The language series: C


I finally took the compulsory software engineering course notorious for its very difficult course project – writing a bitcoin client in C. Alhamdulilah, we successfully completed the project: about 18k lines of code, automated builds/documentation/tests and lots of other stuff. I figure we rank around 7 or 8 on the Joel 12-point scale even though some don’t apply to our project. :D Big UPs to the team!

I decided to do a review of all the languages I have used or been forced to use while taking the course; the story behind learning these languages, their strengths and weaknesses; quirks, advice for beginners and some wisecracks too :).

C is first on the list. Here goes!

How I learnt C

I somewhat got forced to relearn this language this year but my first attempt at C was self-study in 2007 or 2008 as an undergrad. Despite my dreams of building the most AWESOME program ever, my C adventure ended abruptly after I read about 3 to 5 chapters of a C book. I was discouraged by apocryphal reports which insinuated that C was no longer relevant; so I left C for C++ and then Java. That story is here.

Well, this year I had no choice but to learn it. Well, there was another choice: getting a poor grade in the software engineering course.

Likes

  • C packs a powerful punch, who doesn’t like power and speed?
  • It has a concise grammar and you can learn the language fast.
  • Purity: its simplicity forces you to think.
  • I think function pointers are kind of cool too.
  • Forces you to learn how low-level computer stuff like stacks, heaps and memory allocation work.

Dislikes

  • It doesn’t support as much abstraction as I want.
  • Bah… why do I have to call free() all the time? Can’t the language help me with this? I already know and agree am spoilt but why make programming harder?
  • No hashtables? No string support? Beats me… every language seems to have these.
  • There is some redundancy in the methods available in the C library; e.g. strtol and atol; seems PHP got a predecessor in C.
  • Pointer tangles; what does this point to or mean? ***a.
  • Uninitialized values can hold all sorts of values; woe betide you if you make the mistake of using them straightaway; C won’t raise any errors.

Writing code in C

It’s one of two things: you’ll either learn code purity and write pretty nifty code or massacre lots of innocent computer bits à la segmentation faults, memory overwriting and stack overflows…

I think everyone starts out in the latter group and moves to the former :).

Recommended For Beginners?

C is pure and has a small grammar (makes it easy to learn) but a bit challenging for a beginner to start with. I think Python or scheme will be easier.
You’ll probably find OOP difficult to grasp if C is your first language however you’ll find other languages really easy.

C Quirks

7[a] == a[7] if a is an array; it was even on my exam! :P

while(*s++ = *t++) copy a string t to a string s.

Rating

6/10

Pretty powerful, compact and small although lacks a lot of expected features and development is sometimes painful. There are a couple of libraries that you can use though.

I hear C++ is more challenging… Do the ++  signs signify difficulty? :)

Read my reviews of PythonJavaPHP and JavaScript too.

20 thoughts on “The language series: C

  1. Awesome post as usual.
    Corrections though, “No break statement” : there is a break statement. 2. *s++ = *t++ will not copy a string t to string s. You will only copy the first character in this case. (Assuming both t and s are char * aka strings in c )

    Like

      1. It’s possible you had a teacher who so discouraged students from using the ‘break’ statements in C that he/she made it sound like the language doesn’t have a break statement. Many see it as a bad practice to use it, whereas others see some instances in which the code can be more readable, not less.

        Like

  2. C is like realy fast & powerful sports car dat coms wit an xtra pedal 2 ctrl fuel injection….still in my undergrad but memory management “sucks”

    Like

  3. C was originally a systems language – it was intended to be (barely) one step above hand-coding in assembler. It is exceptionally compact and simple, but does presuppose that the developer understands a lot about the hardware architecture that they are working with.

    It does not provide any direct memory management facilities itself beyond alloc() and free() to enable someone to develop a memory management framework that is appropriate for their domain. Remember, at the end of the day, someone has to create memory management structures for OSes and a language which imposes a memory management scheme is not necessarily ideal in all circumstances.

    C is an excellent language still for quite a wide range of application domains where performance counts, such as embedded systems and suchlike. Although there are Java runtimes which can be embedded, the overhead of them is brutal and the performance penalties are high enough to exclude them from use in environments where timing is critical such as process control systems.

    Would I use C to write a web app? No – that would be silly. If I wanted a PID loop controller written for field RTU where timing and compactness are critical, then C is ideal.

    I would absolutely recommend that any new developer learning C take some time to learn the workings of pointers and memory addressing in a relatively simple processor such as the DEC PDP-11 – it helps make a lot of the shorthand in C comprehensible. (and yes, I am quite aware that the PDP-11 is a dead architecture, but it is also very much part of the family of hardware that C was originally developed on)

    Like

  4. I want to piggyback on the PDP-11 comment by MgS. For the younger generation all you really need to know IMHO is that on this machine it was possible in a single machine instruction to BOTH fetch/use AND increment/decrement a register’s value in one machine cycle. This was made possible by the introduction of operand mode-bits within the instruction, something mini-computers (like the PDP series) were free to implement but large mainframe manufacturers couldn’t since a hardware change like that would disrupt their large customer base running legacy software. Programming languages are called “higher level” languages in part because one instruction gets translated into many low level machine operations (one-to-many). But now we had a machine (PDP-11) that could do two things in a single machine instruction that required several instructions in conventional languages (many-to-one). For example, the following two high-level statements in a conventional language could be translated into one machine instruction on the PDP-11:
    a[i] = 0
    i = i+1

    To fix this, the unary ++ (and it’s — opposite) were added to a new language, the C-language:
    a[i++] = 0;

    Today’s PC, workstation, and server processors have hardware features like the PDP-11’s auto-increment and decrement and many of today’s popular languages include C’s ++ and — operators (one notable exception is Python).

    Like

  5. I have taught C programming (at the University level) in the past, and continue to use it today.

    I have always referred to it as “the rope” language, because it gives you everything you need to hang yourself ;-)

    Almost anything will compile, but the compiler assumes you know what you are doing. This goes in concert with the earlier comment about “with great power comes great responsibility”.

    C can do major abstraction, though it takes work. When the C++ language first came out, the technology used was called Cfront – literally a C++ compiler, but rather than output machine code, it output C code – which was then run through the normal C compiler. Looking at the intermediate (the output C code) gave hints on how to abstract.

    As a small, compact programming language, much of what is implemented is done in external libraries. This is what you want as a system level programming language – there shouldn’t be a need to develop intrinsic math functions for each compiler/library – they should all use the same one so that optimizations a) only have to be done once in the library and b) all codes (C/Fortran/C++) generate the same answers when calling the same functions. Having all those libraries as part of the C-runtime just makes it bulkier and harder to optimize.

    This leads me to allocation/freeing of memory. Hiding this ‘magic’ from the user (via garbage collection) means that the runtime is much larger, and has significantly more interaction with the user’s program. C is a minimalist language, and as such can be used anywhere from embedded processors (with fixed amounts of memory and no virtual memory) to large supercomputers. Understanding how each byte of storage is used by your application is key to the success and embedded and smartphone applications. If you’ve never had to deal with a memory leak (or optimized memory utilization), you’ve missed a huge way to make your application perform better.

    Teaching C as a first programming language is arguable. Since Assembly language programming is no longer often taught, there is no understanding of the ‘magic’ behind allocation/deallocation – or implementing as arrays of fixed memory – both of which are necessary for students to understand embedded/app programming.

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.