C is a SystemProgramming language, which also became popular for writing ApplicationPrograms, because
int main() { void (*i_am_bleeding)() = (void(*)())15; i_am_bleeding(); return 0; }The problem with C is the techniques that help it compete with modern dynamic languages are invariably the techniques that make it dangerous.
The C language was created by DennisRitchie over the period 1969 through 1973, based on predecessor languages B and BCPL. By '73 it could reimplement Unix on an early PDP 11. Unix had been implemented purely in assembly before that. See KernighanAndRitchie (book).
It was one of the first successful high-level systems programming languages, and since the 1980s has been the most widely used systems programming languages, mostly displacing assembly code for that purpose. See also BlissLanguage.
A very insightful contrarian comment about the nature of C appears in an AlexanderStepanov (C++ STL designer) interview by Al Stevens in DrDobbsJournal (3/1995) (http://www.sgi.com/tech/stl/drdobbs-interview.html):
"Let's consider now why C is a great language. It is commonly believed that C is a hack which was successful because Unix was written in it. I disagree. Over a long period of time computer architectures evolved, not because of some clever people figuring how to evolve architectures---as a matter of fact, clever people were pushing tagged architectures during that period of time---but because of the demands of different programmers to solve real problems. Computers that were able to deal just with numbers evolved into computers with byte-addressable memory, flat address spaces, and pointers. This was a natural evolution reflecting the growing set of problems that people were solving. C, reflecting the genius of Dennis Ritchie, provided a minimal model of the computer that had evolved over 30 years. C was not a quick hack. As computers evolved to handle all kinds of problems, C, being the minimal model of such a computer, became a very powerful language to solve all kinds of problems in different domains very effectively. This is the secret of C's portability: it is the best representation of an abstract computer that we have. Of course, the abstraction is done over the set of real computers, not some imaginary computational devices. Moreover, people could understand the machine model behind C. It is much easier for an average engineer to understand the machine model behind C than the machine model behind Ada or even Scheme. C succeeded because it was doing the right thing, not because of AT&T promoting it or Unix being written with it." (emphasis added)
C became successful because it provided a range of data and control structures that were general enough to be sufficient for many programming tasks, but limited enough to be fairly easy to implement efficiently on common computer architectures. Assembly language was in near universal use for systems programming in the 1970s; to be a suitable replacement, C needed to offer comparable speed with improved usability. To that end it offered relative machine independence, structured control constructs (if, while, for), and importantly, TextSubstitutionMacros, which are widely deplored today but which were wildly popular in AssemblyLanguages.
C supports
For years it wasn't considered part of the C language per se, and was advocated, designed, and implemented by different people than the rest of C: Alan Snyder, Mike Lesk, and John Reiser. See Ritchie's history of C at http://cm.bell-labs.com/cm/cs/who/dmr/chist.html.
The design of the preprocessor is horrid in several ways, but on the other hand, Reiser's implementation of it is an amazing case study in how to make software run really, really, really fast. In one environment I used in the 1980s, it was faster than "cat file.c > /tmp/foo"! ("cat" uses getchar() and the stdio lib was inefficient).
Although often criticized for being a "low-level HighLevelLanguage", this is also C's strength, allowing very fast MachineCode to be generated for an especially wide range of types of programs, making it especially suitable for a systems programming language.
A variety of more sophisticated languages have been claimed to be suitable as replacements for C, but as of 2003 there still are no other candidates that are 100% technically successful in matching or surpassing C in its areas of strengths.
On the other hand, C has been displaced to a noticeable extent in non-systems programming areas, and displaced to a certain (and some say growing) extent even in systems programming, by languages such as C++, Java, Perl, etc.
The lack of Object Oriented features in C creates dissatisfaction, but nonetheless CeePlusPlus is not universally considered to be a suitable replacement for C due to a large variety of perceived flaws, including complexity.
The most important of the "object oriented" features can be implemented in CeeLanguage through a combination of convention and macros (see PointerCastPolymorphism). Many programmers who understand enough about object-oriented programming to take advantage of the OO features in CeePlusPlus find it easier to simply do the same things in CeeLanguage.
This has been widely debated. Some people believe it to be true, others do not. I personally find that this approach is better than nothing, but does not give me as much support from the compiler as I would like. I've done OO programming even in AssemblyLanguage, but obviously that doesn't make assembly an OO language.
ObjectiveCee offers more expressive object oriented features than CeePlusPlus and doesn't do nearly as much damage to the rest of the language than CeePlusPlus. Many of the macro packages were similar (though long gone). I think the point is that there are many choices besides CeePlusPlus for C programmers who want object oriented capabilities without having to learn an entirely new language.
Agreed. And ObjectiveCee is also a wonderfully small language; I (who?) implemented it on top of a C compiler once, commercially, all by myself in a short period of time. The core language itself, that is; naturally HP licensed the libraries from NeXT.
Don't forget the most famous example of polymorphism in CeeLanguage, the table of DeviceDrivers at the heart of every unix-like operating system. Every device driver is an object (class: major number, instance: minor number) that implements the standard open/close/read/write/ioctl/poll device interface.
C is a nice low-level general-purpose language, which personally I love but other people tend to get a lazy when it comes to memory management, specifically pointers.
Definitely, C is the dominant language today for systems programming. Many (system) software (programs) are written in C. They include Windows, Linux, MacOsx, etc. -- TakuyaMurata
CeePlusPlus is an extended language from C. C++ is a superset of C except slight differences.
Not anymore. The most recent C standard, AnsiCee 99, contains numerous new incompatiblities with C++ - including doing quite a few things differently than C++ does them and which will likely be difficult for C++ implementors (and the C++ committee) to reconcile. Some think that this was done intentionally - quite a few influential C folks are openly hostile to CeePlusPlus (isn't everybody?), and don't like being considered a subset of C++. That said, many of the new extensions are quite useful (well-defined integer types, restrict, vararg macros) and probably should wind up in C++ sooner or later.
While I don't think it is necessary for C to be a near-subset of C++; I do think they should have a usable common subset. I hope the CeePlusPlus community finds some way to reconcile these differences. -- ScottJohnson
Looking at a detailed description (e.g. http://david.tribble.com/text/cdiffs.htm) the new incompatibilities do not seem to be so dramatic. Of course, now there are some new ways to write programs that compile under C and are invalid under C++. So, if you really go at it, you'll surely succeed. But the common part of C and C++ is not at all reduced compared to C89 (meaning: you might not use some new C99 features like variable length arrays *). Furthermore there are other modifications in C99 vs. C89 that really brought C closer to C++. So, in practice you needn't lean out of the window too much if you want to write source code that compiles with C and C++ ... (*: In some of the harder cases remember: The preprocessor is your friend :-))
Anyone else. . . have a look at the HelloWorld page for examples.
See also BrianKernighan and DennisRitchie or maybe even KernighanAndRitchie
CeeLanguage snippet from KernighanAndRitchie:
void strcpy(char *from, char *to) { while(*to++ = *from++); }Is this intentionally wrong?
What's wrong with it? Is it the fact that the designers care deeply that it should run fast and care little that *from might not be terminated and overrun memory? Is that a problem? That's a user problem... I'd call it a LanguageUsability problem.
Well, actually one thing wrong with it is that the CeeStandardLibrary strcpy has the destination argument first...
And the other thing wrong is that the standard library defines strcpy to return a char * to the destination string. Besides, from is a const pointer, as we don't modify the source string.
from the man page:
#include <string.h> char *strcpy(char *dest, const char *src);So I'd say a better try to this particular part of the library would be:
char *strcpy(char *to, const char *from) { char * tmp = to; while ( *to++ = *from++ ) // look ma, nothing in here, C99 comment style :-) ; return tmp; }Except that the arguments are restricted pointers according to the C99 Standard, so even the above isn't quite right. -- JamesDennett
Heretics. The reason strcpy is a library call is to have it optimised for each architecture. The infamous empty-block while() above is 5-30 times slower than dedicated assembly, depending on the architecture.
BTW, it'd be better using char *strncpy(char *to, const char *from, size_t num), no buffer overflows.
''Not necessarily, strncpy doesn't guarantee NUL-termination, so you still have to check the result. Better to explicitly check, or use a real string library. Don't even think about strncat().''
No you don't.
strncpy(dst, src, n); dst[n-1] = 0;End of discussion, and no "checking" necessary. If src is too big to fit in dst, then it would have been truncated anyway. This is something you'd have to check for with strcpy anyway to prevent a buffer overflow.
No. strncpy() is buggy (or misnamed). See http://blog.liw.fi/posts/strncpy/ It doesn't guarantee the result is a string and it may copy many more characters than necessary. It can be useful but it *isn't* a string function!
Here's a dumb newbie question for all you c gurus.... I've been reading through sample code, and seeing a lot of things like this:
char* this_variable_will_store_a_stringNow... my understanding of these things is that char* tells you we're dealing with a pointer to a variable of type character but
For copy-style functions, what's the PreferredOrderOfSrcDstArguments?
Has the CeeLanguage, Circa1970, passed its UseByDate?
CeeLanguage will persist as long as Unix-like systems exist. Also, C is usually the first language ported to a new microprocessor or microcontroller, so C will persist as long as there are EmbeddedSystems to develop.
Well-written C code is the most portable of any language, since every device has a C compiler, and every language has a mechanism for calling C code (like JNI or CriTCL, for example). Therefore, if you want your library to be usable anywhere and in any language, write it in C.
Also note that some younger languages compile to C.
Why is C AsFastAsCee?
Nothing on this page truly explains why C is a fast language. A set of instructions is a set of instructions. Why is one set written in C faster than one say written in Pascal or in basic. Pascal was not necessarily slow, but had a very poor set of built-in I/O functions that made it seem very slow. Basic is generally slowed down by its interpreter. C, from the beginning, was very little "higher level" than assembly language. You could generally "see" the resulting assembly code from the source code. This allowed expert programmers to write efficient code, though sometimes at the expense of readability. Over time, this emphasis was generally found to be counter-productive for the long term, but at the same time, compiler optimization was making great strides. It's amazing how tight the assembly output can be from a C compiler! (I've looked.)
C is fast because
[I would argue that there is a self reinforcing cycle also in play here. Most modern benchmarks are written in C (initially because of portability), so processor design tends towards things that make the benchmarks run fast, some of which will have more general applicability to C programs. Then C gets used more (and more benchmarks get written in it) because it is fast.]
Here are some examples of features that were used in other languages in the late 1960s/early 1970s that were left out of C very much on purpose in order to make C a faster language:
ObjectiveCee, ObjectCee? (a variant created by Apple and BBN) and numerous private variants added object oriented capabilities to CeeLanguage while attempting to preserve the performance of CeeLanguage. As was mentioned above, such extensions were straightforward for a competent CeeLanguage programmer who understood object-oriented practice as it existed in the early eighties. CeePlusPlus was, in the opinion of many, one of the less successful attempts to accomplish this extension.
ObjectiveCee method calls are more expensive than CeePlusPlus virtual function calls, as ObjectiveCee uses DynamicTyping rather than StaticTyping for the OO part. Calling via DynamicTyping is more expensive than static binding - though again, not "thousands of instructions" (unless you have a very simplistic implementation which searches the class hierarchy for the correct function to call every time). ObjectiveCee doesn't have MultipleInheritance (efficient implementations of DynamicTyping coupled with MultipleInheritance are problematic). I don't believe ObjectiveCee has exceptions, either.
[Agreed. I'm unclear, however, why more people don't switch to Objective-C when they are unhappy with C++... any thoughts, or has this been addressed well on another page?]
Because why would they? Objective-C doesn't fix any of the common C++ annoyances (at least those of actual C++ programmers). And it doesn't have many of the C++ features. No exceptions, no way to do RAII, etc. Apparently no portable standard library (other than the C library). Objective-C is less efficient, has fewer features, and there's less literature about it.
ForthLanguage is almost as fast as C, but typically not as fast, since it explicitly uses a stack rather than making it easy for implementations to make efficient use of machine registers, which is critically important to get maximum speed from a cpu. It is also considered somewhat lower-level than C, and the average programmer dislikes its PostfixNotation, and prefers the familiar C/Algol-family InfixNotation for arithmetic.
As mentioned, Lisp has been made fast...even faster than Fortran for number crunching at one point in the 1970s. But this required heroic efforts, and requires carefully avoiding higher level power constructs. The kind of program that Lisp programmers really enjoy writing in Lisp (e.g. unrestrained recursion that isn't necessarily tail-recursive, use of lexically-scoped outer-scope variables, use of higher order functions, etc) will typically be much slower than the corresponding C program in any implementation.
This begins to get into the rather deep specialty areas of compiler design and cpu architecture, which are considered obscure and arcane even by many programmers (especially application programmers who don't do systems programming, don't care about systems programming, and just want to use languages that are powerful and convenient), so extreme technical detail would be inappropriate here.
Nothing on this page truly explains why C is a fast language.
CeeLanguage is fast because it was in daily use by an enormous number of dedicated developers who profiled and optimized the bejesus out of it (primarily the "portable" C compiler). The (Unix) community into which it was born was the primal open-source community, which welcomed and propagated improvements. The first implementations of CeeLanguage did not produce code that was noticeably faster than comparable code written in Pascal. The early Unix community learned, through experience, how to make their environment go faster. Since that environment was written in CeeLanguage, the language improved as well. The proprietary languages that attempted to compete with CeeLanguage could not keep up, mostly because no single company could match the developer resources leveraged by the academic and unix community.
Overall that's true, but a nitpick: I used the first two PascalLanguage implementations in the world, at CalBerkeley, on the CDC 6400 and on the PDP 11. The former was on a machine that didn't support C, so no direct comparison was possible. The second was byte coded, and hence far slower than the C compiler. You are doubtless referring to some of the later (but still early) optimizing Pascal compilers, which took a while to come along due to the popularity of the UcsdPascal system (which again was byte coded). So C must have been hands-down faster than any Pascal system for something like the first 8 years of Pascal's existence, until people started seriously applying the state of the art to optimizing Pascal compilers.
My first-hand experience began in 1982 or so (I was a hardware guy before then), I was using Pascal on a Perq (I worked for Three Rivers at the time) that was optimized for Pascal, and so my very earliest memories are probably atypical. We compiled C into "C-codes", because the Perq was microcoded to be blazing (for its time) at bytecoded implementations. By 1983-85, I was working with more vanilla-flavored M68K machines, and the portable C compiler was significantly faster than Pascal. As I recall, it shared a code generator with the Fortran compiler, so Fortran and C did pretty well on those machines.
C made significant contributions to the state of the art of optimization and of portable compilers in that era, BTW, long before C became a dominant language.
Perhaps AlternativeMicroprocessorDesign would be a better place to discuss this. I'm curious - what about the Perq made it so good for bytecoded implementations? Would those features would still be useful today if I wanted to, say, build a custom FpgaCpu that I wanted to run Java? -- DavidCary
it was the first efficient and portable high-level systems programming language
This may be a controversial claim, but PL/I-S only ran on IBM systems, Burroughs did systems programming in Algol, but only on Burroughs architectures, etc. The apparent exceptions either do not seem to be portable, or if they were (e.g. Fortran), they weren't very widely used for systems programming, because of an impedance mismatch (including inefficiency).
BlissLanguage was effectively used for systems programming, and pioneered many compiler optimizations, but was not portable (PDP 11 only). Actually, that's not quite true; there were Bliss compilers for DecSystemTwenty? and the VAX as well (AFAICT; I wasn't there, but I did some research in to it recently as part of a RetroComputing? compiler project). However, this is not as straightforward as it sounds, as Bliss wasn't a single language, but a family of dialects, each optimized for the system (and in some cases, the OperatingSystem) they ran on. Furthermore, the language evolved substantially over time; thus the Bliss-10 that ran on TopsTen? in 1970 was quite different from Bliss-11 for RSX-11 in 1978, or Bliss-32 for VmsOperatingSystem in 1980. There was something called Common Bliss, which was supposedly a subset of all the later dialects, but apparently it was never widely used. CommentsAndCorrectionsWelcome. -- JayOsako
ForthLanguage might be an exception, except that it is usually considered low-level rather than high-level, it has primarily been used as an embedded language rather than for general systems programming. (Actually, not true. When Forth was first employed, it was used to implement multi-user, multitasking, data visualization and processing environments at NRAO. Its applicability towards embedded development came much later.)
The Algol-60-like language 'Imp' from Edinburgh University (a descendant of Atlas Autocode, which itself was ported from the Atlas to the KDF-9) was a high-level language used initially for operating system coding, and later for applications - just like C, but predating it by several years. Imp was widely ported to systems as diverse as the KDF-9, ICL4-50, 4-75/IBM360,370/Amdahl V7, ICL1900, ICL2900, Modular 1, Univac 1108, IBM 7090, CDC Cyber, PDP-9/PDP-11/PDP-15/DEC-10, Interdata 32 series, Perkin-Elmer 3220, Sparc/SunOS, Acorn ARM/Archimedes [AcornArchimedes], NS32000/Acorn Panos, Transputer, Ferranti DISC, Z80, 6809, (some HP calculator that has been forgotten), MTS, 68000, Sequent Symmetry, and finally Intel 386.
Pascal doesn't allow you to sling pointers around, but C requires some pointer use to even talk to the standard library. Common C style encourages pointer use at a level unheard-of in any language OTHER than assembly, to the point of allowing semi-arbitrary pointer arithmetic. (Fully arbitrary if you freely interchange pointers and integers.) Pascal has the same number of pointers, granted, but the language does a better job of hiding the pesky things. That, and Pascal seems to have a stronger notion of what strings are. (Especially in later non-standard variants. Strings in ISO Pascal are legendarily bad.)
My experiences from system and other performance-oriented programming since 1982 prompts me to claim that (at least for me) the WYSIWYG nature of C is the key to its speed. There will be no hidden calls, creators/destructors etc unless I want them there. I've written so much C code and debugged it in lower-level debuggers to the extent that I can more or less predict what assembly language code will be generated. This, in turn, causes me to avoid certain constructs in certain cases because I'm aware of their performance penalties. Ones that come to mind are using consecutively numbered cases in switch statements and using intermediate pointers to values/structure members so as to make sure the address isn't computed over and over again. The former causes indirect jumps through a vector of addresses as opposed to creating a giant binary test code with lots of CoMPares and (conditional )JuMPs. The latter is especially important on compilers for CISCs with few registers (which forces the compiler to constantly re-use them) and much less important on RISC compilers.
This is how I can write maintainable C code and others may wonder slightly as to why I chose certain constructs but the code will be understandable.
I also like to use a ("semi-arbitrary") pointer scaling macro which scares the **** out of some but is appreciated by others once they figure it out. Another is pointer subtraction to get the byte offset.
#define pointer_manipulation(base,scale) ((void *)(((char *) base) + scale)) #define pointer_subtraction(base,member) ((int)(((char *) member)-((char *) base)))
I've also implemented the x86 RDTSC instruction which allows me to implement elapsed time measurements inside the application code without significant overhead, not to mention elapsed clock cycles measurements!
If I write good C code the compiler will produce assembly code which is so fast that I cannot significantly better it by coding manually. Believe me, I've really tried. My favorite compiler is OpenWatcomC, mostly because of its #pragma aux construct which allows specifying inline assembler and what registers the instructions affects. In this way I can streamline the x86's block manipulating instructions into the compiler's optimization work.
The only other language I found suitable for working in a manner anywhere similar was PL/M-86 but that language was hampered by its almost complete lack of floating point support. On the other hand, it forced me to dig very deeply into the iEEE754 floating point formats and figure out quite a few really powerful constructs and algorithms.
To me, a fast (non-interactive) program is one which appears to exit immediately after it has been invoked. Of course, in the time in-between, the program has done everything it should. An acceptably but not impressively fast program is one where computing time is equal to I/O time.
A very interesting site, this. - OlofForshell
See some interesting discussion on a comp.compilers post: http://compilers.iecc.com/comparch/article/98-05-052.
SimplifiedWrapperAndInterfaceGenerator (SWIG) can be used to make calls from many other languages to CeeLanguage or CeePlusPlus code.
CeeAitch (Ch) is C implemented in an interpreter.
See Also:
I know many a programming language, but overall I like C the most. Sure, it's not object-oriented, sure, it's lower level than Java, but I like C. Besides, C can do so much. Maybe not with only the standard library, but even ncurses allows one to make a word processor.