While CeeLanguage has long been popular as a source language (though it is losing respectability for things other than systems programming and high-performance infrastructure), it also functions well as an intermediate language--a "high level assembler", if you will.
(This also applies to C++; there is a subset of C++ that is also useful for this sort of stuff. Via static constructors/destructors, one can use C++ to provide per-module init functions that run before the program starts. Of course, if you use C++ then you lose useful AnsiCee 99 features like RestrictedPointers?, VarArgMacros, and well-defined integral types).
Reasons that this is so:
- C is ported to (almost?) any platform
- C has a large number of good optimizing compilers available.
- C has well-documented and simple-to-implement calling conventions.
- If C is used as an intermediate or target language, a lot of C's misfeatures (like allowing the user to have FunWithPointers) that make it less desirable as a source language (especially for high level code) simply become irrelevant. The compiler for the HighLevelLanguage takes care of the semantic error-checking and either emits C code that is provably correct, or else which contains error checks built into the code.
- Virtually every OS comes with a linker that can link C (or has it available).
- Can easily embed assembly language if you need to.
Limitations of C
- C doesn't provide access to "carry" or "status" flags of underlying processor. (Try implementing a BigInt package in standard C and you will C what I mean)
- C has a flat NameSpace; mapping symbol names of higher-language OSes to C can be a problem. Especially true in languages which allow overloading. Name mangling is the obvious solution; but then the name-mangling scheme becomes an issue. (This has long plagued C++; different compilers mangled names in different ways)
- Dumb C linkers often have a difficult time with many higher-level language constructs. (C++ templates were one huge source of problems--the eventual solution was to augment system linkers with features to support C++ and other higher level languages). One way around this (for statically linked stuff) is to perform "linking" in a high-level tool; then compile the result to C. You're pretty much stuck with the system linker/loader if you want to do dynamic linking, though.
- Prior to AnsiCee 99, the sizes and alignments of the standard machine types were not completely specified by the language.
- Rather poor at handling unusual/exceptional flow control situations--CallWithCurrentContinuation (a rather handy low-level construct) is missing from the language. (Some OSes have methods for grabbing and encapsulating the state of a task--a facility which can be used to implement such things as exceptions, coroutines, generators, iterators, etc.) The only facility provided by the language definition which even comes close is setjmp()/longjmp()--and it is the opinion of many that these are ConsideredHarmful.
- No native support for synchronization or concurrency prior to C11. Most OSes have this, and OS-independent wrappers exist for C to handle this, but difficult-if-not-impossible to do in C99 and older.
- C has no concept of an "object", or of the many mechanisms used to implement objects (vtables, and the like). The compiler has the task of implementing these on top of C. (If one uses the JavaVirtualMachine as a runtime; a lot of this stuff comes for free.) Which is certainly possible...but may lead to inconsistencies between implementations for things which ought to be common.
Examples of using C in this fashion:
- CeeFront?; the original implementation of C++, translated to C.
- Comeau C++ (http://www.comeaucomputing.com/) translates C++ to C. It is the first C++ compiler to provide full core language support for standard C++.
- Several Java-to-C translators out there (some translate Java source; others translate JavaByteCode to C)
- Many experimental language compilers use C as a backend, rather than emitting assembly language directly.
- SqueakSmalltalk's VirtualMachine is written in a subset of Smalltalk which gets translated to C and fed to the C compiler. The VirtualMachine used by Scheme48 is written in a StaticallyTyped SchemeLanguage dialect called PreScheme? which is compiled to C. (The PreScheme? compiler itself is written in full Scheme.)
- Several SchemeImplementations compile to C (e.g. RScheme, Bigloo and Chicken). These Schemes often use the technique described in CheneyOnTheMta to provide support for ProperTailRecursion.
CeeMinusMinusLanguage is a language intended as a portable AssemblyLanguage that can replace CeeLanguage as a target language used by compiler writers.
Also see MicrosoftIntermediateLanguage.
CategoryCee