How To Writea Compiler

Compilers are typically organized into "phases". The output of each phase acts as the input to the next, much as in a pipeline. The first phase is (typically) LexicalAnalysis. The output of LexicalAnalysis is a stream of tokens and their associated lexemes. This token stream is then fed into a parser, often called a SyntaxAnalysis phase. Typically, as the parse progresses, an intermediate data structure is built in memory. This intermediate data structure is then, optionally, manipulated by one or more optimization passes. Finally, during code generation, the tree is walked and whatever output is generated.

-- MarkGrosberg

Useful book: See also CompilersPrinciplesTechniquesAndTools


Alternative approach:

  1. Begin typing
  2. Continue typing
  3. Repeat step 2

A compiler should eventually emerge, and you will not have had to read TheDragonBook.


At least for fairly simple languages, you can write compilers using test-first design. It helps to have some notion of how a pipelined lexer/parser/optimizer compiler works, but it's not a necessity. You can start off with a small subset of the language, write failing tests for small programs, make them pass, and then slowly add more features. If you refactor carefully, a good design will emerge. I successfully used TestDrivenDevelopment in making an interpreter for KarelTheRobot.


Erik Hilsdale, J. Michael Ashley, R. KentDybvig, and DanielFriedman. Compiler Construction using Scheme. First International Symposium, Functional Programming Languages in education, LNCS Volume 1022, 1995, pp. 251-267, 1995.

This paper describes a course in compiler design that focuses on the SchemeLanguage implementation of a Scheme compiler that generates native AssemblyLanguage code for a real architecture. The course is suitable for advanced undergraduate and beginning graduate students. It is intended both to provide a general knowledge about compiler design and implementation and to serve as a springboard to more advanced courses. Although this paper concentrates on the implementation of a compiler, an outline for an advanced topics course that builds upon the compiler is also presented.

See

Also

Compiling: A high-level introduction using Scheme, in Proceedings of the Twenty-eighth SIGCSE Technical Symposium on Computer Science Education, February 1997, pp. 253-257.

Traditional compiler courses use formal methods for parsing, but treat the more important semantic aspects informally. We present a one semester course in which compiler development is reduced to a number of transformation steps, each of which is formally specified, easily tested, and clearly motivated by semantic considerations. Furthermore, the source language is substantial (essentially the host language of the compiler) and the target is a popular RISC architecture...

These developments have been tested both in class and in a summer workshop attended by college teachers, who were uniformly enthusiastic about this approach.

See


Those papers are pretty interesting. In the mid 80s I wrote a CeeLanguage compiler in CommonLisp. I was wondering if I was nuts, because I hadn't seen many people using Lisp dialects for that job, but it really was a great choice. The target was an embedded VLIW CPU and we wanted to do a lot of the optimizations described in J. R. Ellis. Bulldog: A Compiler for VLIW Architectures. PhD thesis, Yale University, February 1985. YALEU/DCS/RR-364. These involve performing transformations on dominator chains and such. A list oriented language made that sort of operation simple. Compile time wasn't great, but that wasn't a major problem for an embedded processor environment. And that might have been caused more by the degree of optimization than by the language choice.

-- MikeGarrity


The easiest path to writing your own compiler is to read "LetsBuildaCompiler".


EditText of this page (last edited August 10, 2004) or FindPage with title or text search