Introduction
The following is a view of Forth that builds upon a classical implementation by the Forth Interest Group. Currently Chuck Moore has his colorforth that deviates substantially from the following. To state it simplistically color has replaces STATE. Then there are Forth's like lina where NUMBERS are not separate from SYMBOLS, but are merely ambiguous SYMBOLS where only a prefix is require to match. Then there is the ISO standard that says nothing about implementation, only behaviour. I don't think that it is good to edit the following as it is a coherent and valuable view of Forth, extreme and provocative as it is. It is just fair to warn outsiders that this is not all about Forth, not even a more or less consensus view.
Forth has just two rules:
1) You have NUMBERS (in any numeric base you could imagine). They are pushed on to the DATA STACK or compiled on to the DICTIONARY.
2) You have SYMBOLS. They are searched through the DICTIONARY and, if found, EXECUTED or compiled on to the DICTIONARY too.
Forth has two main modes: EXECUTE and COMPILE. Depending what mode you are in, the behavior of numbers and symbols changes accordingly.
That's it. Everything else is about WORDS, that means, what one of the SYMBOLS does when EXECUTED.
Here is a simple but correct implementation of the core Forth colon compiler:
: IMMEDIATE?-1 = ; : NEXTWORDBL WORD FIND ; : NUMBER,NUMBER POSTPONE LITERAL ; : COMPILEWORDDUP IF IMMEDIATE? IF EXECUTE ELSE COMPILE, THEN ELSE NUMBER, THEN ; : ]BEGIN NEXTWORD COMPILEWORD AGAIN ; : [R> R> 2DROP ; IMMEDIATE( Breaks out of compiler into interpret mode again )
Thats interesting, but isnt it slightly tautological? I mean that you are defining the colon compiler ':' but you are using it to define itself (in each definition). Also, you dont define the words WORD or FIND or POSTPONE etc etc so its not really complete. I think that it is an interesting question of what is the minimal subset of forth words which can be used to define all the other words. This has practical applications, because it should allow a forth system to be ported to another architecture easily (or even more easily) since it minimized the number of words which have to be written in assembler or c.
There was an old ioccc entry (1992) titled First & Third almost Forth that builds Third (which is almost Forth) from First (an almost 'minimal subset' of 12 words). The description is rather extensive, about 19 printed pages long, and shows how to synthesize new words and features (including things like basic control structures and support for comments). It's a fantastic read, and may be a good place to start looking for a 'minimal subset' of Forth.
In the late 1970s, I had a copy of the Forth source code from the Forth Interest Group for the Ohio Scientific Superboard II microcomputer -- a 6502-based "bare bones" system. The Forth source was quite elegant and simple; the vast majority was defined as Forth words. Only a handful -- I/O & stack primitives and some minimal infrastructure for interpreting or compiling the input stream -- were defined in assembly language.
Its seems a feature of ForthCulture that forth enthusiasts talk about what forth could do or did but do not provide actual source code. This leads to the perception of the Forth world being, in some sense closed or arcane, a realm only for the initiated where programmers are unwilling to release source code. It seems strange for example, that, after so many years, there do not appear to be open source libraries for performing common tasks with forth. The most obvious candidate would be a Tcp software stack. On now defunct 'ITV' site, it is claimed that a complete networking software stack for forth had been developed, but it seems that it was never released in the public domain, even when the project for which it was written became defunct. Is this not strange?
This probably has to do with Forth's anti-reuse culture. Many Forth enthusiasts recommend avoiding bloat by rewriting all library code for each application, paring it down and tailoring it to the particular application.
http://code.google.com/p/subtle-stack/downloads/list <- Here's a fantastic simple 4th implementation written from assembly, that is remarkably readable. You can watch, from the ground up, how the entire thing works. (What the prior commenter said is correct as well: In ForthCulture, the element of reuse is _ideas,_ rather than _implementations._ Forth programming works radically different than most programming, and Forth culture works radically different than most programming culture.)