Snobol Language

The SNOBOL language was invented by RalphGriswold, et al. at BellLabs.

It is a line-structured programming language with GoTos (!). The basic datatype is the string. Strings can be parsed and matched using Patterns which are much like RegularExpressions today. It supports records as well as functions. It is not object oriented.

The successor to SnobolLanguage is IconLanguage by Ralph and Madge Griswold at UniversityOfArizona?.

Other inheritors of the Snobol4 flame include: AwkLanguage, PerlLanguage, IconLanguage.

On MSDOS, you can get an interpreter called Vanilla Snobol. Search for it! On Mac, there is one by a company called Catspaw [or something!].

Spitbol [SpitbolLanguage] is a compiled variant of Snobol.

Anyone else care to comment on SnobolLanguage?

-- RonPerrella


In answer to "On MSDOS, you can get an interpreter called Vanilla Snobol. Search for it! On Mac, there is one by a company called Catspaw [or something!]":

Vanilla Snobol is a crippled version: perfectly usable, but very limited in some ways. It was originally a free taster of a commercial version called Snobol4+. Now Snobol4+ is also free, so there is little reason to use Vanilla Snobol. Snobol4+ is very good, and I have used it quite a lot. Both these version of Snobol4 are produced by Catspaw: information at http://www.snobol4.com/. Catspaw also produces a still better version, Spitbol, which, however, is not free. The price is in the order of a few hundred dollars, if I remember correctly.

Catspaw produces implementations of Snobol for many platforms. As far as I know its free versions are only available for DOS (And usable on Windows). However, Phil Budne has produced "CSNOBOL4", a free Snobol implemented in CeeLanguage, so that it can be compiled to run on a wide range of platforms. Pre-compiled binaries are also available for some platforms. I have used CSNOBOL4 much less than the Catspaw versions, but it seems to be very good. Phil has a Snobol resources page at http://www.snobol4.org/.

Documentation is variable. Vanilla Snobol comes with EXCELLENT documentation, Spitbol with fairly good documentation, CSNOBOL4 with little, and Snobol4+ now comes with none, since it has been made free. If you are interested in using Snobol4+ I suggest you download Vanilla Snobol for the documentation, and supplement it with notes I have written to extend coverage to most features of Snobol4+. These notes can be downloaded at http://www.sachsdavis.clara.net/Snobol4Pdoc.pdf.

-- Michael Davis


A predecessor of SNOBOL was the (now little remembered) COMIT language, which was (to the best of my knowledge and belief) the first computer programming language that contained built-in language features for matching strings against patterns.

I have never written a COMIT program, but I have read a few. The language was awful. SNOBOL (and SNOBOL3 and SNOBOL4) (no, I believe there wasn't a SNOBOL2 - don't ask) vastly improved on COMIT. But credit where credit is due - the pattern matching in SNOBOL was designed with COMIT in mind, in a conscious effort to embed the same ideas (pattern matching and string manipulation) in a better language.

Incidentally, one of SNOBOL's significant contributions to the evolution of programming languages was a feature not immediately evident to a user: its portability. SNOBOL was implemented in SIL, the "SNOBOL Implementation Language", a low-level "assembly" language for a VirtualMachine. The syntax and semantics of SIL were such that it could be (and generally was) implemented as a set of macros for your favorite assembler. The idea was that you wrote a few dozen macros implementing the SIL "opcodes" in a manner appropriate for your hardware and OS architecture, then you fed the macro definitions, followed by the SIL program for SNOBOL, into your assembler, and voilĂ ; out popped a SNOBOL interpreter. (Generally you had to link it with the FORTRAN IV I/O library to get something you could actually use.) The book The Macro Implementation of SNOBOL4 describes the process, which was carried out successfully on many platforms. (In the days in which SNOBOL was invented, it was exceedingly rare for a language without an ANSI/ISO standard - i.e., nearly all of them - to be available on any machine other than the one for which it was initially developed.)

The idea of a virtual machine applied not only to the construction but also the execution of the SNOBOL language processor, which comprised a "compiler" that compiled the SNOBOL program text into opcodes, and an interpreter that executed the opcodes. The compiler remained available at run time, so you could have a SNOBOL program that extended itself, by assembling SNOBOL program text in a string, feeding the string to the compiler to obtain an object of type CODE (a primitive type in SNOBOL), and then using a "direct GOTO" to transfer control (in the interpreter) to the "entry point" of the CODE object.

I would be interested to know whether SNOBOL was the first language to be implemented in pseudo-ops for a virtual machine (obviously, it wasn't the last...).

-- CameronSmith


Years ago I wrote a fairly long program in Snobol, a macro processor for a long-extinct circuit description language. I enjoyed it, once I gave up trying to get a single expression to do too much. I feel that Perl [PerlLanguage] has now filled Snobol's niche with better control structures and better interaction with the operating system. -- RobertField


Years ago, in the QuestForThePerfectLanguage, I tried some interesting little programs in a IBM-PC version of Snobol4. There is one thing I remember well: gotos were the standard way of doing program control. Actually, as I remember, in the end of every sentence there was an optional jump, that could happen or not accordingly to certain conditions. Also, every instruction (line) has a guard that was a pattern expression, more like awk, if I remember both.

Fun it was, useful as AwkLanguage or PerlLanguage today, kinda stylish and baroque.

However, it is a good paradigm breaker. You should think in a different way of FortranLanguage.

I believe that Snobol, during certain time, could be listed with CobolLanguage, FortranLanguage, LispLanguage, PrologLanguage and AlgolLanguage as different ways of thinking while programming.


In the early nineties, I was a strong advocate for SNOBOL for Natural Language processing. It was a compact language that could easily be learned in an afternoon as it have very few command words. However, the paradigms were useful for the language text process we were doing and we could get non-English speakers to use this language very quickly.

However, PerlLanguage came along and squashed our efforts in SNOBOL/SPITBOL. The compilers quickly overtook the speed of SpitbolLanguage and the new controls and functions made the upgrade to Perl a natural step. However, we noticed that our non-English speaking colleagues were having trouble in the transition and still to this day find the transition to Perl or PythonLanguage very painful.

The long and short of this is that we need to be careful to look at the whole picture in our computer language choices. Learnability is an important parameter which is hard to measure. However, when you compare the minimal instruction set of SNOBOL against the ever-growing list of punctuation and modules for Perl, significant differences in the learnability of the language become very important.

-- BobBatzinger


I used Snobol for my MSc. project at the Institute for Computer Science London, in 1971. The program was called "surgery", and implemented various transformations on Context-Free grammars defined by Chomsky and Miller. It was ideal for the purpose, and opened my eyes to many programming issues and techniques which the more conventional languages of the time (like Fortran and Cobol) couldn't handle. It wasn't until I started to use perl and tcl many years later that I found the same freedom of expression. -- SusanJones?


I have a permanent soft spot in my heart for SNOBOL, because the first computer program I ever wrote "professionally" (meaning that I was paid for writing it) was in SNOBOL. I was a Math/CS major, and a friend of mine was a Linguistics major who had a corpus of Old Saxon (see AngloSaxonLanguage) that she needed to analyze for certain orthographic and phonological phenomena (whose nature escapes me... this was over 25 years ago). She had transcribed the passages into the Roman alphabet and typed them into the computer, and my SNOBOL program helped her scan the text looking for words and phrases matching certain patterns. I couldn't just write a program that would scan the text and match it against a fixed set of patterns, because she needed to look for different things in different contexts at different stages of her research. So I had to write a program that would allow her to enter descriptions (in a form she could use, which ruled out SNOBOL syntax; otherwise she could have just written her own programs) from which I would assemble, at run time, SNOBOL pattern objects, which I then used to scan her corpus. Basically I "compiled" her entry form into patterns, then read text in a loop, matching patterns against the input, and printing out matching sentences. Yes, another complexity was that she had entered the text in lines of no more than 80 characters (so it could be punched, naturally!), but she wanted to scan and match sentences, not lines. So I had to slurp the text line-by-line and break it into sentences on the fly.

At the time, SNOBOL was the only language available to me that made this task possible. I wouldn't have even wanted to think about doing it in FORTRAN IV. Of course it would be trivial today in Perl, and even in AWK it would be possible (although the part about constructing patterns on the fly based on input would be a challenge in AWK, I think).

-- CameronSmith


I use SNOBOL all the time. I used it today to scan a bunch of CeeLanguage source files looking for possible declarations of an array with a constant 4. I have studied PERL and yes, I could have done the job in PERL but the RegularExpression syntax is just terrible. Here's the SNOBOL version.

 tab = char(9)
 w = '' | span( tab ' ' )

main_0line = input:f(end) line '[' w 4 w ']':f(main_0) output = line :(main_0) end
-- SteveSnelgrove?


Although SNOBOL has faded out of view, there are still a valiant few programmers who use it extensively. They are firmly of the opinion that the 'RegularExpressions' in PerlLanguage are a VERY poor second best to SNOBOL's much more powerful pattern-matching capabilities, which is what Steve Snelgrove seems to think. It is true that the control structures of SNOBOL are primitive, but the more you get to know SNOBOL the more you find that the pattern-matching facilities often substitute for control structures, making the latter less necessary than in many other languages. My own use of SNOBOL has been very limited, but I find for some purposes it is far more convenient than any other language I have come across. Even the pattern-matching facilities in IconLanguage and its descendants do not have the same smoothness and ease of use. I think it is a mistake to write of SNOBOL entirely in the past tense: it is still worth using.

Also, it is worth mentioning that some of the comments on this page do not apply to current versions of SNOBOL: for example, the days of integer-only arithmetic are long since gone.

-- Michael Davis


It was probably 1966 when I last touched SNOBOL. I first "met" it in the fall of '64 in my first programming course. All these years later (2001) when I started learning GNU/Linux & got serious about using RegularExpressions, I had a strong sense of [[http://blog.aplusreports.com/2010/06/04/deja-vu-phenomenon-disease-abnormality-or-providence/|deja vu]]. Of course a little 'net research revealed the the genealogy of regexes, and I understood why.

Eventually I wrote a SNOBOL program for my Descriptive Linguistics course instead of a term paper. It was called "LINGEN" for "LINguistic GENerator", and its job was to generate random strings according to linguistic rules supplied by the user. The purpose was to allow the user to test his theories of linguistic structure by generatind N random samples, and then inspecting them for "nonsense" vs. "garbage". My demonstration was an examination of "clause" structure. It had "mouse", "cheese", "cat", & "dog" for the nouns; and "chased", "ate", "caught" among the verbs. This led to results like:

 THE DOG CHASED THE CAT 
 THE MOUSE ATE THE CHEESE
 THE CAT ATE THE CHEESE
 THE CHEESE CHASED THE MOUSE
 THE CHEESE ATE THE MOUSE
not
 CAT THE DOG CHEESE
(btw, I am not yelling, just reminding you what output looked like back then.) You get the idea. However humorous, & I did pick the vocabulary with that in mind, the first 5 examples would be classed as "nonsense", while the 6th is "garbage".

In the process I got account "LNG001" at the computer center; this may mean that I was the first kid on the block to think of using computers for linguistics.

I too had to design a syntax for the prospective user, to communicate his proposed ruleset to the computer without having to learn the whole SNOBOL language. My original goal was to get out of writing a full term paper, so I really didn't expect the program to used again. But a year later, with both professors' permissions, I re-did it and again submitted it in lieu of a term paper; this time for "History of the English Language". What I find most interesting is that about 5 years later, the the HotEL prof. published an article in the Alumni Magazine on computer generated poetry. The program description sounded identical to mine, but article credited someone else with writing it. I have always wondered what really happened.

One of the other possibly ground-breaking features of SNOBOL3 was automatic defragmentation. I have no idea if this a first for SNOBOL, but it may have been. Since it was working on a 36 bit word machine, 6 - 6 bit EBCDIC characters, the designers never wrote over stored strings: If a string was changed, they just wrote the new version in continuous (machine) words at the next available memory address and changed the pointer. When all their alloted mamory was used up, they ran a "GarbageCollector" which rewrote all active strings at the beginning of the string memory space, i.e., defragmented string memory.

Automatically.

Can Microsoft do that yet?

Another "feature" was integer-only arithmetic. This made writing the LINGEN PseudoRandomNumberGenerator interesting...

Thanks for the chance to stroll down memory lane, I am another with a soft spot for SNOBOL.

-- Rick Archibald - 10 Jan 2004 (http://www.adamsinfoserv.com/twiki/bin/view.cgi/Main/RickArchibald)

Rick Archibald says above: 'Another "feature" was integer-only arithmetic'. Modern versions of SNOBOL have real arithmetic. -- Michael Davis


IIRC one exercise we did for a programming class involved rebinding operations on the fly, e.g., you could bind subtraction to addition. I was impressed for some twisted reason by that. -- pjl

How on earth did you do that? It beats me. Maybe it was only possible to do in earlier versions - certainly I can't see any way to do it in SNOBOL4. (It's the sort of thing you can do in ForthLanguage.) -- md

SNOBOL4 supports a primitive function OPSYN(). OPSYN() allows for the definition or redefinition of functions and operators. For example, # is normally not a defined operator, but can be defined in unary or binary forms: OPSYN('#', 'HASH', 1) defines # as a unary operator calling function HASH() -- fmgw


Even under LinuxOs it's the "king of small languages". I was born 1971 (I'm younger than Snobol4), I'm really proficient in Perl/Awk/Shell - and for some sort of things (logfiles, generators) there's virtually no other language (like Snobol4) -- rs


I am one of a generation of Princeton alumni who had SpitbolLanguage in freshman programming. I didn't think of it for many years; in fact, I didn't become a programmer until much, much later. I don't understand why PerlLanguage is considered the language of choice for this type of string pattern matching. When cgi-scripts first started and everyone was starting their code with the same Perl script to parse the input, I was thinking how the ten lines of Perl equalled one line of Spitbol.

If I ever needed that type of parsing capability, I think I'd look at Spitbol again. I don't find Perl appealing. Unfortunately, when I last was faced with this, the Windows version was standalone only, while I needed a DLL.

-- Andrew Lazarus


To this day it makes me laugh when I think about this. It must have been my second year in college, 1976, studying maths at Trinity College Dublin. I recokon it was on the IBM360, because I was doing this as a batch job overnight. but it could have been the DEC-20. I got Griswold's book on Snobol and thought it was cool. After PL/C and PL/1 it certainly was. Anyway, I was playing "monkey see, monkey do" and ran one of the example program(me)s to generate poetry. Unfortunately I made a mistake in the transcription and the next morning when I went into the computer lab to pick up my printout they handed me a box of lineprinter paper. The stack must have been over a foot high. Instead of a short little poem there was a continuous stream of:

"Frogs or mice, no snow. Frogs or mice, no snow, no snow."

I got a bit of a ticking off if I recall correctly. We used the paper in the Maths Society room in college for the next couple of years. We used to paper the walls with it.

-- Simon Kenyon

You can't parse SGML or XML with Perl, because of recursion issues. You can with Snobol, because it doesn't have those issues. I can't help but think that there are Big Data applications for Snobol, since its pattern matching capabilities are so superior to, well, anything.


CategoryProgrammingLanguage


EditText of this page (last edited August 8, 2014) or FindPage with title or text search