Large And Small Languages

You often see statements like "C++ is a bad language, because it's just too big" or "CommonLisp is big and bloated; Scheme is simpler and more elegant". On the other hand, you also often see statements like "C++ lets you program in any paradigm you choose" or "Java has a huge range of helpful standard libraries".

So, with programming languages, is it true that SmallIsBeautiful or not?

Some helpful distinctions:

Size versus complexity. If there's enough regularity, size may not matter as much as you think. (Very small example: compare C++'s find_first_of and friends with C's strcspn and friends.) Usually it's not possible for a language to be large but simple; only its library, which brings us to ...

Language size versus library size. A small language can have a large library. It's generally more practical to learn a large library incrementally than to do the same with a large language. A large language can have a small library, too, though this isn't so common.

So, which is better? One possible answer (feel free to disagree; we can un-thread-ify the discussion later) is:

Large libraries are better, provided they are simple and regular. Would C++ be any better if its strings had find_first_of but not any of the other related functions? Would CommonLisp be any worse if it gained a standard, well designed, library for handling regular expressions? If your library is small, then you have to build facilities yourself. (Case in point: every major implementation of Scheme includes its own version of a bunch of the facilities found as standard in CommonLisp. Is the smallness of Scheme good when this happens?)

Size needn't matter much if you can afford to know only a subset. You can use C++ just like C. This is almost a proof that C++ isn't worse than C. There are two problems. Firstly, you need to be able to work with other people, and the subset they use may be different from the subset you use. Secondly, complexity isn't always well enough isolated. C++'s scoping rules are quite complicated, and you probably need to know most of them even if you're only using a subset of the language.

Small languages are better, provided they do what you need. This is mostly for psychological reasons; using a language is much more comfortable if you feel that you know almost all of it, and that there aren't surprises waiting to trip you up.

It's good to move size and complexity out of the language and into the libraries, when possible. Because then you can learn the language completely, and explore the libraries at whatever pace suits you best.

Of course none of this answers questions like "Is C better or worse than C++?" and "Is Scheme better or worse than CommonLisp". That's good, because those questions will always lead to flame wars even if someone finds a conclusive answer to them.

I like to consider not so much the size of the language but the size of the programs for which it scales well.

For example Perl5 scales well in the very small to the medium but not well on the very big. The lack of real functions/modules signatures hinders modular programming in Perl Also, having only dynamic typing implies overhead in space and time that is not acceptable in many contexts. Scaling in size would imply to have the choice between dynamic and static typing within one language. It seems that the next generation of scripting languages (Perl6, Python) will provide that choice. When that will happen, I expect they will displace application languages.

But scaling as a language quality can be seen as detrimental because one can wrongly infer that the style for one type of programming is the only style for the language. Perl5 is criticized because it supports Perl4 style. This brings us to the next kind of scaling.

One can also study how the language scales over time. Starting small and getting bigger. It has been said that a language must start small. Growing without major discontinuities guarantees to keep a mind share but results in ugly languages like Perl5 and C++. Also, like the example of Perl5 shows, one must be careful to keep people educated about the evolution of the language.

I've given up completely on the concept of some languages being bad and some being good. It depends on personal taste and purpose. Even badly designed languages like VisualBasic and PHP have a purpose.

Ideally, the language spec itself should cover only what can't be done effectively (& perhaps we should add, safely) in a library. (Anything that can make libraries more effective is, therefore, a good candidate for a language feature.) This is because writing, maintaining, & updating a library is easier than a compiler or interpreter.

(You now ask, "But, what is easier?" I think, however, that you know what I mean. Still, it could be worth expanding on at some point.)

Likewise, the general library standard should be allowed to evolve faster than the language standard.

Like software version numbers: x.y.z. Increment x when the language spec is changed. Increment Y when the "standard library" is changed. Increment z for bug fixes.

And, of course, there should be formal & de facto standards for libraries targeting specific purposes.