A KeyLanguageFeature is a feature in a programming language which according to the BlubParadox in PaulGrahams BeatingTheAverages, has the following properties:
- Domain independent -- useful in most (if not all) problem domains; ConsideredHarmful in at most a few.
- If not present in a language, difficult or impossible to simulate without writing an interpreter to simulate a (different) language with the feature.
In several parts of the below discussion people also assumed that the feature should be "essential", which is not actually part of the given definition of a
KeyLanguageFeature.
The list of candidates for such features below is currently unsorted; as a possible refactoring it might be nice to sort it somehow. Note the term objects refers to any datatype here; and does not necessarily imply what is commonly referred to as ObjectOriented.
- Support for each of the basic Paradigms (Imperative, Functional, Constraint/Logic, Goal-based - see ThereAreExactlyThreeParadigms) - all of these are both DomainIndependent? and subject to the BlubParadox. This actually covers a great number listed in duplicate, below.
- Tables support - ability to create tables/relations or something more general that are capable of associating data, in addition to support for relational operations between tables, constraints within and between tables, and automatic maintenance of appropriate indexing for the tables. Most languages don't have this, it is very much DomainIndependent? and has proven extremely difficult to implement via library (in any elegant, fast, bug-free manner) in languages that don't support them natively.
- Language Integrated Query support - ability to perform queries over tables, tree-structures, data (for LogicProgramming) and just about anything else. Automatic optimization support for these queries based upon indexes.
- Patterns Matching and Analysis - support for pattern-matching over and between values in the language and automatic extraction of sub-values for use when the pattern does match... ideally usable for such things as 'case' statements or expressions. If not part of the language by default, it is rare to see a language that supports more than the most primitive forms of pattern-matching (e.g. regexps). Ideally the pattern-matching ought to be advanced enough to casually write parsers for new languages (or support a library capable of such).
- Numbers - Useful in any domain that involves measurement or that is related to a domain that involves measurement, which means EVERY domain (with maybe a couple exceptions). Depending on the language, you might be able to implement numbers if you didn't already have them... but you'd need to have several other features instead (especially pattern-matching, structural aggregates, and sequential composition). Really, they are too convenient to go without.
- Functions. The ability to write functions, callable from any point in the program (subject to programmer limitations), with defined arguments and return value; with the requirement that upon completion of the function, control is transferred back to the caller.
- Conditional Execution. Ability to condition the execution of a portion of the program (or evaluate only part of an expression) based on a condition.
- Recursive definitions (of procedures, functions. Requires either a multiple-pass compiler/interpreter, or prototypes of some sort (this isn't really a big issue these days; but I remember when HolyWars were waged over the merits of two-pass compilers)
- Recursive processes (arising from recursive definitions) which require one of:
- Enforced TailCallElimination, or
- LIFO Memory Allocation The ability to allocate additional memory from the runtime system in a fashion which obeys a LastInFirstOut discipline - the most recent object/memory block allocated is the first freed. Suitable for implementation on a stack.
- Dynamic memory allocation (explicit or implicit). The ability to create additional "objects" at runtime which can have unbounded lifetime, subject to the limitations of system memory. In general, not suitable for stack-based implementation; requires what is commonly known as a heap.
- HigherOrderFunctions. The ability of a function to accept functions as arguments and return functions as results.
- LambdaExpressions. The ability to create a "new" function by binding arguments (supplied at runtime) with existing functions and logic.
- ExceptionHandling. The ability to signal a potentially parameterized fault and propagate flow of control back to a handler.
- ResumableException: The ability for an exception handler to propagate flow of control, potentially with parameters, back into the original operation or predetermined resumption points. Allows much greater modularization of error handling because the handler doesn't need to know as much about the implementation. Instead needs to determine only where it may resume, the general semantics of such resumption (usually in relation to the exception... (e.g. catch(div_by_zero<int>) { resume continue_with<int>(0); }").
- LexicalClosures. The ability to create functions at runtime that "capture" (close over) bindings in the surrounding lexical environment. (Most commonly the captured bindings are of variables, but any class of thing whose names follow lexical scoping discipline are candidates to be captured. For example, in CommonLisp closures can capture block names allowing a closure passed down the stack to unwind back to the block.)
- Structural aggregates The ability to create complex datatypes/objects out of simpler ones, by providing associative aggregates (think structs/records/a subset of classes), and to create multiple instances of the aggregate.
- Structured flow control. Structured conditions (if/then/elif/else) and loops (for/foreach/do/while). Can be implemented atop recursive definitions or CallWithCurrentContinuation, GoTo, and conditional execution if you have them in the right combination.
- Sequential aggregates Includes tuples and arrays, essentially.
- ReferenceSemantics? and aliasing The ability to have multiple references to an object; either implicitly (in a language such as Smalltalk or Java, in which all/most variables are references) or explicitly (via pointers in C/C++/Pascal, etc.)
- ParametricPolymorphism The ability to define functions/types/whatever which are really a function which maps a "type" to something else. Templates in C++, pretty much any function in Lisp/Smalltalk.
- MultipleDispatch. The ability to dispatch on more than one argument to a function. If you don't have this feature you end up constructing a veritable zoo of various kinds of VisitorPatterns? (ranging from extrinsic to hierarchical to acyclic) just to hack around this limitation. Each time you implement these patterns however you're basically building a one-off dispatching framework that you'll be duplicating a few days later when you get to the next bit of complicated logic.
- FirstClassTypes: Support for construction and communication of type-descriptors at runtime, along with their subsequent use when comes time to validate TypeSafety. If EverythingIsa object, then types and classes are objects (MetaObjectProtocol). If EverythingIsa value, then types are values. This is extremely useful MetaProgramming, and is of value whether or not one is using DynamicTyping or StaticTyping or something in between (SoftTyping). However, it is worth noting that FirstClassTypes at runtime under 'StaticTyping' will also require including the compiler in the runtime environment.
- CausallyReflectiveEnvironment? The environment includes a full dynamic model of its own dynamic machinery, such as processes, stack frames, dispatch tables, semaphores, and so on. Changes to this model change the behavior of the environment accordingly.
- IntrospectionAndReflection? The ability to discover at runtime the properties/attributes of an object (without having to give the object that capability explicitly).
- Modules. This is a basic software engineering requirement. Divide software systems into cohesive units, define contracts for the modules, separate interface from implementation, allow independent evolution of caller and callee, etc, etc. Most popular languages do not support this (yet), or can be worked around painfully. Languages that support good modularity features: Ada, Modula family, Standard ML. With support can be counted Java and C/C++. Ideally, modules should be FirstClass.
- AspectOrientedProgramming: or, more generally, the ability to describe CrossCuttingConcerns in one place, whether they be business rules or logging of messages, and automatically have these descriptions be leveraged at all other relevant points in the code. The fundamental capability to accomplish this has a categorical duality with modules - a direct inversion of dependencies. Instead of a value/function/class/etc. being described at one location and 'pulled' by clients that import it, client modules must be able to describe parts of a value/function/class/etc. and 'push' these parts to to a common location, often across modules. Lesser forms of this feature include 'open' functions (add new pattern-matches to a function) and 'open' types (add new tagged unions to a data type).
- CompileTimeResolution: Formal support for performing communications to link arbitrary remote sources. In a 'dynamic' environment that possesses no distinct CompileTime, this might relate more to the ability to specify lazy one-time executions to be performed at need or when initially loading the environment.
- Automatic TypeInference. Essentially a precondition for TypefulProgramming and for internally supporting types more complex than are conveniently described by hand (e.g. structured monads, EffectTyping, etc.). Utterly impossible to implement in a language supporting only ManifestTyping, it clearly requires implementing an interpreter or compiler to acquire its benefits. Can include both dynamic typed languages (CommonLisp, SmalltalkLanguage) and statically-typed languages which use TypeInference (MlLanguage and its successors, HaskellLanguage).
The maybe list:
- RealMacros (more generally ExtensibleProgrammingLanguage). Lisp users swear by these (see comments below). Other languages do without. Certainly, they are one of the cool features of CommonLisp and SchemeLanguage. On the other hand, macros are a way of automatically expanding things that can already be done in the language; anything which can be done with a macro can be done by hand (though this violates OnceAndOnlyOnce -- which is why language aware macros are here....)
- Support for Annotations: Formal support for including and processing annotations that aren't intended to be processed by the final interpreter or compiler (but might be processed by something else in the post-processor pipeline, or some back-end other than the compiler such as an IDE or LiterateProgramming, or plug-in type-checker or theorem prover or static verifier like lint or emily, or plug-in optimizers, or semantics extensions). More discussion available in HotComments. This class of extensions cannot be accomplished by RealMacros alone. (AnnotationMetadata doesn't support the sort of processing described here.)
- ExplicitManagementOfImplicitContext: A mechanism to manage the context in which programs, subprograms, threads, thunks, and continuations are operating that is consistent across the language, all language libraries and modules, and integrated with the core language services. Especially the ability to replace much of the 'global' context with such context-limited services. Especially useful for supporting security, modularity, and scripting (AlternateHardAndSoftLayers) in a multi-user environment like a WebServer or OperatingSystem or vehicle controller.
- DynamicallyScopedVariables? (Not by default, but as an option.) The ability to create variable bindings with dynamic extent (they exist for as long as the binding form is on the stack) and indefinite scope (they can be referred to from anywhere). Moved to "maybe" because they are ConsideredHarmful in the ObjectCapabilityModel. Support for this is implied by ExplicitManagementOfImplicitContext.
- FirstClassUndo: Ability to contextually 'undo' user-driven actions, especially those involving state-manipulation, at some point after they have already been committed, within the limits of what can logically be undone. Dealing with HumanComputerInteraction either directly or indirectly is part of almost any programming environment, and FirstClassUndo can provide great support programmers involved in these fields. Further, even for internal-system coding, it can be useful in the post exception and error contexts. Also relates to language-features that perform backtracking, such as those seen in ConstraintLogicProgramming.
- PartialEvaluation: An optimization that, if given formal support within a language, makes practical a much greater degree of MetaProgramming. If not guaranteed, then programmers cannot count upon it
- StaticAssert: Essentially DesignByContract + PartialEvaluation on steroids; would provide among the most powerful (and arbitrary) of code-proofs. Ideally allows for some flexible expressions.
- GarbageCollection. The ability of the system to recycle the storage allocated to objects which no longer contribute to the behavior of a program (or a useful subset thereof; such as those which are not linked to the root set.) Flawed implementations make it ConsideredHarmful in many domains.
- Encapsulation. Different access levels (public, private, etc.) A key feature of many languages, particularly those which support object orientation and modules, although PythonLanguage does not support this for either. It proves most valuable when programming in a team environment, where dictatorial control over how every programmer may be allowed to code his work proves impossible. For single-programmer or small-team projects, it potentially adds no value. Not quite as powerful or generic as a true, Explicit SecurityModel, especially a CapabilityModel? (which can provide far more capabilities than 'public/private' access levels), but much easier to implement without introducing runtime costs.
Things explicitly not included; move these above if you think they should be included (and say why):
- I/O. Provided by the library in most languages; highly OS dependent in many cases. Typically not a key differentiator between languages.
- Lists. Can be done as a combination of sequential/structural aggregation and references/pointers. Likewise, other aggregate forms can be implemented as lists.
- GenericAssociativeCollections?. Tables and Relations (ideal). Second-best: Sets, maps, hashes, etc. Useful; but can generally be implemented in a library in a language which doesn't have these. (Should be part of the standard library, though... which really is just as much a part of the language as the keywords.)
- CallWithCurrentContinuation. Maybe; it's useful but I'm not entirely convinced it's essential. (It's also dangerous). Of course, if there was some way to limit the extent of the continuation...
- GoTo. transfer control to an arbitrary point in the same function (lesser form of CallWithCurrentContinuation, still dangerous). But might not qualify - often ConsideredHarmful.
- SingleAssignment/ReferentialTransparency. Powerful features for complexity management and optimization (especially amenable to PartialEvaluation, selective LazyEvaluation, and operation reordering). However, excluded above because these features aren't an 'end' in and of themselves... not so much as the optimizations they support are. Also, forcing the whole language to be 'pure' forces requires some counter-intuitive approaches to supporting concurrency, communications, and OS integration. Something like MercuryLanguage with its support for explicit and implicit 'purity' levels, or otherwise dividing functions from procedures within the language, might provide both the desired complexity control and the optimizations without the counter-intuitive IO monads. See KillMutableState for more on this view.
- Variables. Objects with mutable state. Assumes some method to mutate the state. Opposite view of SingleAssignment/ReferentialTransparency.
- PointerArithmetic. Used as a surrogate in low-level languages to implement (in an unclean fashion) the higher-level features found in higher-level languages. This also includes other unsafe practices as pointers-to-stack-objects and the like. While flexible, often ConsideredHarmful.
- UnsafeTypeCasts?. Like pointer arithmetic, necessary in low-level languages to get around the restrictions of the language. Unnecessary in high-level languages.
- StaticTyping/DynamicTyping/SoftTyping. I believe that neither is inherently better than the other; though certainly Automatic TypeInference is better than being forced to use ManifestTyping at every place in the code.
- Preprocessors and Text-based Macros. (In other words, macros which operate on the text stream before it is parsed by the compiler/interpreter; ala C/C++) Some consider these evil; many languages have no such capability and do fine. Others consider these essential - usually to work around a deficiency or limitation in the language proper. (Part of the problem, I suppose, is that the C/C++ preprocessor is so god-awful...)
- Concurrency While this is a good thing (and somewhat necessary given the future of processors is in multiple processing cores), the jury is still out on the best way to implement concurrency. Several approaches have been tried. CoRoutines were one of the first; and still useful in some instances -- however when used to implement non-preemptive multithreading, it doesn't scale to multiprocessor systems and few production languages these days have generic coroutines as a language feature. PreemptiveMultithreading? (multiple threads running in an address space; with context switching beyond the control of the language in most cases) is common in many languages, either as a language feature (Java) or as an OS extension (C/C++). It introduces a whole rats-nest of synchronization issues. It scales well to multiprocessors; but less well to distributed systems. Coroutines or other mechanisms used to implement CommunicatingSequentialProcesses and ActorsModel approaches look promising; they scale to any system (single-CPU, multiple-CPU, distributed) and avoid the some of the synchronization problems of multithreading.
- DataflowProgramming, especially in combination with DataDeltaIsolation - essentially the ability to 'subscribe to' expressions written in the language and receive the altered values with a controllable latency. In normal programming, it seems there is only support for 'pulling' from data cells to evaluate the expression. Languages without support force you to jump through hoops to make this work: dig in and modify memory/variable services to add subscriptions, intelligently resist cyclic recursion, deal with inconvenient syntax (dataflow expressions would likely look extremely different from regular ones), etc.). DataflowProgramming is of high value in realtime distributed applications, where the cost of polling expressions is too high and you want latency to be minimized. However, down in the 'excluded' list because DataflowProgramming essentially depends on Concurrency (which is in the 'excluded' list for some reason).
- Transactions Support for atomic manipulation of shared mutable state in the context of concurrent operations, and useful whether that shared state is a filesystem or individual cells of memory shared between threads. As with concurrency, the jury is out on how to best perform it. Optimistic SoftwareTransactionalMemory looks promising.
- Linkage Ability to link high-level code to low level code written in the correct tool for the low-level problem. I am being perfectly serious here: I have seen and worked on problems where the only solution was to use CeeLanguage or AssemblyLanguage with pointer hopping. One bad case actually required allocating 1.5 GB of RAM in one contiguous chunk and using it as a bitvector. It turned out that any other method would exceed the 4GB address space limitation of the processor in the worst case. AlternateHardAndSoftLayers can pay out. (But this is a more specific form of CompileTimeResolution, and doesn't need to be formal within the language; given good support for CompileTimeResolution and PartialEvaluation, this isn't necessary.)
- SubtypingAndSubsumption? The heart of ObjectOriented (part of it, anyway). The ability to specify that one type can substitute for another in any given context, and the ability to actually perform that substitution. Note that inheritance is a mechanism for this; not the only one. This is only really an issue for statically typed languages, dynamically typed languages get this for free. On the 'explicitly not included' list because ObjectOriented itself is not among the KeyLanguageFeatures, and is strongly related to MultipleDispatch and AspectOrientedProgramming.
- Explicit SecurityModel Can provide far stronger, truer, more generic security than mere Encapsulation. Especially applicable within 'systems' programs that involve multiple users or multiple programmers. If one is going to have any security model, it is important to have a common, pervasive security model integrated with the language standard libraries and other shared program components; the alternative, having different security models for different components and needing to translate between them for each library, is such a hassle as to render the feature useless in every library. Which SecurityModel doesn't matter so much as that it is a provable security model, and is readily used in practice with minimal hassle (the easiest way to do things ought to be the most secure way to do things), so integration with ExplicitManagementOfImplicitContext is desirable (reducing the need to pass security-parameters around explicitly). CapabilitySecurityModels seem the most promising for this sort of integration (due, largely, to their inherent locality of internal reference, which is both highly subject to optimization and allows for distributed management of authority). ObjectCapabilityModel naturally applies as an extension to Encapsulation in OOP languages, whereas SimplePublicKeyInfrastructure (a PasswordCapabilityModel) or some modification thereof (one such model described in ExplicitManagementOfImplicitContext) might be useful in a distributed languages with FirstClass processes (moving towards LanguagesAreOperatingSystems). While a good SecurityModel is certainly subject to the BlubParadox (being very difficult to implement or add, and especially integrate, with a system where it doesn't already exist), it should probably remain with the concurrency support issues above: there simply isn't much need for this level of 'true' security before one has concurrency; any demand to diminish accidental coupling can be supported via the lesser Encapsulation.
Ok, I think somebody has an attitude about what a key language feature is. Considering that IO, Linkage, and GOTO are on the list of explicitly not included, while Reflection is on the list of must have makes this list rather off the wall.
You misunderstand the list. The categorization is NOT one of 'must have' and otherwise. It is a list of 'definitely is a KeyLanguageFeature' (in accordance to the definition at the top of this page) vs. 'useful, but either not subject to BlubParadox or not Domain Independent'.
No language will ever get off the ground with IO.
Perhaps. But plenty of languages have gotten off the ground without standardizing IO - i.e. without making IO part of the language specification. That said, I happen to be a distributed systems weenie, and I am all for putting IO and related features (pickling, moving executables, etc.) as KeyLanguageFeatures, but I'm willing to acknowledge that most problem domains really don't require it.
No language can grow beyond the vision of its creator without low-level Linkage available (hint: CompileTimeResolution will not get you this; it is irrelevant to the issue): it is the only to access OS features that do not exist on the designer's platform.
Access to the OS is often well abstracted as a set of services, modules, or procedures in a language whether or not the language is interpreted vs. compiled. Where does 'linkage' as a language feature come into that? In any case, 'linkage' doesn't seem very domain independent as a language feature. I do agree that CompileTimeResolution won't help for OperatingSystem access; it helps more for CompileTime access to foreign resources (be they text files, remote databases, or ELF executables).
A few years ago I would have said unsafe casts are rare but mandatory. I now find that they can be subsumed into the core by adding a handful of core library functions that convert between scaler types and arrays of bytes without changing any bits.
If you think about the rule about writing an interpreter to simulate a different language to simulate a different language, there is no getting out of the need to add linkage to assembly code as sometimes you just cannot do without it. I really shouldn't have to monkeypatch some other process besides my own, but sometimes it is necessary.
Now, dependency information between the various features:
- "A implies B" means that if a language has feature A, then feature B comes "for free". Likewise, (A && B) implies C means if a language has both A and B, then feature C comes "for free" (for free means with minimal effort on the part of a programmer, and suitable for implementation in a library written in the language).
- "A requires B" means that if language has feature A, it MUST also have feature B. Likewise for the (A && B) requires C
RecursiveProcesses? require Functions. Should be obvious.
RecursiveProcesses? require RecursiveDefinitions?. Also should be obvious.
RecursiveProcesses? require (LIFO Memory Allocation or Enforced TailCallElimination).
Dynamic Memory Allocation implies LIFO Memory Allocation. If you have a heap, you can simulate a stack. Simple enough.
GarbageCollection implies Dynamic Memory Allocation. Doesn't make any sense without it.
LambdaExpressions requires HigherOrderFunctions. Should be obvious. Note that the converse is not true; C/C++ has HigherOrderFunctions if you count function pointers; but it doesn't have lambda expressions. See next rule why not.
LambdaExpressions requires GarbageCollection. It's often said that garbage collection is required for a FunctionalProgrammingLanguage (especially one with SideEffects; without side effects reference counting will suffice). This is one reason why. The result of a LambdaExpression must almost always be heap-allocated, as they do not obey LIFO discipline. Furthermore, the usage patterns of lambda expressions (they get passed around like a cigarette) makes manual memory management of these dang near impossible. For this reason; I claim that lambda expressions require GarbageCollection.
Thread mode stuff
Regarding Preprocessors and Text Macros
This is an interesting entry given the genesis of this page. From a SmugLispWeenies' point of view, this is the poster child for the BlubParadox -- anyone who thinks that macros are an optional feature (let alone evil) is programming in Blub where Blub is a language that either doesn't have macros at all (e.g. Java or Python) or has a horrendously broken thing that it happens to call macros (e.g. C's text/token macros). Lumping preprocessors and macros (in the CommonLisp sense) together is probably not going to lead to clarity.
Good point; I'll happily separate preprocessor text macros (which perform substitutions on the text before the bulk of the scanning and parsing is done) with language-aware semantic macros ala CommonLisp. However, an interesting question still remains: If you accept the viewpoint that macros operate "outside" the core language; they are often are used to automate things that the core language cannot. Which leads to two viewpoints: 1) Macros reveal a deficiency in the core language, rather than using an "outside" mechanism, the core language should be modified to accommodate. This may be more C++ bias (BjarneStroustrup is known to despise the C preprocessor; many features in C++ were put in to eliminate a common preprocessor use). 2) Macros are a legitimate tool; and as they operate using the standard capabilities of the core; they allow the language to be extended in ways which do not compromise the core. Macros are a layer on top, and layering is good. Personally, I tend to fall in between - when I use a macro, I wish there were a better way. Perhaps this is again C/C++ bias; though I've yet to encounter a preprocessor whose semantics were completely clean and neat. -- ScottJohnson
Hmmm. I vote for 2) but I'm not sure I buy your definition of "outside the core". In CommonLisp, macros are part of the core which, by their presence, allow other things to be left out of the core. For instance, in CommonLisp most of the "control constructs" that programmers normally use in are macros on top of more primitive control flow constructs. (For instance, all the structured looping constructs in CommonLisp are built on top of a primitive that is essentially the same as C's goto.) Thus those control constructs are not part of the "core" the way for and while loops are in C/Java/Python/Perl. (They are part of the language standard - most of the language standard is really specifying the standard library, not the language core.) Also perhaps worth noting: CommonLisp macros are not a preprocessor - they are more like a hook into the compiler itself. Basically whenever the compiler hits a "call" to a macro it passes the macro form to the macro code which then returns a form that the compiler compiles in the place of the original form (which might entail expanding macros that occur in the new form). If that doesn't seem clean and neat enough for you, I'd be curious what seems unclean or unneat about it. -- PeterSeibel
In Lisp and Scheme, "macros" are just ordinary code that happens to execute at compile time, and as such they are utterly different in nature than text macros in other languages. #2 is definitely the only one that applies, not #1, in Lisp-family languages. [Nitpick: Scheme has two macro systems, and in one of them macros are not just ordinary code.]
The word "macro" has always meant, as the core of its definition, "something that happens at compile time, not run time", which is why the word "macro" is used in both Lisp and e.g. C, even though the constructs thus referred to are otherwise unrelated. Mentally substitute the phrase "compile-time first-class function" for "Lisp macro" when in doubt.
Thus it is mostly uncontroversial in the C/C++ world that C/C++ macros are mostly a bad language feature (although essential in C and in a few contexts in C++), yet uncontroversial in the Lisp world that Lisp/Scheme macros are mostly a good and essential language feature (although susceptible to misuse, as with anything).
There is no contradiction; the "macros" in question are quite different things in the two language families.
-- DougMerritt
So that's what LispMacros are... ForthLanguage also has the notion of compile-time (immediate) vs. run-time semantics. One can define a Forth word to have either or both types of semantics. As with Lisp, it can be used to good effect to create domain specific languages and to extend the compiler. Also as with Lisp, all the flow control words in Forth are immediate words implemented using a couple of branching primitives. This allows the user to extend the Forth compiler with novel flow control constructs.
[Lisp Lisp Lisp. Macros Macros Macros. Blah blah blah. If macros and lisp are so great, let's see them implement TutorialDee or a similar query language right into the Lisp program using Macros. Good luck. And no, I don't mean just a bunch of brackets smashed together that kind of look like a half assed macaroni/nail clipping based relational language in oatmeal.]
TutorialDee could be done. SQL has been done. Admittedly, LispMacros don't offer considerable manipulation of syntax (Lisp has no real syntax) so it would look 'ugly' by some peoples' standards (and beautiful by others...). What's up with your ranting? Weenie FeatureEnvy?
Actually, I'm envious of Algol style syntax - and using Lisp macros I couldn't implement the most important feature I needed - Algol derivative syntax - which TutorialDee and Cee/Oberon style languages have. In other words, if I was using Lisp - I would be very envious of Algol style languages. If a weenie steps out of first person SmugLispWeenie view for a moment - he can see that all the snobbish arguments for lisp can be used against lisp. Its strength is its weakness. One can always fork a Lisp process from within an Algol program if they need Lisp, too.
It seems wrong to complain that a feature not meant to deliver some other feature you desire isn't delivering it. A bit like saying: "Tables suck! They don't give me secure communications over a network!" Admittedly, macros and syntax are somewhat more related, but they're still distinct facilities. Macros allow compile-time execution of code. Extensible syntax allows manipulation of the parser. Lisp has the former and lacks the latter. Either of them offer mechanisms for embedding DomainSpecificLanguages, and they combine in a rather powerful way, but they are distinct features.
It doesn't provide me with a DomainSpecificLanguage, it only provides more Lisp.
Ah, you must be promoting your strange idea that language = syntax and is independent of semantics.
The idea that it offers a DomainSpecificLanguage is actually just another way of saying it provides more Lisp. Providing domain specific languages can be done by forking processes in most languages - so I don't consider it a feature of Lisp to provide domain specific languages (think about it: since LispDoesNotProvideDomainSpecificLanguages?, it just provides more lisp). Forking an interpreter (such as forking a PHP interpreter from a Cee program, or forking a Python interpreter from a Cee program, or forking some PascalScript? interpreter from a FreePascal program) provides more power. First, someone has already written the interpreter or domain specific language which I can fork and make use of. Second, these interpreters are actual languages - not just more Lisp on top of Lisp. They truly are domain specific languages (consider forking TutorialDee compiler or interpreter).
Another domain specific language is Regex. Lisp is not the only language capable of extending itself - consider a regex interpretter built into an executable (Cee or Freepascal program). Consider I wrote an SQL interpreter inside a program. SQL and regexes are truly domain specific languages - whereas Lisp on top of Lisp, is just Lisp - it isn't domain specific language. It's maybe domain specific Lisp.
SmugLispWeenies go on to argue that it takes too long to write languages that are forked or parsed. They argue that Lisp offers us this power of writing a language inside Lisp. But, Lisp is just Lisp. It isn't a domain specific language - it's more like domain specific Lisp - which is the very flaw of lisp, in that it does not create domain specific languages... but supposed domain specific Lisp (which forces one to program with odd functional syntax, which is not very domain specific). Forking interpreters that are already written and ready to go - is much more domain specific. Parsing a string inside a program, is more domain specific (parsing INI, parsing SQL, etc). And don't think that one has to write his own SQL or INI interpreter - their are already plenty written available as modules.
SmugLispWeenies are contradictory: the domain specific Lisp languages are not languages at all. They are just more lisp on lisp. However, if I write an SQL interpreter or Regex interpreter module for a Cee program or Freepascal program, this truly is domain specific. Actually I don't have to write a domain specific language many times - there are plenty of existing domain specific languages available that can be forked (perl, awk, regex, php, INI parser, pascalscript, JScript, web language could be forked, BrainFuck could be forked, compiler could be forked and launch a program right after compiling, etc). I can reuse a domain specific INI, Regex, or SQL parsing module over and over again in any language that can fork a process or parse a string.
This is truly domain specific: being able to utilize INI syntax, regex syntax, SQL syntax, script syntax, right inside a program on a string (or on a file that is read). One could even fork a Lisp interpreter - if they needed - but I think rather a more domain specific language should be forked. See the irony: lisp isn't very domain specific. In other words, its strength is a weakness - and its claims are jokingly recursively contradictory.
The assertions found in the above italic paragraphs are bizarre - somehow forking an interpreter constitutes embedding, but a language whose syntax consists of EssExpressions isn't a DomainSpecificLanguage? I think PaulGraham would point to RTML as an example of a DomainSpecificLanguage. It's hard to deny that it's domain specific. Is it not a language because it's got Lisp syntax? Isn't Lisp a language?
Incidentally, contrary to the admissions made by the non-italic participant in the preceding discussion, CommonLisp does allow manipulating the parser, through the use of reader macros. It's just that it's far too painful to implement any language that isn't LL(0) that way, and, as execrable as Lisp's syntax is, it's hard to design something better that's LL(0), especially if you also want to be able to use macros to manipulate it. (I would ask defenders of Lisp syntax to consider why all-capitals is hard to read, and compare that to an all-round-bracket syntax.)
You are in a maze of twisty little parentheses, all alike.
Actually, I would argue that Lisp is (with the possible exception of syntax) the ultimate interpreted language. Yes, Lisp compilers exist, but Lisp's features can (as noted by Greenspun) be implemented in other languages by writing a Lisp interpreter. Whereas the features missing from Lisp (other than nice syntax) that make Lisp look a little Blubish to, say, SmugHaskellWeenies
? can, as far as I can see, only be implemented by writing a compiler (loosely speaking, that is - Haskell interpreters exist, but they do a compiler-like amount of static analysis at load-time).
I would agree that the italic rant is a bit weird. He talks about "just" forking processes, but ignores the overhead that simple forking requires; and I don't think he understands how difficult it is to mingle domain specific languages when you have to "fork out" to get those advantages. And, finally, he's assuming that there exists specific languages for your domains, and that if there isn't, it's a simple matter of firing up your favorite parsing tools, create a compiler (or interpreter) for your new domain, and write your language, all in one go! What are you supposed to do, if you don't want to go through that work -- or, for that matter, can't know the syntax of the yet-to-be-defined language, because you don't know what it's going to look like? Well, I would suppose that you can start with a simple, easy to manipulate and extend syntax, that already comes with an industrial-strength language behind it, and then gradually extend its syntax piecewise and experimentally, until you have a brand new domain specific language that can be optimized to near-C levels if necessary, and can be easily mixed with other existing (and even developing) domain specific languages where those domains intersect...or, I suppose you could just write a compiler, and fork it...
Somehow, I have the feeling that if forking processes really is more convenient than creating new domain specific languages as needs arise, that SmugLispWeenies would have figured that out decades ago. But then again, perhaps they did...you could find C and FORTRAN compilers for Lisp Machines...but then yet again, perhaps that's just a fallback to the idea that "once something is written, you shouldn't rewrite it, but just let it be, in the language that it's originally written in, for all sorts of different reasons!" -- Alpheus
Re: LambdaExpressions:
(Is this different from the definition of LexicalClosures below?)
My distinction between the two is: LambdaExpressions might not have access to the referencing environment in which they were created; and can be used in languages which don't have nested lexical scoping. A C++ FunctorObject (ignoring the GarbageCollection issue noted above) could thus be considered a form of a LambdaExpression. LambdaExpressions need to be FirstClassObject?s to be much use. A JavaInnerClass? object declared within a method is kind of like a LambdaExpression (it does have access to the environment; but only to variables declared final. In reality; the variables are copied into the object, and no reference is kept to the enclosing scope beyond creation).
A LexicalClosure, on the other hand, does have access to the referencing environment in which it is created (including the ability to modify that environment); but need not be first class. LexicalClosures aren't that tricky to implement until you try to have last longer than their referencing environment; in which case they become a royal pain (they tend to require spaghetti/cactus stacks). FirstClass LexicalClosures are nice to have, obviously; as they generalize both concepts. However, they complicate the implementation of a language greatly.
Of course, I may be all wet here.
-- ScottJohnson
See also QuestForThePerfectLanguage
AprilZeroEight
CategoryProgrammingLanguage