Semantic Semtex

This is my attempt to coin a new term. ... And my attempt at renaming a new term :P Originally called SyntacticSemtex, the problems described are mostly Semantic in nature.

SemanticSemtex is a language feature which, while appealing to experts and gurus for all the power that it gives (for producing efficient code, usually) is something that (like real Semtex) allows a novice user to blow his leg off.

Some programmers (and language architects) love SemanticSemtex, for the ease with which they can manipulate data (or even program structures). Others hate it; and seek to design languages where the semtex is carefully abstracted away.

Some types of semtex are low level structures that might be better wrapped up in a higher level structure, that always uses the semtex in a safe way. Others are HighLevelLanguage features that have demonstrable benefits, but a high cost (in implementation overhead or difficult-to-understand semantics).

Lots of things have been claimed by various folks to be SemanticSemtex; many of them have had a ConsideredHarmful paper written about them. Some of these things are (the author does not necessarily agree with all or any of these being on the list...)

GotoStatement?. In the original unstructured form (which allowed jumping between functions, and across other instruction boundaries), virtually everyone agrees today that the raw goto is too low-level of a programming construct for a high-level language. The one use of goto still considered somewhat kosher is breaking out of multiple layers of loops; kind of a poor-man's exception handling. [See Note 1.]
PointerArithmetic. C (and some C++) programmers love this; everyone else hates it. Certainly powerful; certainly dangerous in the hands of the uninitiated. The unsafe form used in C and C++ can be used to violate any language abstraction ever devised.
ManualMemoryDeallocation?. One which still is a bit controversial; though the folks in support of GarbageCollection seem to be winning the argument for more and more problem domains as time marches on (and memory and CPU cycles get cheaper). For many problem domains; GC still has issues so typing "free" (or "delete" or whatever) is still done.
MultipleInheritance. Another one which is still an open topic of debate. I suspect in the final analysis; languages will settle on something between full MI (ala CeePlusPlus or EiffelLanguage) and no MI whatsoever (ala SmallTalk or ObjectiveCee). JavaLanguage and CsharpLanguage, with partial MI implementations are strong indicators of this trend.
- Actually, ObjectiveCee does have "partial MI" via protocols, which are similar to Java's interfaces.
SideEffects. For many programmers, side effects are part of everyday life. For others, they are evil (WhatIsEvil?) and to be avoided at all costs. Certainly, it is easier for a compiler/optimizer to reason about software in the absence of side effects; but much real work requires them (or something that acts like them).
TypePunning. In CeePlusPlus, you can effectively sidestep the type system completetly, and do stuff that not even reintepret_cast will allow you to do, such as convert from function-pointer to floating point (There are very few valid reasons for doing this, but the language allows it). Here's an example:

  union Semtex
  {
float f;
void (*p)();
  }
  Semtex s;
  s.f = 4.000;
  s.p(); // you are now dead.

"the language allows it"? Neither CeeLanguage nor CeePlusPlus claims that the abuse of a union example above will do anything meaningful. Sure, it won't generate a compile error, but neither will division by zero. Perhaps "allows" is correct, but it seems quite misleading to me.

Separate namespaces for different types of objects, allowing sharing of names. CommonLisp separates functions from variables. PERL has separate namespaces for scalars, arrays and hashes.
CallWithCurrentContinuation
- Usually sugared away by exception handling, coroutines, generators and suchlike.
MetaObjectProtocol
- Sugared away by AspectOrientedProgramming languages/tools.
Standard language keywords/functions that can be redefined.
Implicit variables (like $_ in PERL)
Allowing redefinition of functions in run-time. Common for DynamicLanguages such as LispLanguage, JavaScript, and more.
DynamicScope

Lots of other types of Semtex (real and alleged) out there....

-- ScottJohnson

NOTES:

The use of goto in breaking out of deeply nested loops and such things is still a hack; if there exist conditions that would prevent an operation from succeeding then those conditions need to be tested before the deeply nested business gets under way. See GatedCommunityPattern.
- What? Let's say you have a large two-dimensional matrix, and you're doing some operation on it. And you have a condition that if there's any element of the result is zero, you're going to fail later on, so you don't need to do the rest of the processing. So you're proposing the correct thing to do is to iterate over the matrix twice, once to evaluate whether the second time will fail and set a flag, rather than just aborting the easy way by breaking out of the loop? That doesn't make sense, but I don't see any other way to interpret what you're saying... -- AdamBerger
  - Alternatively, consider the case where you want to know if something exists, but the only way to find it is with an exhaustive, recursive search. If the target exists then the simplest thing to do is to perform a longjmp out from deep inside the recursion. If it doesn't exist the search can terminate normally. In such a case there is no possibility of testing beforehand - the search is the test. To claim you can simply test beforehand shows a limited experience base.

See: OccamsDebugger

CategoryProgrammingLanguage