Terse Language Weenies

Terse Language Weenies are those who may advocate syntax such as AplLanguage, JayLanguage, RubyLanguage, CeeLanguage, PythonLanguage, PerlLanguage (and PerlGolf), and QompLanguage.

The opposite of Terse Language Weenies are VerboseLanguageWeenies.

If the instance be, 'ten and thing to be multiplied by thing less ten,' then this is the same as 'if it were said thing and ten by thing less ten. You say, therefore, thing multiplied by thing is a square positive; and ten by thing is ten things positive; and minus ten by thing is ten things negative. You now remove the positive by the negative, then there only remains a square. Minus ten multiplied by ten is a hundred, to be subtracted from the square. This, therefore, altogether, is a square less a hundred dirhems. -- (Al-Khwarizmi)

Compare above verbosity to:

  (10+x)*(x-10) = (x+10)*(x-10) = x*x + 10*x - 10*x - 10*10 = x*x - 100

Languages save us time (and confusion, once the language is well known by the studier), no matter how many arguments there are about terseness not helping one bit. Terseness (used the right way) adds precision to our field of work, and keeps the attention span higher in a screen of text (which is also a reason why short, to the point methods or procedures should be used wherever possible, instead of long drawn out ones.).

Not to say though that some terseness can seem to go too far. An example is a regex: once it is written it isn't so easy to tweak it, compared to a char by char parser that analyzes text snippets one by one. This is more to do with the abilities and goals of regexes, though, and is not purely the syntax and notation's fault. Rather, the fact that a regex isn't a full programming language itself is a reason it fails to scale for many situations (in other words, don't always blame the terseness of the notation, but blame the higherlevelness of the tool). Regexes are wrapping algorithms, they'd still not work great for some situations even if they were verbose English instead of character symbols - so can't just blame terseness, have to look at the entire picture.

  *&@$@
  Asterisk Ampersand At symbol Dollar sign At symbols

A compromise may be:

 Ast Amp At Sym Dol At Sym

But those symbols in English don't mean anything.. other than English.. and we do not want them to mean English since we are notating a regex (a language with symbols that mean something to our brain other than words... visually representing parsing almost).

Even when written in English a regex-like syntax is still the same maintainability wise. It's the same stuff... just in English, and the Ast Amp At Sym style notation doesn't offer many advantages in this case.

A regex does have different goals than other types of tools (i.e full programming languages). One problem may be that people use regexes in cases where they could use a programming language. A lot of long term maintenance scripts seem to use regexes quite often, and the extensibility of a regex is not great compared to a char by char parser. The problem starts when someone chooses to use regexes in the beginning and sticks with them - repairing them as the project grows larger. With time regexes become a duct tape of hacks in large source bases, and out of control. The wikipedia parser and Php Smarty source code are demonstrations of this.

AlternativesToRegularExpressions suggests that regexes could be reworked to be a bit more mnemonic.

From a discussion in AdaLanguage:

Remember that mathematical notation evolved during a time when paper and pen were the rule, and hand strain from hired (or slave-labor) scribes proved a significant limiting factor in transcribing texts. Math notation had to be concise, then. Today, it needn't be nearly as concise; the limitations of years past no longer applies. There is nothing wrong with domain-specificity. There is something wrong with overt attempts to re-invent and subjugate human alphabetical or ideographic systems in name of "brevity."

{Imagine if procedure brackets() were instead stated as "This Is The Start Of A Group of Parameters" and "This Is The End Of A Group Of Parameters" respectively. Symbols do have their use! Words in programming language are something we compose - and are very powerful.}

False. The rules of lambda calculus prove that my constructions are no more important than the features provided in-language. They follow (or, at least, should follow) the same rules of evaluation.

{But the fixed symbols such as assignment, plus, minus, beginning, ending, enclosing groups of parameters (brackets), etc. should indeed be brief. Why should they be brief? Because if you use "ADD_LEFT_TO_RIGHT" instead of "+" or "OPEN_OF_PARAMETER_GROUPING" instead of "(" or "BEGIN_OF_BLOCK" and "END_OF_BLOCK", it becomes obnoxiously hideous and inelegant.}

Stop changing the argument. You know damn well what I'm arguing, but you insist on making me look like a baffoon, twisting my words and pretending I'm saying ALKDKLAFJHSLDKFKJLQWERHLSLDFGKJHAD instead of simply A (how's that for single-sigil symbology for you?), for holding my position. I never said that symbols didn't have their place. Please cite where I said so. Go on -- I'll wait. . . . Ahh, I didn't think so. So, to recap, reword, re-iterate, and make so plain that a 1st grader can understand it, I'm not advocating that we replace everything with 152-character-long symbols (remember, words are symbols too) either. I'm saying that restricting yourself to excessively short, unnecessarily terse, or worse, single-sigil names for operations and objects, particularly when they're easily confused (rho versus 'p' for example) aren't always needed, and are increasingly unnecessary.
You, on the other hand, clearly advocate the use of single-sigil-for-everything. This is wrong. There is a reason why a single dot represents the letter 'e' in the English version of Morse code. It is because 'e' occurs more frequently than any other letter. Of course it will have a smaller symbol. Likewise, + appears with astonishing frequency in computer programs. This is why structure dereference has strong preference for '.'. Indeed, working with pointers so often in CeeLanguage, I really must wonder why they forced the use of -> instead of . for dereferencing structure or class members. But NONE of this changes my fundamental basis, no matter how retarded your counter-arguments.

{The "is" keyword and the requirement to end each procedure in Ada with NeedlessRepetition of the identifier word get on people's nerves.}

Lars, we've been through this before. It gets on YOUR nerves. Maybe it gets on other's too. But it does NOT get on my nerves. And, so patently obviously, it doesn't get on the nerves of a lot of other people either. So, please stop making such horrifyingly incorrect, demonstrably, provably incorrect blanket statements about all of man-kind.

{. . ., especially since procedures that are kept short can be seen in one screen anyway; there is no use for much of the Ada boiler plate syntax (I'm not even so sure what 'boiler plate' refers to, sounds like a BuzzPhrase}

Has it occurred to you to educate yourself on a topic before levying arguments against it? Don't bother responding until you know at least what boiler-plate means.

{but I assume it is something along the lines of "up front ridiculous verbosity that appears to look like a verbose English contract but is really just a lot of fluff". The contract could be shorter and more concise, and just as correct - some Ada advocates act as if the reason Ada is safe and successful is mostly because of its verbosity}

This is demonstrably why EnglishLanguage and so many other spoken languages prove time and time again to be "correct" enough to communicate ideas and concepts, despite their myriad flaws, and despite the flaws inherent in the communications medium. It's called redundancy, and it's what powers all sorts of things, including the error correction codes used to on your harddrive, in the RAM sticks in your computer, wifi networks, etc. We couldn't have a space program without redundancy. Error correction and detection depends on redundancy.
Would not it be better if Ada language used "is as follows" instead of "is" then? That's more redundant and would most likely cause programmers to create even more correct programs than before. Or maybe a business language in English... wait they tried that.

{which is silly. It's more of the fact that Ada doesn't encourage dangerous structures such as pchars, pointers to pointers to pointers, etc.}

This is true too, but it all plays a role. One of the biggest problems with CeeLanguage and CeePlusPlus is the distinction between = and ==, particularly when used as boolean expressions. It trips up everyone, from old salts to total newbies alike. And, yet, had CeeLanguage decided to use becomes versus equals, this confusion never would occur. Do you understand this? It would never occur. The HammingDistance between the two concepts is gigantic -- though they both have the same number of letters, their selection is so wildly different that any typo in any one of the keywords results in a fundamentally incorrect (and provably incorrect by the compiler) program. Even if you used := versus ==, while there still is a larger HammingDistance, it still stands to reason that the HammingDistance is substantially smaller than the former case. Hence, one must expect more typos with :=/== versus becomes/equals.
- Of course I understand that using becomes could be clearer (although, I think using math signs like = is clear if one sticks with the math sign and does not use === or ==). Have you considered the more precise three letter word set instead of longer words such as becomes or assign. You can set an item to its value. This was considered in Qomp language, using the set keyword for assignment. Even =a or a= was considered, but was found to be too ugly. For equality.. one could use the keyword called equals which is nice and short, although a little bit too long for TerseLanguageWeenies.
Now, I am NOT advocating we dump the use of := or == (I do advocate the total banishment of = however). The frequency of occurrence of these symbols is just too high to justify the action. But what about some of the other things? Declaring types with the type keyword, saying something is a record (versus a struct -- what the hell is a struct? This word doesn't appear in the dictionary!), etc. makes perfect sense. These symbols tend not to occur on every single line of the program. These symbols have frequencies substantially lower than many other symbols in programming languages. What is so wrong with closing a block of code with end? Why should I have to use {} when their meaning differs from (), and in no way does the language allow me to exploit or create my own brackets? CeeLanguage ain't LispLanguage, ya know. So why bother? I, for one, celebrate how Ada and Oberon both dispense with unnecessary bracketing.
{The keyword END is three letters and causes indentation and matching aesthetics issues. BEGIN is not three letters. One could use BEG but that looks silly. So B and E are superior, in some people's opinion. BEGIN is still required in places in all those Oberon style languages, even though they got rid of BEGIN some places. It's not completely gone. The e; is just as clear as end; to a experienced programmer or even half experienced programmer. Begin and ends do occur often and bloat up the code in algorithms, especially if END is mandatory now. If Ruby used e it would be just as beautiful as if it used end (in my opinion, more beautiful when it uses e). People's tastes do differ. Try it in your editor though. Put some ruby code in and see how e looks. It looks wonderful and decreases algorithm bloat. It helps to have syntax highlighting for the e, but even it is beautiful without syntax highlighting. Try it. Come back and tell me.. do you not like the e? I guess our tastes differ, if so. But also consider the indentation issues and the algorithm bloat.. since END and BEGIN are often used in coding. Often used words shall be reduced! If notation can be reduced without sacrificing your left arm, it should be reduced. If it means sacrificing your left arm - it shouldn't be reduced. Perl, in my opinion, goes to far with symbols and notation. Oberon and Ada, in my opinion are too close to COBOL. C++ and C are too inside out, and backwards VOID declarations (why call something a void if it doesn't exist?) is less elegant than PROC or FUN. What is between? What is a compromise? C++/C are also verbose and not to be used by true TerseLanguageWeenies. Allocating a *char (pchar) and using a FOR loop, or creating a simple program in C/C++ is verbose (requires boiler plate include files and redundant "main" declarations even). Some assume that a terse language weenie is an advocate of Cee/C++/Perl/APL.. but many TerseLanguageWeenies are very selective weenies.}
I am a citizen of the UnitedStates, not of ancient Egypt, or of China, nor a zen master in Japan. Inasmuch, I do not grok a monstrously huge set of symbols. I grok words. I put words together to form sentences. That is how I think. That is actually how most mathematicians also think; equations, for them, are statements of truth, not actions. Programs, however, are actions, not statements of truth. I expect to be able to instruct the computer, but do so in a way fellow mankind can understand. There is no better way to do this than with words.
Might want to try a business language that uses lots of English words. Wait, they tried that...

I see two problems with brevity versus verbosity:

Redundancy - Inexperienced readers can infer more from text because it contains more hints (repetitions of course). This tends to get on peoples nerves if they have to repeat and repeat it when writing it but *is* helpful for beginners. How about a programming language that'd define short and long forms which could be used interchangeably (both allowed on input and different verbosity formatters for output)?
Precision - Programming languages are exact by definition. Natural language is not. Using words obscures this fact. Symbols make it explicit. This seems to be implied by Dijkstra's comments. Words could give beginners the feeling the understand when in fact they only have a feeling and no precise grasp.

{Indeed it is even suggested that beginners see (or use) languages such as Oberon, TutorialDee, as an introduction. However, programmers (beyond beginners) need not put up with beginner and newbie notation and syntax. Programmers, need a migration path to a language that let's the notation do the work while beginners need a language which is more like English. An equation in math may be explained by an Essay to a child, but an adult or teenager and most children should be able to read a precise equation that uses notation and symbols. The symbols and notation, however, shall be consistent - and not baroque and inconsistent. Perl and APL are examples of where letting baroque and complex symbols and notation do work has gone wrong. Related: Leibniz's Dream}

I agree. You seem to propose to have different languages for learning programming: Simple explicit redundant ones for beginners and terse uniform ones for experienced developers. Or do you think it would be a good idea to have one language that is good for both. Or is that impossible?

It is tough to have a WinWin TheBestOfBothWorlds setup where we can have it both ways. Having multiple languages causes NeedlessRepetition, as does having multiple syntax choices - no matter if two separate languages are chosen, or if one extensible language is chosen. I am guessing you are heading toward discussing Extensible Programming Languages or domain specific languages. One option also is to have a compiler mode system, where one can switch the the learner newbie notation and syntax off.

The only problem with this is you still need to read source code written by relative newbies, whose source files will be in "newbie mode". The author's original beef seems to be that verbosity is undesirable from a maintenance point of view, which I contest isn't true. I make a clear distinction between languages that optimize reading (typically verbose languages, provided it's not as excessive as CobolLanguage) versus writing (as exemplified by JayLanguage and, to a lesser extent, CeeLanguage/CeePlusPlus).

In order to realize this happy medium, we need a system where source files are stored in a purely symbolic notation -- roughly approximating IntentionalProgramming, so that the user's editor can be individually configured for program representation when viewing the pre-parsed source.

Well one can use a tokenizer to convert the syntax to newbie mode... it is definitely possible. Syntax highlighters could replace b with BEGIN symbol on the screen. A syntax highlighter could replace { and } with BEGIN and END too. This is not a dream - it is easily do-able. The question is whether it is really worth doing and to what extent? The assignment (:= or =) could be replaced with the words SET TO or ASSIGN with a syntax highlighter too.

[Don't be too hasty, there. I've experimented a great deal (on paper) with the extent of what syntax highlighters can get away with. Manipulating the actual symbol stream causes problems - among them: it teaches people to inject symbols that aren't actually part of the source-code, it requires an additional context-sensitive translation layer on the input stream (so that when someone inputs '}' in a particular context it becomes 'END' in the source), and it increases probability for language-recognition errors and thus divides a developer community even further (DomainSpecificLanguages and competing libraries are bad enough already).]

[As far as assignment operators go, I've some affection for <-. But set works well, too.]

I didn't intend a 'compiler mode system', but rather a syntax that has short and long forms (in the example above '{' and 'begin' could be used interchangeably; of course the syntax would have to ensure that this cannot lead to ambiguities in only some of these cases like 'begin' also being a valid identifier in some cases). Then together with either suitable editor (like the one proposed above) or a tool to convert between long and short forms (yes, difficult in the presence of linebreaks and formatted comments etc.) be have the required language. One nice extra thing would be type inference: Experts only need to state few types and beginners can add all of them. The conversion then removed/adds all that can be inferred automatically.

Reason for compiler mode: consistency and sanity. Having both adult and child modes in the same source file is very ambiguous, leads to inconsistent mixed and matched code. Conventions and style are up in the water. Plus the compiler is much harder to maintain when different notations/syntaxes can be used interchangeably. In fact it reminds me of Ruby's ambiguous loop choices and if logic choices, or the C preprocessor. It is wise to pick one consistent way wherever possible (what's that python saying? [there should be one - and preferably only one - obvious way to do it]). One consistent style per each source file is (in some people's opinion) better.

However, the advantage of mixing and matching child syntax with adult syntax in to the same hodgepodge would be immediate backwards compatibility. But that leads to parsing complexities and IMO horrible ambiguity. I've seen people use the C macro preprocessor to make their C code look like pascal - and it is scary when they use C notation throughout half of the file, pascal notation for the other half, or worse: two different scattered, mixed and matched styles in one file!

With compiler modes, if someone creates a module in newbie child syntax, it can be compiled (used) along side with an adult styled module. They are compatible at the module level. Modules remain consistent throughout - which is superior IMO to inconsistent unenforced modules! The compiler modes are not just a theory, by the way - it is actually how qomp currently works. As an insult to FPC, maybe "MODE FPC" will indeed be called "MODE NEWBIE".

I think notation inconsistency (combining a lot of ambiguous syntax into one module) is evil. Consistency of one source file is always preferred!

[I also share the opinion that consistency within a language is important - SymmetryOfLanguage is far more important to me than aiming for terseness. I suppose that is why I feel that, if you ARE going to have inconsistent syntax forms for the language, you at least ought to have a consistent means of making it inconsistent. My own suggestion in this vein would be to avoid 'hard coding' both long and short forms into a language; instead, make one form that is 'good-enough' (the 'standard' form) and then make the language extensible (as per ExtensibleProgrammingLanguage) such that people can add and tweak long and short forms of the language along with any desirable macros or DSLs (those being the main reasons for extension). If it is non-monotonically extensible, you can actually remove and combine and mix-and-match different language-forms (at least so long as you don't make things too ambiguous for the parser to handle) and even remove the 'standard' form and the ability to manipulate the syntax (e.g. creating a restricted language-form for students that teachers will be able to read and comprehend). The only potential problem with this approach is that conversions don't possess any obvious 1:1 correspondence between forms, so rather than converting the source for experts and newbies one would depend on an editor that has other nifty features like hover-text and colorings indicating how a particular segments of code are being parsed - i.e. to help people who aren't experts in a given language-form or DSL understand how it is being processed and what things mean.]

What about Lisp and Ruby since they offer domain specific extensions (and tweaks) already? They are existing tools at our disposal right now. Would your suggestion be reinventing existing solutions that we have available already?

[Lisp does not offer syntactic extensions, only semantic ones. RealMacros does not imply syntactic extension. RealMacros are also spatially limited in their application, which prevents syntax manipulations on a broader scale. Ruby possesses a MetaObjectProtocol, but that also doesn't qualify as syntactic extension (i.e. the parser is unaffected). OperatorOverloading, polymorphism, creating domain-specific libraries of functions, etc. - none of these things determines notation. And while I wouldn't suggest reinventing solutions we have already... making existing solutions available? yes, that I'd recommend. Most of the research done on syntax extension from the seventies to the nineties hasn't yet been integrated into a modern programming language.]

I'd like to know why you think macros do not constitute syntax extension, particularly when a macro is fully capable of implementing a recursive descent parser to consume tokens and generate code accordingly. That they rarely do implies only that the implementor didn't see the value of doing so; nonetheless, to say it cannot be done is just plain wrong.
[Proper syntax extensions, by definition, act as if they were additions to or alterations of the underlying lexer and parser for the language. This has some useful properties, such as various different syntax extensions all integrating with one another in any given parsing context. Use of macros in the manner you suggest is more equivalent to partial-evaluation over lists of tokens or strings by a particular function. In every technical sense,this does not qualify as syntax extension, and it's missing critical features when it comes to integration with other such macros. RealMacros as seen in Forth and Lisp offer their strength in the form of PartialEvaluation optimizations and CompileTimeResolution features, both of which are KeyLanguageFeatures that reduce need for 3rd party code-generation, design patterns, and other language smells, but they, by nature, do not offer or imply syntactic extension or any equivalent to it.]

As a partial Dijkstra and Wirth follower (they are not Gods, indeed), I think allowing too many extensible features into a language can cause readability problems, because programmers are not disciplined and they have egos (even artistic and creative abilities, which can be harmful).

[I think bad programmers can find enough rope to shoot themselves in the foot no matter which language they're using. And if you're too limiting, even good programmers will start cramming data and behavior into strings and external files that are even more opaque, less readable, more difficult to verify, secure, optimize, type-check, and debug, etc. Programmers should always have the power to do what they need without temptation to implement an incomplete, slow, buggy version of one language within another (or, worse, need to rely upon 3rd-party CodeGeneration). If doing so requires making easy the extensions and implementation of other languages, so be it: at least those other languages can take advantage of my language's optimizer, type-checking, unit-testing, debugging, etc. If I do it right, they can even leverage the syntax-highlighting. They won't be "slow" and "buggy" and they can fall back upon the root language to obtain "completeness" should a sub-language require it. It is not a bad decision, IMO, considering this automatically supports DSLs and domain-specific extensions which offer various other clarity advantages (reduced syntactic and semantic noise). My policy: Give well-integrated power to the programmers within the language or they'll repeatedly seek and re-invent poorly-integrated power from outside of it.]

Okay so consider Rake, since people brag about Rake showing the advantages and benefits of a DSL (I don't consider it a DSL though, it is more like a DSE): wouldn't it be just as useful to have a build system written in Ruby itself, and not the "Rake Ruby" Language? Consider FpMake or PowBuild? which is written in FPC, without it being a new language or new extension to the language!

http://wiki.lazarus.freepascal.org/FPMake

How can people create a elegant build tool like FpMake, using their existing language without any extensions? Why is Rake requiring domain specific extensions and FpMake is not? Are we 100 percent sure that domain specific languages are required in many cases which they supposedly shine? There is proof with fpmake that one doesn't need to extend the language to make a build system... so I have my doubts.

Even procedural paradigm is "domain specific" ability of some languages - it allows us to quickly create make systems since make systems are very procedural: Build this, Build That, Delete This, Move That, Copy that. Supposedly the atrocious GnuMake is "logical", but from my experience GnuMake is horrid and ugly (even though it is a domain specific language!)

[Technically, libraries and modules of functions and macros and classes and such are already 'extensions'. Adding to the set of meaningful words in a language is always an extension to that language; it's just we're so familiar with function and sub-procedures in modern languages that we don't typically think of them as 'extensions' - we have more specific names for them. From the perspective of programming in MachineCode or BefungeLanguage or certain regular expressions, which don't provide even that much, they're clearly extensions. But they aren't 'syntactic' extensions.]

[Regarding your 'procedural paradigm is "domain specific" because it helps you create make systems' argument, by that same logic the Sun is a "calculator-specific" energy source because it helps run my solar-powered calculator. Please avoid such obvious fallacies in your ranting; it does much to discredit everything else you say.]

Languages that offer the procedural paradigm are domain specific. I'll prove it. Consider Java: one cannot make a program such as below, which is a batch task.. in a very terse, simple easy to view syntax/notation. The non-procedural nature of many languages severely limits the language's domain. In a procedural language one can make a quick batch program such as this:

  use fileutil;

  pro moveFiles; 
  b Clone('/tmp/*.tmp', '/foo/bar/'); Delete('/tmp/*.tmp'); 
  e;

  b moveFiles();
  e.

We wish to copy some files - the procedural domain makes this extremely easy. Great for build/make systems since deleting, moving, and performing quick operations on files is often required (no new instance of a classes, short and simple).

Is Bash a shell (domain) specific language? Or general purpose? The line can get murky. It can be both. Languages with procedural abilities assist in the domain of batch tasks and prototypes. Computers were first designed to do batch tasks - a domain specific area (they weren't designed for GUIs necessarily).

[Sigh. You cannot logically demonstrate that procedural programming is 'domain specific' by showing it helps a particular programming domain. If that were the case then every KeyLanguageFeature would be 'domain specific' because such features help many different programming domains which always, necessarily, includes a variety of 'particular' programming domains. 'Domain specific' literally means specialized to a particular domain - i.e. designed for a particular domain, making assumptions or leveraging foreknowledge associated with a particular domain, optimized to a set of needs unique to a particular domain, etc. - usually all of these things at once. Your entire effort above is completely irrelevant if your goal is to prove that procedural is domain-specific. Similar to how you can prove the sun isn't a 'calculator-specific energy-source' by pointing out that it also feeds plants and warms the planet, you can prove that procedural isn't 'build/make'-specific by pointing out two or three distinct other domains it helps in, such as (for procedural) activity scripting, DSP and CGI.]

You cannot demonstrate make or rake is domain specific either then: Ruby Rake is not just used for make tasks, but for shell tasks and batch tasks. Read the websites. Other make tools are used for batch tasks, packaging, installing (not just making). OpenBsd implemented its ports/package system using Make, even though a ports/package system is not a make system. I've used make files as if they were ms.bat or unix.sh files. Ironically, for the make domain, I and many others have found 'make files aren't even good at making.
- On the contrary, 'make' at least is clearly designed and optimized for build systems (i.e. handling up-to-date checks and dependencies implicitly, assuming you don't wish to rebuild if stuff is up-to-date) and as consequence makes it more difficult to perform generic operations. You can use it for scripting, but you need to leap through extra hoops in the form of '.PHONY' specifications and such. In fact, it would be more accurate to say that the reason makefiles support other operations is that they contain some general-purpose features, such as procedural programming & scripting with minimal boiler-plate code.
Even languages with OOP features help in the GUI domain and can have GUI domain specific qualities. Of course OOP (and make languages) can be used in other domains. A make file or an OOP language can be used not just for making - but for deleting, copying. It's domain is not just making alone. Hence there not many TrulyDomainSpecific? languages that specialize in one task. Even SQL has imperative features, triggers. SQL is no longer just a query language (''it is called query language.. but it is more than that.. just as a make language is more than that).
LaynesLaw, WhatsaLanguage, DomainSpecificLanguage --> DomainSpecificTweaks
Which, is why tweaks is a better term than "language". In the case of FpMake, for example, FPC is being tweaked to be used as a make system. Fpc is not going through metamorphic changes; it is still the same fpc language. The language is not extended, the developers are simply tweaking modules and classes to be utilized for making, just as GUI designers will tweak an object to map better to a GUI button, without that GUI button object causing a new programming language (shouldn't a new languages have new features). FpMake is using built in features like classes and modules that already exist. Whenever someone creates a new class or module, or whenever someone uses the Windows API - they are using a domain specific language? WhatsaLanguage.

[In any case, I don't know enough about RakeMake or FPMake MakeTools to make any fully qualified statements about one or the other. What I understand about build systems in general is that dealing with updates-times (up-to-date checks), prerequisites and dependencies, cleanup, modes (e.g. debug vs. release), indicating where to search for files and where to place them, etc. are all important domain considerations that show up repeatedly. And so the expression of these issues needs to be optimized, even made implicit where defaults will serve - doing so makes the intent of the program clearer, which means you'll spend less time creating and repairing errors in the build itself and more time being productive. It is quite possible that a language could be non-intrusive enough to support all this using plain'ol functions and such without much overhead in getting the operations all glued together... but most general-purpose languages won't serve; they'll add undesirable syntactic noise for the task at hand. Based on a Rake tutorial (http://martinfowler.com/articles/rake.html), it seems that Rake takes advantage of Ruby keywords (':id'), blocks, and an object-constructor for a class called 'task' to automatically load task-descriptions into some sort of central repository that can then formulate a partial ordering of activities (based on dependencies) for any given build request. However, it also seems that Rake could use some extra work in the handling of already-built prerequisites and up-to-date checks: it doesn't make the up-to-date checks nearly as easy or implicit as GnuMake. The advantage RakeMake offers over GnuMake is a great deal of expressive and semantic power within those blocks of code describing each task. Relatively, at a glance FPMake I can't tell whether FPMake provides dependency management and up-to-date checks, but I can tell that it seems to create a lot of syntactic overhead for the programmer.]

On another note: It seems that a lot of the advantages of extending a language is about terseness. Make files, or make systems are terse (one should not have to verbosely MemAlloc? a *char just to make something, which is why people don't use Cee for making projects). Mixing SQL "strings" into a language is verbose and messy, and a built in SQL would be terser. Extensions/integration often create terseness! A lot of people may not immediately see this relation - but indeed the entire point of a lot of "domain specific" things are the fact that they create terseness: quick/short precise notations to use. When I say domain specific things, I don't think it is only limited to new macro extensions - because, like I say, FpMake is domain specific, and it doesn't use "macros" or any fancy extensions at all. Even better: one can use FpMake or PowBuild? without learning a new language - because it isn't one!

[The whole reason for DSLs is to remove the messy semantic and syntactic overhead and optimize the language to the needs of a particular domain. Doing so is not only about terseness; it's just as much about clarity... but terseness is one big aspect of it. But keep in mind that this isn't (generally) the sort of terseness that comes from trimming symbols down to a minimum width; it's the sort of terseness that comes from the ability to make assumptions and use default policies based on foreknowledge of the domain, to cut away general-purpose 'fat' and boilerplate code, etc. I.e. it isn't the same sort of terseness that is described as desirable for TerseLanguageWeenies.]

[And with FpMake and RakeMake both you need to learn a new library/API/set of words which isn't quite the same as learning a new language, but is very similar (more like learning a new jargon). The main advantage of these 'internal' languages is that (a) you have the full power of the language at your fingertips when attempting to handle complex tasks, (b) you get to take advantage of the debugger, potentially a compiler and optimizer, testing system, syntax highlighting and IDEs, error handling and reporting, etc. that is already provided for the base language (i.e. no need to reinvent these things). In relative terms, GnuMake lacks most of these features (though some editors provide syntax highlighting for makefiles).]

Re: "The main advantage of these 'internal' languages"...

FpMake is not an internal language though.. it's just using the existing language.

[Sure it is. It adds new words, new interfaces, new protocols, and all the aspects of a new language and a new culture... all associated with the build-system. It's just as much an 'internal language' as any library API or jargon or field vernacular.]

Another domain specific trick is to use the stack, [Specific to which domain?] to reduce line noise (AntiCreation). My point is that not all domain specific abilities stem from creating new languages - in fact, often a language has features within it that make domain specific tasks easier - such as stack instead of heap, the ability to use procedures instead of class instantiations - all these features in a language help the language even if only for the terseness that a stack, procedure, or imperative nature offers. A make system, is indeed much easier to create in fpc or ruby, because Ruby and FPC are much more terse than Java. It's not the only reason - but it plays a big role. The procedural paradigms offered in FPC and Ruby (Ruby uses global DEF's as their procedures) also play a key role.

[Features and languages are called 'general-purpose' when they help in a variety of 'specific' domains.]

Make files help with batch tasks, making, copying, installing, and domains other than just making. Perhaps regexes are an example of a really domain specific language (regexes are even more of a notation than anything.. are plain wildcards *.* a language? WhatsaLanguage) - lone and behold wildcards and regexes are terse; once again demonstrating the power (and danger) of terseness.

Interestingly, an example of a terse markup (arguably a partial language) that is extremely useful is this wiki markup on C2. Imagine if this wiki markup was more verbose? Would we be as motivated to contribute, as programmers? Sure - beginners may prefer a more verbose markup, but the terse wiki markup is indeed useful. In fact even beginners may not prefer something as verbose as XML or BBCode for wiki markup.

MayZeroEight

CategoryWeenie