Argument By Elegance

This is the potential fallacy that if a language or concept is "elegant", then it is automatically better than the competition. I'll agree that all other things being equal, elegance is a desirable feature, but usually there are many other real-world factors involved. See WaterbedTheory.

Elegance is generally measured by "simplicity" of the math or model. If it has fewer "parts", then allegedly it's easier for both humans and machines to reason about, resulting in less mistakes or more efficiency.

Generally a conflict arises when building a DomainSpecificLanguage (DSL) or library. The domain view (or designer) may want features that "muck up" the "purity" of the elegant version, requiring additional steps are work-arounds. The DSL designer wants to remove these extra adapters or steps.

For example, the domain may use a lot of lists, but the collection system may depend on sets, which lack order and repetition. It may be argued that a "list engine" is less elegant and less efficient than a set engine and that the domain design should learn to live with adaption/translation from sets to lists and vice verse. The DSL designer may however prefer a list-centric engine regardless of the theoretical loss of some elegance, skeptical of the downsides of heavy usage of lists for this domain.

-top

It really wouldn't matter if the implementation engine used 'lists' under-the-hood so long as the users cannot come to depend upon that fact from their interface (the DSL).

One may have to reinvent a lot of list-oriented idioms if using a set engine to implement it. Sure, one can be made to look like the other with enough programming, but ideally we don't want to have to reinvent stuff.

There seems to be a misunderstanding. To clarify: your "set engine" and "list engine" phrases have no technical meaning to anyone other than yourself. It is trivial, and legitimate, to use lists to implement finite sets - and, so long as one doesn't expose this fact through the interface, it will not cause problems for later changing the underlying implementation to something other than lists.

In any case, if the domain needs lists, then use lists for those cases. Use sets where they are appropriate, such as managing data.

Interfaces may be fine for a specific thing in isolation, but re-inventing an entire engine to get a decent set of CollectionOrientedVerbs can be a lot of code and project complexity. Ideally we want to avoid impedance mismatches between domain structures and tool structures so that we don't have to build lots of translators and adapters. Using lists for sets is generally easier than using sets for lists in my experience, at least from an usage perspective. I cannot claim it would scale better for say a big-iron-type shop, but that may not be necessary for a given project. You seem to work in hardware-bottlenecked projects while I tend to work on human-labor-bottlenecked projects. Fitting the domain is thus a bigger concern than finding a structure with more potential machine optimization abilities.

I do quite a bit of work in both worlds. An example of a domain that fits list characteristics is dealing with waypoints in a route. Waypoints have explicit order, and each is associated with extra data (commands to execute on arrival or while on the trip to the next waypoint, for example). I would suggest that these waypoints be managed as data (a set of facts about a route) rather than as a list. Fundamentally, if something is true then stating it twice or stating it in a different order will make it no more true. By avoiding list operations (insert, foldl, foldr, reverse, append, etc.) you also avoid complexity. At the GUI level, operators want to manage and tweak waypoints and see sane names for them. Regardless of how it is managed, a variety of different views will need to be displayed to the operator: points with a poly-line, tables with a list of routes and waypoints with trees of commands under them, etc.

Based on your claims, it seems more like you have a single view in mind, and thus want list-like operations built into the data management from the start, whereas I recognize that dedicated view languages are necessary to deal with all the different views a human will need. I believe your appeal for pushing lists into the set operations to be short-sighted PrematureOptimization of one particular view at the expense of both human labor AND performance for all other views. You go on and on about making things easier or more practical for the human, but I've never felt that argument holds water.

Creating views/translation can be very time-consuming. It's a very large part of the programming I do: marshaling stuff in and out and back and forth between interfaces and languages. Thus, "JUST create views" is more than a "just" (JustIsaDangerousWord). Further, customers often don't know what they want.

All these are good motivations for having dedicated support for producing views of systems. My own work consists of the same, though I usually work with live data rather than batch processing. I have never been the one in our discussions to hand-wave away the value or cost of 'views' (unlike a certain TopMind who often argues X is Y because X can be viewed as Y, totally ignoring the cost and value and reality of said view).

Are we talking about "X is Y" in the head or in the machine? The machine may dictate a specific implementation, but that doesn't necessarily mean we have to think and communicate about a concept in terms of the machine's original/root representation. Sometimes thinking/communicating in "X is Y" can be a UsefulLie in some circumstances, but not others. If I misused a UsefulLie somewhere, I apologize. -t

If something can be turned into a list, then a bag, then a set, and back to a list with the least amount of fuss, then it can save a lot of time and effort as the customer learns by exposure what they really want.

I would much prefer a system that lets me view data simultaneously as a list, a tree, various graph views, and so on - than one that lets me change from one to another across multiple steps. I would also like all these views to be automatically associated with enough command-and-control options that I can issue updates or commands - i.e. without fighting the system or knowing precisely how the view was produced. I'm actually developing a new programming paradigm around these requirements: ReactiveDemandProgramming.

"Elegance" really just means "has a lot of nice properties". So, when someone claims 'elegance', you should stop them and ask for clarification on the precise properties or features they are examining.

Rather than 'elegance of expression', for example, one might aim to avoid various instances of the following properties:

semantic junk: is something that exists in the model, but that cannot be expressed in the syntax. For example, the Real numbers are mostly semantic junk: for every number that can be expressed, there is an infinite set of UnknowableNumbers that cannot be expressed.
semantic noise: exists when the syntax for expressing or using a model imposes more meaning than is necessary. For example, we can represent sets as lists, but doing so gives ordering properties to our sets, allows artificial distinctions between identical sets, and introduces possibility of duplicate elements. Or we could represent graphs as a pair of sets, but doing so introduces 'labels' for nodes that might make artificial distinctions between two semantically identical graphs.
- Just about all computer implementations of abstractions require unique addresses/ID's. Even a drawing of a graph uses some kind of pixel coordinate system and file path/name. The difference is what is "shown" to the viewer. If we show the viewer lines instead of address/ID's, were are not really getting rid of an identity system, just re-projecting it. -t
syntactic noise: is something that exists in the syntax, but that has no particular meaning in the model. For example, needing to add a semicolon to the end of each statement, or a comma between tuple elements, is 'syntactic noise' because it doesn't mean anything; it's just there to help the parser.
- There may be some WetWare issues here. For example, I've found that for me personally, commas between key-value pairs (x=1, y=2) usually make the association easier to read and see over using spaces, especially if syntax coloring is not available.
- What is the issue?
- Let me back up a bit. What do you mean by "no meaning to the model"? Domain model? Being useful to "the domain model" and being useful as a "brain marker" may be two different things.
- I mean it has no meaning to whatever is being expressed, even for general purpose programming. I'm not seeing how the 'brain marker' objection is relevant. I've already noted that the comma can help the parser.
- Do you mean computer parser or human "parser"?
- Both. A parser is a parser.
- I would like further clarification. It does "mean" something if it's used to indicate to humans and/or parser where something starts and ends. When you say "meaning in the model", you appear to be anthropomorphizing the model. -t

One can additionally add properties such as LiveProgramming (ability to edit, debug, and integrate without stopping everything); tolerance to changes in dependencies (related to CodeChangeImpactAnalysis); tolerance to disruption; ability to analyze real-time properties, information security, safety; suitability for parallelization, persistence, scalability, distribution, redundancy; which optimizations can be achieved without relying on clever or smart tools; and more.

In any case, the list of properties or features deemed 'elegant', or the avoidance of properties deemed 'inelegant', allows a more reasonable discussion of the system being studied.

I'd argue that "elegance" generally includes some form of conservation or reuse of concepts. It's rarely used to indicate feature quantity by itself. Lisp is considered "elegant", for example, because it does so much with the concept of nested lists. It reuses this concept over again for features that would be dedicated or use-specific and hard-wired in other languages. (It's also a case where elegance may not translate to practical benefits.)

Minimalism of concepts is not a sufficient condition for elegance (confer TuringTarpit). However, minimalism is a requirement for various other features: if primitives overlap in responsibility, they introduce semantic noise; if primitives do not cover all bases, then the system isn't general purpose; thus, any general purpose language design pursuing 'elegance' will have a minimal set of concepts to cover a lot of bases. Do not confuse 'feature quantity' with 'concept quantity'.

The boundary between "Feature" and "concept" can be overlapping. For example, is a "for" loop a feature or a concept? But I otherwise agree with your statement. As an analogy, we want to pack a tool bag with tools that can do a lot of jobs efficiently [and effectively], yet keep the weight of the bag down, and cost and complexity of the tools themselves down, and be relatively easy to learn to use. An "elegant" tool bag is one that appears to score well on all these when added up. However, some problems are that different people will assign different weights to each of these, or be concerned about how the toolkit works with their particular niche, the jack-of-all-trades-master-of-none syndrome.

[The concept is "iteration". The 'for' loop is a feature that implements iteration.]

There isn't much point arguing it, as TopMind was the first to use 'concept' in this page. I took him to mean 'language primitive concepts'. For 'feature' and 'property' I've been speaking more about KeyLanguageFeatures and useful engineering properties (interactive editing, ZeroButtonTesting, plug-in extensibility, portability, runtime upgrade, RealTime support and scheduling, synchronization of effectors (audio, video, robotic motion), concurrency control with resistance or immunity to deadlock and livelock and priority inversions, PersistentLanguage, memory safety, GarbageCollection, support for data-parallel and vector operations of the CPU, etc.). Thus, my understanding is that the for loop (assuming it is a keyword) would be a language primitive concept, whereas the primary feature it represents is iteration on a collection (as opposed to the collection-level operations you would see in APL or RelationalModel). --AnonymousDonor

I don't want to get stuck into a classification debate over what a for-loop "is". (And it's not necessarily "implementation" as claimed above.) But the most "elegant" solution would be to not hard-wire the for-loop into the language syntax and instead make it a library function. In other words, being able to "roll your own" loop and block constructs is the most "elegant" in terms of conceptual reuse, abstraction, and flexibility. (Perhaps it's more efficient performance-wise to hard-wire it, but let's stay away from that right now. Also, it may not necessarily be "iteration" either, as it may be implemented or optimized to avoid explicit looping under the hood.) -t

Quantity Versus Simplicity

For me, elegance is tied up with Occam's razor: have you achieved the goal with the minimum of entities/concepts/assumptions/keystrokes (as applicable to the context)? As such, it's highly likely that a programming language will be elegant in some domains and awful in others. APL springs to mind.

Simple code is not necessarily the same as compact code.

  // approach A:
  for i = 1 to 5 {
    foo(i, x)
  }
  //
  // approach B:
  foo(1, x)
  foo(2, x)
  foo(3, x)
  foo(4, x)
  foo(5, x)

I would say approach B is conceptually simpler than approach A because it requires grokking fewer concepts/idioms. However, it's less compact code-quantity-wise. There is thus a tradeoff of factors: fewest idioms versus smallest "code size".

The "overly educated" seem to usually assume approach A is "better". But in my experience, "excess abstraction" can often confuse other team members or future maintainers so much so that a simpler but more repetitious solution is often better for the organization in the medium and long run. (For this simple example, it doesn't matter much, but for more involved or complex designs, it does.)

By the way, my design decision between A or B would depend on a SimulationOfTheFuture to estimate which would be less maintenance down the road (approx 7 years). If the org is not likely to add or significantly change the quantities of foo, then I'd lean toward B, for example. If I saw the code samples but had no idea what the future change patterns are likely to be, I'm not certain I could make a reasonably-informed choice on that. (In this case it's trivial either way, but there are more complex variations of the concept.) -t

Here's a change scenario state that clearly favors B, by the way:

  foo(1, x)
  foo(2, x)
  foo(3, yyy)
  foo(5, x)  // note "order" change in first parameter
  foo(4, x)

I believe a lot of this is an economic issue: developers who grok a sufficient quantity of abstract coding to readily take up a given project are relatively rare. Plus, masters of abstraction often have poor people skills and are not likely to be domain experts. These things have roughly equal value to most organizations:

1. Technical skills
2. People/team skills
3. Domain knowledge

There is a short supply of people who do relatively well in all 3 (or at least a short supply of find-able matches). Thus, organizations will sacrifice some of #1 to balance the other two, which means not pushing the abstraction envelope so that a wider variety of personnel can take on the code. (Software-only shops will tend to value #2 and #3 less than other domains such that the ideal technique for one shop may not be for another.[1]) --top

Footnotes

[1] Personally I prefer being more of a generalist developer/analyst. Being a mostly heads-down coder gets tiresome to me. I like interaction with the end-users because of the direct feedback. Some of my best memories are praises from end-users who really liked certain design or UI features that help them get their work done smoother. Personally that is far more satisfying to me than money. But one must dig into domain issues and end-user viewpoints to obtain such solution "connections". -t

CategoryEvidence