Signatures And Soft Polymorphism

Question regarding your signatures in TypeSystemCategoriesInImperativeLanguages

   +(integer, string) returns string
If a parameter is designated "integer" in your model signature for a dynamic language, does that mean its type tag/indicator must be integer, or that the value/representation is "parsable as" (interpretable as) integer?. Both variations may exist in various dynamic languages. A more thorough model would address and clarify that issue.

The notation used in the signatures is based on the assumption that a value is V=(R,T) where R is the value's representation (a bit string, typically) and T is the value's type (which is always "string" in the case of Category D2 languages). T is what I presume you mean by "type tag/indicator". Where values can be encoded within strings, they are treated as type "string". For example, the signature of a binary operator DateDiff that returns an integer encoded in a string and parses its string operands to identify encoded calendar dates within them would be written as "DateDiff(string, string) returns string".

And another possibility is "voting" parameters where plus (for example) will consider the hard (tag) or soft (parsable) type of both parameters before making a decision. I've tried to enumerate all the possible combinations for two parameters and two types, but haven't been able to clean out the equivalence-related duplicates for presentation yet. (ThirtyFourThirtyFour has a sampler plate.) -t

That's conceivable, but I know of no popular imperative programming language that uses the "soft (parsable) type" for operator dispatch. There would be considerable performance overhead for doing so, because it would require parsing operands on every operator invocation. Implementing values as V=(R,T) means parsing occurs once (typically during LexicalAnalysis, when values are created from the literals that specify them) and operator dispatch can be done efficiently based on T rather than require the inefficient overhead of parsing R.

Scriptish languages often don't worry much about efficiency. And implementation can use the tag as a shortcut to avoid parsing a good portion of the time, but the result is the same AS IF parsing always occurred (but slower) such that we can simplify the model by saying it "always parses" and avoid conditionals in the model. I'm not going to assume the language was designed with efficiency as it's primary signature processing choice, for that would limit the possibilities that have to be considered. Partly for that reason, I don't like your signature approach. Plus, nobody's done a thorough survey of languages such that "soft polymorphism" cannot be ruled out. Further, such a language could say, "if speed is your goal, then make sure the tags (explicit types) are of the intended type to avoid a parsing step." Some of my earliest exposure to scriptish languages was for minicomputer OS scripting, and those were not designed for math and accounting, but rather gluing the likes of FORTRAN or BASIC programs together and managing or automating system tasks. (Those languages were often still far better than the shit one finds with Windows.)

Perhaps upper-case can be use to indicate "hard" signatures (tag-only) and lower-case for "soft" signatures. ("Soft" meaning "interpretable as").

 +(NUMBER, NUMBER) returns NUMBER
 // matches: 2 + 3
 // does NOT match "2" + "3"
 .
 +(number, number) returns NUMBER
 // matches: 2 + 3 and "2" + "3"
But I'm still not sure this is thorough and/or clear enough to cover the full gamut. The coding details are dumped onto the kit user without any help [sentence added later]. For one, the priority rules could get long or messy because soft types can overlap. I'm still trying to see if it can be fully regimented per ThirtyFourThirtyFour example. -t

Again, I've not seen a dynamically-typed imperative programming language that uses "soft" signatures for operator dispatch. I'm sure it could be done, though. The operator definitions would need some mechanism to identify what "soft" type their parameters expect.

Again, without a wider survey, it's difficult to say with certainty. Have you explicitly tested many? If you don't have a wide and thorough survey to rely on, I'd suggest it at least be covered in your "model" page as a disclaimer of some sort. I'm not comfortable at this point with the "hard polymorphism" assumption.

I am quite familiar with the TypeSystems of popular imperative programming languages and have explicitly and extensively studied them. Among C, C++, C#, Java, Javascript, Python, Perl, Ruby, VB.NET, VB, VBA, Objective-C and PHP, none perform operator dispatch on "soft" signatures, i.e., parsing string values at run-time to identify other values encoded within them, and then invoking operators based on the types of those values.

TutorialDee supports specialization-by-constraint, in which each type in an inheritance hierarchy is associated with a boolean constraint expression. When a value belonging to a particular type hierarchy is instantiated (or "selected", in TutorialDee parlance), the constraint expressions are evaluated to determine the most specific type to which the value belongs. A programmer can define constraint expressions that are based on parsing values. Since TutorialDee supports MultipleDispatch based on the most specific type of values, a programmer can use it to implement "soft" polymorphism as you've described it. I know of no popular imperative programming language that permits this. I know of no other language -- popular or otherwise -- that does this internally.

The closest approximation to "soft polymorphism" that is sometimes found "in the field" is the behaviour of certain operators in "string typed" (i.e., category D2 languages described in TypeSystemCategoriesInImperativeLanguages) languages, in which a few operators vary their behaviour depending on what values are encoded in their (always) string-typed operands. For example, a '+' operator -- whose operands are always a pair of strings -- may check to see if the operands encode numeric values, and perform addition if (and only if) they do. Otherwise, it performs string concatenation. Due to the performance overhead this represents -- the '+' operator has to parse its operands on every invocation -- its use is (appropriately) very limited. Generally, it's an order of magnitude more computationally efficient to associate type references with value representations once -- at the point where they're parsed in the source code -- rather than have to do it repeatedly by doing it at the point of each operator invocation.

Regarding efficiency, technically that's not true. An "internal" type indicator/tag can be kept to make processing shortcuts where appropriate. For example, a language parser could determine if a given literal has quotes or not (such as numeric) on load or pre-parsing stage (such as p-code). If it's a numeric (non-quoted), then an internal tag can track that fact and associate that fact with a variable upon assignment from the literal. When an overloaded operation comes along, the tag can be checked, and parsing skipped because it already knows the "value" represented by the variable is from a literal that's known to be numeric. But that's an implementation detail and one may have no behavioral way (IoProfile) to tell whether such tag is actually used. I use a programmer detectable tag as the category separation criteria, not implementation tags, because language implementation can change or have new vendors building interpreters that mirror existing behavior, but not implementation of the existing specimens. It's like caching in a sense in that it can be invisible to the programmer such that they don't "model" it in their head when reviewing code. An (internal) explicit type indicator/tag can improve efficiency without making its presence known in terms of I/O. As I prefer to describe it, a tag-based language acts like it has tags from a programmer's perspective and a non-tag-based language acts like it has NO tag. If tags are "inside" merely to speed up implementation but don't otherwise change observable behavior (IoProfile), then it's okay to model them as not existing, if prediction is the goal. (Modelling speed is another matter.) -t

It was established early on that IoProfile can't detect whether a "tag" is detectable or not. Remember the trivial C example? Your notion of an "internal type indicator/tag" is simply the conventional notion of a language with types, e.g., Category D1 on TypeSystemCategoriesInImperativeLanguages.

As for using an "internal type indicator/tag", once you do that you no longer have "soft polymorphism", you have conventional polymorphism.

Please clarify. I don't know what you mean. The programmer could NOT detect an internal-only tag anymore than they could detect caching.

If you're dispatching on type, that's ad-hoc polymorphism. See http://en.wikipedia.org/wiki/Ad_hoc_polymorphism

If they can detect tags, the language may employ conventional polymorphism. If they can't detect tags, the language may employ conventional polymorphism. There is no language which parses strings prior to invoking a polymorphic function in order to determine which implementation of a polymorphic function should be invoked. Internally, a function may parse strings and alter its behaviour accordingly, but every language can do that. It's not a distinguishing characteristic.

Basically it's a rule that says:

1. If you already parsed a literal (value/representation) during code parsing 2. You copy that value/representation into a variable during a regular assignment 3. The variable has not been changed since (via a non-tracked event) 4. If you tracked the result of the first parsing (tag) 5. Then you don't have to parse it again.

But the programmer cannot see this take place nor detect it through non-speed-related experiments (unless there is a bug). Also note that if the programmer cannot see how the dispatching is implemented under the hood, then classifying it as OO polymorphism versus case statements versus something else may be moot. That's essentially an implementation detail.

Who said anything about OO?

What exactly do you mean by "conventional polymorphism"?

See http://en.wikipedia.org/wiki/Ad_hoc_polymorphism which is "a kind of polymorphism in which polymorphic functions can be applied to arguments of different types, because a polymorphic function can denote a number of distinct and potentially heterogeneous implementations depending on the type of argument(s) to which it is applied."

Again, that's in implementation detail (and OO specific).

It's a characteristic of certain programming languages, typically referring to a facility that can be exploited by the programmer. A '+' operator that supports concatenation and addition may be described as a semantically overloaded operator, but the language does not necessarily support ad hoc polymorphism. It is common to many OO languages, but not unique to them. In particular, it is not dependent on "objects" in the usual OO sense, only on types.

There are multiple ways to skin the [insert preferred animal]. Note that I consider "soft" polymorphism to involve (at least) parse-based (or equivalent) examination of the "representation" to determine how to process the arguments. Perhaps "polymorphism" is a confusing word choice, but I don't want to dwell on vocab. The "selection" (dispatching) can potentially be based the explicit type indicator(s) and/or the value/representation, per specific language and/or operator. For an illustration, contrast lines 4837 and 94838 in TopsTagModelTwo. Whether OO polymorphism or case statements or HOF's are used inside the interpreter is a swappable implementation detail. The outsider (programmer) can only observe the patterns. "Tag and value combo X result in results pattern A, combo Y result in pattern B," etc. The explicit dispatching mechanism to do this combo "lookup" is an interpreter/model implementation detail.

There is no popular dynamically-typed programming language that performs operator dispatch based on parsing values. Some, however, have semantically overloaded operators.

TypeHandlingGrid discusses why I dismiss the "popularity" issue.

As has been pointed out, your dismissal is weak. If you're going to include unpopular languages, why not incorporate (say) predicate dispatch, instead of something obscure like "soft polymorphism"?

Somewhere else, I forgot where, I pointed out that in the field there is a pretty good chance one may end up having to use an embedded or bundled language, often a "scripting" language, that comes with a specialized product. For example, an automated telephone answering system (voice menus etc.) may come with its own embedded scripting language to code phone response logic. I've encountered at least a half-dozen of such embedded/bundled languages over the years. Most of them were AlgolFamily-styled languages and/or with domain-specialized key-words and usually didn't try to introduce "fancy" paradigms etc. because they are not marketing to experience programmers only, but rather a wide variety of IT shops with perhaps limited programming experience. They are selling primarily a domain product, not a programming tool, and don't want to hurt their market by using "high brow" techniques unnecessarily. (I didn't analyze the specific type characteristics of such languages in detail.) -t

Do you know of any language -- popular or obscure, embedded or general-purpose -- other than TutorialDee, that supports soft polymorphism? I.e., that parses operands to determine what type they encode, prior to operator dispatch, as part of the general operator dispatch mechanism?

There are numerous languages that overload certain operators -- most notably (and typically) "+" for both string concatenation and numeric addition -- but these do not rely on parsing operands prior to operator dispatch as part of the general operator dispatch mechanism. In every example I've seen where operands have to be parsed to determine whether "+" should be string concatenation or numeric addition, it's done by the "+" operator itself.

Now, you could argue that the "+" operator itself is a specialist dispatch mechanism that delegates to string concatenation or numeric addition, but that's covered by TypeSystemCategoriesInImperativeLanguages in the part that states:

"Some languages do not distinguish operand types outside of operators and treat all values as strings, so the only signature (for "+") is effectively: +(string, string) returns string In such languages, when "+" is invoked it internally attempts to convert its operands to numeric values. If successful, the operator performs addition and returns a string containing only digits. If the conversion to numeric values is unsuccessful, the operator performs string concatenation on the operands and returns the result."


Something like TypeHandlingGrid may be more flexible and compact for describing combinations of hard and soft polymorphism. Language usage frequency is also discussed.


EditText of this page (last edited October 27, 2014) or FindPage with title or text search