Types Server Two Purposes

TypesServerTwoPurposes.

[Did you mean "serve"?]

Historically these have been lumped together. Lisp for example provides a test for "numberness" which just checks the internally discovered optimization. This allows programmers and optimizers to specialize their code.

I believe part of this is that our CPU's are biased toward explicit typing, C/C++ in particular. If they had built-in dynamic typing operations, the efficiency advantage may be less.

{What's "explicit typing"? Do you mean manifest typing? Static typing? In what way are "CPU's" [sic] biased toward "explicit typing"?}

For one, they don't natively process strings, including doing math on numbers stored as strings.

{Some processors do natively process strings, such as the VAX processors. I'm sure there are others. Processors do tend to be oriented toward their primary uses, though, so doing math on numbers stored as strings (which is highly questionable at a processor level) is unlikely to be a priority, especially since the performance gains are likely to be minimal. Determining whether a string is numeric or not is typically O(n) wherever you do it.}

This seems to be dismissing the "efficiency" claim above. However, O(n) still "matters" if there is a constant associated with actual performance.

{Why do you think my comment dismisses the "efficiency" claim above? I have not contradicted the fact that using optimised representations for specific types does result in significant performance advantages.}

{Your comment regarding O(n) does not make sense to me.}

When one does BigOh comparisons, one discards constants. However, there may be a big constant influencing the practical application of the above. Maybe it's say 5 times slower to do math on string-based numbers, for example. It could have a practical impact if there are a lot of such activities in a given program.

{You apparently do not understand BigOh. One discards the constant not because it is significant -- i.e., a "big constant" -- but because it is insignificant. What is significant is 'n'; in this case the length of the string.}

But we are not comparing general string processing. We are comparing the processing of "numeric" strings to the processing of dedicated "types" (such as "double"). A typical numeric string will be about 7 characters. For discussion purposes, lets look at the case where the average is 7, and then the case where it is 12. There's no reason why a CPU has to process such strings one-character-at-a-time. I realize there may be a DiscontinuitySpike if the string size goes past a certain size, but the frequency of such will vary, or can be mitigated by balancing the various tradeoffs.

Let's say there's a single machine operation to read two strings up to 16 characters each from a given RAM address, do addition, subtraction, division, or multiplication on, and put the result (up to 16 characters) in a register. (If it's longer than 16 characters, then other techniques would perhaps be used. Division may need a 2nd result register.)

{Determination of whether a given string is a number or not is essentially one character at a time. For example, to determine that "1234567" is numeric but "123456a" is not requires iterating over seven characters. Some clever optimisations may be possible, but these will still require n iterations where n is proportional to the length of the string.}

BigOh tells you the efficiency in the limit. But this is completely irrelevant here because all numbers are small and the constants factor surely dominates in this range. Anyway you seem to be in ViolentAgreement that the CPU could do numeric string processing. Note that x86 still has BCD support nobody uses it, therefore it is not optimized for - quite different from vector operation of limited precision which are heavily used...

[Actually, BigOh is quite relevant. The overall average computational efficiency difference between a scheme that internally represents values as their actual type, vs a scheme that requires string-to-numeric conversions on each operator invocation because all values are strings, is O(1) vs O(n). Whether "all numbers are small" (?!) or not is irrelevant. Optimisations of various sorts may improve the real efficiency, of course. One obvious optimisation is simply to -- at least at the machine level -- internally represent types, where appropriate, as native values rather than represent all values as character strings. High level languages can hide this, and Top's desire for a language in which all types are strings -- or at least can be viewed that way -- is not unreasonable. ToolCommandLanguage does this.]

BigOh notation mostly matters when comparing multiple orders of magnitudes of differences. I will agree that "native" types such as doubles will probably always be faster, but a chip that has number-string-friendly operations can greatly reduce the difference. Most CPU's don't, making the difference fairly large. For example, maybe the difference now is 100 to 1 on average, whereas it may be say 5-to-1 if the chip had native support for NS's. Thus, my statement that current chips are biased toward certain value models still stands. -t

[The biases are entirely reasonable, but feel free to buy a cheap FPGA board and implement your dynamic-typing-favourable CPU to show how it would be better.]

Why are they "reasonable"? I see it as a case of LeftHandedSyndrome. If many languages used the dynamic model, then chip makers may give them more attention. And no, I'm not going to build a fucking CPU in my garage.

[They are reasonable because TypeSafety is generally critical at the OperatingSystem level, even if not critical at various user levels. Building an experimental CPU using an FPGA is not unreasonable. The experimenter boards are cheap -- usually less than the cost of a mediocre laptop -- and it's a popular activity amongst hobbyists and in university-level courses. They're programmed using languages like VHDL.]

Dynamic languages are not intended for SystemsSoftware. Thus, your example is a straw-man. And, it's not Intel's job to dictate what programming languages people should use. If BrainFsck becomes common, then the chip makers should take that into consideration. They are hardware experts, not SE experts. Anyhow, I don't wish to mix up efficiency issues with language design and software engineering (SE) issues if we don't have to.


EditText of this page (last edited September 27, 2012) or FindPage with title or text search