Smart Data

This is a tentative name for a whole area of computing that I feel has been largely neglected, namely the mismatch between the data types used in the real world and the data types supported by most programming languages. Perhaps this topic is addressed in another page on this wiki - if so, could someone connect or refactor them as appropriate.

Most numbers in the real world have units attached to them - for instance, distances are in feet, metres or parsecs, bank balances and prices have to be in some currency, dates and times have to be relative to a given time zone, and so on, but most computer languages treat all quantities as if they were dimensionless. Most business values in computer applications do not even have an explicit precision, as floating point is not appropriate for them, so this knowledge is only held within the program code. When programmers make incorrect assumptions about the dimensions of data, you get examples like the recent Mars Lander debacle. Here is a quote from a NASA release about the loss of the Mars Lander:

The peer review preliminary findings indicate that one team used English units (e.g., inches, feet and pounds) while the other used metric units for a key spacecraft operation. This information was critical to the maneuvers required to place the spacecraft in the proper Mars orbit.

I also feel very strongly that dimension checking should be done at compile time, not run time. You don't want to discover an error like the one referred to above when the lander is approaching the Martian surface! Does anyone know of a language where you are forced to use units, rather than it being an afterthought?

I have also made two attempts at describing such a system: one in the business arena, and one involving physical units. Feedback would be appreciated. They are respectively

http://www.jpaulmorrison.com/busdtyps.shtml, and http://www.jpaulmorrison.com/physunits.shtml


There are several libraries for Haskell which exploits its type checker to ensure proper units of measurement throughout a calculation. I found 3 via Google. --SamuelFalvo?

In the latter I ran into character set problems - naming physical units like square cms following Java naming conventions is highly problematic. However, you cannot ever expect a programming language to accept units of measurement precisely as we'd write them in a physics journal, just as we cannot expect languages to express type notation as written in computer science journal entries. Approximations can be agreed upon, but that is as far as things will ever go.

Maybe, but almost certainly not likely at all, we will have to invent yet another programming language! The absence of suggestions is telling, and many folks find Haskell's solution more than adequate for their needs.

Additional suggestions would be welcome.

Possible connections: ValueObject, BusinessObject


For one approach, see FrinkLanguage.


For a good analysis, and code, to tackle this problem in C++, using templates, see "Scientific and Engineering C++, an introduction with advanced techniques and examples", John J. Barton and Lee R. Nackman, Addison Wesley Longman, 1994, ISBN 0-201-53393-6 , chapter 16.

See also:- http://se.ethz.ch/~meyer/publications/OTHERS/scott_meyers/dimensions.pdf


EditText of this page (last edited July 30, 2010) or FindPage with title or text search