Tuple Definition Discussion

Extension of TupleDefinition.

How exactly is a tuple different than a map (dictionary)?

Tuples do not have keys. They are 1-dimensional, static arrays.

Arrays are a sub-set of maps. The "keys" are (usually) sequential integers.

I am not sure "array" is well-defined either. Different languages have varying features in their array implementation and interfaces.

I should have said vector instead of array, which is the more correct terminology. My C++ past at work. In any case, while in the strict sense arrays can be considered a subset of maps (because they're indexed by integer), it's a useless definition for explaining why tuples aren't maps.

Perhaps this is a LaynesLaw issue. Positional "parameters" can be used to represent named parameters and visa versa, for example. It is a matter of convenience and circumstances which is preferred. This is because a position can stand for only an integer, or can represent something else such as a color. It just requires a conversion between position and color either explicitly or by convention.


<rant enjoyable="true" irrelevant="true" unproductive="definitely">

I blame the RelationalWeenies for this one. In most branches of mathematics, a "tuple" is a fixed-length, ordered, unlabelled sequence of things (technically, a "tuple" is an abstract concept that requires a size for its instantiation - instead of speaking of a "tuple", you instead often hear of a "triple" or a "5-tuple" or a "n-tuple").

However, in the RelationalModel, the word "tuple" is used to mean what most mathematicians call a "record" - an unordered, labelled aggregation of things.

And of course, it's always amusing that in much of the industrial database literature (in particular, documentation for numerous RDBMS products as well as SQL), the word "record" is often used for the individual rows within a table (here referring to the data structure in many programming languages, not the mathematical concepts) often prompting database theoreticians like ChrisDate to tsk-tsk-tsk the authors of such literature for their sloppy terminology - such a thing ought to be called a tuple, according to them. But when you consider the centuries of mathematical history that occurred before EfCodd started writing about the RelationalModel - who's right and who's wrong?

</rant>


Actually there's no need to blame DrCodd for this one. Nor there is need for LaynesLaw, unless Top desperately seeks yet another fight for a definition to circularly prove his points.

The existence of overloaded words is acceptable in mathematics, especially since the two usages are related (just think of a definition for isomorphism).

So for simple-minded mathematical use an n-tuple is a function f : {1..n} -> X where X can be whatever. For students of set-theory this is technically wrong because a function is a special kind of a relation, and a relation is a set of tuples, so voila, we have a circular definition. Therefore a little technical trick allows us to define the ordered pair:

 (a,b) := {{a}, {a,b}}// other variations of this trick are possible as well
with the basic theorem
 (a,b) = (a',b') <=> (a=a') /\ (b=b')
and then by induction we defined n-tuple to be an ordered pair of (n-1) tuple and the n-th element, and then we prove a similar theorem on equality.

The definition of an n-tuple as a function f : {1..n} -> X isn't wrong, it can be made to work perfectly easily within any reasonable SetTheory. Define a relation as a set of maplets, and a maplet a --> b as {{a}, {a,b}}. There are various other ways.

But the mathematical trick above, albeit elegant and useful for its purpose is not practical for programming, so what DrCodd did was to "decorate" the definition, see DecoratorPattern, with stuff that doesn't change much the mathematical apparatus, but makes it more convenient for programming, namely column names and domains for each columns. The two versions of tuples are perfectly interchangeable, FoundationsOfDatabases, for example, uses both versions: the authors call that the named versus the unnamed "perspective" of the same RelationalModel. Some theorems are easier to prove in the unnamed perspective, while others, and especially practical example and exercises are easier to work with in the named perspective. Definitely for programmers the named perspective is generally better, because we don't want to refer to information items by integers rather than names.

That's an advantage of the f : {1..n} -> X formulation; it can easily be generalized to f : Name -> X. No need for separate "perspectives".


Nice cleanup of the original page, whoever did it. Thanks. -- DanM


Tuples impose meaning to order, but columns are not required to have order in relational theory IIRC. Is there a conflict there? It seems to me a map is more appropriate because it does not require an ordering. In functional terms, any "cell" can be referenced via the following function:

  cellValue = getCell(table, primaryKey, columnName)

Whether that is implemented with tuples, maps, or gerbils on treadmills is an implementation detail. If "tuple" is hard-wired into the definition of relation, then perhaps DrCodd made a PrematureOptimization of sorts. As far as a "getRow" function, what comes back is also an implementation detail. But if columns have no required order, then where does the order of a tuple come from if we use a tuple in the implementation? We are back to the same issue. Maps seem a better fit because they don't require any more information that is required by the definition.

-- top

Oh, you just don't get it. Tuple is an abstract mathematical concept. It's a specification. You can implement it with stone tablets if that's what suits you. Or with hashmaps. Or with whatever. Provided whatever you implement matches the mathematical definitions. Now the mathematical definitions for tuple in database theory are two and they are used interchangeably: one is based on 1..n indices and one uses named columns. Again, the one without column names is not used to implement anything nor to limit the programmer in any way shape or form. It's just used to prove some theorems. and FYI DrCodd used the named perspective.

So your comments are totally misinformed.

If named columns are used, then it is or is equivalent to a map. Done.

No, it ain't. Almost done. Maybe I should ask you to provide TopDefinitionForMap?, then who knows maybe you refer to a "map" that is the same as tuple.

In functional terms, a map can be defined as:

  value = getValue(mapInstance, key)
  setValue(mapInstance, key, value)
  booleanValue = keyExists(mapInstance, candidateKey)

(Bonus options are iterators, getListOfKeys(), clearMap(), etc.)

So, no, it's not the same thing. See TupleDefinition and you can convince for yourself that it is not the same thing.

Immutability. A tuple is immutable. Further, the order of a tuple only matters so far as the ordering of two related tuples are the same (i.e., (mapcar [1 2 3] square) results in [1 4 9]). In this view, an immutable map may be held to be equivalent to a tuple iff the keys can be completely arbitrary. Therefore:

[1 2 3] != [3 2 1]
( [1 2 3]->[1 4 9] ) == ( [3 2 1]->[9 4 1] )

{a:1 b:2 c:3} != {x:1 y:2 z:3}
( {a:1 b:2 c:3}->{a:1 b:4 c:9} ) == ( {x:1 y:2 z:3}->{x:1 y:4 z:9} )

{a:1 b:2 c:3} != [1 2 3]
( {a:1 b:2 c:3}->{a:1 b:4 c:9} ) == ( [3 2 1]->[9 4 1] )

Defend or refute. :)

-- WilliamUnderwood

What does "immutable" mean? I could not find it in any online math dictionary.

That's because it's a computing concept. Something is immutable if it can't be changed. In imperative languages you constantly change the value given to a variable. In Functional languages you (often) can't, or don't. If something can't have its value changed, it's immutable.

That explains why the only search hits I got were complaints about Java's immutable strings. Anyhow I fail to see how immutability is a big deal. In practice, rows are obviously mutable, and one can use the concept of an immutable map if need be to describe functional versions of the world. -- TopMind

You need to re-read AnIntroductionToDatabaseSystems. Technically, rows (and tables!) are not mutable. If they were, they would then have ObjectIdentity, etc. What happens (at the theoretical level) is that a database is a collection of RelationalVariables, and that these relvars are the only mutable things in the database. When you do an UPDATE operation, what theoretically happens is that a new relation is created, containing the stuff in the old relation plus whatever changes, and the relvar is then bound to this new relation (the old one may be discarded). In many cases, the relation is only referenced in one place (through the relvar), so an in-place update can be done (in TypeTheory, relations in a database are LinearTypes). Of course, all of this is transparent to the client. -- ScottJohnson

This sounds like the "no side-effects" concept from functional programming. Whether it is updating or replacing is perhaps an EverythingIsRelative issue. Maybe you don't really move your arm, but God quickly replaces an old you with a new one but with a changed arm position. Either way, it looks the same to us.

The side effect is that the relvar changes. The whole idea of no side effects in functional programming has only confused the industry and done damage IMO. A computer has a hard drive and memory. The whole point of using a computer is to have side effects galore. A relvar is very very mutable.


PageAnchor: Equiv01

Here's an example to clarify the equivalency issue:

 A:  // missing equiv to "null"
 <row foo="zock" bar=7>
 <row bar=5
 <row foo="ff">

B: // (traditional) <row foo="zock" bar=7> <row foo=null bar=5> <row foo="ff" bar=null>

-top


See also: DataStructure


EditText of this page (last edited March 10, 2012) or FindPage with title or text search