Extension of TupleDefinition.
How exactly is a tuple different than a map (dictionary)?
Tuples do not have keys. They are 1-dimensional, static arrays.
Arrays are a sub-set of maps. The "keys" are (usually) sequential integers.
I am not sure "array" is well-defined either. Different languages have varying features in their array implementation and interfaces.
I should have said vector instead of array, which is the more correct terminology. My C++ past at work. In any case, while in the strict sense arrays can be considered a subset of maps (because they're indexed by integer), it's a useless definition for explaining why tuples aren't maps.
Perhaps this is a LaynesLaw issue. Positional "parameters" can be used to represent named parameters and visa versa, for example. It is a matter of convenience and circumstances which is preferred. This is because a position can stand for only an integer, or can represent something else such as a color. It just requires a conversion between position and color either explicitly or by convention.
<rant enjoyable="true" irrelevant="true" unproductive="definitely">
- It is not irrelevant because such discussion can spark alternative ways of thinking about or implementing something, such as dynamic RDBMS (DynamicRelational). -- top
I blame the
RelationalWeenies for this one. In most branches of mathematics, a "tuple" is a fixed-length, ordered, unlabelled sequence of things (technically, a "tuple" is an abstract concept that requires a size for its instantiation - instead of speaking of a "tuple", you instead often hear of a "triple" or a "5-tuple" or a "
n-tuple").
However, in the RelationalModel, the word "tuple" is used to mean what most mathematicians call a "record" - an unordered, labelled aggregation of things.
And of course, it's always amusing that in much of the industrial database literature (in particular, documentation for numerous RDBMS products as well as SQL), the word "record" is often used for the individual rows within a table (here referring to the data structure in many programming languages, not the mathematical concepts) often prompting database theoreticians like ChrisDate to tsk-tsk-tsk the authors of such literature for their sloppy terminology - such a thing ought to be called a tuple, according to them. But when you consider the centuries of mathematical history that occurred before EfCodd started writing about the RelationalModel - who's right and who's wrong?
</rant>
Actually there's no need to blame DrCodd for this one. Nor there is need for LaynesLaw, unless Top desperately seeks yet another fight for a definition to circularly prove his points.
- I only focused on it is an issue when others got picky about its usage. Otherwise, I would have let it be. -- top
- Maybe I'm wrong but you did try to imply that in RelationalModel a table is an array of dictionaries or something like this, and when challenged came here to suggest a LaynesLaw fight to suit your purpose. So no, tuple is not a map or dictionary structure, although you can use a dictionary to represent tuples. And defining a table as an array of dictionaries serves no useful purpose. Just let the RelationalModel be defined as DrCodd did it and the database theory used it for decades already. Will you ?
- Because DrCodd, although otherwise a genius, was a poor communicator.
- And you think you are better ?
- I would bet money that more people know what a "map" or "dictionary" is than "tuple". And, yes, I do feel that I could write a more approachable book on relational than DrCodd. His may be academically more precise, but that is not the only audience for such books. I would start by using everyday concepts and analogies, and build on them as we go deeper into the topic. I wish to bring tables to the masses so that they can share in my joy of tables.
- Or you might read ISBN 0764507524
- DrCodd changed his name?
- Uh, no. The point is, that non-technical, non-scholarly database books are a dime a dozen. Oh, and joy of tables? It sounds to me like you're not peddling science; instead you're peddling religion. :)
- [Some are not fond of the term "dictionary" since we all know a dictionary is that book with pages that you flip open. I for one am a bit fed up with all the terms like associative array, dictionary, hash - it just seems people have too many terms to describe the same thing. I'm also annoyed that there is a struct, record, tuple - it seems there is some overlap. I suppose I could just GetOverIt though.]
- Forgive me for telling you bluntly but you are either delusional or suffering from UnconsciousIncompetence. Most OO developers whom you fought endlessly on comp.object still hold DrCodd in high esteem, while they consider you more or less of a troll. Your approach certainly had little convincing power out in the wild, therefore your hypothesis was not quite confirmed in practice. But let no facts interfere with your deeply held convictions, write your own book, Top, and see how that goes. Heck, you can even write it on wiki, if you are careful and honest to avoid the confusion between the proper RelationalModel and your private "school of thought" on the subject. You can easily inflate the right to have your own opinion, to the point where you want that to have equal stature with others which have a lot more to show for it. In other words be mindful of EveryoneHasHisOpinion / EveryoneHasHisDefinition fallacy.
- I always find that if one says the same thing from different viewpoints, the audience is more likely to cling. No one single way of explaining something will gel with everybody. Thus, if you try multiple approaches you increase the chance of one of them gelling. Explaining tables both in terms of maps and vectors increases the "learning paths" available. As far as pleasing the comp.object crowd is concerned, it is a lost cause. They think OO is the greatest thing since sliced bread and I will never convince them otherwise. Perhaps OO models their pointer-hopping heads. I can't change that, other than make sure they don't extrapolate subjective preferences onto everybody else, which is what I feel they do. -- top
- [On a softer note: It's an admirable sentiment, but do be sure in your approach to make clear what is analogy, and where the analogy breaks down. I think you'll find in the end that it will be a more verbose way of explaining something that really isn't very complicated to begin with. I personally have yet to meet a developer who couldn't understand a simple explanation of "tuple" (in the relational sense) without resorting to analogies to more complex data structures. -- DanMuller]
- Seconded. Vectors are very simple, dictionaries are more complicated. Besides, do you really think that "A tuple is a dictionary, except you can only access values by index, and the size is constant, and you can't change the values" is a good definition? Maybe we need JustLikeaFooButDifferent?.
- But the size does not have to be constant, at least not per "row". See DynamicRelational and PageAnchor "Equiv01" below. Does doing that make it non-relational? I don't think so. It is just a different approach, a different view or implementation of the same info. We could enumerate all the possible attribute names in a central spot and implement it positionally, but that is an implementation detail. It is representationally equivalent as far as the output.
- Doing that definitely makes it non-relational. That conflicts with the very definition of relation. Go check it. Since what you can represent this way is superset of relations, the output of your operators could only be "equivalent" in cases where the inputs are in fact relations, and you'll have to go through each and every relational operation and define very exactly what it means for your "super-relations". Then you'll have to work at making it all logically consistent, proving that what you end up with is a closed algebra (or that it isn't, and live with the unaesthetic consequences), and explain it well enough to convince others that it is. What you're talking about is a whole new theory, you can't blithely change one of the underpinnings of the relational model and pretend that you're only "tweaking" it - every definition and logical consistency of every operator proceeds from and depends on these fundamental definitions. -- DanM
- How is it a superset? Please show an example. Any set of maps can be turned into a set of tuples and visa versa. As far as I can tell, they are representationally equivalent. (Which is more "convenient" is a separate matter.)
- By "size" I assume that you mean the number of attributes in a tuple. If I've misunderstood, explain. By definition, each tuple in a given relation has the same number of attributes, with the same names. (I'll leave domains aside for the moment.) So one of your tables would only be relational in the case where each one of its rows had the same number of fields, with the same names. Should be obvious without an example, which would be tedious to construct. -- DanM
- Afterthought: If you define the heading of one of your tables to be the union of the headings of all the rows (or conversely restrict your rows to have no attributes that are not defined in the heading of the owning table), and treat each row as if it had this heading, but consider "missing" attributes to be null, then you could indeed call your view relational. You've just described it differently and confusingly. (This interpretation didn't come to me immediately, since I'm firmly in the anti-null relational camp.) -- DanM
- There's a pro-null relational camp? :)
- Sadly, yes, we haven't managed to nullify them yet.
- Of course, this view is only relational if the union of all of the attributes - the maximal set - is fixed. If you can insert arbitrary attributes into arbitrary rows, with no constraints on what can get stuffed where; that can't be considered relational under any definition of the term.
- Well, you'd be potentially redefining the relation heading with each insertion. Yup, that definitely puts you in new territory.
- This is the "dynamic" approach described in MultiParadigmDatabase (mentioned above already). If it is not clear there to you, you are welcome to add clarification. Relational does not limit tables "growing" columns in the middle of operation that I know of. Many existing RDBMS allow that. We are just using that feature an extreme in a dynamic relational DB. I generally defined a non-explicit column as blank, but if you want to use null it does not change the general concept. It can act just like a static RDBMS, it is just that a list of all (active) columns takes longer to compute. -- top
- OK, this can fit in with thoughts that I touched on in DoesRelationalRequireTypes. But I don't like the implications. I'll explain there later tonight if I can make time. (However, I think you're quite wrong about tables changing in mid-operation in either relational theory or existing RDBMS. Data definition operations are distinct and outside of the relational algebra.) -- DanM
- If they are outside the definition, then it is not something that can "violate" it, it appears. I would note that I purposely made a non-distinction between empty cells and non-existent cells in dynamic relational proposal (MultiParadigmDatabase). Does this potentially violate relational theory because it creates a potentially infinite set of empty columns, and if so does it contribute to nasty side-effects beyond traditional static-vs-dynamic issues?
- Depends on what "outside the definition" means. If A' introduces new requirements on A such that none of the constraints on A are violated, then A' can still be part of A. If A', on the other hand, waives requirements on A, then A' is most definitely not part of A (though it might be the case then that A is part of A'). In the present instances, adding variable schemas like this is in direct contradiction to the RelationalModel, which holds that the shape of tables are fixed. That doesn't mean your idea is bad; it just means it's not strictly "relational" any more (here, defining "relational" to refer to DrCodd's model specifically; not to something that might have similar properties).
- {Note that Microsoft's RDBMS allows one to add a column while it is still running. Although, it must be an explicit DBA action, it would probably have similar implications. I don't see how that causes any great logical conundrums. I suppose it may introduce name-space collision possibilities, but these can happen even if you stop and start the RDBMS to add a column.}
- That is what I cannot figure out. Does dynamic schemas violate relational theory, or only relational tradition? Has DrCodd or his colleagues ever considered the idea of dynamic schemas, and if they did, did they reject them on practical grounds, on theoretical grounds, etc.
- You'll probably have to go tromping through the literature to answer that question (I don't know myself). Based on a back-of-my-hand estimation, I suspect a consistent theory could be built on such a system, but I haven't delved into it deeply. OTOH, it is often the case that relaxing a constraint on some system/model produces undesirable results (making the system inconsistent, making formal manipulation or examination of the system undecidable or intractable, etc.) There is undoubtedly lots of research material out there on dynamic schemas and other relational extensions.
- So in summary, you are not necessarily disputing its "relational" qualifications, at least not without researching it further. But, you are suggesting it may have practical downsides (which are probably similar to the heavy-typing versus dynamic arguments/battles already found on this wiki.) --top
The existence of overloaded words is acceptable in mathematics, especially since the two usages are related (just think of a definition for isomorphism).
So for simple-minded mathematical use an n-tuple is a function f : {1..n} -> X where X can be whatever. For students of set-theory this is technically wrong because a function is a special kind of a relation, and a relation is a set of tuples, so voila, we have a circular definition. Therefore a little technical trick allows us to define the ordered pair:
(a,b) := {{a}, {a,b}}// other variations of this trick are possible as well
with the basic theorem
(a,b) = (a',b') <=> (a=a') /\ (b=b')
and then by induction we defined n-tuple to be an ordered pair of (n-1) tuple and the n-th element, and then we prove a similar theorem on equality.
The definition of an n-tuple as a function f : {1..n} -> X isn't wrong, it can be made to work perfectly easily within any reasonable SetTheory. Define a relation as a set of maplets, and a maplet a --> b as {{a}, {a,b}}. There are various other ways.
But the mathematical trick above, albeit elegant and useful for its purpose is not practical for programming, so what DrCodd did was to "decorate" the definition, see DecoratorPattern, with stuff that doesn't change much the mathematical apparatus, but makes it more convenient for programming, namely column names and domains for each columns. The two versions of tuples are perfectly interchangeable, FoundationsOfDatabases, for example, uses both versions: the authors call that the named versus the unnamed "perspective" of the same RelationalModel. Some theorems are easier to prove in the unnamed perspective, while others, and especially practical example and exercises are easier to work with in the named perspective. Definitely for programmers the named perspective is generally better, because we don't want to refer to information items by integers rather than names.
That's an advantage of the f : {1..n} -> X formulation; it can easily be generalized to f : Name -> X. No need for separate "perspectives".
Nice cleanup of the original page, whoever did it. Thanks. -- DanM
Tuples impose meaning to order, but columns are not required to have order in relational theory IIRC. Is there a conflict there? It seems to me a map is more appropriate because it does not require an ordering. In functional terms, any "cell" can be referenced via the following function:
cellValue = getCell(table, primaryKey, columnName)
Whether that is implemented with tuples, maps, or gerbils on treadmills is an implementation detail. If "tuple" is hard-wired into the definition of relation, then perhaps
DrCodd made a
PrematureOptimization of sorts. As far as a "getRow" function, what comes back is also an implementation detail. But if columns have no required order, then where does the order of a tuple come from if we use a tuple in the implementation? We are back to the same issue. Maps seem a better fit because they don't require any more information that is required by the definition.
-- top
Oh, you just don't get it. Tuple is an abstract mathematical concept. It's a specification. You can implement it with stone tablets if that's what suits you. Or with hashmaps. Or with whatever. Provided whatever you implement matches the mathematical definitions. Now the mathematical definitions for tuple in database theory are two and they are used interchangeably: one is based on 1..n indices and one uses named columns. Again, the one without column names is not used to implement anything nor to limit the programmer in any way shape or form. It's just used to prove some theorems. and FYI DrCodd used the named perspective.
So your comments are totally misinformed.
If named columns are used, then it is or is equivalent to a map. Done.
No, it ain't. Almost done. Maybe I should ask you to provide TopDefinitionForMap?, then who knows maybe you refer to a "map" that is the same as tuple.
In functional terms, a map can be defined as:
value = getValue(mapInstance, key)
setValue(mapInstance, key, value)
booleanValue = keyExists(mapInstance, candidateKey)
(Bonus options are iterators, getListOfKeys(), clearMap(), etc.)
So, no, it's not the same thing. See TupleDefinition and you can convince for yourself that it is not the same thing.
-
- Immutability. A tuple is immutable. Further, the order of a tuple only matters so far as the ordering of two related tuples are the same (i.e., (mapcar [1 2 3] square) results in [1 4 9]). In this view, an immutable map may be held to be equivalent to a tuple iff the keys can be completely arbitrary. Therefore:
-
- [1 2 3] != [3 2 1]
-
- ( [1 2 3]->[1 4 9] ) == ( [3 2 1]->[9 4 1] )
-
- {a:1 b:2 c:3} != {x:1 y:2 z:3}
-
- ( {a:1 b:2 c:3}->{a:1 b:4 c:9} ) == ( {x:1 y:2 z:3}->{x:1 y:4 z:9} )
-
- {a:1 b:2 c:3} != [1 2 3]
-
- ( {a:1 b:2 c:3}->{a:1 b:4 c:9} ) == ( [3 2 1]->[9 4 1] )
-
- Defend or refute. :)
-
- -- WilliamUnderwood
What does "immutable" mean? I could not find it in any online math dictionary.
That's because it's a computing concept. Something is immutable if it can't be changed. In imperative languages you constantly change the value given to a variable. In Functional languages you (often) can't, or don't. If something can't have its value changed, it's immutable.
That explains why the only search hits I got were complaints about Java's immutable strings. Anyhow I fail to see how immutability is a big deal. In practice, rows are obviously mutable, and one can use the concept of an immutable map if need be to describe functional versions of the world. -- TopMind
You need to re-read AnIntroductionToDatabaseSystems. Technically, rows (and tables!) are not mutable. If they were, they would then have ObjectIdentity, etc. What happens (at the theoretical level) is that a database is a collection of RelationalVariables, and that these relvars are the only mutable things in the database. When you do an UPDATE operation, what theoretically happens is that a new relation is created, containing the stuff in the old relation plus whatever changes, and the relvar is then bound to this new relation (the old one may be discarded). In many cases, the relation is only referenced in one place (through the relvar), so an in-place update can be done (in TypeTheory, relations in a database are LinearTypes). Of course, all of this is transparent to the client. -- ScottJohnson
This sounds like the "no side-effects" concept from functional programming. Whether it is updating or replacing is perhaps an EverythingIsRelative issue. Maybe you don't really move your arm, but God quickly replaces an old you with a new one but with a changed arm position. Either way, it looks the same to us.
The side effect is that the relvar changes. The whole idea of no side effects in functional programming has only confused the industry and done damage IMO. A computer has a hard drive and memory. The whole point of using a computer is to have side effects galore. A relvar is very very mutable.
PageAnchor: Equiv01
Here's an example to clarify the equivalency issue:
A: // missing equiv to "null"
<row foo="zock" bar=7>
<row bar=5
<row foo="ff">
B: // (traditional)
<row foo="zock" bar=7>
<row foo=null bar=5>
<row foo="ff" bar=null>
-top
See also: DataStructure