Pointers Versus Keys

First, go read WhatIsaPointer.

Advantages of keys (where "keys" means a triple of <database, table, CandidateKey>; or possibly just <table, CandidateKey> if the database is implicit; or possibly just <CandidateKey> if both the database and table are implicit).

Keys have a simple text representation which is portable:
- Generally keys can be used or "read" by multiple languages
- Keys usually have a human-readable/usable representation
- I thought keys could be any possible database format, for example, text, numeric, or date. Furthermore, keys can be multiple columns.
- I don't think it is a matter of native database types, but of being "printable".
- [How do you "print" a key on several columns such that it can be unambiguously interpreted by multiple languages?]
Pointers have a simple machine specific representation which is portable:
- Pointers can always be used or "read" by multiple languages.
  - Please clarify
- Pointers always have the same representation, i.e., a machine specific location.
Keys are an alternative way to reference stuff, but not the only way. In tables, one can look at just an entity's contents without any use of (visible) pointers. Pointer-based approaches often have pointers as the *only* or primary way to "get to" stuff.
- Is that a necessary trait of pointers, or just a detail that is commonly found?
- In objects, one can look at just an entity's contents without any use of (visible) pointers.
Keys have "lasting power". If you print the key on paper today, it still means what it did a few weeks from now. (However, some keys are recycled in practice to keep them small.)
- [Assuming you have a consistent keying context. If you do, you can probably also just as easily dereference the same pointer.]
Pointers provide a consistent way to access a structure. No need to revert to an alternative.
Keys are not tied to RAM addresses or architecture.
- Depending on the semantics of your pointer, that may or may not be true. Were machine/disk addresses on your RDBMS to be exported (certainly an AntiPattern), you'd have that problem and many more. OTOH, other types of "pointer" could be conjured up without this issue.
- Keys are most definitely tied to architecture; they define the database architecture.
Pointers are tied to RAM addresses ensuring every element has a pointer.
Keys can optionally be "domain keys", such as Social Security Number.
Pointers are independent of the data held, hence pointers do not have uniqueness concerns.
Keys are an optional way to do "joins" or look-ups, but not the only way. One can join based on calculated criteria in relational algebra. In relational, keys are treated just like any other attribute.
Pointers provide a complete solution to do "joins" or look-ups without a need to fall back to another mechanism.
Pointers are often hidden from the programmer-user and completely hidden from the end-user.

[Note: The following is an attempt to refactor the above into a side by side comparison with any evaluation of benefits removed. Please feel free to edit to improve objectivity as needed. This was not an easy task and I added and edited text as I felt necessary for the comparison, but no bias was intended. Fix as needed.]

Portability
- Keys usually map to language primitive data types making them usable across languages
- Pointers have a simple machine specific representation which is portable
Alternatives
- Keys are an alternative way to reference stuff, but not the only way.
- Pointers provide a consistent way to access a structure. No need to revert to an alternative.
Architecture
- Database architecture is represented by keys.
- Program architecture is represented by pointers.
Coupling to Data
- Keys can optionally be "domain keys"
- Pointers are tied to RAM addresses; Pointers are independent of the data held.
Visibility
- Keys are visible to the end-user
- Pointers are often hidden from the programmer-user and completely hidden from the end-user
TopMind claims that keys are accessed via relational operators, while each class may invent its own interface to pointers. Further, he says without evidence, relational fans believe this offers consistency in design, while OOP fans see it as styming creativity and modeling flexibility.
Query-ability - For TopMind, it is usually fairly easy to query keys to study them or take inventory, but not pointers. One must study pointers via navigational query techniques (see NavigationalDatabase).

To me the difference is that keys usually have an ascii (or at least "printable") representation that is tied to a model of something external to the machine. For example, an employee ID is tied to an employee, not a slot in RAM. One talks about the key as if it labels some physical object or at least something that is part of the domain. -- top

If keys are identifiers for members of sets, then a pointer would be a key to the set 'memory'. If the use of a key is to allow access to an individual member of a set, then a process for retrieving the data stored at the key could be -

 Data = Get.Row.Value(Set.Name, Key)

and the syntax for retrieving it from a pointer could be -

 Data = Get.Variable.Value(Variable.Name)

but this syntax could easily be envisaged as -

 Data = Get.Row.Value("Current.Scope", Variable.Name)

So I think keys and pointers are the same. The difference is only in syntax, where the implicit operation is conveniently reduced to the naming of the pointer variable in code. -- PeterLynch

Isn't the biggest difference that keys are persistent while pointers are transient? I can write down a key on a piece of paper and use it a year later. A pointer may not be consistent between application runs nor even within a single application run. Keys and pointers each reflect the transience of the data they refer to.

Everything is transient.

[Perhaps the paragraph above should state, "The choice of key vs pointer may reflect the degree of transience (which may be dependent on the environment) of the data to which it refers."]

As PeterLynch stated above, conceptually a pointer is simply an auto-generated key belonging to a domain of memory addresses -- keeping in mind that what we mean by "memory address" may vary from environment to environment. However, there is a significant difference: Since a pointer is typically an index into an linear array of machine memory slots, retrieval of referenced data is direct. Or, at least it is relatively direct, because the pointer may represent a virtual memory address that has to be translated to a physical address in order to retrieve the associated data.

A key references data indirectly via lookup from a map of key/pointer pairs, because internally the machine only references data via pointers. Thus, dereferencing a key involves the overhead of a key lookup to obtain the associated pointer, plus the pointer dereference (and any associated translation, as noted above) to obtain the data.

In general, pointers offer better speed and storage efficiency at the expense of portability and environment independence. Keys offer better portability and environment independence at the expense of speed and storage efficiency. Choose one or the other depending on the speed/storage/portability/independence requirements of the problem at hand.

-- DaveVoorhis

MayZeroSix

CategoryComparisons CategoryPointer