Tables And Objects Are Too Different

[Should be "TableRecordsAndObjectsAreTooDifferent?".] {Too long IMO}

Summary of Points (or at least an attempt at a summary)


Related to issues of ObjectRelationalImpedanceMismatch, the fundamental difference is that table "records" and object "records" (see ObjectsAreDictionaries) have a different base rule set. The main difference is that table records must satisfy a table "shape" while object "records" don't. In this regard, table records have constraints that objects are not subject to. Objects in general can be more willy-nilly about what "slots" (methods, attributes, object pointers) they have. The "slots" that relational records can have are dictated by the "owning" table. You cannot add or remove attribute slots for any given record. (Note that the slots may be virtual, they don't have to take physical space in RDBMS.)

Whether this is limiting or empowering for tables probably belongs in another topic (see OoLacksMathArgument, for example). The point here is that they are just different animals with no straight-forward way to tie them together. It is like translating different spoken languages: it cannot be 100% automated and some things get distorted or long-winded upon translation.

[The counter-argument is that object slots/attributes can be treated simply as computed fields, and the fact that an object may have additional fields not used in a particular query just doesn't matter. There is no particular difficulty in applying any of the relational database theory to this treatment.]

Some object-relational bridge tools sometimes try to hide or reduce this difference by making object designs more like relational ones or by limiting full usage of query languages, but I don't think very many developers are fully happy with the results. Such tools try to fit square pegs into round holes (or visa versa) with mixed results. There is a "translation cost" in terms of project complexity and perhaps performance.

Please clarify. Database rows are merely instances of database tables. Objects are merely instances of classes. Values stored in rows are constrained by the basic column data type and the constraints may be further refined through written triggers. Values stored in objects are constrained by the basic data types declared in the class and the constraints may be further refined through method code. Adding or removing columns from a table is not inherently any more difficult than adding or removing data declarations from classes. Deployment and compatibility between old and new versions is a difficult task in either case. What specific differences are being discussed?

OO can have inheritance, and in dynamic OO languages new "cells" (attributes, methods, references) can be added at run-time. If we ignore inheritance and dynamic ability, perhaps the two are closer. But, then it is not true OO, but a subset of OO. Relational also requires a unique key or key-set to identify each "instance".

And that's different from and better than an object's identity because?

This has been already debated on wiki, see ObjectIdentity, RelationalHasNoObjectIdentity RelationalHasLimitedModelingCapability and probably a few more. I'd strongly suggest that you start with FundamentalsOfObjectOrientedDatabases if you are unconvinced that object identity is not sufficient.

I can't find answers to my questions on any of those pages. Can you narrow the search or reiterate the answers? As I see it, since an RDBMS can be implemented in an OO language then object identity must be sufficient.

That's why I recommended that you start slowly with FundamentalsOfObjectOrientedDatabases, which is written entirely from an OO point of view. Since even when using OO languages one needs not rely on object identity, and indeed one should better not rely on object identity for most practical purposes, it doesn't follow that "object identity must be sufficient". Instead of arguing that you have this counter example that proves the contrary to the contrary to the contrary, you might want to propose a new framework that builds on object identity solely d'a cappo al fine and meets a majority of users' and programmers' needs. If you succeed in doing that you might have a point, otherwise your argument sound like "in theory we can build a theory that is wonderful and perfect, especially if we look through this pair of glasses".

One always relies on object identity when using OO languages. Without identity an object doesn't exist. I don't have to propose a new framework. Existing RDBMSs written in OO languages rely on object identity. No theory, no glasses.

I fail to see how the implementation has anything to do with anything. It does not matter if RDBMS are implemented via gerbils on treadmills and Tinkertoys.

There are an infinite number of ways to implement the tables mentioned in the title of this page in an OO language like Java, C++ or Smalltalk. What is true of any of those implementations is that they rely on object identity to keep track of where things are. Unique keys, key sets, indexes, all the cool stuff in any RDBMSs implemented in an OO language is built on object identity. Since anything that can be implemented in OO can be used in OO, that means object identity is as good as unique keys, etc. It's true that using nothing but direct object identity for organizing data is difficult. But any scheme for organizing data, from simple lists and maps to Oracle and beyond can be implemented in an OO language. That extends what the programmer can do in the language. Object identity all you need.

No, it is not called object identity it is called object equality in popular terms. So you typically need to override equals or the == operator in C++ or pick your favorite. That's why most OO languages and frameworks provide a special equals method for their object. For more details you should go study FundamentalsOfObjectOrientedDatabases as recommended. The hurry with which you post replies tell me you haven't even tried. Also the fact that everything is built by moving bytes at machine code / assembly level doesn't show that machine code is fundamental to anything. It's true that a good assembler is all I need to implement a database if I had to work on it for the rest of my days. What's your conclusion? That assembly is as good as anything?

No, it is called object identity. We use object identity when we change the way the comparison operator works in C++. It's the only way the compiler or run time knows where objects are, so any other way to organize data can be expressed in terms of object identity.

No, assembly isn't "as good as anything". Neither are Java, C++, Smalltalk or SQL. Each is well suited to some tasks but none is perfect for every task. The biggest difference is that one can create SQL interpreters in any of the other languages in that list, but not vice versa. Therefore a user of one of those languages will have more choice of data structure, ranging from the very simple to the very complex.

Relational does not dictate implementation. Perhaps you are arguing that relational query languages are overkill for simple stuff. Perhaps, but the simple things are also less likely to be shared in my observation. A non-nested dictionary array in the language is usually sufficient for temporary, local needs. As soon as the possibility of sharing or x-to-many relationships comes into play or is likely to come into play, custom-built stuff is usually inferior IMO. (SharingDataIsImportant and YagniAndDatabases).

No, that isn't what I'm arguing.

Well, if it depends on reading the cited book, then I guess we stop here.

The book is about object oriented databases and has nothing to do with my argument.


Alternatively, TablesCanBeObjects.


Further, perhaps TablesCanBeObjects, but objects cannot be tables (unless they limit themselves to relational rules).

(rest of discussion moved to OoLacksMathArgument)


Is that "are too different" as in the difference is too much, or as in "Are not! Are too!"?

It is hard to integrate and communicate between them. They have different underlying philosophies that are at odds with each other. Relational is a square peg and OO is a round hole. There is a back-and-forth "translation cost" that many OO developers recognize (but some still tolerate because they prefer an OO side).


Navigation Differences

Another difference is possibly navigation. OO designs tend to want to hop around pointers to query and navigate object collections, but relational tends to be declarative: you ask for what you want, not how to get there.


Another philosophical difference described in SeparationOfDataAndCode is that a data-centric approach uses data as the primary glue/messaging/interface system, while OO wants to make behavioral interfaces be the primary. They fight over whether data or behavior is the "king". It has proven difficult to compromise.


See Also: MultiParadigmDatabase, CantEncapsulateLinks, RelationalWithSideEffects


CategoryComparisons


EditText of this page (last edited November 23, 2014) or FindPage with title or text search