Tables Can Be Objects

Tables (and rows and columns) are easily represented as objects. Let go of the desire for transparency and treat tables as just another externally imposed requirement. You may be accused by one RelationalWeenie of ReinventingTheDatabaseInApplication, but there's no pleasing some people.

Transparency?

Yes, the objections on TablesAndObjectsAreTooDifferent can be traced to a desire for transparent integration of objects and records.

How can tables easily be represented as objects! Objects have methods (procedures)! Saying it is easy is WishfulThinking. Or do tables have methods, and I am missing something?


Fine with me if you try, but at least satisfy these requirements: [try what?]

The object/class design will fit the tables like it does any other externally imposed requirement. {But then it is relational-shaped OO, not OO. TuringEquivalency may be possible, but is not necessarily practical here.} [[Why would being "relational-shaped OO" be a concern?]] The RDBMS handles relational queries from multiple clients. The OO application will use what it needs and share what it can, like any other OO application. The OO application will use whatever OO language you pick. My advice doesn't intersect with performance penalties so I'm not sure what you mean here.

You could use the CampusExample as a testing ground and example source. (Well, maybe not the performance part, but first things first.)

CampusExample is a schema. I don't see any requirements for an application there.

It is just a suggestion. You are welcome to find a different example or add to it.

It's a suggestion of what? All I see is a schema. I don't see how it applies to my advice at the top of this page.

I give up. I don't know what you are trying to convey and you don't seem to want to provide an example.

Here are some examples of treating tables as just another externally imposed requirement: http://java.sun.com/docs/books/tutorial/jdbc/basics/index.html

Here is a snippet:

  String query = "
  SELECT COFFEES.COF_NAME " +
    "FROM COFFEES, SUPPLIERS " +
    "WHERE SUPPLIERS.SUP_NAME LIKE 'Acme, Inc.' " +
    "and SUPPLIERS.SUP_ID = COFFEES.SUP_ID";
  resultSet rs = stmt.executeQuery(query);
  ....

They are using embedded SQL to do a lot of the work, such as joins and filtering. If you use something besides SQL, then you have to overhaul the embedded SQL, no? Note that JDBC is not significantly different from the early FORTRAN API's that hooked into databases. The pattern you see in most API's is roughly:

   sql = "select * from ....."
   rs = getResultSet(sql)
   while getNextRecord(rs) {  // loop thru result set
     [do something with current record dictionary or object]
   }
   closeRS(rs)

See also QueryAndLoop

Yes, if you don't use SQL you can't use JDBC.

Well, it may be possible to use JDBC with some other query language, but I don't know of a case in practice. In practice, your JDBC code is tied to SQL for good or bad.

But whatever RDBMS you use will most likely have tables (and rows and columns) which can be modeled as objects. If the code has to be query language neutral another abstraction can be added for that. My point is that since objects can model tables, tables and objects are not too different. The problem lies in the desire for transparent object to relational mapping.

These are just the temporary result sets we are seeing with JDBC, ODBC, and their historic cousins. OOP is not really adding anything to them that non-OO API's didn't have.

Agreed. How does that relate to the subject of this page?

I suggest we call this topic ResultSetsCanBeObjects?.

It is true that result sets can be objects but that isn't the subject of this page. Read the opening paragraph again.

I am not sure what you are getting at from a practical standpoint.

The argument on TablesAndObjectsAreTooDifferent can be safely ignored. Treat tables as any other externally imposed requirement. Writing code to deal with the RDBMS like you would a printer or mouse. If they present a relational table view of data, write your code to interact with that view. There's no more need to reinvent the database than there is to reinvent the printer. Ask it for data. Tell it things. Tables and objects are different, but we use objects because so many things are so very different. Different is not a threat.

It is when the choices are not orthogonal. Do we use bags of object pointers to represent one-to-many relationships, or the database, for example.

If you need a database, use a database. If you need bags of pointers, use bags of pointers. Use whatever works.

Nobody can ever agree on when to use what. They choose based on some internal inarticulatible feeling.

People agree every day on when to use what. The ones I've worked with use articulatible math to decide which one.

I would like to inspect such "math".

Read a ChrisDate book.

Chris talks about decision math to determine when to use bags of pointers instead of relational tables? (I suspect there has been a text mixup here)


How is this different from an ObjectRelationalMapping? Applications do this all the time; the whole reason the ObjectRelationalImpedanceMismatch is a problem is that this takes a significant amount of work for any project, and people don't like having to do it over and over again. It seems like the suggestion described here is simply common practice for anyone that works with a database in an OO language. -- JonathanTang


As an experiment, I wonder what would happen if an OO system forced relational rules on objects/classes. Not just "can be", otherwise one violation can bust the whole match. For example, toss inheritance, classes define entities, there are "permanent" classes and "work" (temporary) classes whose objects don't have post-application persistence. Objects would be immutable other than perhaps internal temporary variables. If one sticks with these limits, then relational and OO may get along better. I doubt many OO fans would go for it though. It might help to find out when and why they would want to violate the rules, and see if relational is limiting them, or there is a fundamental philosophical disconnect.


I have tried working with database libraries that mapped each class to a single table, and disallowed inheritance for classes that used a table as backing. In other words, the part of the application that used the database was not allowed to use inheritance for refinement. This has led to an implementation of "inheritance by proxy" - instead of doing inheritance from a virtual base class (which was what would be "OO-wise correct"), people implemented proxy methods in the parent for all methods that had to be forwarded, and associated a separate object that it could be proxied to. A poor man's vtable, effectively.

The programmers that did this are *not* OO-fanatics - the lead for the project has never been very keen on OO, but switched to it for this project due to all the 3. party APIs he used being OO, and he wanted to avoid mixing styles too much. --EivindEklund (feel free to anonymize for refactoring)


However,

Many database experts (including ChrisDate and HughDarwen) claim that equating tables/record with objects is a mistake; and that the proper way to include objects in a RelationalDatabase is as attribute values/domains. In this view, a database is a collection of relations (tables and views); a table is a set of records, and a record is an associative array mapping strings (attribute/column names) to objects. (And a view is function in the RelationalAlgebra or RelationalCalculus allowing one to derive compute new relations from other relations).

See DateAndDarwensTypeSystem, TheThirdManifesto.


The above seems to attempt to use the authority of ChrisDate and HughDarwen to "invalidate" other points of view. As far as I can tell, ChrisDate does not have OO experience (see the ChrisDate page for background). As a such, I find the discussions between dual practitioners here in this Wiki to have a value that is orthogonal to ChrisDate's work, and believe that the views here can be of more relevance than ChrisDate's view, even though he is strong in the database area.

From my experience in doing work in the intersection between databases and OO (and I've spent a couple of years in the area): I'm not sure there exists a "proper way to include objects in a RelationalDatabase" - this seems to me to be (usually) the wrong way around the problem. Databases are convenient for handling large amounts of data data (depending a bit on which database), and inconvenient for doing substantial programming in. This is likely to stay true, as the design goal for databases is to be convenient to store data in. Programming languages are convenient to do substantial programming in (depending a bit on which language), and inconvenient for handling large amounts of data.

I've mostly worked with databases with a web interface, mapping one class per table, one object per row, and abstracting the simplest classes of queries into methods with parameters. The object / row mapping has worked reasonably well, with a few cases where there "should" be inheritance, leading to ad-hoc implementation of inheritance. What has made serious "problems" has been the second-class status that queries ended up as - there has been no way to combine or refine queries. This has led to problems when we've e.g. changed a table from having only one way to look at the data to having different contexts that data should be available in (represented by a separate many-to-many relationship), Instead of passing in a suitable relation (made by a suitable select + join from the two tables), we've had to hardcode the relationship into every single query that referred to the original table and were to be restricted.

My (temporary) conclusion is that TablesCanBeObjects (Really: Tables can be classes, rows can be objects), and that generally works, but occasionally inheritance problems will crop up. However, this only covers one area of the problem. A separate issue is that queries/relations needs to be first class objects to give good factoring to the OO program that run atop the database. At the moment, I hang towards that being a minimal "query language" based on the relational algebra and generating SQL, but I haven't yet finished an implementation to test how that works out in practice.

-- EivindEklund (but feel free to anonymize for refactoring)

I think you've encountered some of the same problems with rows-as-objects as I have, but I consider them more serious than you do. No doubt perception tends to be colored by the applications one works on.

What has made serious "problems" has been the second-class status that queries ended up as... This is a serious problem for some applications. In many applications, and I suspect in most data-centric applications that have matured and added features beyond basic functionality, there will be many more ways of viewing data than of editing it. Making queries harder to use will have a big impact on development.

You don't mention why you are encapsulating the rows as objects. As a convenient method for reading data from the database? Or are they enforcing database constraints (aka business logic) and serving as your write path to the data? You've identified some of the shortcomings to this approach; what are the benefits, in your case?

-- DanMuller

We use objects-per-row both for reading and writing. We do generally not enforce database constraints at this level; they are, for better or (in my opinion) worse mostly enforced at the presentation level. We also hang a lot of logic off the objects. Looking at one of my classes here (for a layout of boxes, created by humans), it can

Also, there is a separate implementation of the same interface that does the exact same job for tabular layouts.

The primary advantages of the present mapping (one row - one object) I can think of are

Note that my experience with this mapping has only been in settings where the developers are also the DBAs, and the amount of legacy code tying down the database structure is fairly low. The tradeoffs in other scenarios might be totally different.

On a separate (but related) note: My relational algebra implementation for Ruby started doing cross-table natural joins yesterday :-) (You see, I'd really like to be able to handle those queries better, too...)

--EivindEklund (but feel free to anonymize for refactoring)


Possible Functional Ties

If one does a join or filter, what object are they asking? Do the objects represent actual tables, or views (including joins)? Relational does not, or should not, make any notational distinction between them. Relational can be viewed from a functional standpoint of taking a table or table view combined with possible expressions, and results in a new view (virtual table). Perhaps we should look at the bigger picture of how OO emulates or functions as functional programming. For example, how is this done in OO?:

  virtualTable3 <-- join(virtualTable1, virtualTable2, expression)
  virtualTable4 <-- union(virtualTable3, virtualTableOther)

(Virtual tables 1 and 2 may also be real tables. True relational does not care.)

In general, if we have:

   t3 = op(t1, t2)

If t1 and t2 are objects instead of tables, which object has the responsibility? Do we arbitrarily pick one?

  t3 = t1.op(t2)

Or

t3 = t2.op(t1)

What if t3 has properties of both t1 and t2 such that single inheritance cannot be used? For example, t3 might have a union of the columns of t1 and t2.

I wrote a relational algebra in Java. I provided both class and object methods for join and union, so the user could call

  t3 = op(t1, t2)

Or

t3 = t1.op(t2)

Or

t3 = t2.op(t1)


See also: TablesAndObjectsAreTooDifferent, MultiParadigmDatabase


EditText of this page (last edited February 6, 2012) or FindPage with title or text search