When Do Schemas And Classes Deviate

When and why should class structures deviate from relational schemas? The controversy over hierarchies and subtypes is relatively well known, but outside of that, what are specific cases or general rules for why and how they differ? Let's also exclude or separate cases where the schema is poorly designed. For the sake of scope, let's consider "good" schemas and "good" class designs.

For some loose definition of "good" :) In practice, they do differ in some places.

Database and schemas differ from each other by their nature. Intuitively, of course many people will understand what you are what you're trying to say, but trying to formalize when we call them "the same" (or some other good term, maybe let's call them "in agreement": the class definitions and schema definitions are in agreement), may shed some useful insights on the subject.

1) We shall depart from the simplest case: a schema with one table versus a schema with one class. Let's further assume that the class represents some domain of interest for the business, the same as the table. Let's say a STUDENTS table versus STUDENTS class. Intuitively, we'll say they are in agreement when all fields of the class map one to one with the columns of the table. The mapping ideally conserves the names or applies a simple bijective transformation of the name, the same about the types. For example, studentID in the class becomes STUDENT_ID in the table, the type String in Java becomes VARCHAR2(20) Oracle or the type int in java becomes NUMBER (38,0) in Oracle.

Are we speaking theoretically or practically? There is no theoretical reason that an OO language and a relational database can't share the same set of (immutable) ValueObject types - the domains in the relational case. Just because SqlLanguage blows doesn't mean its warts should be representative of relational if we're talking theory. The original question asked "object" vs "relational" and not "Java" vs "SQL", so I presume this is a fair question.

Even if you speak theoretically, you won't be able to get to very different systems to operate under the same type theory. Just think ML and Haskell and SML and Scheme. Do they perfectly agree on the basic building blocks of data structures (excluding functional values, modules, etc., - just plain old data). Is it realistic to assume that at some point they will agree? Practical considerations are also theoretical considerations.

VARCHAR2(20) and String are different domains, one cannot hold the values of the other (and, depending on charset, vice-versa). The same applies to NUMBER/int. It is usually not possible to get a one-to-one mapping between the domains. This means, that there can be values in the db, which you cannot represent in the program (or vice-versa). A solution would be to a) Create custom type classes (say StudentName?, which ensures length) or b) ensure the conditions is the class. In the first case we already deviate by introducing a class, that has no direct correspondence in the db.

Unless I misunderstand... Date and Darwen seem to think that there's no issue with having aggregates - even tables - as domain values, which would be one relational way of doing this sort of composition.

This 1.b observation was more from a practical point of view as of 2004. From a theoretical point, you may still have 2 different type systems. So we go back to a maping between the types of the two different type systems.

2)


I could swear we've been down this road before, many times, and they never seem to result in any illumination. So, I'm going to try and recap briefly and hopefully settle this before it becomes sprawling ThreadMess:

-- JonathanTang

Shall we try to synthesize something helpful, though? I mean, we could repeat the whole damn enchillada from DbDebunk or the usual suspects on comp.database.theory.

''One example: not everybody in the "relational camp" agrees that FirstGreatBlunder is actually a blunder.

Another example, classes organize code + data (not only code), more so in languages like Java and Smalltalk where the only way to structure data is to create classes and instantiate objects. As such, the prompting statement of this page is justified: when the structure of data provided by classes in OO code be divergent from the structure of data defined in the relational schema? D&D's standard answer that one should map classes to database domains (doing otherwise would contradict the thesis put forth in FirstGreatBlunder) is unsatisfactory.

Yet another example: classes support constraint not with extreme difficulty but rather extremely easy. For example do not provide mutatirs provide only constructors, and you only have to check the invariant upon object constraction. Alternatively: do not provide individual mutators, provide only state transition methods that take the object from one valid state to another. In a language like Eiffel, you even have language support for invariants, in AOJ you can easily add it automatically, in Java or Smalltalk you can add it manually. At most you can say that majority of OO developers are not aware or do not practice these techniques.

And last but not least, your bold claim: "Schemas do not support inheritance. Trying to introduce it essentially breaks the RelationalModel.". I find it very hard to support it. It follows from where?

Generally relational "thinking" tends to believe there is no real value in schema inheritance. The philosophy is generally that sets make a better, more flexible classification system than hierarchies. (See TableInheritance)

[That does not address "essentially breaks the RelationalModel"; it only argues "no real value" and "better".]

Actually, it appears the relational model does not really address how schemas are created. Thus, whether they come about via inharitance or hamsters on a wheel is not really its concern. Now value inheritance may be a different issue.

[Well, ok, if that's your new position. But then you appear to be abandoning your previous position that "Trying to introduce [inheritance] essentially breaks the RelationalModel", yes? If so, go ahead and make the appropriate edit.]


The question presumes that classes should mirror schemas or vice versa. I've come to believe that this is a flawed assumption. It seems like relations ought to be classes, but in practice they often aren't. Outside the realm of CRUD screens, isomorphism between the two is just a happy accident. -- EricHodges

But is there a pattern to when they happen to match and when they don't?


EditText of this page (last edited December 4, 2006) or FindPage with title or text search