Oo Vs Relational

Isn't debating OO vs relational like debating LinkedLists? vs namespaces? Two totally orthogonal ideas..

I have always viewed OO as a tool for data hiding and for binding data and the fundamental operations you can do on that data together so that you never end up with a situation where data ends up in a nonsensical state. Databases, of any sort, are just data structures. Sure, there's been allot of work in making these data structures accessible over the network, making them fast in many useful respects, making them persistent, creating a common vocabulary to allow these data structures to be accessed by a variety of languages and environments in similar ways, to fix most synchronization problems etc... Ultimately they are a data structure. The OO part would be defining the interface / interfaces to use this data structure and preferably enforcing this. Really, this is not unique to objects, it's just good programming. Objects are a tool that can allow for the expression of more formal way of accessing the data structure. Oddly enough, a typical query language does a part of this. Your typical query language is a way of specifying which kinds of operations you want to do on the database. It abstracts the implementation of the database from the operations you want to do on it. This is a great idea but it's ultimately only a start. On any project of size, you'll need to make your own application domain specific meta-query-language, which will typically take the form of functions in the language you are interacting with the database in. This manifests itself as something like this ->

Vector delagates = database.getListOfDelegates();

instead of

result = database.query("[People]","[People]type = "+DELEGATE");

Vector delegates = new Vector(); while (result.isMore()) {

    delegates.addElement(result.get(1));
    result.next();
}

This is, of course, a trivial example. A case of pure code reuse. Hopefully every one is wrapping repeated code like this in functions already. Even this can be handy, however, if the implementation of the database changes.. So now instead of there being DELEGATE, there's also a type SUPER_DELEGATE which happens to be a delegate and super hero. Since everyone is using the function, no client code has to worry about it. Why you'd want to to this is irrelevant, since the point is to show you can abstract the implementation or the data model from the interface used to access it. This is an OO principle.

You are changing the meaning if you add another "set" to the list, not mere implementation. This may result in unprectictable results. Besides, under proper set theory, "super-delegate" may already be a member of set delegate. We would perhaps have to know about the domain and its set management system to know if we are creating problems. Anyhow, in the spirit of YagNi, I don't recommend wrapping until 3 or repeat uses. Wrapping everything for the sake of wrapping can create messy, verbose code. Anyhow, this probably belongs under ScatterSqlEverywhere.

If we also start to add functions that don't exist for code reuse purposes, but instead exists to mitigate access - to preserve database coherency then we start to have a database that has been designed using OO principles (which also happen to be just plain good, scalable design principles). If we put all the database assessors together, preferably in a common namespace. We're basically all the way there..

What do you mean by "together"? You mean grouped by entity (assuming there is only one appropriate PrimaryNoun), or all DB accessors in one giant file/module? There was a big debate about all 3 options (low wrap, entity-grouping, central) about 6 months ago.

Nothing so far conflicts with relational databases.

Humm, so what about comparing a relational database data structure to an ad-hoc data structure designed and built using an OO language? Well, you can create bad data structure in any language. Really, that debate is not an OO vs relational but a design question. Which data structure.. which tools do you want to use.

So anyway, I don't understand where this comparison comes from.. unless there are programmers out there who think programming consists only of writing glue code between some interface API and some pre-packaged backend.

- AdderTheBlack?


moved from PersistenceLayer

I don't see databases as being just about persistence. They are a tool for managing and processing state. Persistence is just one of many services they provide (or can provide). I would use them even if there was no disk I/O allowed. Wrapping them just because they happen to use disk seems like a waste of effort unless you know you are going to switch DB vendors anytime soon. There are some big HolyWar topics on this, by the way, so perhaps we should consolidate all the topics battling over database wrapping.

Covered in more detail at DatabasesAreMoreThanJustStorage.


But persistence is the best important thing DBMS want to deal with.

If disk and RAM merge or become indistinguishable, I don't think DBMS's will die. For example, their query abilities and the ability to share data with multiple applications and paradigms and manage user contention (transactions, rollbacks, etc.) are still very popular features.

Why not separate persistence aspect from other aspects? There are many other applications that can do other things, like EJB, EOB, CORBA, WebServer.

The very concept of PersistenceLayer implies a focus on an application which needs persistence for its data. Some applications are like that, but many projects revolve around shared databases with many independent and relatively small, unimportant and often short-lived applications that must perform specific queries with a specific transaction structure using a more or less custom DatabaseAccessLayer.


I don't know, dudes, it just seems like some people like databases and others hate them. It will probably never change. It would be interesting to understand the psychology of such preference. I have not been able to figure it out after years with similar debates. (Most of the above text is not mine, I would note.) Databases and declarative approaches just seem to bother some, same way that OO bothers me. -- top


Re: "One does not have to use the database as a big global variable, one can have each app manage it's own data and itegrate the apps rather than the databases."

I am curious how stuff is divided up under that approach. I would assume that the various apps would be "responsible" for given entities/nouns, and one would have to request information from the responsible application if they want info about a given noun. Am I close? I suppose it is possible, but you are essentially reinventing a database of sorts from scratch. For example, if you have to cross-reference information from entity A with entity B from a different app, you may have a lot of work to manually program such a Join operation, manage indexes (lists of pointers), deal with concurrency, etc. In other words, ResponsibilityDrivenDesign does not do InterfaceFactoring well with DatabaseVerbs. -- top

One way stuff is divided up under that approach: By responsibility. Each component is responsible for a specific set of behavior. Components interact by making requests of one another. There is no need for shared data.

This approach helps simplify large, complicated problems. It can also streamline throughput. Instead of all data flowing into and out of a single point, high level commands flow to components that send lower level commands to other components.

It's at the core of object oriented programming. Communicating through a single shared block of memory is anathema to OO design. That's probably why many OO folks dislike using databases in that way.

-- EricHodges

They dislike it because they don't really understand it. We'll have to face some very basic foreign keys integrity issues. We'll set up some WebServices/messaging ad-hoc stuff rather than just use the trivial features available in databases ? Specific set of behavior says nothing as fae as any pair of application systems (say accounting and human resource) do share some data by the very nature of the business. If one app system is using Java + Prevayler and other Gemstone + Smalltalk, only then you've created yourself oine big integration problem. DoTheSimplestThingThatCouldPossiblyWork anyone ? How about DoTheThingThatMightWorkWell ?

So let's see how could we use that big scary shared database without global variables. HR will use the HR schema, Accounting will use Accounting Schema (both on the same DBMS or on distinct nodes -- two DBMSes) HR will either expose a master snapshot with valid Employee data needed for accounting, or it can even expose a table/view on which Accounting has only select but no update rights. What's the very big deal about that ? --Costin

The big deal about that, Costin, is that my company is not going to expose a master snapshot or view of its HR database to your company. That's the problem solved by WebServices. And would you please stop with the "they don't really understand it" bit - that's neither necessary nor helpful. Contrary to what Eric surmises above, I like databases just fine, including relational databases, and I think I understand them well enough. --RandyStafford

P.S. The protagonists on this page may be interested in the points of view expressed on DatabaseApplicationIndependence.

That's a different problem to be solved. And my company won't expose much HR functionality to your company nor accounting functionality for that matter -- webservices, snapshots or whatever. But the claim behind all this hoopla was that "shared databases are a bad idea to begin with". Since we count on your understanding of databases, I take that you'd have some exceptions to that position, wouldn't you ? --Costin

I suspect what happens is that databases are initially designed to support one application, and the appliction assumes ownership of the database, feeling free to cache, refactor schema, etc., as need be. But then over time additional "applications" are added as dependents on the database, perhaps to produce reports or perform analyses. At some point a line is crossed where another application begins committing to the database in addition to the original application, complicating or precluding caching strategies, and the general accumulation of dependent applications begins to act as a serious drag on refactoring agility.

Perhaps it would have been "better" to maintain the encapsulation of the database by the original application all along. It's hard to say without an experiment - all we have are anecdotes - and the definition of "better" is context dependent. Like RalphJohnson on DatabaseApplicationIndependence, I'm skeptical than an organization can design a schema up front that is omniscient enough and robust enough to support most of its applications over many years of operation.

Instead I suspect schemas grow by accretion, and precisely because it's painful to change dependent client applications, they grow by addition instead of changing to reflect new insight. More views get added, more triggers get added, more ETL machinery gets added, to the point where they eventually become a BigBallOfMud and nobody really knows what the original semantics were. Of course, the same thing can happen in any codebase, but I think it's easier to refactor schema the fewer dependent applications there are. Re-reading Eric's comment, I think I overlooked the "in that way" part - meaning, using databases to integrate applications. We're actually doing that at my company, not by a big shared schema, but by a set of "staging tables". It works fine for the purpose, but it's kind of crude. I would prefer a messaging approach. I think the reason that approach was chosen in the first place is because one of the applications being integrated is heavily PL/SQL based, and doesn't have a good messaging interface. --Randy


Re: "Communicating through a single shared block of memory is anathema to OO design. That's probably why many OO folks dislike using databases in that way."

Focusing on just performance issues, why wouldn't such a solution also be avialable to a RDBMS? Relational (ideally) does NOT dictate implementation. One could split each entity to its own server, and in some cases big shops probably do that sometimes. However, it may complicate things like joins and AcId for multi-table transactions. Generally the nice things about RDBMS is that Oracle and IBM worry about such things so that developers and to a large extent DBA's don't have to. It is about reuse of attribute/state/fact management solutions. OO'ers seem to want to reinvent all that stuff from scratch for some reason. I suspect it gives them a sense of control, even if it is more work. (Eventually link-move this to AreRdbmsSlow.) -- top

"Developers no longer have to worry about it" is one of my AlarmBellPhrases. It usually means you have to understand a lot more than you would have had to if the vendor had not tried to "help" you. We have to be careful using this "single shared block of memory" analogy. I think the point Eric is making is about encapsulation - do applications encapsulate databases - or are the innards, the database schema, exposed for the world of applications to see? I think that's the philosophical divide. When it comes to performant implementation, big blocks of shared memory are crucial - that's what the System Global Area in Oracle is, and that's what the Shared Object Cache in GemStone is. What's anathema to me is unnecessary distribution, e.g. a separate server per entity. I'm not interested in reinventing, or "control" - I'm interested in DomainDrivenDesign and the "best" way to persist the state of my DomainObjects. -- Randy

Using encapsulation to "protect the innards" is a myth I have decided after several debates on that issue. See GateKeeper and DatabaseNotMoreGlobalThanClasses. Especially notice "Widest Method Access Principle" near the bottom of GateKeeper. Any protection offered by OO almost always is based on convention in the end rather than the machine forcing stuff on people. And, any paradigm can rely on convention to "do stuff the right way". Further, I am not saying relying on the vendor is the ideal situation, but usually a hell of a lot simpler than reinventing it from scratch just for the sake of (perceived) control. Oracle probably has lots of bugs, but probably far less than a home-rolled solution, and easier to find experts in also. Software is getting to complex and needs too many changes to create the whole state management thing from scratch. -- top

No one wants to create state management from scratch, we just want to use the database for state management, you want us to do everything under the sun with it. Programs have state, databases are storage for that state, what's so complicated about that?

OO forces you to have TWO layers of "storage": one for RAM and one for disk. With a RDBMS one can forget about such distinction. (Sure, we have local variables, but they only last for the current task and then are gone.) We talk to a single "state machine", not one for RAM and one for disk. OO designs go nuts trying to deal with the duality. Plus, databases handle things like indexes, joins, ad-hoc queries, aggregation, and concurrency out-of-the-box. You guys like reinventing it from scratch brick by brick. Those things don't come with OO and attempts to simply inherit them all turn your system into a NavigationalDatabase, so you cannot escape from a database one way or another. GreencoddsTenthRuleOfProgramming. We don't have to have giant lists of pointers in arrays. The DBA can add a new index without having to change a single line of app code, but OO "indexes" become part of the app. -- top

Some folks can forget about such distinctions. Some folks can't. It depends on what you're doing.

How do you know ahead of time what database-like needs will come up and what won't?

You don't. If you start without a database and later need one, you add one.

Some are no doubt going to suggest that it is less change risk the other way around. Which is the "default" probably depends on whether you think databases are annoying or living without a database is annoying.

And writing code to handle indexes, joins, ad-hoc queries, aggregation, concurrency, etc., isn't all that difficult. -- EH

So it is efficient if everyone is their own Oracle making their own conventions and protocols and Oracle and DB2 are just ripping people off for billions? But a bigger problem is that newly hired developers have to learn entirely new protocols and conventions for doing such. They go to your shop and have to learn the Eric Hodges State Management and Query System, and go to another shop and then have to learn the Dr. Foo's State Management and Query System. That is not good reuse. -- top

Who said anything like that? [reinvent Oracle] -- EH

That is the implication I get. Even if by chance it was easy (I doubt it), it creates inconsistency because every developer will do it very different. OoLacksConsistencyDiscussion. RDBMS are simply OnceAndOnlyOnce applied to commonly-seen state and attribute management needs. OO does not factor them together in any identifiable way, letting each developer/shop cowboy-up their own. -- top

And are you really saying that if a [OO] program has those features then it becomes a navigational database? -- EH

In a round-about way, yes. It is integrated inside the application such that the distinction between database and app code is blurred. OO likes to do that, calling it "encapsulation". -- top

So if Doom has code that performs joins, creates indexes, does queries, etc., then Doom has become a navigational database? What's the point of that classification? It doesn't mean you can implement Doom inside Oracle or anything. -- EH

It is "has", not necessarily "is". I bet Doom does have navigational structures inside of it, and I expect the more complicated role-playing games will eventually use lite RDBMS because keeping track of all the rules and attributes using dedicated structures and pointer hopping is a maintenance nightmare. (In fact, I have been kicking around the idea of such in my head even tough I am not much of a gamer.) -- top

"Those things don't come with OO and attempts to simply inherit them all turn your system into a NavigationalDatabase, so you cannot escape from a database one way or another."

That is "is" not "has". What's the point of this categorization? -- EH

{Is it even possible to "simply inherit them"? How does a class inherit things like joins?}

Why is it that a page about using something other than a database can't seem to exist without being flooded by Costin and Top about why we must use databases instead? None of these arguments have anything to do with the topic of the page, go preach about databases on DatabasesAreMoreThanJustStorage, because a persistance layer is about using a database for just storage. A persistence layer is an OO technique for storing objects, that doesn't mean the page has to devolve into more OO vs relational HolyWars, it's an OO topic for Christ's sake, let it be. Let's talk about persistence layers, or is that too much for you Costin?

Now whoever wrote that note is not paying attention to the discussion. I never touched ther page before the gratuituous claim was made that:

This is a bad claim and should be debunked altogther on the grounds that is is without merits, technically unsound and harmful. We can't improve the value of wiki by letting baseless claims stabnd as if nothing happens. Would you let it pass a claim that OO is bad idea to begin with and we should all be using assembly instead ?

Then don't misquote it, you're being intellectually dishonest. No such claim was made. What was said was.. "many think using shared databases with many applications is a bad idea to begin with". This is a true statement, many do think that, it's irrelevant whether or not it is a bad idea, for it doesn't invalidate the truth that some think it is bad, others think it's not, so what?

Costin, are you denying the fact that the webservice based approach has many followers?

If you say no, then the statement that many believe is true, it's that simple. You want to fly off the handle because you think that saying many is handwaving, when in fact, it's a simple statement of fact. Here's two more simple statements of truth. Many believe that application integration should be done via the database. Many believe than application integration should be done via messaging and not through the database. Both of those statement are true, and require no further explanation, so quit bitching, because it doesn't matter how many are on either side, the number isn't relevant to the simple truth of those statements.

It isn't handwaving to simply point out that there are two sides to an argument and some believe this and some believe that. I'm not required to justify both positions just to point out that they exist.


see ObjectRelationalPsychologicalMismatch, AreOoAndRelationalOrthogonalDiscussion, ArgumentsAgainstOop, RelationalHasLimitedModelingCapability


EditText of this page (last edited February 18, 2006) or FindPage with title or text search