Persistence Store

An officially sanctioned and widely spread AntiPattern. It's been advocated as a pattern by ScottAmbler, KyleBrown, MartinFowler and all the usual OO suspects. The J2EE literature is particularly keen on promoting it. The climax point of this seems to be the ReinventedWheel? commonly known as EQL.

Basically it goes like this:

So you have this incredibly versatile, very complex and very refined piece of machinery called RelationalDatabase. A RelationalDatabase at its core is basically a logic engine (although you can call it a Prolog--) with transactional support.

However, as it turns out, in modern OO architecture this crucial and very valuable component becomes the lowest layer in FourLayerArchitecture - or something similar, and the official OO propaganda tells you that developers should talk only in terms of domain objects, object model, etc. And above all, "they should be isolated from the low level PersistenceLogic?". The database stays somewhere in the background with the only mission to load and save objects.

The end result is that your fancy database becomes a glorified file system (aka PersistenceStore). It's a waste of money and opportunities. If you don't suffer from the usual OO prejudice against SQL and stored procedures, many use cases can be solved with an order of magnitude less effort in LOC and will run an order of magnitude faster. But of course, you'd have to break serious OO dogma to do the SimplestThing.

Another aspect is that transactions between the OO layer and the relational database tend to have a particular shape in 2 stages. The first stage will contain only reads that will load data in an object graph, and at the end there will be updates and inserts only. Well, this spells trouble: increased probability of deadlocks, horrible run-time performance, decreased concurrency. These are the traditional evils that OO programmers have complained about when doing projects with relational databases, and of course they blamed it on ObjectRelationalImpedanceMismatch.

To break out of this vicious cycle, the thing you have to do is the SimplestThing: don't assume that for every use case you have to "load" your nicely crafted slice of the object graph and move bytes around using ModelViewController, instead for most of the use cases in a typical project the SimplestThing to do is PutTheDamnDataOnTheDamnScreen?.

OO "bigots" (MartinFowler's self-assumed characterization) might tell you that your business logic will become brittle, that you'll create a dependency between the more abstract and high level "object model"/"application model" and the "low level" PersistenceLogic?. Don't believe them:

Microsoft is building an ObjectRelationalMapping tool. It's called ObjectSpaces.

It's true that the persistence store idea often results in an interface that's designed for the lowest common denominator between all the possible implementations. However, there's a difference between saying that if you have a database you should strive to take advantage of all the benefits it offers over a file system and saying that you should have a 2 tier system.

If you have 2 tiers and your business logic ends up being duplicated you now have two sets of code to maintain. This means all changes to business rules must be exactly duplicated. Add to that the tight coupling between the presentation tier and the persistence tier. If you've worked on an ASP, project you soon realize that scattering SQL throughout your html is a recipe for disaster unless you make an arbitrary decision that some ASPs will be used to talk to the database whilst some ASPs will only do presentation. You basically end up faking MVC.

I'm not sure MartinFowler and co should be singled out for this. To my knowledge, the whole issue didn't arise until Sun et al invented J2EE. The particular source of pain was the entity bean. Most shops now ignore entity beans except for simple session state serialization as it is too heavyweight for the job it was designed for, and architecturally deficient for any increased role. The problem is, very simply, that shared updatable data should be in only one place unless you really really have to distribute it and you can take the resulting pain, wallet-wise or performance-wise.

Mixing SQL code into your business logic is just wrong. Put it in stored procedures, or its OO domain object equivalent. Similarly, don't put your business logic into your SQL code. No need to do this any more. We have application servers now.

As for EQL, what was wrong with SQL again? You're going to generate it anyway, so why not just generate it? We have cheap tools for that. You can run them every time you change persistence mechanisms (like that ever happens). Use polymorphism to associate the appropriate dialect of SQL with a particular database if you need more than one. I agree that Mr Gates has taken the path of least resistance on this one. .NET will perform well because it matches the software architecture to the underlying data architecture. I never thought I'd say that :).

Don't Entity beans still have a role as in-memory data caches?

That would be a good fit to their architecture, explicitly separated data. I can think of a couple of others too, such as non-shared large-grain persistent session data (Ye Olde Shoppinge Carte) and as persistence marshalling points (as suggested by message beans).

That really reduces them to a persistence mapping layer, however, which they seem to do in a very clunky way. I'm guessing Sun/IBM et al are hanging on to the idea of resaleable non-code components. If you've got the source, and you need to share your data, then a 2-way O-R mapping product, or just hand-coding, makes more sense IMO.

Code is more malleable than data, and so often code should follow the data, not the other way round. One point of unease I feel is that I often have needed to mess with data structures to achieve performance. I'm not sure how I'd do this in EJB's since they seem heavily biased to forward mapping from objects to data. This may just be my ignorance though, so don't quote me :).

I tend to recommend EntityBeansAsInMemoryDataCaches? when people need transaction-aware caches that can be clustered. There's also room for flexibility because CMP can be used for the simplest entities (the ones that really do map on to just one table) and BMP can be used for the more complex entities where the advantage of cacheing is more important than speedy access to the database.

And, as I remind people in my classes, nothing stops them wrapping entity beans around stored procedures. Especially in organizations that have more DB experience than Java. Although they then have to live with the duplication of their business logic. Some people find that it's an acceptable trade-off. -- AdewaleOshineye


EditText of this page (last edited December 30, 2008) or FindPage with title or text search