Ejb Query Language Discussion

This short story, on EjbQueryLanguage (after all it's the only query language defined in less than 20 pages ) exemplifies an unfortunate conjunction of antipatterns as follows: CommitteesDontRead, CommitteesDontCode, and last but not least SoftwareEngineeringVsComputerScience. Maybe PraiseOfFolly also applies, since the new version of CMP is already acclaimed by the crowds and seen as the great promise for the future of EJB. On the positive side it could be a good case of JobSecurity for senior software architects that will be busy modeling the world in entities, and for the vendor consultants that will be busy trying to optimize the whole enchilada after all is said and done.

But as a practical matter, for anyone doing EJB projects and trying to get the job done, you're a lot safer using StatelessSessionBeans either with straight hand-coded JDBC or with a decent ObjectRelationalMapping product.

The above is wrong, imho. CMP in EJB 2.0 is often significantly more productive and performant than straight coded JDBC when involved in business transactions. (Reporting is naturally easier in JDBC). Obviously you'll need a good ObjectRelationalMapping product, but that's the whole point of CMP 2.0 - it provides a coding standard for them. You're belittling all those people that think it has promise by suggesting it's for JobSecurity. Since I'm one of those people whose people in support of CMP, I'm calling you on that. How is modeling the system in entities any different from using an O/R mapping tool? Have you looked at the capabilities of EJB 2 CMP? There are limitations, (inheritance being a big one), but most optimizations can be covered by thinks like TopLink or the new CMP engine in BEA WebLogic 7.

Ok, if it is so wrong, can you tell me what can and what cannot be done with CMP/EQL, more exactly what is the subset of SQL that I get for the easiness. How much do I have to pay for it, and what is the exit strategy when you discover that your beautiful entity model doesn't support a query? Do you write SQL for the yet unspecified real database fields?

EJB QL Aggregates, order by, etc. are supported in EJB 2.1.

So, EJB 2.1 will have SQL fully reinvented? Considering how much [time?] they spent to go so far, I'd rather doubt it. But even if, what is the point to have the first RedundantQueryLanguage??

The whole point of EJB QL is to provide querying capabilities similar to SQL but allow any underlying store to be used - including mainframes like IMS or VSAM flat files. It's also about separating the logical (object) schema from the physical (data) schema to allow for easier evolution.

If that was the point I'm afraid to say that the point is misguided. Getting the least common denominator among files and databases would be a true nightmare. But luckily that was only for marketing brochures purposes. No app server implements or has the intention to implement a query engine in middleware over non-relational legacy data sources. Their whole point is to have a straight-forward EQL to SQL-- translator.

CMP is not moving "back" to network or pointer-based database designs because it *isn't* the database, it's the application server, which is inherently object-oriented, and hence, pointer-based. It's about minimizing the manual programming involved in making the transition between the in-memory pointer based paradigm and the data storage paradigm in use (usually relational).

-- StuCharlton

I dunno. Call me a dinosaur, but this specification doesn't stand still, which is the classic smell of a poor implementation. Maybe in a couple of years it will be something worth having, but right now it is a commercial experiment with little to recommend it. It seems designed to try and fix the problems introduced by EJBs in the first place. YAGNI applies IMO.

YAGNI applies only in the context of a particular project with a particular set of requirements, and a particular part of your project life cycle. It isn't a good way to write off a technology. Certainly one wouldn't use O/R mapping at the very beginning of a project if reporting were the original goals. But if WorstThingsFirst was followed, and the initial tasks of a project were for some crucial & complex transactions , O/R mapping is definitely candidate for SimplestThingThatCouldPossiblyWork. Straight JDBC is not simple in these cases.

All EJB 2 represents is a standard to help foster the use of object - database mapping frameworks. These are tools that have been around for a while with growing success. Certainly EJB 2 fixes prior problems, that's no reason to discredit it. -- StuCharlton

We don't discredit it, we judge it based on merits. Well, based on merits, EJB 2 is a pretty bad standard in that matter, simply because it reinvents the wheel. And this reinvented square wheel is later imposed on the ignorant marketplace.

Let me give you a very basic example of the SimplestThing: you write a framework (alternatively you dictate it to the marketplace as a JCP standard :), that intercepts on the way back the result sets of full fledged SQL, and, only if the metadata of the result set corresponds and the programmer asks for it, the framework instantiates full-fledged objects/(CMP entities), returning back a set of objects, or else just returns a table of values. You have the full power of the database query language, you have your objects when and if you need them and everything's in there. You reinvent nothing at all.

: Sounds like ADO.NET? Has anyone done a practical comparison of ADO.NET vs EJB2.0? -- DinoChiesa

I'm doing a lot of .NET stuff, I'm thinking of doing this very thing. EjbVsAdoDotNet -- StuCharlton

And by the way, you avoid another terrible shortsightedness of EJB standard: you don't have to declare in advance all imaginable queries in a stupid XML file and corresponding ejbSelect methods. That's what I would call a kind-of SimplestThing. While you claim that CMP/EQL is simple? It is a little hard to believe. -- Costin

Dynamic EJB-QL is probably forthcoming, it's already supported in Weblogic.

I claimed that O/R mapping is arguably the SimplestThing. Certainly EJB 2 isn't the simplest way to do O/R mapping, it just happens to probably be the path of least resistance - it requires the least amount of manual work. The reason I'm defending it is primarily twofold: I want to see OR mapping technologies to flourish, and secondly to debate the limitations and usefulness of EJB 2 for general understanding. I'll respond to the above stuff soon (and probably refactor a bunch of stuff, this page is getting a bit unwieldy :) .. -- StuCharlton

So, dynamic EJB-QL is forthcoming, aggregates are forthcoming, subqueries are forthcoming, and what do we end-up with? A reinvented SQL. Why bother?

I'll add my vote to 'why bother'. It is slowly becoming SQL. Here's a specific point. Joins are expressed in two completely different ways depending on the multiplicity. If it is singular, then 'dot' syntax is okay. Otherwise you get the horrendous IN syntax. That strikes me as ugly and redundant. The whole syntax seems entirely biased towards machine generation which is not a good thing IME. Performance tuning/debugging is going to be tricky, though profitable for certain parties.

Why bother? Well I guess you've both made your mind up and obviously know more than I do. I was in this for learning something new, but it appears I'm the only one.

EJB-QL is a reinvented OQL. Arguably it's redundant, but it serves a functional need: the ability to query EJB objects in a server independent manner. Theoretically it could be replaced by OQL or SQL99 at some point. It's a stop-gap, but a useful one.

If you'd like to debate this issue, feel free. But it is a cop out to dismiss it with the rhetorical equivalent of "it's all just bunk to make money", which is what I read above. -- StuCharlton

The OQL point is the key thing: EQL exists to query _entities_. Entities are not always stored in a relational database, you know (in fact, there is no requirement for them to be stored in a database at all!). Because of this, it is inappropriate for SQL to be used (directly).

Fair enough. Lets get serious :). My thesis is that OQL/EJB-QL is redundant. It doesn't appear to be needed. We might need to adjust SQL slightly (maybe) but not completely replace SQL. I think we should extend and reuse. I do not trust Sun or IBM to define this as they have an obvious financial interest. I am, however, open to reasoning on why SQL is no good for EJB's.

I disagree with the idea that SQL is tied to relational databases. There's nothing wrong with SQL for querying flat-files (or whatever). If the relations exist in the data, then they can be queried using SQL. A relational database is just an implementation that accepts SQL and queries relational data. You can move the interpreter somewhere else and examine flat-files if you want. A number of relational databases do exactly this. -- RichardHenderson.

SQL is not suitable to work on querying objects, basically because it can't traverse the object hierarchy. That's why OQL was born. Of course, you can translate OQL -> SQL, that's what most EJB container are doing, but that doesn't mean that SQL is better that OQL for querying objects. (Note that I'm not saying that OQL is better than SQL for querying related tabular data)

I agree that EJB-QL and OQL are redundant, but as long as EJB-SQL remain a subset of OQL we're going in the right direction. By making the evolution of EJB-QL to advance at little steeps at a time, the vendors can deliver the features faster. And by putting the simpler things and those with most value first we all win. And eventually we will have the complete query features of OQL.

-- Rafael Alvarez

An important part of EJB-QL's role is as an abstraction which shields against variances in DBMS implementations, including the type of dbms/data store (BridgePattern). In this light, it seems misguided to accuse EJB-QL of redundancy and being least common denominator. Of course everything that is possible through the abstraction has to exist in some way in the implementations, and of course the abstraction is in some ways the least common denominator of the implementations, but that is true of all abstractions.

One could argue about whether using a bridge for the query language doesn't introduce more problems than it solves, or that Sun did a bad job with EJB-QL. But attacking it because it mirrors many but not all features of all SQL implementations seems to be missing the point of it.

-- FalkBruegmann

There are several problems with your argument:

First, EJB-QL is not an abstraction over SQL, it is less abstract than SQL. Second, there is really little more left to abstract over SQL, SQL is at a level of abstraction where you really can't abstract more. You can create something more powerful and more expressive (DataLog is an example), or more concise, but not really more abstract. SQL doesn't mandate that the implementation be a RDBMS: SQL is an implementation independent specification (and is standard). You can use it to query anything that implements SQL, and indeed there are examples of using SQL to query text files and the piped output of Unix processes. It is abstract enough that you can use it to query EntityBeans (except that current vendors and implementations are way too primitive to make that a reality).

What if after we fix all the flaws and limitations in EQL we end up with a mirror of SQL? What will that tell us? It will tell us that EQL was redundant and unnecessary all along (we could have adopted SQL from the very beginning) and that various problems during EQL lifetime were introduced because of bad implementations (we don't have decent aggregates, we don't have dynamic queries, we don't have sub-queries, we don't have views because WebSphere, WebLogic et. comp, can't support them at this time), thus EQL lowered the level of abstraction to make room for implementation concerns. Already EQL lowered the level of abstraction by the fact that it's whole thingie is about navigating collection of pointers and picking up data like apples from the trees. Well, we had that before: that's how COBOL programs were written from the 60's.

Last but not least, if one really wanted an improvement over SQL, a more "abstract" query language, one wouldn't need a committee for that. We already have relational algebra/reltional calculus, we have several well defined and mathematically sound E/R algebras (that would probably be psychologically closer to EntityBeans crowd). What we had instead, was a bunch of guys who manifestly had no clue on query languages, appointed as committee, came up with a half baked proposal, that is patched ever since, and will probably lead nowhere but will lead us there slowly.

So let's recap again: EQL is not and cannot be an abstraction over SQL. SQL is an abstract language and what it abstracts from are implementation details. EQL can be substituted at any time with any of a number of better and more abstract query languages : relational algebra/ calculus, E/R algebras, SQL '99, etc. It is extremely likely that if we add all the decent features of a query language, and if we remove all the warts, we'll end up with SQL. The conclusion (what all these mean) is left as an exercise for the reader. -- CostinCozianu

I'll concede that EJB-QL is a bad idea badly implemented. However I still think that there are benefits to having an object-oriented query language such as the OMG's OQL or even JDO's OQL that abstracts away the underlying datastore. The main advantage I can see is that it makes OR mapping and the object-relational impedance mismatch a separate concern. This concern can then be isolated and made the responsibility of vendors and/or a separate J2EE standard/library. -- AdewaleOshineye

Please define what "abstracts away" the underlying datastore means, and why you can't use SQL '99 or any other decent query language for that purpose. Please note that OQL wasn't adopted in practice by the ODBMS products it was trying to help, it would be really ridiculous then to use OQL to access information in SQL databases. The JDO QL is by all standards far worse than EQL. As to the mismatch, yep, it should be eradicated, the only real solution to that is developer education vis-a-vis ObjectRelationalPsychologicalMismatch. I really don't see how half baked ad-hockery can further the cause any more.

Consider a design that has resulted in a DomainModel. Now we're looking for ways to persist that domain model. We end up with a layer that maps our domain objects to tables and back again. One possibility is to educate ourselves about the best ways of doing that mapping, learn the intricacies of converting objects to n-ary relationships and how to end up with a useful domain model and a database that satisfies your DBA's standards. The other alternative, and the one I would advocate, is to say: "this is hard, let's get someone else to do it for us." That someone else should be the authors of a standard OR mapping layer. In which case I don't see why these experts shouldn't also provide us a decent query language that's designed to satisfy OO developers who'd just like to get the job done. From my perspective I feel it's cheaper/better to encode this knowledge in a set of tools that the majority reuses than to have everybody learn the detailed theory underlying this area. This is why we have specialists because life is too short for everyone to become an expert in everything.

With regard to "abstracts away the underlying datastore": I find that domain models simplify a large set of problems. Moreover there are times when I don't want to care whether I'm using an RDBMS or an ODBMS. In which case I'd like to work with an API that hides the details from me (in much the same way that I don't care if I'm writing to an ext3, fat32 or reiserfs file system under Linux because the API abstracts away that sort of low-level detail). SQL is sufficiently abstract to hide a lot of this. However there are times when because your domain model contains inheritance or complex joins that SQL is insufficiently abstract. In those situations a 'decent query language' that enables me to work with objects and collections would be preferable. Perhaps all the current object query languages are inadequate. That doesn't mean that it's not possible to build a query language that uses objects and which is adequate. -- AdewaleOshineye

I guess the above paragraph is riddle with some prevailing myths among OO developers that need to be debunked. I'm afraid I'll have to finish the ObjectRelationalImpedanceMismatchDoesNotExist. Given how far I've been from the likes of transactional business applications in the last years, I'm afraid it will have to wait for my first vacation, so here I'll put some punctual replies.

Let's consider a design that will contain some DomainModel. All of a sudden you're looking for a way to "persist" the "object model". The myth nr. 1 here is that you can have an advanced OO designer who doesn't know enough data modeling and is looking for a miracle solutions. The error is that sound data modeling principles should permeate the design of object models. Too often I've seen the empty claim that "data consideration should not drive your object models" from people who were really saying "we don't know that and we don't want to be bothered to learn". And of course their object designs were predictably crappy. In reality, a good object schema implies a good underlying data schema, plus a good distribution of behaviour. This has been true both in my personal experience, in the experience of my friends whom I consulted with advice in this regard, and last but not least this principle has a sound theoretical foundation in the work of OO database experts (see FundamentalsOfObjectOrientedDatabases).

My advice to object designers who think they can create good object designs but can't create good database schemas is: it's just a prejudiced and unfounded fear. Try to learn it (data modeling), you'll see that it's nbot complex theory, it is rather intuitive and it'll make your object designs even better.

Which leads us to myth number 2: reflected above in the phrase "the intricacies of converting objects to n-ary relationships" which is kind of funny cause there's really no intricacy in there. So there's a myth that relational databases and databaser theory in general is kind of a very complex and obscure thingies that is better left to the "initiated". This is utter non-sense. In reality, what is only a part of database theory, the science of how to resolve the "database design problem", is a very simple theory, it is extremely intuitive, therefore easy to acquire and easy to practice. What is even more important, acquiring that essential knowledge directly im pacts the quality of object designs and the quality of the software projects overall.

We go on to myth number 3: that you can "abstract away" that essential knowledge and you can solve the problems automatically by delegating the "mapping issues" to "object/relational mapping tools": [["From my perspective I feel it's cheaper/better to encode this knowledge in a set of tools that the majority reuses than to have everybody learn the detailed theory underlying this area.]].

In reality, all that tools can do in generating those schemas is some pretty dumb and predictable transformations. Most of the time the "object designer" is asked anyway for key issues and he has to take decisions. In no case that I've seen so far will a tool be able to take as input a denormalized object schema and be able to produce a quality database schema. Indeed all of the O/R mapping tools (I'd like to be contradicted if possible) have no notion of normalization whatsoever. At the very least, if you consider that "life is too short" for you to become an expert in the very simple science of creating decent database schemas, you should delegate that responsibility to a person who's an expert and not to a tool - the tool will be inherently dumb and won't help you at all. And since you do that, you should consider that person primarily responsible for doing the object design also, since he is better equipped to spot the deficiencies of object designs by virtue of his knowledge in data modeling.

Now we rich the myth number 4: that SQL is insufficiently abstract.: [[However there are times when because your domain model contains inheritance or complex joins that SQL is insufficiently abstract. In those situations a 'decent query language' that enables me to work with objects and collections would be preferable.]]''.

As it turns out the decent object query languages are not so decent after all, as we've seen is the case of EQL. And the more decent you want them to be, the more unnecessarily complex and bulky they are, ending up as a reinvented . In order to hide the complexity of joins you have an extremely simple SQL solution: define a view. In any case the sound solutions to "object query problems" is not to have OR mapping tools reinvent the square wheels, but rather have them support a decent query language over their O/R mapping schema. SQL is one option, a recent innovative product for .NET (http://www.alphora.com) uses TutorialDee for this purpose.

If all these things are merely myths can you solve the issue in TheyWillNotListen? Moreover I think there's a fundamental difference of perspective between those of us who feel that objects/domain models are important (thus relegating databases to the same status as filesystems - this doesn't mean that DBs are trivial. In the same way ReiserFS is a very sophisticated system but I don't want to have to care about it when all I want to do is store some data.) and those who feel that the database is important.

The "issues" in TheyWillNotListen are solved easily by sending developers back to school (maybe a one week training session would suffice). Other than that, perspectives don't matter, the only thing that matters is getting results. EjbQL as it stands now is unlikely to help you in achieving results for all but the most simple things. For the real thing, you'll have to go back to SQL whether you like it or not, with or without domain models and everything.

I don't think anyone is actually defending EJB-QL. However quite a few of us misguided, under-educated OO developers think that better implementations of the same idea are possible. What would be the agenda for this "one week training session" that will resolve the issues in TheyWillNotListen? I ask because it's an increasingly common problem.

Please add agenda items here