Conceptual Query Example Of Advantages

See ConceptualQueries for context.

JPAQL does not help much with AccessPathIndependence, even if a query is written in JPAQL (the standard query languange with ObjectRelationalMapping extensions in Java), it still depends too much on the shape of the relational model (I think JPAQL could be extended to support Conceptual Queries). .

For example, lets say you want to:

Find each white cat that lives in a house with doors made of wood

A cat is of one color, it lives in one house, a house can have 1 door and the door can be made of 1 material

So, the query in JPAQL looks like this:

select c from Cat c where c.color=”white” and c.house.door.material.name = "wood"

But then, lets say our customer changes the requirements:

Find each white cat that lives in a house with doors made of wood

A cat is of one color, it lives in many houses, a each house has many doors and the door is made of one or more materials (for example half glass, half wood)

Since our relationships are now @OneToMany? so we write their names in plural, and if we are newbies in JPAQL we try doing this (and it, of course does not work):

select c from Cat c Cat c where c.color=”white” and c.houses.doors.materials.name = "wood"

Now, we can of course solve this problem using subqueries and the exists keyword (that is pretty similar to the way it can be solve in plain SQL), but that makes the resulting query way more complex, and even if the above worked, it still is a different query, but, in our English example, the query didn’t change:

Find each white cat that lives in a house with doors made of wood

So, why we can not write in JPAQL (or in SQL) something like:

select c from Cat c where c with color= “white” and lives in House has Doors made of Material with name = "wood"

That way the query wouldn’t have to change even if the multiplicity of the relationships changed. Of course now the question is, from where do I get the “with”,“lives in”, “has”, “made of” and “with” well, simple:

The with operator is available for each non @Entity member of the class (for strings, integers, etc).
For relationships with other entities we just add the conceptual name of the relationship name as an attribute in the @OneToMany? or @ManyToOne? annotations:

Example:

Before:

 public class Cat{

 @Column

 private String color;

 @ManyToOne?(conceptualName=”lives in ”)
 private House houses;

 }

After:

 public class Cat{

 @Column

 private String color;

 @OneToMany?(mappedBy=”cat”, conceptualName=”lives in ”)

 private Set<House> houses;

 }

This way changes in the object model do not necessarily affect our queries. That means we can make changes to our relational/object/domain model and we are not forced to re-write our queries, because they still are conceptually correct.

I work with domain models where 'root.x.y.z' is implicitly all z's that are found under y's that are found under x's that are found under root. Uniqueness is not guaranteed by the implementation, and the path query can return many results. For queries, this isn't a challenge at all.

So you mean you can change the multiplicity relationship in you domain model but you do not need to change your code after that? May I ask what programming language do you use?

In this case I work with Javascript, with bindings to the domain object model. One can explicitly scan using root.x[0] vs. root.x[1] if they wish (the order is fixed based on temporal entry of x's).

A filtered breadth first search does the trick very easily. With an ambiguous language, "c.house.door.material.name = "wood"" could, in fact, return many houses for each cat, many doors for each house, many materials for each door, and many names for each material. Where it becomes difficult to handle is when combined with reactive programming, automated updates, and delta isolation.

The problem above seems to be that you are being forced to represent (or assume) constraints on the model within the query language. What you seem to want is to represent the query independently of the 'rules' for the model.

No, I do not want to represent the query independently of the 'rules' for the model, I want to have a more abstract model that shields me from implementation details like multiplicity.

Fair enough. May I ask you what the distinction is between "rules" like multiplicity and "implementation details" like multiplicity? And can I ask how multiplicity becomes a mere implementation detail rather than a design decision?

From the perspective of the query, multiplicity is an implementation detail, because I do not care about mutiplicity I want to "Find each white cat that lives in a house with doors made of wood" and I do not care how many houses are related to the cat, I do not care about how many doors the house has, and I do not care if they are of 1 or many material, I only care if one of them is wood. So, since I am not saying something explicit about multiplicity in my query, for my query, multiplicity is (should be) an irrelevant implementation detail.

Ah, I think I understand you now. So by "implementation detail" you mean what I'd call "a detail of the domain model about which the query should be unconcerned". I tend to limit "implementation detail" to expressing, say, the physical/code implementation of the model or the query, as opposed to abstraction of them.

Exactly, by "implementation detail" I mean what you would call "a detail of the domain model about which the query should be unconcerned". I feel that implementation details are always about the relevant concerns from a particular perspective: If you are watching a show in a TV, you only care about the interface, not the implementation details on how a TV works, but, if you need to fix a TV that does not work, then you interface with the TV changes, and its internal mechanisms, previously considered irrelevant implementation details become your interface.

Oh, I agree with much of what you say. Still, though, I don't believe that details of the model being queried really qualify as "implementation details" of the query, which is what is implied in how you stated it. It may help to distinguish between "irrelevant" detail and "implementation" detail.

Your belief that this requires a more relational syntax isn't really the case... it only requires a more relational/logic programming semantics.

I do not believe that this requires a more relational syntax, I believe this requires a more conceptual syntax. Could you provide an example on real world language on how relational/logic programming semantics make it possible to write queries that are not affected by multiplicity changes in the domain model?

I am not aware of what "conceptual" syntax means.

Well, please read ConceptualQueries for context. Note that I already stated this at the very beginning of this page :-P

I had read the aforementioned page. ConceptualQueries does not cover syntax.

Did you read the linked PDF document in that page ( http://www.orm.net/pdf/ConceptQueries.pdf )?: At the conceptual level, the information is expressed in its most fundamental form, using concepts and language familiar to the users (e.g. Employee drives Car) and ignoring implementation and external presentation aspects...

But as for relational semantics, the above multiple-return query path qualifies as a form of relational programming because 'inappropriate' return values (e.g. y's without z's) are filtered out automatically. (In this sense I am referring to "relational" programming as defined in ConceptsTechniquesAndModelsOfComputerProgramming.) Logic programming would generally allow greater flexibility in search/discovery of 'objects' that can't be found directly in the domain model. But in either case, multiplicity changes in the initial inputs are handled in stride to produce multiplicity changes in the query outputs. For a typical logic language, one might have:

 MODEL
 isaCat(Emo)
 isaHouse(house_3121)
 livesin(Emo,house_3121)
 hasDoor(house_3121,door_12234)
 hasDoor(house_3121,door_12235)
 madeWith(house_3121,steel)
 madeWith(h,m) :- isaHouse(h),hasDoor(h,d),madeWith(d,m)
 madeWith(door_12234,aluminum)
 madeWith(door_12234,wood)
 madeWith(c,meat) :- isaCat(c)
 madeWith(c,bone) :- isaCat(c)
 madeWith(c,sinew) :- isaCat(c)
 madeWith(door_12235,brass)
 madeWith(door_12235,wood)

 QUERY
 #Find each white cat that lives in a house with doors made of wood
 (c) ? isaCat(c), livesin(c,h), hasDoor(h,d), madeWith(d,wood)

In this particular model, there is one cat that lives in one house that has two doors each made with two materials. But it should be clear that I can easily add or remove doors, houses, and cats from the model. The query itself (which only returns cats) could return multiple valid answers, but in this particular case it would return only 'Emo'.

It is the ability to add implication/entailment to the model of a logic language (such as saying: "all cats are made with sinew" as "madeWith(c,sinew) :- isaCat(c)") that greatly simplifies both the queries and the model.

Yes, but when you say that, you are also not limiting the multiplicity of the relationship, while if you were using SQL or JPAQL there is no way to shield your query from the multiplicity of the relationship. SQL and JPAQL do not (by default) offer AccessPathIndependence.

On the other hand a problem I see with you approach is that for insert or updates it doesn't appear to be a way to enforce a particular multiplicity (while for querying multiplicity can be considered an implementation detail, because one may want to write a query that works even if the multiplicity changes, for updates or inserts, multiplicity may be detail that can not be ignored (note that I wrote "may be" because in some cases it could be useful to be able to ignore multiplicity when inserting or updating data))

If I wished to limit the multiplicity of a relationship, I'd need to make it explicit in the model. Specifying that a given property must always exist and must be unique, vs. specifying that it may exist and must be unique, etc. are possible (though not every logic language provides such semantics for it). It is quite similar to expression candidate key or foreign key constraints in SQL. It is not the query language's job to limit multiplicity. If desirable, a query language could potentially provide heuristics for a "best" choice (e.g. if you only wanted the best matches by some measure of 'best'). Figuring out how to specify the 'best choice' in a query would be a problem you'd need to solve no matter how the domain model was expressed (e.g. object oriented vs. logic).

It is also possible to add transient model components to support queries-with-assumptions. As an example: find all cats living in houses with doors made with wood assuming that Omar is a cat that lives in all the same places that Emo lives:

  (c) ? isaCat(c), livesin(c,h), hasDoor(h,d), madeWith(d,wood)
        assuming  isaCat(Omar), livesIn(Omar,h) :- livesIn(Emo,h)

Sound interesting, how would you extend the syntax of SQL or of TutorialDee to support this?

I wouldn't. I'd extend the syntax of DataLog to support this. SQL and TutorialDee were not designed for logic operations.

But, would you agree that the extension that I described before could move JPAQL a little closer to in DataLog direction?

It does, which is why we ultimately started discussing DataLog in this context. I expect that JPAQL (even with your extension) still doesn't quite match the expressiveness of DataLog (e.g. the above query would also allow livesin(c,X),hasDoor(X,d) with X not being a house since I didn't specify isaHouse(X)). But there are always tradeoffs between expressiveness and optimization.

Some rough-and-dirty observations:

It requires more up-front specification than a typical RDBMS does. However, one can also "throw views at the problem" with a RDBMS to sort of do the same thing: pre-name relationships and groupings.
- Well in plain SQL there is no way to pre-name/pre-store a relationship. See NaturalJoin
- You can have a view such as "DoorsOfHomes?".
It seems optimized for use with roughly hierarchical decomposition descriptions of things (part, sub-part, sub-sub-part, etc.). This makes me wonder how it works in domains that are not so hierarchical.
- I am not sure why it shouldn't work for less hierarchical domains... if you give me an example of less hierarchical domain maybe I can show you how a ConceptualQueries can help you
If the point is to make it user-friendly, I'd suggest putting the effort into graphical techniques instead of so much into verbal. It could even have house icons, door icons, etc. in the "drill-down trees".
- The point is not to make it end-user-friendly. The point is to make it more developer-friendly (I hate it when I have to change queries everywhere just because my user realized that a relationship is not many to one but one to many, I want full AccessPathIndependence)
- This doesn't happen that often in my experience. It's not where'd I spend top effort to improve queries.
- Ah, yes. We're still waiting on your "top effort to improve queries" (TopsQueryLanguage). Zing! Sorry. Bad pun. But you gotta admit you walked right into that one...

NovemberZeroEight