Tql Lacksa Natural Join

Top's Query Language does not have an obvious NaturalJoin feature, but TopMind might add an even more powerful TqlChainedJoin feature.


Once I discovered NATURAL JOIN, I found nonsense like a.dept=b.dept to be so much noise.

{If you mean joining based on schema information instead of an explicit expression, not all database engines support natural joins. But as you say, adding an operator for that is possible.}

And why should criteria get second class treatment and only be represented as strings?

This is discussed in TqlOverloading. It has to do with keeping the meta-ability of the column info without overloading parameter types, but perhaps that requirement could be relaxed.

Although I'd probably go with a different syntax entirely, I'll try to keep a similar syntax just to keep it familiar. Added is a named arg syntax. expr is me being lazy. It could probably do with alternate grouping brackets ala scheme for when the parens get thick. When I reall want an elegant syntax, I'll look to Haskell for inspiration (go look at HaskellDB sometime)

  join((srt,top), on:(natural, expr("b.order - a.order <= 5")))
I don't really like the "on:" thing. It creates potentially unfamiliar or cluttered syntax.

I guess one could have a natjoin() function, but it really is largely a shortcut for a bunch of a.foo=b.foo criteria, except one isn't actually required to know the names of the columns being joined. In fact, if one table has a foreign key referencing, the natural join should probably join them on equality whether or not they're named the same.

Reasonable people can reasonably disagree on syntax. Named args wouldn't be necessary if positional args are short and fixed, and strings are reasonable for containing expressions (it works for tcl anyway).

You do have to support joining more than two tables, otherwise you have a pretty anemic query language.

One would join them two at a time. Personally I wish RDBMS had "view columns" to reduce the need explicit joins.

I personally would be sold on a new query language if it implicitly joined foreign key references, as well as supporting natural joins where the key relationship is implied by the schema - I don't like to repeat myself and have to restate the table relations every time I perform a query.

Natural joins depend more on the database engine than on the query language.

NATURAL JOIN is part of the SQL standard. That makes it part of the SQL Query Language. In fact, the database engine doesn't need any concept of a natural join, since a SQL parser could just expand it into a bunch of WHERE clauses (proviso it invalidated it if the schema changed). There's no point in getting hung up on it: NATURAL JOIN is syntactic sugar, but it'd be nice to have some even sweeter sugar that knew about and joined on foreign key relationships implicitly - this is almost always what you want anyway. What I'm seeing here so far is a query language slightly weaker than the procedural query builders like tangram or alzabo, and nothing like an intuitive or more powerful encoding of actual relational calculus. If you're going to replace SQL, you're going to have to have a language at least as powerful as SQL.

I guess my problem with it is that it requires schema analysis to find relationships. It is a different animal than the other Joins. But, I am not forsaking it anyhow. We can have a NaturalJoin or NatJoin operation. If the engine supports it, natJoin() is available; if not, it's not available. That would not drag down other operations. Note that even though it is part of the SQL standard, its implementation coverage appears spotty such that it can harm portability.

Of course it requires schema analysis - that's done at the query parser level, to ensure the colums you're querying actually exist, for example. SQL itself, praise Date, standardizes that sort of access to the schema with the data dictionary (the level of compliance in implementing the DD varies pretty wildly though). Additionally, query optimizers have intimate knowledge of the schema, and presumably TQL is ideally implemented as a "native" query language with that sort of low-level access, and not merely a translator to SQL, no?

Entity-relationship information (table cross-link info) is not a requirement of relational theory. It is just a nice convenience.

[Natural joins don't require foreign key information, if that's what you're referring to. They're based on the matching of attribute names.]

Maybe you are right. However, they should be based on schema info if you ask me (and if you didn't, just ignore me). I like the convention where foreign keys use "fooRef" instead of "fooID". It makes the schema much more self-explaining in my opinion. Plus, using the name alone to match may result accidental name collisions for common names such as "description", "status", "isInactive", etc.

[Yes, I'm right, natural joins are a well-defined concept, are based on names only, and require no foreign key information. Since names are in the schema, schema information is of course needed in order to interpret a natural join request. The naming issue is an interesting one. To make effective use of natural joins (which are very handy because they are succinct compared to explicit joins), you have to be very strict about naming in the schema, in general following this principle: Fields that mean the same thing have the same name. Fields that mean different things have different names. That's sounds trivially reasonable, but still requires some judgement in practice. Certainly, under this guideline, 'fooRef' is a bad idea; a 'fooID' is a 'fooID' wherever it appears. Things like 'Description' and 'Status' are bad because they're too general, they don't really say what the field means. 'CheckDescription?', 'TransactionReconciliationStatus?' would be examples of more suitably descriptive names. Despite best efforts, though, accidental name matches can occur, and that is a disadvantage of natural joins.]

I've found the "ref" convention for foreign keys very useful. I've used both ways and I vote for "ref". Perhaps we'll just have to AgreeToDisagree. I am not against a "natJoin" operator, but probably would not make it a top priority (no pun intended). But the idea is that SMEQL/TQL has simpler syntax than SQL and Tutorial-D in order to facilitate DBA-added operators. Thus, your site can have your cake without my site having to eat it too. -- top


For one possible approach to natural joins, see TqlChainedJoin.


EditHint: Topic name change proposal to "TqlAndNaturalJoins?".


>>> Personally I wish RDBMS had "view columns" to reduce the need explicit joins.

Are you referring to computed columns? Most databases have those, but they can't reference other tables. If you want columns computed on a join, you want a view. If you're looking for something completely different, then well, edit away.

The requirement that a table view be created for cross-table joins makes it hard to wrap certain kinds of changes. One does not want to have to sift through a bunch of code to change the table name to a view name in a bunch of queries just because we move some columns to another table. This shows a flaw in the abstraction.

The flaw is in the implementation, not the abstraction. Specifically, in the poor support for updateable views in SQL and in SQL DBMSs. If (more) views were updateable, so that views and (base)tables were more nearly indistinguishable, then code would not have to be changed when such changes were made (just define a the new base tables with new names, and define a view with the name and columns of the old base table which does the join).

Many shops are understandably uncomfortable having actual table names overlap with view names. It could result in debugging nightmares because you are not looking at what you think you are looking at.

[Ideally all visible tables would be views, with only the admin fiddling with the true tables, and with the two existing in different namespaces to disambiguate naturally. This would nicely separate implementation issues. ("Ideally" means, in a hypothetical ideal situation where issues of performance, vendor support, blah blah, have all been magically fixed.)]


CategoryTql, TqlRoadmap, AlternativesToNaturalJoins


EditText of this page (last edited April 7, 2009) or FindPage with title or text search