Living With Sql Annoyances

(refactoring in progress)

Incidentally, I don't believe that you could implement the more complicated Windows NT user/group/capability model using only SQL (there's some Oracle extensions that would let you). Anyone want to take a shot?

Actually you can trivially do it in ANSI SQL but the feature is currently implemented only in DB2 and SQL Server 2005 (with recursive clause), you can also do it with Oracle's CONNECT BY if you do not have cycles (which you shouldn't anyways). For other databases it takes a bit more work, but one path (maintaining a materialized transitive closure ) is already beaten: http://citeseer.ist.psu.edu/dong99maintaining.html. Maintaining transitive closure still looks more terse then you would have it in your stock OO language.

See AccessControlList. Once the schemas are set up, the rest is just run-of-mill CrudScreen work. -- top

I should have been more specific, but it really was an aside. You can model the relations just fine, I don't think you can implement the logic(behavior) without resorting to vendor-specific SQL extensions or post-processing the query in some other language.

I never said the DB does the entire thing. It is about DivideAndConquer. If I can get the DB to do 80% of the work, then I only have to do the last 20% in app code. On a few occasions it will be 20%/80% (reversed proportion), but I still win on the total. There is a similar discussion under IsDeclarativeLessExpressive.

I understand that (and I believe that most people actually do the same thing, arguments over what is more efficient aside), but it goes to what the challenge purports to prove - of course you can model a schema very quickly. Implementing the lookups (not done for this challenge, note) adds to the complexity, and in some cases isn't possible, at least using currently available tools. When I implemented a security model (ACL based in concept, slightly different model than NTs), I wasn't able to write a query that could implement a yes/no answer to whether or not a user could access a specific resource, because I couldn't model the collapsing of allow/deny records without using Oracles decode() functions. Might just be that I didn't know enough about SQL, of course, I'm a much better OO programmer than I am a relational one. I've also written a hand-rolled version in Python (which did some neat stuff with weakrefs to keep the cost of deletes cheap) which I far prefer from an elegance point of view - I really, really, really hate code generation as strings. -- ChrisMellon?

Sometimes the problem is SQL itself. I don't consider it the pinnacle of relational query languages in any way. But, SQL not being able to easily do an entire specific algorithm is not a show-stopper as long as it does not happen too often in a given app. In such cases I usually find that SQL can be used greatly narrow down the result set so that one is not working with the entire table. Then this smaller result set is used to do the heavy-duty chomping needed by the algorithm.

Some languages limit what can be done with the result set, such as only allowing one to move forward through it. I generally consider this the fault of the language or API's, not SQL itself. For a language/API with such limits, perhaps loading the result set into an array may be the way to go in order to get full random access. This copying is a violation of OnceAndOnlyOnce, but sometimes necessary if you have lack-luster tools to work with.

[Forward-only cursors are usually an implementation detail leaking through the relational abstraction. Loading into an array for full random access is the normal solution, but impossible/impractical when working with very large datasets. Regardless, it's a deficiency of specific databases and APIs rather than of the concept in general. -- ChrisMellon?]

Generally the processing amount can be divided as follows:

1. Must loop through entire table
2. Can narrow down working result set but still have to do most processing outside of SQL
3. Can narrow down working result set but still have to do a fair amount of processing outside of SQL
4. SQL does all or nearly all the processing

If I had to graph the frequency of these, it would generally look something like this:

1. **** 2. ************** 3. **************************** 4. **************

Also, the better the schema design, the more it is tilted toward the higher numbers (3 and 4). If you have to do a lot of 1 and 2, then it is a yellow alert to re-explore the schema design. Then again, some shops are stuck with bad schemas and one must live with what exists. For example, the "category per column" anti-pattern discussed in DatabaseBestPractices is quite often a hindrance to letting the query language do more. It is allegedly done for performance reasons, but I question that if all the resulting workarounds are factored in.

As far as "not liking strings", I have generally grown used to it. (I used integrated language + queries in my ExBase days, and still prefer the approach for small RAD, but it does not scale). There are various ways to avoid having to be too involved with them, as described in HelpersInsteadOfWrappers. Maybe in the future ways to better integrate them will be found, but it is difficult to have a language-neutral query language without some kind of "boundary", such as strings. The first step would perhaps be to toss SQL for a more programmer-friendly query language.

[For many of the same reasons you prefer to expose the runtime engine, I dislike creating code via string formatting and concatenation. It's basically tossing aside all of the tools for effective programming I have at my disposal. It's not just SQL that irritates me in this request, I end up doing it a lot with JavaScript too and it's really suboptimal. That's totally aside from the performance stupidity of converting from a bytecode/machine code representation, to a string (SQL), compiling and parsing that string, executing machine code (in the DB), returning a dataset which is then walked and converted into yet another binary/bytecode representation. It's usually FastEnough but it's still a stupid way to do things and the inelegance irritates me. -- ChrisMellon?]

One of my goals in TopsQueryLanguage was to create a language that could be implemented in most other languages such that if you wanted to do query processing "locally”, you could, but still use a language that ran on BigIron databases. However, unless your language has heavy meta features, you would still have to convert it to strings to be able to use the same queries on a BigIron database. Maybe languages with heavier meta features are the way to go in order to ease such a transition.

-- top

[I find the syntax of TQL interesting and I like the easy exposure of columns. One of the biggest annoyances when mapping from the database to program code is handling datatypes - in my web programming, I often end up re-creating the table schemas in app code so I can properly dispatch on types. Stupid. -- ChrisMellon?]

Perhaps it is time to explore DynamicRelational and have away with column types altogether for some shops. For example, dealing with the issue of whether to put quotes around a value or not would be gone: always put quotes and the RDBMS won't complain. (Actually, Microsoft SQL-Server allows this.) Note that DynamicRelational does not rule out validation triggers to assure that something is (formatted as) a number.