Database Abstract Interface Discussion

(Moved from GreatLispWar, which is getting TooBigToEdit.)

...Except, of course, for every indivisible algorithm that requires customisation, as shown in HofPattern.

It's great because you say it's great. The pattern is rare in the wild, or at least uncompetitive with alternative solutions.

[What about qsort(), which is part of C's standard library? What about approximately half of Ruby's standard library? Are standard libraries rare in the wild? What about any genetic algorithm ever written, DijkstrasAlgorithm, AstarSearch, and so on? -DavidMcLean?]

I've never ever used qsort. I imagine it's useful for SystemsSoftware and embedded systems, but those are not my niche. And much of Rudy's library, and perhaps qsort's comparison techniques, could be implemented in objects. If you don't like Ruby's object syntax, complain to the Ruby makers, not me.

[If you use an object constructed on-the-fly with an associated callable method---the C++ functor pattern, effectively---then you're using higher-order functions. It's exactly the same thing, just with syntactic overhead. The minimal syntactic overhead on blocks in Ruby is vital to their utility in code; any syntax that required declaring an object-with-some-named-method to pass to things like Enumerable#each would be too verbose to be used as ubiquitously as Enumerable#each is and should be. JavaScript might be a better example to look at here, because its object syntax is extremely concise; despite this, one finds that the vast majority of functions-that-need-to-take-functions just take functions, not objects. -DavidMcLean?]

Again, that's perhaps a problem with Ruby, not objects in general. When deciding what to include in a language or shop conventions, if x and y are similar and overlap in many areas, and you agree that it should have x, but circumstances where y shines over x are not used/found very often in the field, then it makes sense to exclude y if you want to keep the language relatively simple. (NonOrthogonalLanguageFeatures.)

[You could change the object syntax such that there's syntactic sugar for creating a C++-functor-type object, i.e., an object with a callable associated method, which is what you're actually trying to pass around. However, providing sugar like that is providing lambda syntax: It's exactly what Java 8 is doing. On the other hand, if you're trying to reduce the number of related concepts in a language, why not remove named functions and methods? CoffeeScript proves that you only actually need anonymous ones in the syntax. -DavidMcLean?]

Or we could hard-wire table/collection-oriented idioms into the language rather than rely on libraries (a bit like ExBase). This may prevent unwanted creativity. I'm sure you would object (no pun intended), but it's back to the 'ol "herding" debate.

[That's adding more NonOrthogonalLanguageFeatures, Top, and specialised ones at that. In addition, if we represent collections using parts of the standard library (the Enumerable module, in Ruby), we can add more collections. If we hardwire collection handling, how would we do that? -DavidMcLean?]

First let's analyze some realistic situations where that might be a need.

[There's a database vendor who isn't the developers of the language. We want to be able to query that database. -DavidMcLean?]

ODBC driver and/or a C API.

[And how do we make our hardwired collections handling use that C API, when it wasn't designed with such an API in mind? -DavidMcLean?]

Maybe I misunderstood you. I thought you meant like a tool vender being able to use this hypothetical TOP language's "internal" database data. Instead you mean having "TOPX" communicate with another vendor's database? Something like ODBC is the usual way. And I don't see how Ruby improves on cross-language and cross-vendor data sharing.

[Yes, I meant querying another vendor's database. Let's make it concrete: You're using TOPX and you find you have to query a MongoDB database. How do you do this, preferably without sacrificing your collection syntax? -DavidMcLean?]

I am not familiar with Mongo's query language. But the "lowest common denominator" driver would be able to handle something like this:

  queryLoop to=q, db=mongo1, qry="Magic mongo syntax"  // for each result record
    print q.name, q.rank, q.serial  
    doSomethingElseWithRow(q)
  End

Basically you feed it a query string that uses Mongo's native query syntax and it returns a work table (record set) to process in our loop. A fancier driver may be able to be mapped to TOPX's own syntax for a more "native" fit (such as WHERE= clauses etc.). But just about anything that can return a single table when issued a request string can use the above pattern. We could pre-define a way to return multiple work-tables also in preparation for such situations, which may come in handy for XML result sets, although there's different ways to deal with such semi-regular structures that would be too long to address here. Related: TighterAppAndDatabaseIntegration.
[Your response is tantamount to "you give up the collection syntax when accessing this resource". That's a problem. -DavidMcLean?]
Please explain. The features available are generally defined by the driver, not our app syntax. We could even make it 'pass arbitrary attributes to the driver:

  queryLoop to=q, db=mongo1, glib="nof", florf=7, tloog="pift", zarg=x
    print q.name, q.rank, q.serial  
    doSomethingElseWithRow(q)
  End

It's up to the driver to process those attributes, flag them as unrecognized, pass them on to the database, etc. The sky is the limit. Let's see Ruby do that as nicely. An OOP interface could perhaps be available to allow developers to write their own drivers if they want. (Some drivers and/or settings will do cursor-like calls to the database or source and fetch a row at a time for each iteration, and some will pre-fill a virtual local "temp" table before looping.)
["Please explain." Okay. You're proposing languages provide built-in syntactic support for querying tables. Then you go on to say that, in fact, there's no syntactic support for constructing queries, as queries are expected still to be written as simple strings. Therefore, your language actually has no special syntax for tables. It just has a foreach loop with a name that makes it sound more like it's query-related. An example of actual special syntax for querying tables is LINQ, which we've discussed before. "Let's see Ruby do that as nicely." Ruby does it much more nicely by default, because Ruby actually does something beyond weirdly-named foreach loops; take a look at ActiveRecord, for one popular example. -DavidMcLean?]
I never said limit it to only strings.
["Basically you feed it a query string that uses Mongo's native query syntax". I asked how you'd query a database, and your immediate response is "you use a raw query string". Granted, you went on to add that you could use other attributes, but that's not special query syntax either: That's simply keyword arguments, and you can't properly express all queries with just keyword arguments (because query clauses are trees). -DavidMcLean?]
In some cases one would want direct string queries, in others some kind of tree-builder approach, in others domain-specific or custom wrappers or sub-clause generators. A non-SQL database is going to have different syntax and idioms than SQL such that a driver would need to know about those so it can create a MirrorModel grammar. So what syntax are you referring to when you mention "sacrificing your collection syntax"? There is no "your".
["your collection syntax" means "the special collection-querying syntax your language offers". It's your collection syntax and your language because they're your ideas. If your collection syntax expects users to write out the queries as strings, in SQL or another querying language, then there's not special syntax for queries; there's only special syntax for one specific type of foreach loop. -DavidMcLean?]
Even if TOPX or any language/API had its own default set of collection-oriented commands and syntax, someone still has to map those into a new database brand, such as Mongo.
[Obviously. How do they do that? -DavidMcLean?]
Object API's
[Great. Let's see a typical example. -DavidMcLean?]
One approach is to use an SQL-like native syntax because SQL is the de-facto query standard (for good or bad). When you issue a query, the language parses your SQL-ish statements into a (usually invisible) parse tree, perhaps a table-form of the syntax tree, and then it's up to the vendor-specific driver to take that structure and translate it into the target DB's (such as Mongo) native query language. I'm not saying this is the only approach, but it has the advantage that the parse-tree can be used for many purposes and that app and/or systems programmers can potentially generate or modify such a tree in other ways to better fit the domain. Building a perfect cross-query-language translator is probably not practical, but that's not due lack of HOF's, just that translating between computer languages is not a trivial problem unless you stick with a narrow sub-set. Some kind of "escape to native" commands may be needed for portions of queries, but it may be hard to get the driver to place them in the right spot being that ordering may be very different between query languages. For this reason, direct full query strings are often the least headache for non-trivial queries. I often use HelpersInsteadOfWrappers for the repetitious parts of the queries, such as column lists and WhereAndAnd criteria clauses. A good data-centric language should facilitate this practice. It could look something like:

          Query native=true :
             SELECT $myFields
             FROM FOO
             WHERE X=5 AND $myAndClauses
             ORDER BY $myOrderFields
          Loop: 
             print a, b, c
          End Loop

Note that defining the query and the looping don't have to be together, but often they are so it's best to make doing such simple when needed. (We could also have named templates of queries.)
[Using SQL-like native syntax is a reasonable choice. LINQ did it. I'd still like to see a typical example, though. How do you implement a queryable? -DavidMcLean?]
Please explain. I thought I covered that above already.
[You explained how it might work. I'd like to see an example of implementing a queryable, preferably in code, because your above description doesn't establish anything about the "object APIs" that might be available. -DavidMcLean?]
I don't think there's anything special about API's. We'd want a standardized way for the drivers to access the parse tree structure mentioned so that they can translate it into the target vendor's format and submit results back (including warnings and errors). It could be done in plane-jane C. I'm not really that concerned with making a driver writer's job easier, but driver writing is outside of the scope of this discussion re custom biz-ware.
And the parse-tree tables could be available to app-level developers so that they can add a "native feel" to say a web-service and issue SQL-like queries to it. This is a bit different from Linq in that one is not directly building the parse tree with the app language, but rather entering SQL-ish statements, and then TOPX parses them and puts the results into a structure to be further processed.
[If the parse trees are available to app-level devs, the APIs for such will be important to app developers and not just driver developers. Also, yes, your system is slightly different to LINQ; what makes your system better than LINQ? -DavidMcLean?]
I don't see a need to muck with such API's very often from the app developer's perspective. Yes, occasionally they may make custom translators for say web services, but that's not where most app developers spend most of their time.
- [I asked how you'd implement some reasonably common things using those APIs, though. I'd still like to see how it's done. -DavidMcLean?]
- What's an example case you'd like to look at?
- [Let's go with a Web service. I already suggested Google Images and still think it's a good example. How do you make a queryable to Google's REST API? -DavidMcLean?] [[Note: I removed the original request because it broke up the flow due to thread growth. -t]]
- Of course, somebody has to create a driver that converts the SQL tree into the Google image API calls, runs the Google service, and then return results in tabular form.

 // Example: Carmen01
 // search for monotone (greyish) jpeg's of Carmen Electra's face
 Query driver="googleImages":
   SELECT * FROM googleImages 
   WHERE faceness > 0.7 AND extension IN ('jpg','jpeg') AND 
     saturation < 0.2 AND textQuery = 'Carmen Electra'
 Loop:
   print url, size, width, height, extension, altTag
 Status: // std. query result info
   print "Rows: $rowcount, Query-Time: $qryTime"
 End Query
 // [leave dot below to work around a wiki bug]

This example is roughly based off the "Carmen Electra" example at http://developers.google.com/image-search/v1/reference
The beauty of this approach is that one ONLY has to know the table schema (DataDictionary) to query images. They don't have to learn a screwy new interface language and custom set of DatabaseVerbs invented by Google or Gates or Gorn the Space Lizard because one must learn SQL anyhow in the biz IT world, and can then leverage that knowledge. I will agree though that not everything may be able to fit nicely into an SQL "shape", but TOPX wouldn't force SQL as the only query language. However, deviating from it puts more labor on the driver creator.
[Okay, but, again, how do you implement the queryable? -DavidMcLean?]
Please clarify, I'm not necessarily trying to copy Microsoft.
[I didn't claim you were. (They're called IQueryables in LINQ, yes, but it's a good name for the concept in general, and I don't specifically mean IQueryable when I say "queryable".) I'd like an example of how you implement a queryable, like that googleImages one you just provided a usage example of. -DavidMcLean?]
The link above uses a deprecated JS-based API, which would be a pain in the butt to write a driver around anyhow because the driver would have to emulate a browser with a JS engine (or dissect HTTP communications to see the internal language). Fortunately they are switching to a URL-based technique (which REST is expected to generate for interactive pages, but is not mandatory). Basically the SQL is parsed into a native/default TOPX parse tree as usual, and the driver reads and grabs the WHERE clause items, and translates them into URL parameters and then issues an HTTP GET and awaits the results, which are then converted into table form. It looks like one has two result format options from Google's service: JSON and XML. Thus, the driver would have to parse say XML into a result table(s). (That URL sort of resembles a WHERE clause, oddly enough).
[Right, but how does one go about writing a driver that manipulates a TOPX parse tree, translating the WHERE clauses into URL parameters? -DavidMcLean?]
What aspects are you asking about? The algorithm? The language used? The API that traverses the parse table/tree? Why did you use the word "manipulate"? If the driver can use the standard/built-in SQL parser, it doesn't need to be involved in controlling or changing the parse tree structure, only reading it to translating it into the target service's language.
[I used the word "manipulate" because it's reasonably generic and hence doesn't specify intent too deeply. You've said that one maps the SQL-ish native query language of TOPX to a data source's language using "object APIs", which you've said at other points are accessible directly to (presumably) TOPX-using app developers. I would like to see those object APIs being used to map TOPX-query-language to the appropriate language for accessing a data source such as Google Images. -DavidMcLean?]
A draft parse table is given below. The API likely would have operations like, getParent, getFirstChild, getChildCount, getNextSibling, getRoot, etc. Those are pretty generic tree-traversal kinds of operations. I'm not sure why you are interested in those. Is there something specific you are looking for? I don't expect app developers to need to use them very often, so that's not where I'd spend much fanciness. Also note that the driver can parse the raw query text and skip the built-in parser if it wants to. It's not forced to use the built-in parser.

         // Example struct01
         table: parseTree
         ---------
         ID
         parentID  // zero for root
         sequence
         token
         tokenType   // token or value type
         grammarRef  // grammar rule reference number/ID
         warningMsg   
         errorMsg

[Yes, there's something specific I'm looking for. I'm looking for an actual example of implementing a queryable. I chose Google Images because it's reasonably common, but if you'd prefer to use a different data source for your example that's fine. A database table schema provides absolutely no information as to how queryables are implemented. Storing a parse tree relationally is bizarre, by the way. Why would you do that? -DavidMcLean?]
I don't see that we need "queryable" here. And I see nothing wrong with a "relational tree". However, that's another topic. Please find or create a new topic if you want to explore that. RelationalAndTrees has some leads.
[We need "queryable" here because that's what I'm asking for an example of. Would you prefer it be called a "data source driver"? As for relational trees, why? What's the value in storing a parse tree like that? -DavidMcLean?]
I'm still not clear on what you want and why. I'm not going to make something that's not needed for the tool just for the hell of it.
- [I want an example of writing a queryable, given an arbitrary data source, such that TOPX's query support may be used to query that data source. I suggested Google Images as the data source to use in your example, although I don't mind if you pick a different data source. The reason why I want this is simple: If I'm developing in TOPX, I'd like to be able to query data sources, such as Google Images, using its query support. Someone is obviously going to have to implement a driver for that data source for this to work, and I'd like to see how they'd go about doing that. -DavidMcLean?]
- See below.
As far as why to table the parse tree? Mainly so our tool can read it using its default query language and utilities if need be. Note that SQL is only an interface, what the table "is" under the hood can vary widely (linked list, array, gerbils with cards on their back, etc.) The table-creation command(s) could offer table implementation choices or hints to pick the best implementation for the task.
[Ah, so you're putting the parse trees in table form because the language is geared toward table manipulation? Not a bad design choice, I suppose; that's the whole idea behind homoiconic languages, after all. Relational tables do remain a kind-of-weird way to represent trees, but that problem is, as you say, not related to this page and belongs elsewhere (such as on RelationalAndTrees). -DavidMcLean?]

As far as why it's "better than" Linq? Because the native syntax is much closer to SQL. One can copy and paste from and to a DB browser such as Toad with little or no changes. There would be a "native" mode and a pre-parsed mode. Native would just pass on the query as-is, after any variable substitutions. The pre-parse mode would generate the parse-tree table talked about above under the hood. There are tradeoffs between each approach.
[Reasonable. I assume "variable substitutions" aren't necessarily actual variable substitutions, however, right? They can be prepared-statement value binds as well? -DavidMcLean?]
The pre-parsed approach could be used to generate prepared-statements via the drivers since they have access to the parse tree, and could be used like subroutines in which variables are passed.
[But the pass-through approach doesn't? Wouldn't that make it inherently insecure? -DavidMcLean?]
If you want to force a Nanny Switch on your version TOPX forbidding pass-thru queries, be my guest. I would make it optional. But that's another topic.
[Are you saying you want to make security optional? -DavidMcLean?]
Sometimes pass-thru is the only way to get something to work. Let's just agree it's a shop decision at this point and not get side-tracked into security.
[When you're designing syntax for querying databases, security isn't a side-track. It's a central concern. -DavidMcLean?]
Almost every common programming stack allows "direct" queries if you want them.
[Certainly, as a side-effect of the way even with secure variable binding the query itself is still a string. You aren't designing a low-level database connection API, however. You're designing a special syntax for building queries, which is intrinsically at a higher level. The expectation with such a syntax is that it's secure, rather than thoughtless string-concatenation. -DavidMcLean?]
Higher-level tools generally need to offer an escape hatch to the lower-level when there's a problem with the higher level. A high-level GUI API should be able to address, read, and set individual pixels, for example, for the times that the high-level commands can't satisfy a specific need or have a bug. (See also LeakyAbstraction.)
[Reasonable, but the low-level escape hatches shouldn't be a first resort. They're comparatively dangerous, and developers need to know that to compensate accordingly. -DavidMcLean?]
Let's leave tuning what the defaults are to another day.
[I'm not interested in tuning the defaults. I'm interested in making the design, as a whole, safe. -DavidMcLean?]
My philosophy is to make the safer option the default, but let the shop managers or owner make the final decision as far as the trade-off choices. I build tools, not cops. (And some drivers may not have the ability to provide pre-analysis.)
[If the safer option is indeed the default, why'd you immediately jump to the unsafe option as soon as I asked for usage examples? I'm honestly confused. -DavidMcLean?]
If you are not explicit, I will choose the shortest interpretation out of both laziness and to keep the examples easier to read.
[Right, I'll be explicit. I'd like to see the safest, most correct, and most idiomatic version of any example. -DavidMcLean?]
Some companies choose nimbleness or shorter-term productivity over safety. The risk profile that the owner wants is not the programmers' job to decide, other than advise.
[I'm not some companies. As I said, I'd like to see the safest, most correct, and most idiomatic solutions. With a good design, those three factors should all be met simultaneously. -DavidMcLean?]
Naive thinking about the real world. CategoryIdealism.
[Safety, correctness, and conformance with language idioms are all aspects of a design that aren't impacted by the real-world desires of "some companies", only by the decisions of the language designers themselves. In short, the real world is irrelevant. -DavidMcLean?]
Wrong. If companies and orgs don't want to use it, it gets selected into obscurity. Purity Island.
[Wrong how? It's true that unused technologies sink into obscurity. That has no impact on the properties of safety, correctness, and conformance with language idioms, which are the properties I'm concerned with. -DavidMcLean?]
And I'm concerned with world peace and saving sick puppies.
[Good to know. I'm excited to hear how supporting SqlInjection in your programming language will cause world peace. -DavidMcLean?]
It could end up doing the opposite of what you intended because people will become disillusioned with what they consider "anal" products and revert to super-loose systems instead. People tend to "fight the last battle" meaning that a bad experience with an approach will tend to make them focus on avoiding anything remotely the same such that the pendulum swings to the opposite side. A middle-level approach thus may be a better way to achieve that goal in practice. If you don't understand human nature, you will get bitten by human nature. We make products for mostly people, NOT machines. -t
[But with correctly-written data source drivers, isn't writing correct, safe queries in TOPX exactly like writing SQL? Wasn't that the whole point? -DavidMcLean?]
Like I said, somewhat often drivers can't handle something properly or do not yet support pre-parsing. The option to use the raw mode (a.k.a. "pass-through") should be available to at least the shop IT stewards. I agreed it should NOT be the default (if driver supports pre-parsing). This is not unreasonable. Good drivers can be hard to find because the audience can be relatively small because it's a combo of specific product and a specific database/service vendor. Asking a vendor or the OSS community to write query-grammar-perfect drivers for thousands of DB tools/products, many of them very "nichey", is a tall order. Raw-mode drivers are probably going to be available earlier (and more readily) than grammar-based drivers. And this applies to prepared statements (TopOnPreparedStatements) also, not just the hypothetical TOPX language above. Let me put it this way: A raw-mode driver is better than a broken or missing driver (unless you are the military, FBI, etc. where security far outweighs productivity.)
[It's not expected for vendors to write drivers for thousands of database products. Vendors just write drivers for their own database products; that's how it already works, and there's nothing fundamental about TOPX that means it can't keep working that way. In my opinion, a database driver that can't prepare statements is worthless, but clearly our opinions differ. -DavidMcLean?]
Are you talking about a generic kind of driver, such as ODBC? If there is and app-specific API or language, then something has to translate that API/language into something that the ODBC driver and/or the server can understand. For example, the Linq API is a language until itself. Something has to translate the Linq "tree" into SQL and/or something the destination server can understand. And I don't know if Dot.Net's Linq driver is different from Java's Linq driver etc. Either way, TOPX would need something similar for each DB vendor. Something has to translate a non-trivial language (app/client side) into another non-trivial language (DB server side).

      for V = each db vendor/service {  // Oracle, Sybase, Mysql, etc.
         for L = each app/client-tool language {  // VB, Java, Php, etc.
           write a Linq or TOPX driver[V,L]
        }  
      }

[I'm talking about a specific driver for a specific platform, not a generic one like ODBC. For example, MongoDB provides a driver for the .NET framework. This driver includes support for LINQ, because that's a feature of .NET that database vendors can reasonably be expected to support in their .NET drivers. There's no real distinction between a vendor providing a .NET driver, which is something they already do, and providing a TOPX driver. Naturally, said drivers must translate LINQ or TOPX-query-language (respectively) to their native query format in a sensible fashion. It's hardly impractical to do, though, because it's already being done for LINQ. -DavidMcLean?]
That's pretty much what the loops show. For top languages the investment in thorough drivers makes economic sense. However, for niche languages/tools, the incentive to match grammars accurately may not be there. If a good driver is available, you use it. If not, then you may have no other choice but pass-thru mode, or switch database vendors, which may not be so simple if you have established data and queries written for it or it's another business/service.
[It's an expectation of .NET database drivers that they provide LINQ support. It'd similarly be an expectation of TOPX database drivers that they provide TOPX-query support. You're right that everyone who's not us mightn't put effort into coding their drivers properly, though (NotInventedHere ;)). However, realise that your pass-through mode only actually works for SQL databases, because TOPX queries are written as (a close approximation of) SQL. A generic properly-parsing driver could be used for SQL databases, based on something like ODBC or using the same database-agnostic principles practically every ORM employs; if you can provide a generic-SQL driver, of what value is a pass-through mode? -DavidMcLean?]
Re: "your pass-through mode only actually works for SQL databases, because TOPX queries are written as (a close approximation of) SQL" -- Incorrect. Pass-through is exactly that: pass-through. You can send BrainFsck to a BrainFsck database, if such existed and had a driver for it. I remind you that pass-thru mode is only an option, not mandatory, but this may vary per driver as different drivers can choose to use native or pass-thru or make an option for both, with a default to one or the other. The model is flexible that way. A driver designer has these choices: -t
- Support only pass-through mode
- Support only native mode (TOPX's dialects, such as its built-in SQL dialect, XML, etc.)
- Support both, but make pass-through the default
- Support both, but make native the default
Also note that "native" may mean native to the driver, for the driver can create its own "client-side" language that may differ or not exist at the target server. Further, TOPX can in theory support a native XML, JSON, etc., although it would be up to the driver and/or server to interpret (run) those. "Native" is perhaps not specific enough being that native to TOPX, native to the driver, and native to some (distant) server all mean different things. (If TOPX did support "native" XML, it would parse it into a table, just like the SQL below, which the driver then uses). Review:
- TOPX native - TOPX can parse query text directly and use it for local app-side operations or hand parsed results to a driver. (Query may be in SQL, XML, and maybe others)
- Driver native - Driver can parse it and translate for whatever use it's designed for, both local and server-based. Useful for, but not limited to, adding a new local app query language. For example, adding a JSON parser (if not natively supported).
- Server native - A server or network service parses and processes the query text. This is generally called "pass-through".
[So it's possible to have practically any language that can express queries in some form inside a TOPX query block. Wouldn't that be, y'know, confusing? How does one know what to expect from a given query if there's no consistent rules about how they're handled? -DavidMcLean?]
Why is it "confusing"? If a given org doesn't use Foo Query Language, then they are not going to have to worry about seeing and learning Foo Query Language. Plus, if no driver or server-side adapter for FQL is included or provided, it can't be processed anyhow.
[It might be confusing because there's no obvious indication of the intended semantics for any particular query block. -DavidMcLean?]
I'm not sure what you want, a required comment and/or name? Note that normally there would be a driver name, or driver config name. If you want to distinguish between say SQL and FooQL queries, having a useful driver name is one way to do it. But keep in mind that unless you use one of the out-of-the-box drivers or driver config names, the naming is up to the organization. Flexibility also creates responsibility to label things well. That should go without saying. Oracle is often the same way: a config file is set up and each database access space (for lack of a better phrase) has a name assigned by the org. This is more or less the same thing, except that it can go beyond a given vendor's query language.
[Well, what I'm seeing in your query-loop blocks is actually something very much like Ruby's ability to host an EmbeddedDomainSpecificLanguage, in that it's a way to embed another language inside a specially-delimited block. The two differences are that a) Ruby DSLs are required to stick to Ruby's syntax in general (and to an extent to its semantics too) and b) Ruby ones have at least a method name to indicate what sort of meaning the following block might have, varying among methods instead of always being "query". These differences are both advantages in Ruby's favour, I think: The required consistency of syntax both makes the code visually cleaner and reduces impedance mismatch between languages, and DSL method names (when well-chosen) document the intended semantics of a block reasonably clearly. How would this query-loop syntax capture those advantages? -DavidMcLean?]

Translation from SQL to Google Example

This is an amateur-style parsing approach to converting the Carmen01 example query to Google's URL "command". (There are fancy language processing techniques or libraries that can be used for such, but this is not a demonstration of building "good" translators.)

To simplify things, I will only show two criteria factors instead of four, and assume the ID is the same as the sequence number (see example struct01).

Although Google's actual facial detection is given as all or nothing (&imgtype=face), I'm assuming a threshold here to make it more realistic (1.0 is 100% certainty it's a face). Thus, I added "face_op" to indicate the made-up comparison operator and am assuming any other such thresholds append "_op" to the name for the comparison operator counterpart.

[I'm amused that you saw fit to change the actually-existing API to a non-existent one to increase realism. (You do make a good point, though, so no worries.) -DavidMcLean?]

  ID   ParentID  token        tokenType
  -------------------------------------
  1    0         sql          root
  2    1         SELECT       clause.select
  3    2         *            column-filter
  4    1         FROM         clause.from
  5    4         googleImages var.tableview   
  6    1         WHERE        clause.where
  7    8         faceness     var.column
  8   10         >            op.cmpr.gt
  9    8         0.7          value.number
 10    6         AND          op.bool.and
 11   12         textQuery    var.column
 12   10         =            op.cmpr.eq
 13   12         Carmen Electra value.string  // don't need to preserve quotes

Let's say the final result is the following URL being sent to the Google Search service:

  searchkit.google.com?customerid=12345&face_op=gt&faanyhowce=0.7&q=Carman%20Electra

Here is the driver processing in pseudocode. (I use BASIC-like code mostly to make fixing TabMunging easier.)

 func processGIquery(queryText, parseTbl, info[])
    // skip verifying table name because we are lazy
    mapCrit=map("faceness":"face","textquery":"q","etc":"etc") //criteria name translation
    mapOp=map("op.cmpr.eq":"eq","op.cmpr.gt":"gt","etc":"etc") //compare op trans 
    gglURL = "searchkit.google.com?customerid=12345"   // init
    // find the WHERE clause
    wid = parseTbl.firstVal(col="id", where="tokentype='clause.where'") // quicky query
    //
    Query:  // (internal table so no driver need be specified)  
      SELECT * FROM $parseTbl.name      // get all critera expressions
      WHERE id > $wid AND tokenType LIKE 'op.cmpr%'
    Loop:   // for each comparison operator (2 in this case)
      procCriteria(id, tokenType)
    End Query
    return sendHttpGet(gglURL)  // send URL with appended parameters
   // --------- (nested function; "||" is string concat)
   func procCriteria(opid, operType)  // process a comparison operator
     left = parsTbl.firstVal("token","id="||opid-1) // left side of op
     right= parsTbl.firstVal("token","id="||opid+1) // right side of op
     gglURL.append("&"||mapCrit[left]||"="||right) // add translated crit name
     if operType != "op.cmpr.eq" // add comparison portion if applic.
       gglURL.append("&"||mapCrit[left]||"_op="||mapOp[operType])  
     end if
   end func
 end func

Not shown is the code that converts the return results (XML or JSON) into a table.

The native dialect of "TOPX" SQL could perhaps support "escape" clauses:

 query:
   SELECT * FROM table2 WHERE x=2 ESCAPE $specialStuff ORDER BY y
 eachRow:   // loop:?
   print foo
 end query

The stuff within the ESCAPE clause is sent to the database as-is. In some cases this could get messy if the translated version of the SQL (if translation is done by the driver) does not match up compared to how we expected it to and the escaped clause ends up in Timbuktu. For example, suppose a database does not support explicit JOIN clauses and translates them to the older-style WHERE...AND clause version such that the ESCAPE clause's material is "moved" to a different part of the query then where we expected.

This is not necessarily a design flaw of our draft TOPX, just the side-effects of allowing flexibility. However, the level of flexibility allowed is always a matter of debate.

[Allowing for bypassing the language's parsing might indeed be necessary as a feature in some cases, but calling it ESCAPE is probably an awful, awful choice. That keyword connotes the idea of escaping values to make them safe, as in SQL string escaping or shell escaping. This backdoor deliberately has the opposite effect, rendering an otherwise safe value "live" and dangerous. EVAL might be a better keyword for this concept, since it does indeed cause some value to be eval()-style evaluated by the database, and most devs are familiar with the injection potential and hence possible dangers of an eval() call. -DavidMcLean?]