Database Abstract Interface Discussion

(Moved from GreatLispWar, which is getting TooBigToEdit.)


...Except, of course, for every indivisible algorithm that requires customisation, as shown in HofPattern.

It's great because you say it's great. The pattern is rare in the wild, or at least uncompetitive with alternative solutions.

[What about qsort(), which is part of C's standard library? What about approximately half of Ruby's standard library? Are standard libraries rare in the wild? What about any genetic algorithm ever written, DijkstrasAlgorithm, AstarSearch, and so on? -DavidMcLean?]

I've never ever used qsort. I imagine it's useful for SystemsSoftware and embedded systems, but those are not my niche. And much of Rudy's library, and perhaps qsort's comparison techniques, could be implemented in objects. If you don't like Ruby's object syntax, complain to the Ruby makers, not me.

[If you use an object constructed on-the-fly with an associated callable method---the C++ functor pattern, effectively---then you're using higher-order functions. It's exactly the same thing, just with syntactic overhead. The minimal syntactic overhead on blocks in Ruby is vital to their utility in code; any syntax that required declaring an object-with-some-named-method to pass to things like Enumerable#each would be too verbose to be used as ubiquitously as Enumerable#each is and should be. JavaScript might be a better example to look at here, because its object syntax is extremely concise; despite this, one finds that the vast majority of functions-that-need-to-take-functions just take functions, not objects. -DavidMcLean?]

Again, that's perhaps a problem with Ruby, not objects in general. When deciding what to include in a language or shop conventions, if x and y are similar and overlap in many areas, and you agree that it should have x, but circumstances where y shines over x are not used/found very often in the field, then it makes sense to exclude y if you want to keep the language relatively simple. (NonOrthogonalLanguageFeatures.)

[You could change the object syntax such that there's syntactic sugar for creating a C++-functor-type object, i.e., an object with a callable associated method, which is what you're actually trying to pass around. However, providing sugar like that is providing lambda syntax: It's exactly what Java 8 is doing. On the other hand, if you're trying to reduce the number of related concepts in a language, why not remove named functions and methods? CoffeeScript proves that you only actually need anonymous ones in the syntax. -DavidMcLean?]

Or we could hard-wire table/collection-oriented idioms into the language rather than rely on libraries (a bit like ExBase). This may prevent unwanted creativity. I'm sure you would object (no pun intended), but it's back to the 'ol "herding" debate.

[That's adding more NonOrthogonalLanguageFeatures, Top, and specialised ones at that. In addition, if we represent collections using parts of the standard library (the Enumerable module, in Ruby), we can add more collections. If we hardwire collection handling, how would we do that? -DavidMcLean?]

First let's analyze some realistic situations where that might be a need.

[There's a database vendor who isn't the developers of the language. We want to be able to query that database. -DavidMcLean?]

ODBC driver and/or a C API.

[And how do we make our hardwired collections handling use that C API, when it wasn't designed with such an API in mind? -DavidMcLean?]

Maybe I misunderstood you. I thought you meant like a tool vender being able to use this hypothetical TOP language's "internal" database data. Instead you mean having "TOPX" communicate with another vendor's database? Something like ODBC is the usual way. And I don't see how Ruby improves on cross-language and cross-vendor data sharing.

[Yes, I meant querying another vendor's database. Let's make it concrete: You're using TOPX and you find you have to query a MongoDB database. How do you do this, preferably without sacrificing your collection syntax? -DavidMcLean?]

  queryLoop to=q, db=mongo1, qry="Magic mongo syntax"  // for each result record
    print q.name, q.rank, q.serial  
    doSomethingElseWithRow(q)
  End 

  queryLoop to=q, db=mongo1, glib="nof", florf=7, tloog="pift", zarg=x
    print q.name, q.rank, q.serial  
    doSomethingElseWithRow(q)
  End 

          Query native=true :
             SELECT $myFields
             FROM FOO
             WHERE X=5 AND $myAndClauses
             ORDER BY $myOrderFields
          Loop: 
             print a, b, c
          End Loop
 // Example: Carmen01
 // search for monotone (greyish) jpeg's of Carmen Electra's face
 Query driver="googleImages":
   SELECT * FROM googleImages 
   WHERE faceness > 0.7 AND extension IN ('jpg','jpeg') AND 
     saturation < 0.2 AND textQuery = 'Carmen Electra'
 Loop:
   print url, size, width, height, extension, altTag
 Status: // std. query result info
   print "Rows: $rowcount, Query-Time: $qryTime"
 End Query
 // [leave dot below to work around a wiki bug]
.

         // Example struct01
         table: parseTree
         ---------
         ID
         parentID  // zero for root
         sequence
         token
         tokenType   // token or value type
         grammarRef  // grammar rule reference number/ID
         warningMsg   
         errorMsg    
.
      for V = each db vendor/service {  // Oracle, Sybase, Mysql, etc.
         for L = each app/client-tool language {  // VB, Java, Php, etc.
           write a Linq or TOPX driver[V,L]
        }  
      }


Translation from SQL to Google Example

This is an amateur-style parsing approach to converting the Carmen01 example query to Google's URL "command". (There are fancy language processing techniques or libraries that can be used for such, but this is not a demonstration of building "good" translators.)

To simplify things, I will only show two criteria factors instead of four, and assume the ID is the same as the sequence number (see example struct01).

Although Google's actual facial detection is given as all or nothing (&imgtype=face), I'm assuming a threshold here to make it more realistic (1.0 is 100% certainty it's a face). Thus, I added "face_op" to indicate the made-up comparison operator and am assuming any other such thresholds append "_op" to the name for the comparison operator counterpart.

  ID   ParentID  token        tokenType
  -------------------------------------
  1    0         sql          root
  2    1         SELECT       clause.select
  3    2         *            column-filter
  4    1         FROM         clause.from
  5    4         googleImages var.tableview   
  6    1         WHERE        clause.where
  7    8         faceness     var.column
  8   10         >            op.cmpr.gt
  9    8         0.7          value.number
 10    6         AND          op.bool.and
 11   12         textQuery    var.column
 12   10         =            op.cmpr.eq
 13   12         Carmen Electra value.string  // don't need to preserve quotes

Let's say the final result is the following URL being sent to the Google Search service:

  searchkit.google.com?customerid=12345&face_op=gt&faanyhowce=0.7&q=Carman%20Electra

Here is the driver processing in pseudocode. (I use BASIC-like code mostly to make fixing TabMunging easier.)

 func processGIquery(queryText, parseTbl, info[])
    // skip verifying table name because we are lazy
    mapCrit=map("faceness":"face","textquery":"q","etc":"etc") //criteria name translation
    mapOp=map("op.cmpr.eq":"eq","op.cmpr.gt":"gt","etc":"etc") //compare op trans 
    gglURL = "searchkit.google.com?customerid=12345"   // init
    // find the WHERE clause
    wid = parseTbl.firstVal(col="id", where="tokentype='clause.where'") // quicky query
    //
    Query:  // (internal table so no driver need be specified)  
      SELECT * FROM $parseTbl.name      // get all critera expressions
      WHERE id > $wid AND tokenType LIKE 'op.cmpr%'
    Loop:   // for each comparison operator (2 in this case)
      procCriteria(id, tokenType)
    End Query
    return sendHttpGet(gglURL)  // send URL with appended parameters
   // --------- (nested function; "||" is string concat)
   func procCriteria(opid, operType)  // process a comparison operator
     left = parsTbl.firstVal("token","id="||opid-1) // left side of op
     right= parsTbl.firstVal("token","id="||opid+1) // right side of op
     gglURL.append("&"||mapCrit[left]||"="||right) // add translated crit name
     if operType != "op.cmpr.eq" // add comparison portion if applic.
       gglURL.append("&"||mapCrit[left]||"_op="||mapOp[operType])  
     end if
   end func
 end func

Not shown is the code that converts the return results (XML or JSON) into a table.


The native dialect of "TOPX" SQL could perhaps support "escape" clauses:

 query:
   SELECT * FROM table2 WHERE x=2 ESCAPE $specialStuff ORDER BY y
 eachRow:   // loop:?
   print foo
 end query

The stuff within the ESCAPE clause is sent to the database as-is. In some cases this could get messy if the translated version of the SQL (if translation is done by the driver) does not match up compared to how we expected it to and the escaped clause ends up in Timbuktu. For example, suppose a database does not support explicit JOIN clauses and translates them to the older-style WHERE...AND clause version such that the ESCAPE clause's material is "moved" to a different part of the query then where we expected.

This is not necessarily a design flaw of our draft TOPX, just the side-effects of allowing flexibility. However, the level of flexibility allowed is always a matter of debate.

[Allowing for bypassing the language's parsing might indeed be necessary as a feature in some cases, but calling it ESCAPE is probably an awful, awful choice. That keyword connotes the idea of escaping values to make them safe, as in SQL string escaping or shell escaping. This backdoor deliberately has the opposite effect, rendering an otherwise safe value "live" and dangerous. EVAL might be a better keyword for this concept, since it does indeed cause some value to be eval()-style evaluated by the database, and most devs are familiar with the injection potential and hence possible dangers of an eval() call. -DavidMcLean?]


See also: EmbraceSql


EditText of this page (last edited August 21, 2013) or FindPage with title or text search