Top's QueryLanguage (TQL) Proposal
The TqlName has not been decided yet, partly because "TQL" is already taken.
SMEQL (pronounced Smeagol and wiki-tized to SmeQl) is one contender.
SMEQL stands for Structured Meta-Enabled Query Language.
TopMind is working on a relational language to overcome SqlFlaws. It has a BusinessSystemTwelve-like functional style (FunctionalProgramming) that makes it easier to break queries into digestible chunks, avoiding the "run-on" nature of SQL. It differs from BS-12 in that the arguments can only be one of 3 things:
- Tables - virtual or real
- Expressions that return scalars or tables
- Scalars (values like numbers or strings)
Being only one of 3 three things simplifies the implementation, understanding, flexibility, and extensibility of the language. If new constructs are added in the future, it usually does not require changing the language, only creating new operations with parameters that fit the standard. It thus is a library addition instead of a language addition. The syntax also makes it easier to implement the operations as regular functions or methods of "typical" programming languages.
TopMind has not seen this ability in other query languages. The general syntax is:
resultTable = operation(parameters.....)
One can optionally nest these:
t1 = foo(blah)
t2 = bar(t1, zog)
Into:
t2 = bar(foo(blah), zog)
Note that the final result set name is not needed. Thus, one could type:
bar(foo(blah), zog)
However, most examples here will not use the nested form. The
intermediate tables should be considered virtual tables. They are not assumed to be "saved" unless explicitly requested. They are merely a way to linguistically reference query "chunks". An optimizer may even opt to skip the internal creation of parts or all of virtual tables as long as the final result is the same. (SQL provides views and/or nested SELECT's for a similar purpose, but these suffer the awkward
ThickBreadSmell, cannot be reused in the same statement, or require DBA intervention.)
Key Language Features:
Sample Code:
Discussion:
- TqlName -- to be determined.
- TqlDesignGoals
- TqlPrefix
- TqlOverloading discusses the cols() syntax, operator overloading, syntax context, and interpreters.
- TqlLacksaNaturalJoin
- TqlImplementation
- TqlChainedJoin - Questions about declarative versus imperative also raised.
- How are comments delimited? Is this implementation defined?
- How about we assume "//" for now. I have not seen a need for start-end pairs yet. I suppose maybe we should go with "--" to be consistent with current approaches, but I never liked those much. Vote?
- Very trivial detail. Just pick either -- or # by fiat, and no one will lose sleep over it.
- The examples tend to use "//". It stands out well (visual separation), avoids "wasting" any single character away from the language itself, familiar, and is easy to type.
- TQL will attempt not to over-dictate the "domain math" and typing system used as part of the standard in order to be domain-flexible. See below for type-related information.
Issues Postponed or Not Fully Addressed
- A procedural "wrapper" language such as Transact-SQL.
- How Nulls are addressed and how they differ from SQL's. (May be related to the "domain math" issue.)
- What statements start, commit, or rollback a transaction? Is this DBMS-defined? Is this implementation defined?
Contributors to the TQL pages:
Why differentiate scalars and tables? If a scalar is a 1 row, 1 column table then you don't need to worry about what you're dropping where, because EverythingIsA table. This could rapidly turn into a Lisp dialect if you aren't careful ;)
Perhaps this relates to the syntax issue raised in TqlOverloading. A TQL operation has to know what kind of parameters it is getting. If a parameter can be more than one of the 3 kinds of things listed above, then the syntax may grow complicated.
Perhaps conversion functions can be provided to convert from one kind to the other so that a reliance on type overloading isn't necessary. I would like to explore some UseCases first. --top
Comments about implementation moved to TqlImplementation.
Type Issues
- TQL does not necessarily assume "typed" relational. It leaves the typing system to the "domain expression engine/math", as long as such expressions provide Boolean-evaluate-able results, and perhaps equality for some join kinds and collating sequence interfaces for sorting (integer sequencing). Related: DoesRelationalRequireTypes. How things such as collating sequences are specified, such as system-wide configuration versus optional parameters, needs to be settled. But not all of these necessarily have to be part of the TQL standard.
- I wouldn't go building any type assumptions into TQL -- most PL/SQL I've seen that's "aware" of types only did so because of unfortunate design decisions beforehand.
- It depends on what you use your database for, types as a method of ensuring data integrity are important to me. On the other hand, the guy(s?) behind SqlLite? feel that types are contrary to relational and implement everything as strings. http://www.sqlite.org/datatypes.html for some of their thoughts. Back on the gripping hand.
- They are also completely contradictory and completely BS'ing by saying such conflicting nonsense as: "Even though SQLite allows the datatype to be omitted, it is still a good idea to include it in your CREATE TABLE statements, since the data type often serves as a good hint to other programmers about what you intend to put in the column." Oh I see now. So types are not useful, but they are useful. Then they go on to say "SQLite accepts all the usual datatypes." Oh, I see now. It is typeless - but it has types available. Right. Then they go on spouting more BS such as "If you are ever in doubt about the datatype of an expression you can use the special typeof() SQL function to determine what the datatype is.". I see. So it is typeless, and types are a misfeature - but SQLite sure has types! Right. Makes 100 percent logical sense. "This behavior is a feature, not a bug.". Way to go. Made it to FabianPascal's quote of the month on DbDebunk.
- Ideally, I'd create an optional schema "typing system" as "validation" instead of types. That way you can add validation features in a mix-and-match style. Things like "positive" is best as a validation attribute instead of "type" (or use ranges). Anyhow, I think types and relational should be more or less orthogonal issues (see DoesRelationalRequireTypes). We don't have to bind them to the relational engine nor the query language. --top
- You're attempting to maintain a DifferenceThatMakesNoDifference. See PredicateTypes.
- I don't see what this has to do with predicate types. TQL does not address types. It simply passes expressions to the "expression system". (For efficiency sake, some integration may be needed in practice.) Note that the comments about SQLite are *not* mine. --top
- Your sentence: "as 'validation' instead of 'type'" addresses types. And it is the distinction between 'validation' and 'type' that is the DifferenceThatMakesNoDifference. Validation based on an expression entirely over immutable values (as opposed to the current time, windspeed, etc.) will always qualify as 'PredicateTypes'.
- That said, I think I understand what you are aiming at based on your response: you are saying that the generic 'script' getting passed to the correct 'expression system' (which might be a plugin) does whatever it wants to produce a value, and that TQL could be enabled to 'validate' the result with yet another script (which might go to a different 'expression system' with a different plugin). A TQL implementation can effectively be a ScriptingLanguageAgnosticSystem... or at least mostly so (one might still benefit from a few common datatypes that generic scripts need to handle as input or produce as output like bool, BLOB, string, etc.). In this case, validation by script is STILL a form of PredicateTypes. But, if I'm understanding you, you're attempting to clarify that TQL is not responsible for validating the scripts or expressions themselves. (Of course, it could always call some sort of script validation method on the correct 'expression system' plugin... to at least make sure the script parses correctly).
- Am I catching your drift?
- Generally you are correct. The "validation" comment was kind of a side comment about my typing preferences and probably should be moved or rephrased to avoid confusion. Whether validation is "typing" or not gets us into the definition battle that has generated large debates around "intent" etc. As far as passing expressions to a processing engine, that may be a valid conceptual way to view the desired separation; however, the ties probably have to be closer in practice for efficiency sake. I doubt clean separation can be achieved without performance sacrifices. But if somebody did create a production-targeted engine, hopefully they will keep separation in mind so that a different sub-language can be put in (re-coded) with minimal overhaul of the relational side of things. (I don't have the skills to produce a production-level engine with regard to performance; only a plunky demo.) This advice applies to any relational language/engine, not just TQL/SMEQL. --top
The Hadoop "Pig" query language breaks queries into SmeQl-like chunks, and thus has a similar granularity.
Would that be in the family as NoSql (pronounced Nazgul)?
The NoSql movement seems to be more against "Oracle-style" database engines, not so much SQL the language.
See also: RelationalLanguage, FunctionalProgrammingLanguage
CategoryTql CategoryQueryLanguage CategorySpeculative