Flirt Data Text Format

A data exchange format originally drafted in RelationalAlternativeToXml but later influenced by Fusdx. Unlike CSV, FLIRT allows multiple tables and possibly column meta data in a single file. "Flirt" is short for "First Letter Indicates Row Type". Actually, it is a symbol, not a letter, but "fsirt" is harder to remember. (Better name suggestions welcomed.)

One possible alternative: LIIRT - Leading Item Indicates Row Type, but still not as "catchy".

Goals of FLIRT:

Ability to have multiple tables in one file
Basic optional meta-data (DataDictionary)
K.I.S.S. (simplicity)
Easy to write a parser for
A reasonable amount of error checking such that transmission or syntax errors don't "shift the whole boat", but rather localize the corruption to one or few cells or rows.
Relational-friendly
Custom extendability without breaking base standard and syntax using the ignore-if-name-not-recognized approach that HTML does.

--top

One variation that may eliminate the "lasting counting error" (described in FusdxDiscussion) problem is something like this:

@table $product_log @columns $mydate $price $shortdescription $longdescription @record $12-DEC-2002 $27.93 $red t-shirt $This red t-shirt is a perfect gift \n\ It also comes with a free tie. @record $01-JAN-2003 $15.79 $blue pants $These pants are for the winter \n\ They also can be cleaned with bleach. @record $Etc..

The character in the first character column tells whether the line is data or a section marker. The marker character also acts as check on the data. If say somebody forgets to escape a line-feed, then the next line will likely be missing a marker ("$" or "@") and the parser can display a warning. --top

[Spot on, Top. Your scheme provides facilities for error-detection on a record-by-record basis, as well as clearly distinguishing an empty value ('$' followed by a newline) from a record break, which FUSDX does not. I consider the chance of FUSDX happily absorbing 100 gigabytes or so of records -- assuming a missing column value followed by a series of record breaks mistaken for empty column values and vice versa -- before deciding the column count is erroneous, to be unacceptable. I wouldn't use it. I would use Top's approach.]

Now we just gotta agree on how to specify (optional) types and lengths. Let's see. One approach is to extend the "@columns" group:

@columns $mydate,date $price,number,12,2 $shortdescription,char,20 $longdescription,char,100

Or give each their own group:

@types $date $number $char $char @maxlength $ $12 $20 $100 @decimals $ $2 $ $ @record $...etc...

Both bother me in different ways. Maybe we could just define a DataDictionary table rather than add new element kinds to our setup. Should we reserve a table name or add an optional "@dictionary_table" or "@dictionary" group?

By the way, multiple tables (including dictionary tables) could be separated as follows:

@table $foo_table @columns ... @record ... @table $bar_table ...etc...

So now we have:

@dictionary @columns $table $name $type $maxlength $decimals @record $my_products $mydate $date $ // note that we could eliminate this line without harm $ // note that we could eliminate this line without harm @record $my_products $price $number $12 $2 ...etc... @table $my_products @record $12-DEC-2002 $27.93 $red t-shirt $This red t-shirt is a perfect gift \n\ It also comes with a free tie. @record $01-JAN-2003 ...etc...

If this looks confusing, an outline of this would resemble:

@dictionary
- @columns
- @record
- @record
- @record
- etc..
@table
- $ [table name 1]
- @ record
- @ record
- @ record
- etc..
@table
- $ [table name 2]
- @ record
- @ record
- @ record
- etc..

Optional alternative dictionary associations are discussed below.

In response to a since-removed (though not recanted) claim that FLIRT lacks the extensibility its author believes it to possess:

I saw no need to remove your suggestions. Perhaps they belong under a different topic eventually, but brainstorming is welcome.
And I'll move them to a different topic. Eventually.

What is the alleged extensibility limit on the DD? A FLIRT DD can have a million DD columns if needed (only a few are part of the standard, but bajillion extra custom ones are allowed). However, I should point out that heavy meta-data is NOT the intended use of FLIRT (even though it is possible). Simple DD's are easy and don't need new syntax, and fancy DD's are still possible, even if not so favored. One can even create their own tables for meta-data info if many-to-many DD's are needed, etc. And if you find a problem, let's think of ways to fix it without inventing new FLIRT syntax conventions. We don't have to throw out the baby with the bathwater. --top

I described the alleged extensibility limit above, if you had cared to read; finding the allegations should have been rather trivial, even if you disagreed with them. As noted, the need for FLIRT to order parameters and to have order-significant parameters to the DD is a significant blow to extensibility. You say your DD can have a million columns, but you'll literally need to fill in a (million-1) columns (rows in the FLIRT format) just to specify what the value is for the millionth, at which point you'll have long lost track of which row you're on (especially if attempting to read or write the code by hand). Humans won't be able to track much more than seven columns.

One generally would not hand-edit a complex meta table. It's a transfer format, not a programming language.

"If say somebody forgets to escape a line-feed, then the next line will likely be missing a marker ("$" or "@") and the parser can display a warning. - top"
If you want an un-editable transfer format, then don't provide suggestions about "somebody forgetting XYZ" - leave that to the machines. If you want a hand-editable transfer format, then don't rag on people who make suggestions to make things hand-editable. I've never much cared for your consistent hypocrisy, top, but I can't ask you to stop being one: I've concluded that there is only one way for you to stop being a hypocrite, and it would make you rather unproductive.
It could be programmatically forgetting to escape something. In fact, I made exactly that mistake a few weeks ago. Only one column in a table of about 60 columns had text that permitted line-feeds. I forgot to have the export program escape them and it messed up a CSV export that somebody was trying to do. I forgot to test CSV exports with that column. --top

This implies a need for KeywordParameterPassing for structures that are very sparsely loaded, such as a DataDictionary.

I too like KeywordParameterPassing. However, I don't see a real need for it here. FLIRT is generally a relational *data* transfer format whereby most info comes from existing RDBMS tables. If your specific app needs KeywordParameterPassing, then go ahead and add it as an item:

 @dictionary
 ...
 $keywords
 @record
 ...
 $foo="blah", bar="snoog"
 ...

It would just be up to your app to parse such an item because such syntax would not be part of FLIRT itself. Or, use a many-to-many AttributeTable to specify such info. If you need to rely heavily on KeywordParameterPassing for whatever reason, then I would recommend XML instead.

As far as 'throwing the baby out with the bathwater' with 'new FLIRT syntax conventions' - you're being a bit extreme in your categorization there, I believe. FLIRT doesn't have "syntax conventions" - you're still in the process of deciding 'convention' unless your mind is carved in stone, in which case any discussion on my part would be pointless. There are reasons for adding to this 'syntax convention' for more than just the DD, such as for purpose of specifying 'table' and 'order' to records without having to re-specify the whole table descriptor (which allows separation of data and schema).

Please clarify this. (Which part?)

Statement 1: FLIRT doesn't have "syntax conventions"
- Clarification: you're still in the process of deciding 'convention'
Statement 2: - unless your mind is carved in stone, in which case any discussion on my part would be pointless.
- Clarification: Top, in my experience, is a rather close-minded individual who has a (very) long history of insisting on his own view of the world, thumbing his nose at academia, screaming at people who suggest books, ignoring or resisting rational argument, and otherwise avoiding any sort of education that would lead to an eventual admission of having been wrong.
- There is "no point" in holding a discussion with such an individual, no matter how productive it might have been with someone more open-minded. Therefore, such a discussion "would be pointless" unless said individual was making an active effort to be open-minded to suggestions. This is obviously not the case for an individual who likens a change to "syntax conventions" (that aren't yet even fully conceived) to killing babies.
Statement 3: There are reasons for adding to this 'syntax convention' for more than just the DD
- Clarification: such as for purpose of specifying 'table' and 'order' to records without having to re-specify the whole table descriptor (which allows separation of data and schema).
- Even More Clarification would have been readily available if had you cared to enlighten yourself by actually reading my suggestions. But see point 2. Then see down the page a bit further, where I talk about the properties of XML a wise creator of a 'modern' format should be out to steal and make his own.

But if your mind is set, then so be it; I really don't care whether your 'FLIRT' format is simplistic because I recognize it as additionally unmarketable as currently designed, therefore I won't ever be forced to use it. I am accepting your 'reactionary' response to be your final answer, and I will be withdrawing from this page.

You seem to be envisioning a use for it much different than what I am envisioning, and its creating a communications gap.

I envision a far more flexible use than you're envisioning, apparently - including greater support for hand-editing, far improved human readability, support for DBMS management and autonums, and easier to understand and implement support for types (i.e. with real semantics, rather than just a bunch of non-semantic strings thrown into a special table that forces it to be "just be up to your app" and thus neutralizes the whole point of having a format to start with). But the same primary use at no loss to efficacy.
It seems to me that your goal is to avoid anything XML-like by some sort of foolish thought that everything XML-like is bad. A wiser person might attempt to learn what properties were good about XML, and which were bad, such that the good properties can be redesigned into your new modern format, and the bad can be dropped. The good properties about XML as I see it are:
- hand-editability and readability, including comments. These allow a 'transport' file to also be used for 'configuration' and 'database bootstrap' purposes, and just reduces risk that someone will damage it.
- Ability to divide the meta-data/schema from the data. Not only does this allow the parsing process to control the 'type' of data it is willing to accept, but it also:
  - saves on transport-costs because it is very likely that the communicating entities 'share a schema',
  - improves confidence that applications will understand one another, and
  - allows for metaprogramming to optimize the processing of data (though this requires that there be 'official' semantics to the type metadata)
- Stability in the face of extension.
  - For a Database transport format, this really requires insensitivity to column-reordering in the schema. Therefore, one must be able to specify the order of the columns when specifying a record. If a schema is separated from the data, one requires the ability to specify the table, anyway, so the ability is a WinWin.
- 'Small' data-transport - the basis of AJAX and many communications - also essentially requires separation of schema from data, but benefits from ability to handle multiple different small updates in a single message.
Also, there are the 'bad' properties of XML
- Vastly redundant syntax. Angle brackets and too many assignments make it difficult to examine (lower readability than is ideal).
- Need to specify keyword parameters every single time. (The ability to have 'implicit' column-ordering based on earlier statements is something FLIRT already possesses; my suggestions took advantage of that, simply allowing for explicit reordering to become implicit on all further statements.)
- XML's basic tendency to 'internalize' and 'wrap' stuff makes it more difficult to process procedurally or en-masse or (especially) to heavily parallelize it. Essentially, for a data transport format, rather than a single-value transport format, it would be convenient to turn XML inside out, so that 'data' in multiple different files seems to logically be one contiguous chunk. This allows for conceptually better (i.e. prettier) parallelization regarding placement and utilization and transport of the data, even though it might not much affect the real operation.

But it seems you want a dumber format that forces people to jump through hoops to get even the meta-capabilities available through SQL. Your choice, really. I'll never use FLIRT unless it actually provides a meaningful semantic and syntactic advantage over XML, which it does not as written, so it is unlikely to ever come up in my future.

Top, ignore the NaySayers? as they seem to miss the point. The FLIRT format and the FUSDX ideas are great brainstormings for a format dedicated for data storage or backup. The naysayers do not seem to quite understand that the formats are purposely designed to not be XML, just as a database is not designed to be general purpose programming language. Those who try and turn databases into cars and trucks, or those who try and turn a search engine query edit box into a Word Processor simply do not understand the intended dedicated goal of the dedicated tool. This reminds me of when a boy tried to put wings on a yellow school bus and make it fly because the little boy didn't have the discipline to understand that a bus is a bus, not an airplane. The yellow bus actually flew after he completed the project (which took 24 years to complete), but it didn't fly very well. --AnonymousDonor

I could claim 'FLIRT and FUSDX are XML-like because they are data-transport formats and XML is a data-transport format.' and it should carry about as much weight as Top's throwing around the word XML above. There are certain properties of XML that are good, and there are certain properties of XML that are not so good. Only fools and idiots would stop at "is it XML-like?". The intelligent and wise go on to ask: "In what way is it XML-like, and is it a good way or a bad way?" Only fools and idiots stop at "is this tiny little piece as simple as I can make it? I want KISS!". The intelligent and wise ask: "How does making this part simple in this manner affect the rest of the system? The true simplicity goal is 'As Simple As Possible' to achieve all non-simplicity goals 'But No Simpler'" because there is such a thing as 'simplistic' - which makes one thing simple but everything else complex; one ends up AddingEpicycles. And Top has already started AddingEpicycles to FLIRT - he did so the moment he starting suggesting '@oejfojewgfjewf' extensions as an excuse to not think (as though a new extension would magically solve the problem), and continued doing so when he created meta-data unfriendly approaches, such as shifting all meta-data into tables that aren't part of some 'relational-friendly' standard and it would "just be up to your app to parse such an item" - the very idea of meta-data is data about data, not part of the data itself such that every app needs to reinvent the parser and meta-data manager. But, in the name of KISS, he's declared he'd rather keep his syntax, epicycles and all, than really listen to suggestions.

If you hadn't noticed, I am not the CEO of this wiki (okay, well, maybe I kinda act that way sometimes). You can describe whatever suggested change or addition to FLIRT you want and nobody will stop you (except maybe the militant wiki reductionists, but they've been quiet lately). If you want to suggest a key-word based extension, feel free to suggest away. Just realize that criticism is common and expected here. This wiki is not for the sensitive. --top

You should take the time to comprehend before you criticize; as it stands, I was left with the impression that you weren't being properly critical... you were simply defending your baby (and its bathwater) like some mother bear. And you are the guy who named FLIRT. I'd much rather just excise you from the picture... I doubt your personal format will go anywhere, and if I should need a relational format... well, I've now got one saved someplace I can readily find it. Feel free to attempt to convince enough people to use FLIRT that it will actually become 'standard' or 'conventional' enough to really have "syntax conventions".

I felt you were going off on a tangent. Some systems indeed probably need something that is keyword-friendly, and if you want to suggest a keyword-friendly alternative to XML, you are welcome to propose such. Brainstorming never killed anybody I know. In my opinion, XML is "good enough" for keyword-heavy usage and I don't think I could improve upon it or EssExpressions significantly enough (if at all) to bother replacing them. But, you are welcome to try. FLIRT addresses RDBMS-centric data exchanges, which don't need key-words that often and are not well-served by XML.

By the way, we could add an "@XML" marker to the reserved words for XML embedding:

 @XML
 $<foo>
 $  <subfoo value=123>
 $  <subfoo value=456>
 $</bar>
 @Table
 ...

This allows us to piggy-back on the existing XML standard(s) without having to start our own. You mentioned displeasure with XML, but for occasional XML use I feel that sticking with an established standard is better than creating a key-word-based language just for use in FLIRT.

--top

The XML block you propose fails to address ANY of the concerns I put forth (and if your memory is failing you, these include: division of relational schema from data, division and parallelization of data across multiple clients or farms, cheap network costs for 'small' data exchanges of only a few records, support for extensibility and schema changes (new columns, dropped columns, reordered columns), ability to declare the context of the data (insert,delete), improvements for hand-editing and readability such as comments, centralization of column and table metadata, and the ability to locate all records related to one entity near one another in the file, and better standardization in the form of both formally defined types and definitions of how much a minimal client needs to support). And, while I could go into detail on exactly how and why it fails to address these, I don't believe you'd comprehend it any better now than you did before - especially not after seeing this proposal, which is a pointless addition to FLIRT that is of poor design and serves no conceivable purpose other than to attempt to assuage the thoughtless masses. If you want XML in FLIRT, just put a whole XML-structure into one of the column-cells. If you want it done 'right' in FLIRT (or RDBMS in general), create an XML-schema 'type' that references an XML-schema URI (instead of using a plain string) in order to allow RDBMS-server-side validation, server-side storage optimizations, and queries/'where' conditionals on the XML data structure (in the 'where x>4' sense, you could use full XpathLanguage queries or a related XML query language as conditions or even row-splits if the query returns >1 value). But putting XML into FLIRT, even doing it 'right', won't give FLIRT itself the properties that would truly leverage it into a modern, efficient, flexible format for communications with and between databases.

As far as how to group meta-data with a given table instead of at the top, it is indeed something that should be considered, at least as an option. The simplest change would be to simply allow multiple "@dictionary" blocks (table name column would not be needed if dictionary block after table block). But for many of the others, I think we need to ask "why" before we address "how". You seem to want to turn FLIRT into a distributed database itself. And, "standardizing types" is a huge project in itself and requires mass vendor cooperation. The phrase "getting carried away" comes to mind. --top

"You seem to want to turn FLIRT into a distributed database itself." - You seem to believe I'm an idiot. No, FLIRT would not become a distributed database; it would, however, allow for efficient 'micro' exchanges of data, and effective distribution of data, with distributed data maintaining stability even under changes to the main schema. And since you seem to be playing the idiot (or at least implying that I am one): There is a huge difference between distributed data and distributed database. Distributed data can be useful for parallel processing, distributed storage, hand-editing of smallish files, etc. Distribute database requires distributed indexing, distributed transactions, cache invalidation protocols, communications support for queries that need to touch multiple machines, etc. - and is a MUCH larger task, and one for which FLIRT (even with my suggestions) lacks any real support. As you've drafted it, FLIRT is pretty much useless for micro-exchanges of data - even if you use the same table-names, the meta-data could be entirely different every time; there is no schema URI tying everything together. You can only effectively transmit whole-queries or whole-databases. And "standardizing types" is pretty much required if you want meaningful 'data exchange' rather than a broken TowerOfBabel where every pair of vendors must individually negotiate the rules of communication, which really means they'll be reinvented every time by front-line programmers who lack any sort of big picture. And you can easily use RFCs and W3C recommendations and even the SQL standards for the majority of the important 'types' you might want (integers, decimals, floats, autonums, dates, strings, URIs, XML, etc.) so don't try to invent strawman excuses like you usually do. Remember that Data formats are NOT something you can just go back and "fix" - if they succeed, you're stuck with whatever flaws they possess. And you should always prepare for success. The phrase "short sighted" comes to mind.

These are involved issues that probably deserve their own sub-topic or topic (it already sparked NumberTypes). Generally I approach the design using the equation:

   rank = frequency / complexity

Commonly-needed features that don't add much complexity to FLIRT are included while rarely used features that complicate the syntax/specification of FLIRT will be bypassed. We'd need to determine the values for both of those factors.

The original goal of FLIRT was to allow data dumps and transfers from existing RDBMS (RelationalAlternativeToXml sparked it). You seem to be wanting to extend it to non-RDBMS systems and/or future/better RDBMS.

I read "data exchange format originally drafted in RelationalAlternativeToXml" as not necessarily involving RDBMS. It could be data-exchange between RDBMS and application, or between one application and another, or between RDBMS and RDBMS. If it isn't intended for use between applications, or between services in a ServiceOrientedArchitecture (for which RDBMS is just one type of service) it really isn't a RelationalAlternativeToXml. I believe we'd benefit from a good, real RelationalAlternativeToXml. And, yes, for BOTH of us it would be used for "future" RDBMS - i.e. ones supporting the new exchange format. Services are ALWAYS built around languages, whether it be SQL or some new form of FLIRT, because languages are interfaces to services and you CantAbstractMuchPastInterfaces - at best you can layer abstractions atop them. However, my own suggestions do not betray the goal of data-dumping an RDBMS... they just provide much more than that at very, very little extra complexity cost. Make one thing slightly more complex to make everything else simpler.

I'd like to see some specific sample scenarios. Also, if one needs a variety of uses for a format, then XML and EssExpressions are "good enough" in my opinion.

MarchZeroEight

CategoryXml, CategoryText, CategoryLanguageDesign