Fear Of Adding Tables Discussion

Oh Look! An 80KB ThreadMess!! How exciting - I can't wait to read it ...

Why, it needs to be........normalized

No it is better in a wide and long view with the most data crammed in the table as possible, even if all the chit chat doesn't relate to the subject. Just put some page anchors or cursors in the database, and also make sure to access the data procedurally - scanning through it all in sequential order.

If that's an accurate summary of my opinion, I'll eat a plastic fast-food movie promotional action figure. Without ketchup even.

See TopDrunkOnFastFood

ThinVersusWideTableDefinition is an attempt to define the key issue of the debate.


I've seen the downsides of over tabling. I tend to lean toward fewer tables, and am thus bothered a bit by the suggestion to make more tables. I suspect people are trying to make tables fit OOP, when it should perhaps be the other way around. Is this another HolyWar at play? How about some examples and scenarios to explore.

In fact, if you lean toward fewer tables that could even be seen as more object oriented... with good ObjectRelationalMapper (with Hibernate to use RealWorld example), you can save a complete object graph equivalent to huge number of tables in to a single table... that of course is not considered a good practice, but it is certainly possible... and easy to do.

I believe having few tables is bad from a purely relational point of view, because by reducing the number of tables, you make foreign key integrity relationships less meaningful (If you store cities, countries, and flavors of ice cream in a single table, and then add a foreign key relationship to it form another table, that only needs ice cream flavors, the database will not be able to help you to avoid putting a city on the place of an ice cream flavor).

Having fewer tables also make it harder to write queries, because you have to remember to filter stuff before writing joins:

 select cities.* from godtable as cities, godtable as states, godtable as countries where cities.type='city' and states.type='state' and country.type='country' and cities.parentgodid=states.godid and states.parentgodid= country.godid and country.name = 'China'

or

 select cities.* from (select * from godtable where type = 'city') as city, (select * from godtable where type = 'state') as state, (select * from godtable where type = 'country') as country where cities.parentgodid=states.godid and states.parentgodid= country.godid and country.name = 'China'

are worse than the easier to read, easier to write, and perhaps even more efficient:

 select cities.* from cities,states,countries where cities.stateid=states.stateid and states.countryid = country.countryid and country.name = 'China'

Don't you think?

Of course, if you have an ObjectRelationalMapper, you can tell the Object Relational Mapper that the "type" column should be used as a discriminant. Map the god table to 3 classes (city, state and country) explain how they are related and then write the query like this (Note that with a good ObjectRelationalMapper you could write the queries with the exact same syntax regardless of the underlying existence (or not) of a GodTable):

 select c from cities c where c.state.country.name = 'China'
Or even like this, if you choose to do so (but I prefer the abbreviated method):

 select city from city ,state,country where city.state=state and state.country = country and country.name = 'China'
Of course, you could achieve similar effects using CREATE VIEW to simulate 3 tables, but if you do so, why not simply have 3 tables from the beginning? You also have the problem that your DBA might not want to grant you privileges to create views. Requiring special privileges to create views to make your queries more readable is one of the well-known SqlFlaws.


I am not sure what the above country example is trying to demonstrate. A generic "location" table may be useful in the case of an international databases in which the number of "levels" varies per country. Concepts such as "state", "county" may not be applicable to other (non-US) country's conventions. Of course, "it depends".

Exactly, it depends, the above country example is trying to demonstrate that, by default, unless you get help from an ObjectRelationalMapper, it is harder to write queries when having a single GodTable (like location), I agree with you that a location table could be helpful for an international database (or maybe not because SQL is not that good for handling recursive trees), but the point is that unless you're sure that you actually are going to get a specific advantage by creating a GodTable you should prefer to split stuff into different tables, because while having few tables seems to make your design simpler, in fact it is only moving that complexity to another place (your queries).

In this case, if it is for a US company that does not have "odd" requirements, it generally makes sense to have STATE, COUNTY, and CITY tables (although county is not used very often by commercial firms and is usually skipped). But I have seen borderline cases where regions/areas specific to the domain (marketing or internal area carve-ups) could perhaps be better served by a locational GodTable of sorts. At least the drawbacks versus the benefits are close enough that I could not fault either decision. Dealing with ever-changing domain areas can get sticky, and a GodTable just may be a bit more flexible.

But I have seen examples where the name of the GodTable is something like "GLOBAL_CATALOG" and after the developers add locations to it, and see it works fine, they start adding all kind of stuff to this GLOBAL_CATALOG (like Colors, auxiliary Status Information (stuff like "Open, Close, Busy, Blocked, Paid, InTransit?"), and finally they decide to store the names of their favorite ice cream flavors.

I've also kicked around the idea of having one table for the smallest partition (such as a specific retail store), and then another table to map those specific spots to general categories or groupings such that they could even overlap if need be. Location info is often not a perfect tree and different departments may have different overlays/regions. Thus, this is the "set-centric" view.

  table: Spot
  -------
  spot_id
  spot_name
  etc...

table: Spot_group_links ------- spot_ref // f.k. to Spot table group_ref // f.k. to Locat_groups table

table: Locat_groups -------- group_id group_name group_type // ex: tax, marketing, probation, etc. (optional) etc...
The groups perhaps could also be non-locational, or quasi-locational. The boundaries can get sticky in real-world stuff.

Separate State/County/City tables might be a bad idea if you're attempting to represent hierarchical locations (because, as you mention, there are places where these overlap, though at least counties and states don't overlap). E.g. the 'City' table should not have an entry about which 'State' it is in unless you wish to treat as two or more cities every city that lands on a state border. A single table expressing a relationship between entities and locations (this entity (which might be a location) participates in that location-entity) is, semantically, the better way to go... even for expressing that a particular City 'overlaps' a particular state. Then, if necessary, the 'State' table may carry facts exclusive to the state (flag, motto, symbols, governor, etc.).

Unfortunately, this also is where Relational (esp. relational calculi and query languages) breaks down a bit: having one relation-table be associated with foreign keys from several different tables. A single 'GodTable' for relating entities and their locations should, semantically, be used for everything from States and Counties to Employees and Vending Machines, each of which have their own facts and associations. However, making it work well in relational (in practice or theory) can be a pain. In practice, use of true 'GodTable's that relate lots of different kinds of entities seems to work better in the very flexible Logic programming languages, like Prolog, where there conceptually exists a true-to-the-word 'relation' for every single predicate (i.e. a full, potentially transfinite and uncountable, set of N-ary tuples with semantic meaning derived from both the name of the predicate and placement in the tuple). Logic programming languages, and their query and data specification mechanisms, don't offer many constraints or implications about how one predicate will relate to another; anything with a shared value will do so long as it is described in a list of predicate truths.

FearOfAddingTables shouldn't be any higher than FearOfAddingPredicates. However, the price paid for flexibility and correctness is usually space and speed. At which point does optimization cease being premature? Obviously, to most of you RelationalWeenies, the answer is 'the moment I arrive'. You don't at all consider jumping straight to one of the popular 'relational' languages to be a premature optimization of a predicate-logic specification for data. Nor do you feel it appropriate to 'add a new table' to, say, add the 'official state anthem' (that can be lumped into the 'state' table within which you've already lumped 'flag' and 'motto' and 'capital'). Rant rant rant rant rant rant rant. I feel a little better now. I'm quite fond of optimization myself, but would prefer to see some real advances in full DBMS systems (you know, the sort with ACID transactions and concurrency? and maybe automatic distribution and replication?) that are focused more upon correctness and flexibility than upon speed; the DBMS itself should be capable of deciding when tables should be joined in representation as part of query-optimization and space-savings; users should feel that they can add a thousand tables (one unary-table for the 'IsState' predicate, one binary table for 'EntityMotto' relationship, one binary table for 'EntityFlag', etc. where a State is one 'Entity-Kind' that can have flags and mottos) all without significant loss of speed or efficiency.

For ultimate flexibility, may I suggest the MultiParadigmDatabase, which is essentially one big dynamic-row GodTable. (If you want entities, you add an Entity attribute to a "row".)

(moved discussion to MultiParadigmDatabase)


(Moved from RelationalIsTooAbsolute)

I think it's because amateur database developers find creating tables scary, and try to avoid doing it whenever possible. (Tongue {somewhat} in cheek...) -- DV

I am not an amateur, but I disagree with the myriad-table approach. The debate can be found in FearOfAddingTables.

Having searched the literature, I fail to find "myriad-table approach" mentioned anywhere. I can only assume, therefore, that it has no academic or engineering basis, and that aberrations must be entirely attributable to CREATE TABLE phobia or philia. Otherwise, proper normalisation will -- in an automated fashion -- give you the requisite number of tables, no more and no less, every single time. -- DV

Have you found anything that "proves" the opposite? And "proper normalization" has subjective components to it, such as "related". Ultimately everything is related, at least gravitationally. The "related" we use is from our own mental models, which are UsefulLies, but still lies. OnceAndOnlyOnce is the best guide in my opinion, but there are situations were even that is not sufficient because there are different ways to score the weight of the "violations" when there are trade-offs involved. I've seen heated debates about where the "nulls" in sparse tables should be counted as "duplication". You get philosophical funbies such as, "How can they be duplication when they are stand-ins for something that doesn't exist? You can't duplicate something that doesn't existence. Can lack of existence be duplicated? Is empty space between galaxies all just a bunch of duplication of emptiness?" Fun stuff, but it never gets settled. -- top

Proves the opposite of what? "Proper normalization" is not subjective. It is an algorithmization of OnceAndOnlyOnce based on functional dependencies. If I recall correctly, Date described it as "the automation of common sense". If your functional dependencies are unclear, then your analysis is insufficient. Go back to the users and ask questions. If it's still unclear (rare, but happens when the users don't know, either), this will make a difference in a few tables; on a large schema, a very small and manageable percentage. This does not represent some philosophical approach to data modeling, because there isn't one. You're either modeling the real-world facts -- to support user requirements -- in a normalised manner, or you aren't. If you aren't, then you should be -- modulo the occasional (and hopefully, deeply begrudged) denormalisation to achieve acceptable performance within budgetary constraints. -- DaveVoorhis

I've seen situations where there are trade-offs and no clear answer, especially when trying to anticipate future change patterns. I cannot remember the details at the moment. But, let's see if OnceAndOnlyOnce can pick the "proper" RelationalGuiDilemma.

If you can't identify the functional dependencies among the data elements in your "future change patterns", then they have no business being bodged into tables until you do. As for your RelationalGuiDilemma, is there something about it (I've only glanced) that precludes using the standard supertype-subtype pattern? E.g., supertype (widget_id_pk, common_attribute_1, common_attribute_2); derivedtype1 (widget_id_pk, unique_attribute_1, unique_attribute_2); etc. That is the standard and time-tested approach in such situations. Of course, SQL systems with table inheritance (e.g., PostgreSql) might be an alternative. -- DV

There are various approaches as described there, and none are perfect. In other words, we are weighing trade-offs with no mathematically proven single "right" solution. That's my point. -- top

There are standard approaches to standard problems, and normalisation stands as the accepted algorithm for eliminating redundancy and update anomalies. However, all modeling decisions (especially those that denormalise) must be made in light of the requirements. I see no requirements stated on the RelationalGuiDilemma, only a request for a static model. What is the intended purpose of that thing? -- DV

You go right ahead and publish these "standard problems" along with the mathematical proof they are objectively better, and let's see how it survives public review. -- top

[It already has been published. The database model did not exist long ago. It took lots of public review and proving to realize the relational model was the way to go far large database management. It took someone to prove and invent the DBMS in the first place... otherwise we might still be using linked lists or arrays for large databases (each of us with our own application driven database). The relational model was proven (to the extent which TrulyTrue things can be proven, to sensible people) and is the reason we have databases today (even if they don't follow it perfectly, the proof helped people at least scheme up something like it... if it were not proven we might have just linked lists, arrays, and other reinventions... it is just a pattern).]

You are assuming here that "the relational model" automatically results in skinny tables and that skinny tables have been proven objectively better. You are not being very specific.

Please stop using white trash talk such as chopping, choppers, skinny tables, etc. Next thing you know you are going to claim that Types are just tattoos.

Foreign keys, relations, normalization, et al have all been proven to be better than one wide Excel spreadsheet. As the project gets larger, the more the relations, normalization, foreign keys, etc become more important in maintaining the data. Being able to query a bitmask table for the available bitmasks, or being able to look at a bitmask enumeration Type is simply a much better longer term solution than hiding fields in a huge wide table without any clear specifications or relations (most likely the specifications and relations are instead stored in your application logic). Running a query over a huge wide table just to find out the five types of bitmasks you offer isn't sane.. and this has been proven. In fact, you can't find out what your data is related to because you aren't practicing the relational model, if you can't find this info out quickly. Over time, using the relational model does automatically result in tables that are not as wide -- despite your claim about skinny tables (could we please stop the white trash talk?). You are confusing the relational model with TableOrientedProgramming. You should stick to TableOrientedProgramming and just write off the idea that you will ever understand relations.


Compromise - Sets are not orthogonal

I believe somewhat of a consensus or compromise has developed on other topics spawned from this one. Whether one's view is a wide-table view or a narrow table view should be just that: a view. What is underneath the hood does not have to matter to the query language user. Relational has the potential to be flexible enough such that one can use whatever "named group" of columns they want. It's just sets. In theory, named column sets can overlap: it does not have to be either/or. However, limits of existing RDBMS (such as limited updateable views and wimpy column dictionaries) with a combination of a shortage/underfunded/uncaring DBA's makes this goal difficult in practice. The battle above is mostly about which to give priority given the existing limits. --top

Fine, as long as the schema is in (at least) ThirdNormalForm. There may be some debate over whether or not some higher normal form is the minimum acceptable, but there should never be a debate over whether less than 3NF is acceptable or not. It isn't. -- DaveVoorhis


Another anecdote: I'm implementing a smallish intranet app to replace an MS-Access one that some complained was too complicated. It tracks PC's along with hardware and software. One thing I decided to do with have a single table for software and hardware rather than make a separate Software table and Hardware table. One of the reasons I did this was that the distinction can be blurred. For example, is a dongle that comes with configuration software truly just one or the other? Is a USB thumb-print reader for security access to a specific app software or hardware or both? I don't want to hard-wire such classification into the system prematurely. If I'm on vacation or get hit by a bus, there may be nobody around to adjust the columns if a software-only field becomes shared (or vise versa). There's not exactly spare staff around here to do the new-column dance all the time. --top

Top seems to embed a lot of what I'd consider erroneous assumptions into his arguments; responding to some of them implies agreement - a bit like trying to answer: "have you stopped beating your wife yet?" MuAnswer. Wide tables are the over-specified ones, and skinny tables aren't domain-abstractions. The more extreme normalization forms don't even allow for domain abstractions - there are no 'entities', just unique 'identifiers' that just happen to be common across a set of facts and tables.


Well, and what if it goes the other way? what if while you are in a coma they decide that software and hardware are pretty different stuff and that they need lots of relations from other tables to the HardSoftWare? table, and they want the database to help, because for example it makes no sense to connect... I don't know, a table about voltage consumption to software, referential integrity could have helped you here, but since now you have mixed Hardware and Software, you won't receive any help. Also whenever you want to print reports that are only about software, or about hardware, you need to filter your results, all the time... as the system evolves it ends up having a view that for most (all?) uses and purposes wrap the fact that there is a HardSoftWare?, and every programmer deals with the Hardware view and the Software view... after a while, somebody arrives and asks: why we don't have a separate tables? it makes no sense to mix them, it has no usable purpose, and it unnecessarily complicates our queries... and the answer is: That is the way it has always been. Enter the era of FearOfRefactoringTheDatabase? because of FearOfAddingTables

Based on experience, I judged that as the less likely scenario. Having accidental extra fields is less of a problem than accidentally missing needed fields. One is an inconvenience, the other is a data stopper. --top

SkinnyTables?, Data stopper, bad chopper, skinny timmy. Enough. This person is insane - the white trash needs to end, now.

You have a round-about way of asking "please clarify". What I meant is that an extra field that is irrelevant to a particular situation is going to be less of a problem for the data entry operator and/or report reader than the lack of a field that they need.


PageAnchor: 953

Re: "...if I can't quantify or qualify human or economic factors, then neither can you - which means you can't possibly use them in any rational argument you're attempting to present to support or defend your side of things."

Well, that's a very interesting philosophical question. Think of it this way: would you want somebody to organize your (physical) desk and closet and garage based on formulas optimized for factors that could be measured from various models or pooled studies, or based on what you feel is comfortable for *your* work-style and habits? Software models are to serve humans first and machines second. They are like a custom domain-specific work-bench to solve a problem or provide a service. -- top (ctd.)

{If I could give a formal description of *my* work-style and habits, sure, I'd base it on that. But that requires the ability to qualify my work-style and habits. Indeed, putting one's preferences into code decisions can be one form of saying what those preferences are, but it's foolish indeed if one doesn't even know one's own preferences - in that case, stick with what is proven to work and decide later if there are specific things you can 'qualify' as not liking about it -- db}

One of you suggested that wide tables are done for "laziness". Obviously, people must feel comfortable doing such for *some* reason. [{Habit comes to mind.}] The suggestion is that it creates more effort in the long run, but I don't see it on net. I see lots of tables as clutter, just like clutter on a desk. The less stuff on a desk you have to sift through and grok, the faster and cleaner one's thinking. -- top (ctd.)

{Tables aren't "on your desk", so the analogy doesn't hold. With Relational, you always have a "view" that is constructed automatically from at least one table. Maybe if you had a magic button that could construct things "on your desk" into whichever formation you could specify, your little metaphor would apply. Besides, be it 100 tables vs. 100 columns - you've got the same amount of stuff to sift through and grok either way. -- db}

I've worked with both types of design, and my summary gut feeling is that thin is not in. Wide tables allow one to "drill down" incrementally. You get the general entity concept first (the "wide" table), and then filter out what you don't need via column and row filtering. It's a hierarchical thought process more or less. (Yes, there are LimitsOfHierarchies, but we're talking mostly about a mental map, not a formal taxonomy. In fact, trees are overused because they are mentally catchy.) -- top (ctd.)

{With 'wide' tables you'll also have a bunch of scattered 'exception' tables (because wide tables are very non-homogeneous) that carry all the one-to-many and many-to-many relationships, and as consequence you still don't get "the general entity concept" - just a fractured view of it. I'd much prefer to have good tools for constructing useful views on the fly than 'pretend' I'm getting some significant benefit out of wide tables. I'll also note that the value of the 'entity' concept has always been at question - it may be worth its own little debate. Thin table approaches, at least those with which I'm most familiar, don't embrace this 'entity' concept... they sort of kick it aside in favor of DatabaseIsRepresenterOfFacts. The closest they come is recognizing semantic facts like 'exists(identifier-goes-here)'. -- db}

With skinny tables its like trying to remember the name of an obscure person: there's no mental drill-down available; the mind has to do a sequential search. Now, I perfectly agree that if better schema sifting and virtualization tools were available, such may not be an issue. But, to misquote Rumsfeld, we have to fight the war with the army we have, not the army we want. -- top (ctd.)

{I'll agree that the most popular RDBMSs aren't ideal for the thin-table approach (partially due to the spiral-effect: nobody outside the academic community uses thin-tables, so there isn't a lot of visible money-benefit of building one, so nobody builds one, so nobody outside the academic community uses thin tables ...). That can reasonably affect your current actions, but I don't believe it a great argument for why wide tables are better. And, yes, a thin-table approach would rely on the ability to rapidly identify or learn of tables. Among other things, first-order-logic as part of queries would be useful in thin-tables, making meta-data (data about the database and the data) into FirstClass data. This requires the ability to use a column-name as an attribute-value, grab it out as part of a query, and query upon it. Your proposed DataDictionary(ies) follow this pattern, but similarly lack much support from modern RDBMSs. Anyhow, what do you think Rumsfeld would have said should he have had the ability to construct armies out of well-arranged ones-and-zeroes simply by spending a few $million to hire some smart and experienced people to do it for him? -- db}

Note that I don't disagree with all thin tables. The one that buggers me the most is doing it merely to get rid of nulls. Nulls are too volatile and trivial a concept tie key design decisions to. One shouldn't have to rework bunches of tables just because a column becomes "required" instead of optional, or required for different circumstances. -- top (ctd)

{I can't imagine needing to rework a bunch of thin tables because a column becomes 'required', either. It might help to know my conception of 'thin tables' in its extreme form is essentially that which I've derived from my study of logic programming languages and automated theorem proving: exactly one fact per row, or ConjunctiveNormalForm? transformed into rows on a table. A typical entity-table basis for databases has (ENTITY-ID,LIST-OF-ATTRIBUTES). E.g. if there are three attribute-columns 'wears', 'makes', and 'eats', then this single row would represent three conjunctive facts wears(fred,what-fred-wears) AND makes(fred,what-fred-makes) AND eats(fred,what-fred-eats). The 'thin-table' approach would break this into three tables (wears, makes, eats) each with a binary relationship. However, there are non-binary relationships that cannot be expressed in this manner, e.g. where an abstract proposition (a predicate) has one, zero, or more than two than one parameter. 0-ary tables are simply boolean variables (essentially), unary tables are straightforward classifiers (like 'exists(x)' or 'isNumber(y)'), and ternary+ tables really need all three-or-more entries (e.g. delivers(bob,pizza,fred)). There are NO entity tables that also carry facts about the entity. Ever. The closest you can come is an extensive-list of entity-identifiers of a single type (isPublication(y), isPeriodical(z), exists(x)). Thin tables, in the more extreme form, entirely rejects EntityRelationshipModeling. Instead, its best-practices focuses on OnceAndOnlyOnce explicit representation of facts (e.g. you'd never say 'is("publication",x)' AND say 'isPublication(x)', though you could say 'is("publication",x) :- isPublication(x)' - i.e. use implied or entailed facts... which (in this case) turns 'is' into a query or view). And there are ways of specifying schema and constraint requirements (e.g. that 'isPublication(x) requires known(y),hasTitle(x,y)') - db}

Like I said before, that's like purchasing the decorations for a house based on the clothes you happen to have on. Big design decisions should be tied to "big" concepts, those that are as invariant as possible. Empty columns often don't qualify. (Note that I am not happy about the way current RDBMS implement nulls, but that's another topic.)

Perhaps we can sum this up in a more general rule: Don't make hard partitions in designs unless the hard partition exists in the domain itself. Otherwise, changes will be painful and risky. (Originality disclaimer: I've heard similar rules from other software engineering literature, but cannot recall any specific names or sources from the top of my head.) --top (end.)

{It seems you are making assumptions about the virtues of your own choices again. Can you present any convincing arguments and evidence that the 'wide' Entity table is a 'softer' partitition over the concept-space than are the thin tables? Or are you just assuming it? -- db}

I can appreciate that more tables, rather than fewer, may represent a visualisation problem. This is certainly true when you have to deal with a project that has hundreds of tables. That does not mean, however, that you should accept or allow the update anomalies that result from denormalisation. That would be like deliberately writing bad code because you have to program in Notepad. Instead, use (or create) tools that facilitate visualising complex schemata. The "relationships" view in Microsoft Access is surprisingly good for this, especially in terms of following foreign key relationships, though I've long wished for a more sophisticated schema management and visualisation tool that would permit zooming in and out, showing/hiding only the tables that are related by foreign keys to a specified table, and defining arbitrary groups of tables -- e.g., Accounts Receivable, General Ledger, Inventory, Billing, etc. -- and hiding/showing group details (i.e., the individual tables within a group, or the attributes within the tables within a group) as needed. I started working it -- as a kind of ER-modeller/DBA tool from Hell -- around the turn of the century, but, alas, the pressing demands of other projects have largely squeezed it out of the running. I will come back to it, though -- it would make an ideal administrative front-end for the RelProject. -- DV

We'd have to look at specifics to see if there are really "update anomalies". Not all "problematic" wide tables result in the typical update issues. The devil's in the details.

If your design is not, for example, in 5NF when 5NF is warranted -- or in 3NF, period -- then there are, by definition, update anomalies. These will occur unless you can guarantee the affected tables will never be updated. One of the values of the normal forms is that they can guarantee, for each level, which anomalies will not occur. That means you don't have to look at the "specifics", or rely on conditions (such as "<x>, <y> and <z> tables will never be updated") that hold true today but may not hold true tomorrow. -- DV

Moved reply to SafetyGoldPlating.


Maptional?

"Relational" means roughly "based on tables". However, the "thin table" approach is more akin to maps (dictionaries) than tables. Perhaps the thin-table approach should be called "mapational" or "maptional". (Reminds me of the Muppet Show theme song: "Muppetational" is used to rhyme with "sensational" and "inspirational" IIRC.) --top

"Thin table" approaches still allow for n-ary relations of arbitrary degree 'n'. The rule determining the degree is essentially that one row = exactly one complete fact (i.e. no conjunctive 'AND' clauses implicit in the predicate reading of the fact represented by the row because those can always be fully reconstructed by use of table joins). As such, likening 'thin tables' to 'maps' seems an invalid analogy even before one starts to consider the variation in needs for indexing (maps are indexed on exactly one key, but thin tables could be indexed by any subset of the powerset of columns) and the common difference in query and manipulation languages (maps don't typically support joins, intersects, unions, views, etc.). -- WhiteHat

I must say I am surprised that TopMind, who is acutely aware of these differences and often proselytizes over the relative benefits of the table-based systems over common collection objects like maps, even suggests this 'maptional' analogy... it is almost enough to make me suspect more disingenuous designs. -- RedHat

Please clarify. I didn't endorse "maptational".

(By the way, WhiteHat and RedHat appear to be the same person based on IP address. I almost suspect an attempt to make it look like more than one person is "ganging up on me". But its possibly just paranoia on my part and just a burp in wikiware.)

--top

It appears to be a posting style judging from the user pages.

But they were not cross-judging each other.

[Note: A related MapTational topic has been moved to TheAdjunct as part of an EditWar compromise.]


Partial Agreement?

I thought somewhere in this ThreadMess (or another) we generally agreed that "thin versus wide" could be merely a view on the same schema/meta information if our tools and/or DB engine design were sufficiently powerful and/or properly designed. One could switch between a "thin" view and a "wide" view of tables as needed. A difference remains over what the default should be when using the half-@ss tools we currently have at our disposal. (For example, existing support for editable views is often poor.) -top

Incidentally, how would naming-collisions be avoided with so many tables even if the multi-view approach was taken? With so many potential tables, things may start to get crowded without some good conventions. -top


EditText of this page (last edited August 11, 2009) or FindPage with title or text search