Different RelationalWeenies seem to emphasize different aspects of relational. This is an attempt to catalog the reasons.
- Transformability of View - This is the ability to transform one's view of information to suit a specific circumstance. In particular, one can filter, search, sort, group, and cross-reference information many different ways via a query language and/or query tool. (Sorting is not part of relational itself, but is provided by most RDBMS in practice.)
- Integrity - Relational techniques and tools provide various ways to ensure that certain business rules and associations are guaranteed to be true.
- Standardized - SQL is an ANSI and ISO standard. See http://en.wikipedia.org/wiki/SQL#Standardization. (However, the standard often only partially supported by existing products; and there are fuzzy areas in the standard.)
- Third Party Tool Support - Because of the SQL standards, and defacto standardization of vendor independent APIs, including ODBC (and ADO and OLE-DB) and JDBC and others, there are a wide variety of third party tools available to support and use relational databases.
- Ad-Hoc Access to Data - Easy to use third party tools enable ad-hoc query and reporting access to data stored in relational databases. This allows one to quickly respond to business needs without requiring any software development.
- Many People "Think" in Tables - The table-based arrangement is visually and conceptually natural to many people. Tables are a compact way to represent and interact with large amounts of data.
[DISCUSSION]
On being "Standardized":
RDBMS and relational have provided a somewhat standardized way to represent information and associations between it. (However, there are sometimes disputes over the level of normalization.) Also, SQL has become a common semi-standard across different RDBMS brands. Thus, one can query without learning a new query language from scratch. (Many feel that SQL is far from the ideal query language. However, that is a typical trade-off of standards).
SQL is an ANSI and ISO standard. See http://en.wikipedia.org/wiki/SQL#Standardization
Adherence to published standard by vendors varies, and also there are allegedly some fuzzy areas to the standard. One could also perhaps argue that the standard is unnecessarily complicated.
On the benefits and drawbacks of "Ad-Hoc Access to Data"...
It's still an open question as to the relative benefits and costs of enabling ad-hoc access to operational data stores.
Benefits:
- Enables improved operational support -- diagnosis and resolution of production problems.
- Enables rapid gathering, analysis and distribution of data to address rapidly changing business conditions.
- Can dramatically reduce short term costs of providing business access to data. (...possibly at the expense of long-term costs, as mentioned below.)
Costs:
- Unpredictable ad-hoc loads and locking may impact the performance of critical production functionality. Complete lockup of production systems for hours at a time has been observed, in extreme cases.
- Misinterpretation of data may result in bad business decisions. (On the other hand, lack of data could be worse.)
- A proliferation of critical production SQL queries across multiple development teams and business users inhibits changes to the common database schema, as it becomes difficult or impossible to predict the results of schema and data changes, and difficult or impossible to throughly test all uses of the database before rolling the changes into production.
Note that one does not have to allow users access to ad-hoc queries, or could restrict their queries to a limited set of views or stored procedures. Thus, these are not really a downside of relational per se. Perhaps another way to state the benefit would be "relational can provide powerful ad-hoc query ability".
On the "They don't know anything else" comment:
Really (They like relational but) have they tried something else or do they quickly dismiss everything that is not relational? Do they know that SqlFlaws are not part of the RelationalModel and a better language than SQL could be used (TutorialDee)? Do they bother to try an ObjectRelationalMapper? Do they make detailed comparison with ExBase? Have they (really) experimented Cobol? Have they used Prolog? Do they (really) know the advantages/disadvantages of a HierarchicalDatabase? Do they bother to compare SQL it with ConceptualQueries? or with JPAQL? or with OQL or OCL? or they just follow along because "Everybody who is anybody around them likes Relational". See OldRulesWithForgottenReasons
This could apply to anything, and thus should not bother being mentioned here in my opinion, other than maybe a brief reminder. Instead it rambles with a hint of insults.
I am sorry if it was interpreted as insulting,that wasn't my objective. I think it is important to remember the difference between something that is known to be superior, and something that is believed to be superior because of tradition (the principles in the tradition may be inherently superior, but I think that it would be better to explain people the concrete advantages, instead of just saying "this is the best way, because we have always done it this way and it is the only way we know"). See TraditionOverridesProof
Quite a ramble. But I'll touch on some of its points anyway:
- SqlFlaws don't really matter that much when well supported reliable and widely available products with SqlFlaws meet the business needs, and alternative products without SqlFlaws aren't as well supported or widely available, or worse yet, don't have nearly the tool support of the most popular products.
- Yes but the funny thing is that with ORDBMs we seem to be seeing a revival of the network model.. are this people blind to the advantages of relational? or do they just do not understand what is the advantage of Relational (or the drawbacks of the network model)?
- Personally, I favor ObjectOrientedDatabase technology, because it nearly eliminates the ImpedanceMismatch between the host language and its persistent data storage. (See ObjectRelationalImpedanceMismatch for RelationalDatabase issues.) If the standard representation of object instances within ObjectOriented programs is best described as being like a NetworkModel?, then it seems to me that a NetworkModel? "database" would probably be the best choice, to minimize the ImpedanceMismatch between the host language and its persistent data storage. If we found that RelationalDatabases, or some other data persistence and manipulation model, were far superior for solving our business problems, then I think this would suggest that we should redesign our host languages to match that model. -- JeffGrigg
- The ImpedanceMismatch you speak of is perhaps a difference between an application's view and a "general" view. An OOP app tends to create nested relationships between entities for example, but that's only the viewpoint of the application. Another app that uses the same info may not want or like the nested view of "parts". In that sense, ObjectOrientedDatabases are selfish in that they hard-wire the viewpoint of a single app into data potentially needed by other apps. Relational is more general in that relationships can be presented as an option or viewpoint, but are not otherwise as "hard-wired". RelationalBreaksEncapsulation for a reason. --top
- ExBase?!? Get serious. Are you really suggesting that corporations should discard Oracle, DB2, Sybase and SQL Server, in order to standardize all work on dBASE clones? They're worse than the relational databases, in terms of adherence to the relational model.
- Yes, they're worse than the relational databases, in terms of adherence to the relational model, but how many of the RelationalWeenies really know the exact disadvantages of ExBase form a practical POV (for example with comparisons between SQL queries and ExBases? showing the advantage of SQL?). A lot of developers in the RealWorld do not care about how well a tool implements a particular theoretical model, they care about how hard is for them to do their work.
- [Comments moved to "Ad-Hoc Data Processing" section below.]
- (Perhaps related: ExBaseRant)
- Are you seriously suggesting that COBOL and Prolog are alternatives to relational databases? You're comparing programming languages to persistant storage mechanisms. You'll have to explain how that could possibly work.
- I am not suggesting anything, I am only comparing COBOL and Prolog with SQL for data processing. I am just saying that it seems to me that as an industry we seem to forget why we switched from hierarchical or network to relational, in my experience it is extremely hard to find a developer that can explain the advantages with concrete examples of how things would be done in hierarchical or network, and how relational makes things much easier. (Are network and hierarchical really inherently inferior ? or were the implementations inferior?)
- Also, some people believe that DatabaseIsRepresenterOfFacts, and Prolog is a good tool for dealing with Facts (See DataLog)
- DataLog seems like an interesting idea. But it was my work with Prolog that convinced me that LogicProgramming was a dead end, for business, and that ObjectOrientedProgramming was a much more productive line of work. -- JeffGrigg
- Do you mean as far as popularity (support), or technical desirability?
- Technical merit: How well it supports typical business requirement changes. In particular, considering that "adding a column to a table" (or a field to an object) is the most common change I've seen in business systems, I quickly got to the question, "You mean I have to change every rule in the system that mentions Person, every time I add another attribute to Person, just to express that in this rule (like most of the others) I don't care about the new attribute?!?" It might be great math and theorem proving, but I've got issues on the maintainability angle. -- JeffGrigg
- {In a logic system, you don't add 'columns' to an 'entity table' (or entity predicate). Instead, you add an entirely new predicate/table that contains the entity-identifier and the associated attribute. This has been called the 'narrow table solution' elsewhere on this wiki - it enters conjunctive normal form (and, necessarily, 6NF) with each attribute having its own predicate/function. Then you rely upon the optimizer to handle any physical join in the underlying representation, should you need it - quite feasible, though the lack of optimization support in existing tools is discouraging. The main problem with Prolog as a database system is that it isn't well designed to handle updates to the logic rules (not even the extensionally defined predicates), so it is entirely impractical as an enterprise database management system.}
- Hmmm... Yet another reason to believe that my Prolog instructor didn't really know his subject very well. :-[
- OTOH DataLog seems to have better support for AccessPathIndependence than most relational (or pseudo relational) languages. See ConceptualQueryExampleOfAdvantages for example on the advantages of Logic or Conceptual query languages in this regard.
- Do you understand the historic maintenance problems of HierarchicalDatabases, as observed in actual practice?
- I would really like to answer that with a yes, but no, I do not really understand, I have never had to experience anything but relational. Can you recommend me a link with a detailed comparative analysis of Relational, Hierarchical and Network databases (preferably with examples on how things are done)? What about a page named: DifferentReasonsPeopleDoNotLikeNetwork?? --AnonymousDonor
- As I understand it, the main disadvantage of HierarchicalDatabases was inflexibility: Once you got the data physically arranged in the optimal hierarchy for the processing needed by your first application, all other views of the data were a costly hack of pointer maintenance and random disk access (IE: thrashing). RelationalDatabases "solve" that problem by having little inherent data organization, and relying on an expert system (the query optimizer) to figure out efficient ways to do things. -- JeffGrigg
- But... how much of that problem is really solved by relational databases? After all not even TutorialDee offers full AccessPathIndependence. The problem now is that once you got your queries written in a way that matches your model it is not easy to alter them to work correctly with the new shape of your model.
- Isn't that an implementation issue? Why is it not possible to do a similar thing with heirarchical databases? I find myself wanting to organize data in a hierarchical way quite often. My conceptual model is something like a unix filesystem. There is a natural hierarchy to the data in question, and the depth of the tree depends on which branch you are on. My implementations tend to be ad-hoc, non-critical and very small. The main difficulties I run up against are things like how to deal with alternative paths for the same item, or how to maintain integrity when the tree structure changes (moving and deleting). I am currently using Offsiders and Weaves to model this sort of heirarchy - and probably rediscovering all of the pitfalls of a navigatable object database along the way. -- GlenelgSmith
- {Where is implementation mentioned? Perhaps "references" should have been mentioned instead of "pointers". Possibly related: LimitsOfHierarchies}
On the ease of doing "Ad-Hoc Data Processing":
(The discussion in this section originally started with two points:
"Do they make detailed comparision with ExBase?"
and
"Are you really suggesting that corporations should discard Oracle, DB2, Sybase and SQL Server, in order to standardize all work on dBASE clones?"
)
{I don't recommend it to replace Oracle etc. for "corporate data". The above is a red-herring. Where it shines is ad-hoc processing and temporary tables, such as client-side processing. It is one of the best tools for building one-shot data conversion scripts, for example. --top}
- This is where I see embedded relational language like TutorialDee just as useful (but maybe LessVerboseTutorialDee, reducing BEGIN and END bulk and such). It wouldn't have the flaws of ExBase and would be quick and easy to mung data around instantly. For example my continual harping on the idea that one should not have to make a TCP/IP connection to use a database if they don't need a TCP/IP connection. Setting up connection info takes time and requires connection configurations, etc. One does not have to setup connection info and such when using an array or a list, so why should we with quick relational databases? One should be able to treat a database as if it was an available variable instance of a type in the language (a relvar instance of a DatabaseType) without any requirement to connect to a server or setup connections and such. It would be better than SqLite since no dynamic sql strings are required to be generated, better than parameterized queries, etc. It would be better than ExBase since it is relational. We shall be able to query the database in the program as if it were a built in array or struct with methods (but without obvious flaws that an array or struct has - no relational access or queries!). One should be able to mung data around regardless of whether one is using ExBase or not. ExBase could just be one example of a way to mung data around, but it doesn't mean other better solutions can't be created.
- I want PowerfulAdHocDataProcessingTools! -- JeffGrigg.
There's A LOT to be said for quick and easy generation of CreateReadUpdateDelete screens to let users easily see and change business data. I think we could learn something important from RubyOnRails and like-minded frameworks. I see a lot of value in the TheNakedObjectsFramework, too. -- JeffGrigg