Relational Weenie

Please, I want to know what a relational weenie is, and this page has no concise definition up front. (Or at all.)

Boo hoo; I've been using SQL databases for n years and I haven't the foggiest interest in getting with the program of XML or whatever.

Or,

"You coded how many hundred lines of Java in Object-Obfuscated syntax to do that? Why didn't you just type ... (insert arcane sql syntax here)"

Or, see FabianPascal, perhaps not a RelationalWeenie as much as a RelationalPsycho?.

I am curious, what did Fabian do to get the label Psycho?

It's FP's essay site* calling for Jihad against those who "don't understand" the "fundamentals of relational algebra" etc. I mean, I consider myself a RelationalWeenie (based upon the above criteria: 1) I don't want to get with the XML picture, and 2) I eschew using bizzare object things when I need a select count(*) from ... left join ... where ... ).

http://www.dbdebunk.com/ -- TobyThain

By that definition, BertrandMeyer is also Psycho.

Question: Is there a NoSqlWeenie? ?

Answer: Almost: UnskilledAndUnawareOfIt. ;-)

There are many people who would like to see SQL replaced or enhanced, including myself, but being against SQL is not necessarily the same as being against relational. As far as being "anti-relational", some object and XML proponents sometimes talk of getting "rid of" or avoiding relational.

Re: "haven't the foggiest interest in getting with the program of XML or whatever."

I generally disagree with heavy use of XML because it is not relational. It is fine for transferring data between diverse systems (although comma-delimited works well also in my experience, see XmlSucks), but using it in place of a relational database is a no-no in my book.

Wow, yes, concur, absolutely. XML is a DataTransport?, not a database mechanism by itself. How come Java types can't seem to grasp this? Although there are plenty of C++ers who are equally guilty, I'm sure.

Some Java types do. We're not all idiots, you know.

XML does not scale. In my situation, if I used XML the bloat would bring the network to its knees. And we are not talking about VLDB here, just a couple of GB at most. It is useless to me for data transport.

But what about compression? Surely XML being mostly text would compress very well. So I see no big issues with transport, but I do see problems with XML parsing performance.

Related: RelationalAlternativeToXml

Simple way to tell if you're a RelationalWeenie: if you believe that any code in any language

copying data from one collection into another,
serializing/deserializing to anything other than a database,
searching within a collection (yes, even if you use a HashTable or other form of dictionary)
iterating across a collection for any reason other than display

is EvilCode, then you're a RelationalWeenie.

three on this list- "searching within a collection.." is too extreme. It implies that most data structures ought not to exist outside of an implementation within a database. Having a small RAM based single-purpose HashTable dedicated to storing key value pairs is entirely appropriate. SwingDeveloper

I didn't make that list, so I won't comment on that one. Instead, I'd say something like, "You know you are a RW if you have pangs to run relational queries and/or use RDBMS-like tools on other software and tools. For example, the file system, code managers, file server security systems, and anything else that has has complex structures or attributes. --top

I admit, I am a RelationalWeenie. Although I feel that existing commercial relational systems can use a lot of fixes and adjustments, including fixing or tossing SQL (SqlFlaws), relational is a wonderful concept IMO. It sings to me. OoAndXml do not.

Yawn. I only ever hear ObjectWeenies talk about whining about RelationalWeenies. As far as I can tell, being in love with your tools blurs your vision. OO certainly isn't the best hammer to pound in every screw into every board. On the other hand, the big reasons I would use a RelationalDatabase are:

I don't have a choice
I fear that my data will outlive the application and I wish to avoid being locked into a proprietary application
many different applications use the same data and need a generic data store that anyone can utilize
the problem requires very sophisticated transaction semantics
the problem is a natural fit for set operations and set arithmetic

Of course an ObjectDatabase is a natural alternative to a RelationalDatabase ... but those haven't done very well in the marketplace. We should think about the reasons why that may be. On the other hand, ThePrevayler looks very attractive to me ... but then, I'm an ObjectWeenie. <grin> -- EricHerman

'''I guess GemStone is the exception that proves the rule. AnonymousDonor

Number 3 (data sharing) is probably enough to reject a PrevalenceLayer in most apps I encounter. The need to share pops up all the time. That leaves the fight between OODBMS and RDBMS.

There are countless ways to share an application's data, like Corba, RPC, RMI, SOAP, JDBC, ODBC, HTTP, etc.

Those are okay for external transfers, but some are too rigid for internal stuff. You cannot do ad-hoc queries if your interface is "flat". (JDBC and ODBC perhaps don't belong on that list.) One does not have to write new app code to query a database in ways not anticipated (for the most part). Filtering, joining, sorting, etc. flat data from scratch each time is a waste of code, if you can even get enough of what you need. I don't have to tweak Oracle or MySQL or bother the programmer of the supplying app every time I want to get data in ways not anticipated, but with flat outputs you do. I have been down that road before. It is not fun.

(Similar discussion under SharingDataIsImportant.)

Relational Versus XP

(Moved to RelationalVsXp)

It has been implied that since RelationalWeenies are a minority on c2.wiki, that the paradigm has been outvoted in general. However, I would note that C2 is hosted by an "Object Oriented Consultancy". Thus, filtering is probably taking place. If ChrisDate hosted a wiki, I am sure the balance would look different. BTW, there seem to be 3 RelationalWeenie regulars or semi-regulars here by my reckoning. -- top

Who are the other two? I've only seen you use that epithet. -- bottom

I would rather not name names. If they want to volunteer such information, I welcome it. Perhaps in a private email or something I can reveal one, but I lost the other name. Perhaps a wiki archive search is in order. -- top

There are a few around... you don't need to try hard with a wiki search to find couple... and there's at least a few more who wouldn't normally be recognized as such.

==== List of RelationalWeenies ====

Can I vote myself into the RelationalWeenie club? I turned my nose up at database programming until about six years ago, when I took a job working on a shrink-wrapped bookkeeping application. (Single-user, migrated to a simplistic multi-user model since then, and we continue to increase opportunities for concurrency.) My main responsibility was to address performance issues. The system had an OODB-like layer in it, which is ostensibly (but not entirely consistently) used to encapsulate the business logic.

I spent a lot of time studying OODB tools, being somewhat of an OO zealot at the time. Then, I luckily stumbled across Date & Darwen's works. Some things began to crystallize for me. Soon I was able to attribute the main systemic performance problem to a lack of efficient, ad-hoc, task-specific queries. Everything went through the OODB layer, which was necessarily based on wide and deep (hierarchical sets of) queries to load complete business objects all the time. Since then, I've been migrating the system to a relational approach. The OODB remains (at least for now) the main write path, but we've been speeding up a lot of stuff with the ability to do efficient ad hoc queries. It's been a bumpy road, and I suspect that some of my design decisions weren't optimal, but I remain convinced that a relational approach is a much better way of organizing and thinking about the data. It also helps in addressing data consistency issues, which were becoming increasingly difficult to manage in the OODB layer as different business objects increasingly shared common underlying tables.

-- DanMuller

Congratulations Dan, you're in the club. I just started it above.

Relational operators are powerful, and we need relational variables in object oriented languages so that we can stop relying on object links and hacked-up search code to find things. In fact, I'm dealing with this right now. I have a set of EmailAddress? objects, each of which has, among other things, a region. My MessageSelector?, which selects a message for each recipient, needs a list of all of the regions so it can verify, before the mailout starts, that it has been given a message for each area it's going to be asked to select. My crappy solution: iterate through that set of EmailAddress? objects, asking each one for its region, and collect up all the distinct regions. The relational solution would just be to ask the set for the set of all areas, and let it deal with figuring out how to get that. -- AnonymousDonor

ObjectWeenie solution (using Java 1.5):

  Set<Region> regions = new HashSet<Region>();
  for(EmailAddress email : emails) regions.add(email.getRegion());

RelationalWeenie solution (using SQL):

  SELECT DISTINCT region FROM email_addresses;

ObjectWeenie solution in a real OO language (Smalltalk)

  (emails collect: #region) asSet

FunctionalWeenie solution (using Haskell):

  nub (map getRegion emails)

None of those are particularly long. Yeah, it would've been a pain before Java introduced Set classes, but now that it's got GenericTypes and a sane ForEach? construct, it's down to 2 lines, not much more than the SQL or Haskell versions. -- JonathanTang

Yeah, but they don't really map cleanly. the ObjectWeenie solution uses two types, Region and Email, whereas the RelationalWeenie solution uses only one type, email_addresses, which incidentally must store a 'region' in them. Is the 'region' a name? an id? what?

A more appropriate example:

C# (.NET 2.0) approach:

  List<Region> regions = new List<Region>(); //Yes, but this doesn't remove duplicates... I think there aren't Sets for .NET
  foreach( Email email in emails ) regions.Add( email.Region ); 

Or with LINQ (MicrosoftLinq) .NET 3.5:

  List<Region> regions = new List<Region>();
emails.Distinct(new EmailRegionComparer?()).ToList?().ForEach?(x => regions.Add(x.Region)); (read lambdas as 'goes to').

(((

more properly:

  IList<Region> regions = new List<Region>(); //abstract = new concrete();
  foreach(Email email in emails) regions.Add(email.Region);

or more succinctly:

  var regions = new List<Region>(); //type inference w/ var
  foreach(var email in emails) regions.Add(email.Region);

and this may not be quite right as I just hacked it out without real types, but you can also use a projection to do something like:

  var regions = new List<Region>(emails.SelectMany?(email => email.Region).Distinct());

Also, C# added generic HashSets? in 3.5 and as memory serves SortedSets? in 4.0 (though I haven't used SortedSets? personally yet);

http://msdn.microsoft.com/en-us/library/bb495294.aspx

--ERH )))

This approach has benefits of types, scaling to parallel processing, and using filters as objects, as more intricate Distinct rules can be set in the EmailRegionComparer? object than in bloated SQL filters. It's not as fast as a good Sproc though...

SQL approach:

  SELECT Regions.* // Here we are missing a Distinct... aren't we?
  FROM Regions
  INNER JOIN Emails ON Emails.region_id = Regions.region_id //Each email has one region

ObjectWeenie approach(using Java 1.5):

  Set<Region> regions = new HashSet<Region>(); //Here we do remove duplicates...
  for(EmailAddress email : emails) regions.add(email.getRegion());

Both cases return a set of all the information in the region class/table that are referenced by any emails. This also assumes we have at least one region per email. I also assumes we can not have more than one Region on each Email... both OO and Relational are fragile when dealing with multiplicity

On the other hand, the C#/OO solution still has roughly 30% more characters.

-- JonathanMitchem

Perhaps, but, the OO solution can be written without Generics, and in that case, it doesn't need to change if Regions is a Join or simply a string (It deals with change in a better way, as long as it still keeps ToOneMultiplicity?):

ObjectWeenie approach(using Java 1.4):

  Set regions = new HashSet<Region>(); //We don't care if we change our definition of "Region" (it can be a string, an integer or an entity)
  for(EmailAddress email : emails) regions.add(email.getRegion());

Now... can you "hide" so easily the implementation detail that you may or may not need a join in relational?

Yes: views and natural joins. (I suggest other ways for SmeQl) -t

Wouldn't an ObjectWeenie handle this requirement by listening to changes in the EmailAddress master list and updating the Region set as necessary? -- RichardCordova

Perhaps a more important point is that over a busy network, the relational solution only sends the answer through the network, while an application would likely have to read every single record, which would have to be transfered across the network. It adds up when you have thousands of clients on a network. Further, if the region is indexed, the RDBMS may be able to summarize them without reading each instance (full record). Databases can often find the optimum path to a computation so that you don't have to sweat the optimization details. Plus, SQL is a fairly standard and well-known language. C#'s query language requires one to learn new syntax. SQL is not perfect, but its imperfections are not near enough to reinvent another query language for each app language. (See EmbraceSql)

There is a wiki topic on this network efficiency issue, but I forgot the title.

I like Java, but stuff like that makes me crave for lambdas in Java. -- WouterLievens?

It is not just about reducing the volume of code, but also about letting the machine determine how best to go about it. If there is an index on region, for example, then maybe the DB can find a shortcut instead of iterating through each one. Further, later adding an index if one is needed does not change the query. Manual iteration makes it hard to hide the code from such issues. You have to think on a pointer level. And, other languages and tools can use the same info from a DB without writing output classes.

The ObjectWeenie implementations are massively less efficient if the compiled bytecode acts as the source code reads. What if I've got a mailout to 1000 or 10,000 prospective clients? If the system isn't busy and this isn't a vital call then it wouldn't matter if I burn 10 to 1000 seconds on this. But the relational alternative would be so much faster and, given an ORM, doesn't break OO so far as I'm concerned. And the rule-evaluation system I'm working on now really needs that speed to be able to perform acceptably.

This reminds me a little of the various web presentation frameworks that try to hide HTML from developers. They're OK, but you can't make the interface do what you want it to, and the HTML is often rubbish. There's nothing wrong with admiting that web pages are written in HTML, and making the OO code work in HTML templates rather than shoehorming HTML into OO components. I'm a RelationalWeenie and a HtmlWeenie?.

OO is not everything. Learn SQL to make efficient relational queries when you need that performance. Learn HTML and CSS to make nice web pages. Don't be scared; they're not hard.

Different needs; I use an ORM onto a rdbms because

The good ones write to disk in a reliable way that I don't trust OO to without paying a bucketload for an esoteric technology server e.g. JavaSpaces
I can do set/relational operations without burning CPU cycles on object iterations
I can do set/relational operations quickly
I can do arbitrary operations without messing with the domain model

I use OO because of all the well-known reasons for using OO.

Iterating over OO collections isn't efficient because, as pointed out above, you have to load all the objects in the collection into memory to iterate them. Remember the problems with EntityBeans? - not just the nasty coding conventions, but the performance problems that prompted topics like CoarseGrainedEntityBean and EntityBeansAreEvil. On the other hand, if you have servers big enough to keep your entire domain model local and in memory it might be practical to forgo the rdbms.

I like OO, I like databases, but I do find the mapping awkward even with ORM like Hibernate. I suspect that any OO system that can do set/relational operations efficiently and quickly would look very like an rdbms under the hood; you'd need it to be ACID to keep the object pool sane, and table-like storage of records and attributes on the disk makes searches and comparison so efficient. I'm not saying that that is a bad thing, but I suspect that you'd still be able to use something like SQL on the underlying storage of the objects.

Take a look at TheThirdManifesto for a deeper glance at the object/rdbms furor.

In my opinion, the power of relational is from the merging of two concepts: First is tables, which are easy to relate to and visualize (as 3D beings who use 2D writing systems), and second is SetTheory with its succinctness and "mathability".

See ObjectRelationalPsychologicalMismatch, TableOrientedProgramming, ChrisDate, OoLacksMathArgument, ReinventingTheDatabaseInApplication, SqlFlaws, DifferentReasonsPeopleLikeRelational, RelationalWithSideEffects

CategoryRant CategoryWeenie