Reinventing The Database In Application

Us RelationalWeenies see a disturbing trend of reinventing the DataBase in application code.

Examples include:

Context:

Smells: (Typical database features listed under DatabaseDefinition)


Please, do NOT read about eBay's transactionless architecture http://martinfowler.com/bliki/Transactionless.html

EditHint: move to WhenAreDatabasesNotAppropriate. See also: WebStoresDiscussion. Ebay's products are relatively isolated because it mirrors individual bidders and their individual products. There is not a lot of complex interaction such that the advantages of DB's would be relatively minimal. It lacks the multi-faceted relationships of most business applications (suppliers linked to sellers linked to shippers linked to marketers, etc.)


If something is gettting reinvented again and again, it seems it should be turned into a template or generic of some sort. C++ can now create template tables, template join ops, template search ops, template index mechanisms, etc. even for objects that change over time (with a template observer pattern). Of course, making a convenient Boost implementation is a ways off, but it can be done.

Reinventing the Database Once Per Language

Yes, but this is indeed "reinventing the database". If each application, shop, and/or language reinvents one, then there is not only duplication of effort (OnceAndOnlyOnce violation), but the interface/protocol/commands are all different because each will use their own flavor. There is no consistency such that you have to relearn each one everytime you encounter it. You also tend to run in to walls, such as a single-user library now needs concurrency management. With a DB those features come in the box. Writing all that from scratch is a waste of programmer time.

Everything needs to be reinvented when you design a language. Think about it... which general programming language have you seen that doesn't have its own linked lists, records, and basic IO procedures? If a project written in a language needs a database, you can't get by WITHOUT reinventing the database at least once for language. Even designing an interface to an existing database or query language needs to be reinvented at least once per language. The trick is reinventing it just once... and doing so in a manner that is powerful and flexible enough to fulfill the language users' needs. If you can't manage the latter, your users will feel the need to reinvent the database again and again.

You argue that "With a DB [certain] features come in the box"... and you're right. A DataBase isn't state-of-the-art if it doesn't provide atomic operations and concurrency management for the data access. But, of course, that's just one box... just one DB, just one language. You'll find everything you just complained about languages true about the databases... if each "box" reinvents the database, then there is not only duplication of effort (OnceAndOnlyOnce violation), but the interface/protocol/commands are all different because each will use its own flavor. There is no consistency such that you have to relearn each one every time you encounter it. Writing a new DB from scratch is a waste of programmer time.

The only means to prevent a OnceAndOnlyOnce violation on the basis of language is to invent a language that fulfills every role optimally... from peer-to-peer database management to optimized hardware drivers and printer specifications, and then invent it just once for this language. Unfortunately, that isn't likely to happen. Absent that, the best we can do is find a way to "reinvent the database" just once for each language that needs it... such that every application in that language requiring database functionality can leverage it.

In C++, this can be done with templates and virtual database/table objects (the latter allowing a single interface to support both in-application volatile databases and interfacing to persistent data stores or remote databases). And it should be done. Without this, every application that needs a database will need to reinvent the database and every database that C++ interfaces with will need to reinvent the interface (unless it deigns to use a common one, like ODBC)... which, as you'd say, is not only duplication of effort (OnceAndOnlyOnce violation), but the interface/protocol/commands would all be different because each will use its own flavor, forcing programmers to relearn each interface anew... a true waste of programmer time.


This page assumes there's a database involved in the application to being with. Believe it or not, some applications never use a database.

It is talking about database-like features in application code and not necessarily the presence of a database. Note that I agree that embedded systems and other niches may not have the horse-power for a database, but that is another topic.

Then how are you supposed to query data if you don't have a database and you can't write query methods?

Are you talking about not having the option of using a database?

No. I'm talking about apps that don't need databases. Pong doesn't need a database, but it does need query methods.

Perhaps this is an issue between "need" and "nice to have". We don't "need" compilers either since we could write in machine code directly. But compilers sure make life easier. In a similar way, I think databases make development easier. They provide abstractions that are common for managing state.


The issue here is being skirted. If a Database collection object was available in C++ or Java, I'm sure coders would jump up and use relational code as fast as they could. The problem is much more simple, and twofold:

Forces

Because of this, it takes a lot for a small file-based application to get big enough that someone sucks it up and admits that they need a DB. If it was as easy as saying
 FooDatabase? foo = FooDatabase?.Open("myFooName");
 myListBox.Data = FooDatabase?.select * from foo where blah;
where everything's compile-time typechecked, builtin, etc. then all the arguments against OOP users reinventing the DB in application would make sense. But the fact is that, while Relational theory is beautiful and pure and pleasant, and all the other features provided by databases are individually wonderful things we'd love to have in our apps.... but when viewed in the collective monstrosities of your average RDMS, databases suck.

When C# 3.0 comes out and I can just plug SqLite into it to give me quick, painless DB access within my app, then I'll start using DBs in my hobbyhorse projects. Until then, my home-brewed tools will remain DBless and I will use a DB when my employer specifically tells me to.

SqLite is essentially not a database nor is it proper relational, it's kind of an advanced fileOpen tool (the sqlite website even says this). I think SqLite needs to be replaced by a light and tight small proper relational database system. Someone could use SqLite source code as a basis, or wrap SqLite using it as the backend tool.


I'll be open to the idea that certain teams cross some threshold into rewriting a database (and that that's a bad thing, just like rewriting a servlet engine), if RelationalWeenies will admit that a requirement to persist data does not necessarily require a database. IOW saving data to a file or something like ThePrevayler don't constitute ReinventingTheDatabaseInApplication. -SteveConover

I don't see why queries and persistence are linked. It's perfectly reasonable to query something that doesn't persist. -- EricHodges

There are about 9 or so features of databases listed under DatabaseDefinition. If your app only needs one of those (say persistence), then I could understand arrays and file-writes or whatever. But as you need more and more of those features, then you are walking into database territory. The more complex the application is or will grow, the more of those it will eventually cover, in my observation. Thus, to future-proof your app, a database is often the way to go.

Most pro-RDB arguments boil down to some vague future need. YouArentGonnaNeedIt.

Well, it is a statistical issue. If say 80 percent of all apps in a given shop will eventually end up needing at least 2 of the referenced database features, then the total cost will likely be less than keep reinventing them one-by-one until you realize you need to gut the whole app to use a database instead. (Future replies to YagniAndDatabases)


Thus, to future-proof your app, a database is often the way to go.

I just wrote a Java app that plots a set of mathematical functions. How will adding a database "future-proof" this app? It has two or three of the smells listed above (get/set seems to be listed twice), but using a database for storing the indexed color map (for instance) would slow it down for no reason. -- EricHodges

Well, not everybody works on math apps. Nobody ever suggested that everything under the sun should use databases. (Although if they get popular and cheap enough, it may be worth it.)

The top of this page doesn't qualify the smells. It reads as if any behavior that could possibly be implemented with a DBMS ought to be.

---

I see the original poster's point, but I think how you interpret the implications is a matter of perspective.

A RelationalDatabase tries to solve many data problems that are common to many applications - persistence, querying, etc. So a RelationalDatabase, to the database-minded, might be the natural way to solve it. Those who want to move away from RDBMSs aren't trying to move away from those fundamental qualities, but from other problems they see as inherited from the relational approach. (Usually having to do with the ObjectRelationalImpedanceMismatch.) So, sometimes there is a reinvention - ThePrevayler seems to me like it would almost qualify - and sometimes you have to reinvent things to make progress.

If it is relational that bothers you, then why not an ObjectDatabase?? Because you don't need to readily share data with other apps?

Probably just laziness. MySQL is the RDBMS I use - if you think that even qualifies as a database - because it's free and it's everywhere. I do find myself often in multi-language environments where an extremely well-supported DB API is important. (In my day job, for example, I have a MySQL DB that's being regularly worked on by Perl, PHP, and Ruby. Yuck.) And although I can imagine certain benefits you might get by using an ObjectDatabase?, I've never bothered to really investigate them because wrapping MySQL seems to me good enough.

Me, I'm an agnostic - I use a RelationalDatabase, and then I wrap the whole thing in my ObjectRelationalMappingLayer. Sometimes, I find it useful to muck around in the DB directly, but often I'd prefer not to. Of course, I could do almost I wanted to do in DB code - but it's not the most expressive way, for me. -- francis

Could you qualify "expressive", or is it a subjective thing? I find relational logic highly expressive (although SQL is less than the ideal RelationalLanguage).

"expressive" is a tremendously subjective thing. In a nutshell, I find that it's easiest for me to grasp logical concepts when they're wrapped in a dynamically typed OO language like RubyLanguage, so I make my world Ruby-centric. That's my world, not yours. You don't even have to step into it if you don't want to.


[This should eventually go on its own page... I'll do that in a bit]

What I want (and am currently working on building):

I've mentioned this a bit in ManyToManyChallenge. It's not the final usage, although it's close. I have written spikes for all of the above. Now I want to write a full implementation. Working on the AcceptanceTest's right now... anyone game?

-- WilliamUnderwood

Some of these are not very clear IMO. I am not sure what you mean by "use directly in my language", "arbitrary query", "Persistence, in whatever form is necessary" (what is meant by "form"), "distribution". Further, is hard to make a universal protocol without using strings.


I would like to point out the way this thing (ReinventingTheDatabaseInApplication) is solved in Rails, in two different abstractions (model and controller). Rails (since its inception) had ActiveRecord. The ActiveRecord class encapsulates the database, enabling its users (AKA: programmers) to do data manipulation with a better domain language than SQL (IMHO, YMMV). And they've started to develop another way to access the data (ActiveResource?). This class fetches data via RestFull? web services and provides a subset of the API provided by ActiveRecord.

But his is not all. The new rails is RestFull?. This means that the objects exposed by rails can be used with the data base verbs (create, retrieve, update, delete). But I'm pretty sure that DHH explains it in a much better way than me. If you are interested watch the video at http://www.scribemedia.org/2006/07/09/dhh/ with the slides at http://www.loudthinking.com/lt-files/worldofresources.pdf. It's worth the time, even if you don't use Rails.

-- AurelianoCalvo.

ActiveRecord does not really free you from the table, for you must still reference the table fields explicitely.

ActiveRecord is a O/R mapper, and as said further up in this page: "headaches".


See Also: ModernDinosaur, AntiPattern


EditText of this page (last edited February 6, 2012) or FindPage with title or text search