Extreme Programming Challenge Thirteen

Lucky number, huh? Okay, I'm working with a gaggle of DBAs. I'm building a 3-tier MTS-style app, and I want to keep business logic in the middle, away from the stored procs in the back. Refactoring is just fine and dandy in the middle, but my DBAs have a database to normalize and tune, and it's going to have to serve a whole bunch of apps in the future that will need it to do things I can only imagine now. Sure, you aren't going to need it, but RDBs are expensive to refactor, right?

So how can I stay extreme when I am going to need things in the data model? How can I stay extreme when the RDB is truly expensive to refactor? Or are RDBs the death of XP?

(Thinks: I ain't heard a word about RDBs from these SmallTalk clowns. GemStone for DataWarehousing?? I can't wait to hear 'em explain how XP'll make that scale ...)

See RefactoringWithRelationalDatabases

Get your own "copy" of the RDB. Refactor it whenever you want, daily if you feel like it, up to first release. It's not really hard, you just recreate all the tables. Keep the cardinality small so it doesn't thrash. Make sure the released version is fully normalized. Thereafter you can always add fields by setting up auxiliary tables with the same primary key as the one you wish you could extend. Generate default records or deal with missing records with default values in the model, as seems more convenient. Rely on the fact that as the system stabilizes, changes to the RDB are less frequent, just as changes to the objects tend not to change the instance variables very often. Finally, when the RDB needs to change, change it.

DBAs have an unnatural fear of refactoring. It takes about one SQL statement to do most RDB refactorings. They should have called them something other than "Administrators", it got them off on the wrong foot.

Refactor that.

In moderately large and complex RDB environments, simple and obvious modifications to database objects can potentially cripple the performance of existing systems.

This could possibly be addressed using a TestingFramework; if one were to keep a storehouse of "queries that need to be fast", and run them all whenever changing the database schema, you could verify that you're not breaking anything. It might be difficult to maintain such a repository. -- BrettNeumeier

See CrossingChasms

DBAs have an unnatural fear of refactoring? Where I worked before had a DBA afraid to change ANYTHING. The usual excuse? "It'll take three months to test it [we only had two weeks to go]" 12 months later and we are STILL at it!

A DBA as above, lots of little weird C++ bugs, The C++ Guru Who Never Used CVS, A broken GUI system, broken middle-ware, Change Phobia ... and NO unit tests. In fact, no automatic testing of ANY kind.

I refactored my career instead. 10 doors down I found computing heaven.

Moral of my story: the price is high, but you need UnitTests for ALL parts of your system, including the RDB pieces.

-- NickBishop, bilingual C++ and Oracle Wannabe

The answer to most ExtremeProgramming hurdles is to be more Extreme:

Refactoring is just fine and dandy in the middle, but my DBAs have a database to normalize and tune...

Tuning is optimization. Put off optimization until you have a clean (refactored) system to work with, right before the next release. And, of course, why are you optimizing before you know where it's needed? ProfileBeforeOptimizing.

As for normalization: it seems to me to be the RDBMS equivalent of refactoring, so when you refactor, you're doing some of their normalization work for them.

...and it's going to have to serve a whole bunch of apps in the future that will need it to do things I can only imagine now.

Then the database can be changed in the future to accommodate those apps. What's more likely: that the future developers will curse you for not guessing at their requirements and tossing in a lot of speculative stuff, or that they'll thank you for leaving them with a clean design to start from? YouArentGonnaNeedIt.

Sure, you aren't going to need it, but RDBs are expensive to refactor, right?

Code's expensive to refactor, right? Well, both are common wisdom, and the latter is wrong when you're doing things the ExtremeProgramming way, so maybe it's time to take a long hard look at just how expensive it is to refactor a relational database:

The only expense that's different from that in the case of code refactoring, is migrating the data. And I'm no longer very sure that's all that expensive. If the working set of data is small, it doesn't take long. If the working set is large, you can refactor with a view instead of actually changing the table schema.

For more on how to refactor efficiently with RDBMSes, see RefactoringWithRelationalDatabases.

One final point, related to future applications using the same database, is existing applications using the same database. This is a difficulty, but it's just the old CodeOwnershipBoundary? problem again: you can't refactor stuff you don't own. Get around it the way you would any other such problem: e.g. adapters.

In fact, you can see the recalcitrant-Database-Administrator problem as a code ownership problem, too: the database schema is code, you need to work with it, but the DBA owns it. If you want to do ExtremeProgramming, make it so the DBA doesn't own the schema. Have the DBA PairProgram with the other developers when they're making schema changes if you want to leverage and spread his or her database knowledge. SpecializationIsForInsects.

-- GeorgePaci

Extract of a post I put under DbasGoneBad:

When someone mentions DbasGoneBad, I think we need to look at roles. In many application development scenarios, you end up with 2 sets of databases, production and development, with mutually exclusive functions. A development database exists for development and some testing work. Programmers often have a high level of control to make changes etc. DBAs provide tools and assistance, often also working as data analysts and modelers in smaller environments. Programmers are given wide latitude, though provisions are made to insure that developers do not trip over each other. In my situation, I often end up playing traffic cop which is sort of a SCM problem. In addition, I often spend time educating programmers on what a DB is and what our particular brand of DB can and cannot do. Now, a production DB is very different. Often, a company will have millions or even billions of USD invested in the data. Before a production move the programmer must prove that the code they are going to unleash (and any schema changes) is going to work correctly and maintain data integrity. Often, this is a complex task. But the stakes of a mistake are high enough that a good DBA should go nazi on anyone that may threaten the production DB. That includes having them provide a detailed work plan with fall-back positions. This may slow development down a bit but it is necessary. It does not provide an excuse for doing nothing, however. To me, DbasGoneBad do not properly support the development process and/or make things so complicated to move that nothing gets done.

So my point is that you can use XP at least for the first version and redesign the DB as needed. And a good DBA staff should support this. Also scale matters, the larger and more complex a DB, the harder it is to test changes and so to make changes.