Is it surprising that we have tools like ReShaper? or Ideal IntelliJ, or Eclipse that make it really easy to refactor our code to make it fit better the domain of our requirements, but there are almost no tools to do that for Databases?
A good DatabaseRefactoringTool should make it easy to:
- Change the multiplicity relation between tables (from OneToOne to OneToMany? or to ManyToMany? in a single click)
- Increase or decrease the normalization of a design without losing data, and without having to write a very complicated SQL Script
- Split a table into two tables, and automatically rewrite the referencing stored procedures, views, and so on.
- Join two tables in to one, and automatically rewrite the referencing stored procedures, views, and so on.
- Allow to remember the re-factorizations applied to the development database, and re-apply them, to the production database.
- Work for lots of different databases (not only for SqlServer or only for Oracle)
- Rename functions, views, stored procedures and tables, and updates all the references to these renamed objects.
- Rename parameters and columns renames parameters of stored procedures and functions, and columns of tables and views.
- Schema comparison and automatic (or semi automatic) script creation to convert from one schema design to the other
- Easily add or remove a column from a Table
- Easily change the type of a column (if the change doesn't damage data, it should be super easy to change from Integer to Long, or from Integer to Float, or form Integer to String, and if can damage data like for example from Float to Integer or from String to Integer it should automatically verify if the conversion is possible given the current values of the column, and if is not possible, generate a query that shows you which rows have the offending values)
- Encapsulate As New Stored Procedure turns your selection into a new stored procedure and, if requested, introduces a reference to it in your script.
- Expand Wildcards expands SELECT * statements to include a full list of columns in the select part.
- And the UltimateProblem?, propagating the effect of your refactorings to client code
- Easily drop tables (even if connected with other by serveral integrity constraints, it should also make it possible to delete those foreign keys if desired)
- Easily change the type of primary keys or foreign keys (from String to Integer or something else) and propagate the changes to referenced primary keys or foreign keys
Examples of these kind of tools:
A strange thing in the world of database design and admistration is that we have lots of tools to may lots of silly diagrams, but we do not have many tools to do DatabaseRefactoring?... why is that? (Maybe it is because bad Dbas do not want anyone to know that they could be replaced by a tool?)
As someone who has made several false starts at creating such a tool, I think the biggest obstacles are:
- As programming goes, this turns out to be particularly tedious, un-challenging, un-interesting, and un-sexy;
- un-challenging? you mean it is easy?
- Essentially, yes. Except for "schema comparison and automatic (or semi automatic) script creation to convert from one schema design to the other", almost all of the above are straightforward, except a few that strike me as nearly impossible to automate.
- So, database refactoring it is so easy that is impossible?... Mmmm... paradoxical concept
- Not really. Most of the operations are trivial. Some operations are relatively easy for a human to do, but very difficult if not impossible to automate, like auto-generating the SQL to morph one schema into another -- except in trivial cases. Keep in mind that this could involve re-writing stored procedures in non-trivial ways. Indeed, any operation that require significant alterations to stored procedures are unlikely to be automatable, except in trivial cases, but a human programmer can do this easily.
- tedious, un-interesting, and un-sexy? really you think those are the real motivators for lack of availability of this kind of tool?
- They're certainly responsible for my lack of motivation, but the fact that database refactoring occupies such a small percentage of development/maintenance time is probably a more significant factor. See below.
- If you're a database professional, there's always plenty of work around that's more profitable in the short term;
- In other words, you believe that in the world of databases the objective should be to JustMakeItWork? (You just don't care if it becomes a maintenance nightmare?)
- Such tools may expedite refactoring, but they don't do anything that can't be done manually.
- Yeah, there is nothing as exiting, challenging, interesting and sexy than having to manually fix all those functions, views, and stored procedures just because you need to rename some tables and fields, and after that there is nothing like the excitement of wondering:did I break anything? pure bliss... in fact, who cares if the name of the table/fields have nothing to do with the UbiquitousLanguage of the domain experts... only kids are afraid of maintenance nightmares
- I think you're ranting rather than making a point, but... Trivial renamings are just that -- trivial. They're easily testable on the development database, too. Non-trivial refactorings -- i.e., those that require changes to the functionality of stored procedures -- are not automatable anyway, so there's little gain to be had. There may, however, be some gain to be had, so feel free to write a database refactoring tool. I'll happily test it for you.
- It is not that I think that trivial renamings are not trivial, it is just that I do not like GruntWork, do you? It is also that I have seen too many databases with wrong names everywhere and developers or Dba that say that they "just don't have time" to fix them (they may be trivial, but they can not be done in trivial time)
- Fair enough. Like I said before, feel free to write a database refactoring tool; I'll be happy to test it for you. However, keep in mind that part (if not most) of the reluctance to changing the database may not lie with the database itself. There may be dozens of client-side applications dependent on that database. They'll have to be changed and re-deployed too.
- Well, that might be a case for LINQ, with something like that, your client code would be strongly tied to the database structure and it would be trivial to automatically propagate trivial renaming from the database to the client applications... and therefore ObjectRelationalMappers? are a GoodThing, because they make it possible to automate this kind of changes... (Believe me, a tool that could do this would save anyone lots of GruntWork, don't you think?). Or, if you do not like the Object part in ObjectRelationalMapper... maybe the answer is to create a ClientApplicationToDatabaseMapper that makes it possible to automate this kind of changes (and it should replace silly GruntWork prone database Apis like JDBC or ADO.NET)
- Since they're done relatively infrequently (see below), they're the kind of thing that when you have to do an individual task, you sometimes think "Gee, I wish there was a tool to do this." Then you finish the task (i.e., preserve the SQL that implements the refactoring) and spend the majority of your time on other things and simply forget about it. I've never found this sort of thing a maintenance nightmare. Annoying at times, yes. A nightmare, no. Since most such refactorings have to be deployed to production databases in the field, they're invariably expressed as SQL scripts rather than interactive changes. It's a matter, therefore, of writing some code.
- Database refactoring, manual or otherwise, occupies such a small percentage of the overall database-driven application development process that most of the time it seems not worth bothering.
- Really? or that is just some more of that I JustDontCareIfItBecomesaMaintenanceMightmare? philosophy?
- Really. >99% of my time is spent working on the application(s) side; <1% is spent working on the database, and that includes some big schemas that change relatively frequently. For every minute spent making a change to the database, there will probably be at least an hour spent accommodating the change on the application side, unless it's a trivial change. Having said that, I think such a tool is a great idea. I'm not deprecating that. I'm simply pointing out why I think it isn't a significant application category.
- Well, that maybe so, specially if you are like me and use an ObjectRelationalMapper (and unlike those that believe that DatabasesAreMoreThanJustStorage and do code in stored procedures ,views and user functions)
- I don't use an ObjectRelationalMapper, and I believe that DatabasesAreMoreThanJustStorage and a DatabaseIsRepresenterOfFacts.
- Well, then it is really amazing that >99% of your time is spent working on the application(s) side, if you do belive DatabasesAreMoreThanJustStorage, then, how can you spend such a little time in the database? you certainly can not have a lot of your code in stored procedures, functions, triggers and stuff like that.
- If you believe that DatabaseRefactoringTools will not help you... I JustDontGetIt... How can you believe that DatabasesAreMoreThanJustStorage and at the same time use them only for storage? (If you never write functions, triggers and stuff like that, and 99% of you code is on the application... and you don't like ObjectRelationalMappers?... I am starting to believe that you may be masochistic... Maybe I just don't get how you code...
- The volume of application-side code is proportionally vast compared to the database-side code, and there is typically one database shared by multiple applications. Stored procedures and triggers are typically used to handle specific core server-side functionality -- such as to support an employee scheduling engine, for example -- and change relatively rarely compared to the application-side. The need for column, table, or procedure renamings is rare. Deletions of columns or procedures is rare. Much, much more common is adding tables and columns, which result in a relatively small server-side change volume, but may effect multiple client-side applications, and may require changes throughout each client-side application. Also, changes to client-side applications or development of new applications that require no database changes are relatively common. However, I do recognise that this may be somewhat domain-specific. I can imagine domains where the database might require massive changes to reflect significant business changes, but that would still require significant -- and proportionally large -- changes to client-side applications. In that case, I would be inclined to deploy a new schema and database that imports the existing one, rather than refactor the existing database. Finally, I take an approach that does not mirror whole database entities as client-side objects. I build applications to process information, i.e., facts; I do not create business simulators. And, I'm not saying DatabaseRefactoringTools wouldn't help at all, just that they wouldn't help enough to motivate me to finish building one.
- I can think of a very common time when the database might require massive changes to reflect significant business changes, and that is when you are creating the system for the first time, and, as always, the requirements change every day.
- Yes, I've seen poorly-managed and/or poorly-analysed projects go haywire in that manner in the early stages, and a refactoring tool might help relieve the pain thereof. Still, the percentage of overall project time that will be spent on the database, as opposed to working on the application-side code, is likely to be very small once the database requirements have stabilised.
- Mmmm, what you say gives me the impression that you tend to get your database designs right on the first try (Something that in my opinion rarely happens, specially because Customers are not so good at explaining what do they expect from the software). Your customers usually have all their requirements correctly specified right from the beginning? Or... Do you handle this change in a way that wouldn't be helped by DatabaseRefactoringTools? if that is the case... can you explain your methodology?.
- On the first try, I get the database design 100% right 0% of the time, 99% right about 1% of the time, but (throwing arbitrary statistics around here), I get it 95% right almost 100% of the time. Most of the changes, however, tend to be minor -- adding columns, occasionally changing column type, and sometimes adding tables or turning an existing table into two in order to capture a one-to-many that escaped initial analysis. Wholesale changes that necessitate a new schema? Very rare. Some reasons for this:
- We primarily (but not exclusively) developed (I'm using the past tense because I'm mainly doing university lecturing now) financial accounting, employee scheduling, billing, payroll and inventory systems, which generally have clear database design requirements even before analysis. There are only so many ways to build a Canadian payroll system, for example, and the majority of core requirements are specified in government-provided documents.
- We employed a requirements-gathering methodology we developed in-house called the Forms-Based Analysis Method, which uses a combination of a diagramming methodology and a set of structured questions that are highly successful in reliably eliciting database requirements from the users. (Published, with a co-author, in the Proceedings of the Western Decision Sciences Conference, San Francisco, California. 1995)
- The above is essentially BigDesignUpFront, rather than an agile methodology. Subsequent client-side development is agile. This combination seems quite effective in minimising difficult changes.
- During development -- when little or no data need be retained -- the database schema can be trivially dumped, edited as a text file, and re-loaded.
- Sometimes changes are needed after deployment, but these are very rarely significant, as the users have generally been closely involved throughout the process. We used an inversion of OnsiteCustomer, in which the consultant developer spends significant time on the client's site -- sometimes even acting in an employee role -- in order to elicit and determine requirements, and obtain feedback on initial releases.
- While application requirements are often unclear and changeable, database requirements tend to be quite rigid. A given enterprise, department, or organisation tends to have numerous ways to clearly indicate exactly what it needs to record about its Customers, Employees, Products, etc. These are found on printed forms, existing computer systems, Post-It(tm) Notes, catalogues, or whatever the business currently uses to record and/or distribute relevant information. This becomes the basis of the database design. What tends to change or be poorly-understood (or poorly-communicated by the users) is what the business does with this information. That becomes the basis of client-side application functionality.
- And the idea of creating an ClientApplicationToDatabaseMapper to make it possible to do refactorings all the way from the database to the client application would tempt you?
- Yes, essentially. I've added a comment to that effect to ClientApplicationToDatabaseMapper, though my idea is broader in scope than just a mapping layer.
See also TableNormalizationIssues, ImproveDatabasesOrElse, RefactoringDatabasesBook
AprilZeroEight
CategoryRefactoring CategoryDatabase