This is a place to list and briefly describe issues involved when deciding how much to normalize relational tables. Please include only brief summaries of pros/cons in the list itself. Place long discussions below the entire list.
- Duplication Reduction (OnceAndOnlyOnce) - See NormalizationRepetitionAndFlexibility.
- DBA versus App Developer Availability - In some shops, DBA available is less or more available than app developer availability.
- I think developers who are worth their salt, can be the DBA's.. i.e. I don't see them as completely separate entities with completely different different knowledge. Most good developers should have strong knowledge of normalization or be able to read up on it (esp. if products offer proper documentation and education). I'd argue that a good DBA is not better than a good developer - because a good developer is also a DBA!
- Understanding of the Domain - The DBA's versus developer's understanding of the domain may be at different levels. This may affect whether the validation is best managed at the application level or database level.
- Cost of "Safety Errors" (bad data) - See SafetyGoldPlating
- Cost of Business Rule Change - When a business rule changes, it may impact different table designs in different ways.
- Availability of DatabaseRefactoringTools
- Availability of programmers and database engineers who know how to refactor manually, too - as normalization is something one must understand anyway. I hope the automated tools don't dumb down our community - but at the same time they would be nice to save time (if, normalization, can even be automated sensibly and reliably without human intervention - if at least partially).
- Cost of Confusion or Information Overload - What are the results of an error caused by being confused by complexity or volume. (Example: does the number of tables to grok affect the ability of developers or other IT staff to use them effectively and properly?)
- On the other hand, do hacks like multiple columns in one huge table cause IT staff to be confused and cause applications to go through complex kludges to work around the problems.. for example Image1, Image2, Image3 columns instead of a normalized table with the ability to have as many images as needed. What if the customer needs four images in the future? Our big wide table confuses are IT staff, now we have to change the schema of our big wide table to make room for Image4 and Image5! Think long term. Or, should we shove multiple images in one Image column using semi colon separators. A short term hack, leading to multi value database - confusing staff even more (can't query how many images customer has uploaded using database query language, etc.).
CategoryDatabase