Schema Design Is Modeling

I find that when I design a system (or wish to redesign one), I almost always first start with general schema design. I find that this helps me best bring out important details and/or questions about the "model". And, it serves as the framework in which to later add task-specific behavior too. Does anybody else feel this way? --top

To some extent, yes. On the largest schema design I worked on so far, an accounts receivable system, I used a variety of techniques. We actually started with use cases describing what users would want to do. We spent a lot of time building these, working with a domain expert and another developer who had more expertise in the domain. This helped a lot with schema relationships, because it made clear what could be deleted, voided, etc. We often thought in terms of representing 'documents': invoices, deposit slips, etc. Many data screens would be presented to the users in terms of these documents, because they're familiar to them. Finalized documents were considered to be contracts; these could never be truly deleted from the system (except by infrequent archiving operations). Sometimes, we had to think in terms of events that needed to be recorded.

In a sense, I think we had several models of what the system had to do and represent, and the relational schema had to reflect all of the data needed to represent elements of all these models as facts. It was hard work, and we didn't catch everything in advance, but we did get all the essentials right. No major changes needed to be made during implementation. Although I don't like a waterfall model of development, I'm skeptical that an incremental approach would have worked as smoothly. -- DanMuller

I start with failing tests. If the code required to satisfy those tests is simple or already written, then I can move to schema design very quickly. I find that if I start with schema design before I write tests I insert too many assumptions about what the data "ought" to look like. My assumptions (or the user's or analyst's assumptions) can easily lead to overly complex schema. With large systems it can be much more difficult to simplify later than to create only what is needed to fulfill the requirements. -- EricHodges

In short, yes. Even if the "schema design" turns out to be no more than ER diagrams sketched on napkins before diving into TestFirstDesign, it is a helpful way of mentally coming to grips with the persistent elements of the problem space. On some large, heavily data-driven systems I've worked on -- a health care billing and patient records system, for example -- where it was understood that the database would need to outlive the applications that manipulate it, schema design was strongly emphasised at the start of the development process and in any significant revisions. This reflected the fact that the BusinessValue was in the data itself, not in the relatively changeable applications that manipulate it. It also reflected a practical consideration: Any schema changes had to be deployed to the up to seventy geographically disparate locations that ran our system, and although this process was largely automated, it could never be undertaken lightly due to the crucial nature of the data. -- DaveVoorhis


Here is a snippet of mine from a usenet discussion where somebody claimed that RDBMS are mostly just about "persistence":

[...] the general philosophy behind most DBs: they model the world by associating attributes of things to a virtual model of that thing. There is only one you, and thus you have a unique record and key in the DB and attributes about you such as hair color or bank balance are stored in that record about you. It is a one-to-one match more or less. The "persistence" you often talk about is simply modeling the external world where things don't disappear just because you turn off the lights [or computer power]. In this view, "persistence" is a technique for modeling, and not really a mechanical/electronic issue. (Note that one does not have to use them for one-to-one modeling of external things.)

This to me is a large part about what DB's are about: Modeling domain nouns.

--top


CategoryModeling, CategoryDatabase, CategoryPlanning


EditText of this page (last edited April 8, 2006) or FindPage with title or text search