Approximately 25 rules-of-thumb for producing allegedly smoother software development and maintenance initially proposed by WikiZen TopMind.
At the moment this rant neither summarizes the 25 rules implied by the title nor opens itself well to discussion. To whomever just wrote it: Please add one of these aspects, or consider it for deletion.
I also think the name needs a review, but I don't yet have any alternative suggestions. Maybe something like: CanDesignBeReducedToRules?. I don't think it really needs a number stated since there is no competition yet for different quantities of rules.
I agree. Some suggestions: DesignByRules? (neutral), FollowRulesBlindly? (negative), RulesMakeItEasy? (positive)
How about TableOrientedDesignRules?? Some of the items suggest a procedural or table-driven solution, where an OO approach would suggest something else. For example, "lots of similar case-statement conditions scattered thru-out code suggests that a table would be a better solution" typically suggests polymorphism in a strictly OO approach.
How about 'DesignByCheckList?'!
RulesForRelational? ?. It's usually to some advantage to pick something that is subject to accidental linkage.
Dunno... There are few things here that are rules for relational or SQL development per se. Those are better covered under DatabaseBestPractices. This seems more about client-side business application development using procedural languages like FoxPro.
What specificly is FoxPro-specific?
It isn't FoxPro-specific, it's procedural-specific. FoxPro is an example of a procedural language with named parameters and built-in local tables, hence my use of the phrase "procedural languages like FoxPro". The rules are procedural-centric, with a flavour of data-driven or table-driven development. As I noted above, some suggested solutions might be quite different when using an OO approach, even when working with SQL databases. The rules aren't relational, as the RelationalModel merely describes relations, tuples, and an algebra for manipulating these. It would make no more sense to call this page RulesForRelational than to call it RulesForArithmeticExpressions.
- I'd lean more toward "procedural rules when a database or attribute-and-collection-management-system is also used". I prefer relational for the DB, but as described, the rules would not change much if a different kind of DB was used. In OO, the app language builds the ACMS and it becomes melded into the app code (in an ugly way). --top
My understanding is that somebody is questioning that a handful of rules can define a software development process in general, not necessarily limiting it to a given paradigm or technique. In other words, they are saying that 25 odd rules are not enough to define good software practices, regardless of the paradigm. It is premature to focus on specific paradigm/tools. These rules listed are simply a specific sample to study.
I was merely focusing on the attempt to give this page a better name. I think RulesForRelational is worse, TableOrientedDesignRules? is better.
But do we want the topic to be about using rules, with table-orientation being a specific example to analyze, or about table-oriented rules?
''I'd rather it be about rules in general. The original topic title suggests nothing about relational, table-oriented, etc. I'd favor DesignByRules? or FollowRulesBlindly?, perhaps with a dedicated links to rules for specific domains.
My original issues when I wrote this rant were that:
- First I don't believe such rules exists because every system is a little different.
- having rules is no warranty to make something "easy" (it doesn't matter if the number of rules is big or small, rules are declarative, and that says nothing about how hard is to follow them: "All you have to do to finish this software right, is to calculate the last digit of Pi, that is the only rule")
- with steps (an algorithmic solution) I think it would be possible to say if it is "hard" to follow the algorithm or not... (of course, then the hard part is to probe that following the algorithm always leads you to a good solution)
- I think the goal of this page should be to discuss if it is possible to build good software easily by following an small number of rules, and if it is possible to say that "a rule" makes "a process" easy or hard"
Mmmmm, maybe we should have both? Rules to test the correctness of the algorithm, and algorithm to measure effort?
Some developers... specially those that are RelationalWeenies, seem to believe that with TwentyFiveOrSoRules? it is possible to build good an easy to maintain software (at least in the biz domain), I believe it is harder than that... or perhaps it is possible to have such an small number of rules... but that doesn't make it "easy"... why? well, because rules are declarative, and that says nothing about the procedure required to fulfill them (I could say, for example, that the only rules needed to go to the moon and back without an spaceship is "to have all the powers Superman has", but that doesn't make it easy, it doesn't even make it feasible)
Now.. if we were talking about the TwentyFiveOrSoStepsToBuildSoftwareThetWorksAndItIsEasyToMantain? (specially very detailed AlgoritmicSteps? then we could say (and only after reading them) that those steps prove that building software following a particular recipe is easy (and we of course could only say that for the kind of software that falls inside the scope of those steps, in this case at least in the biz domain).
I'm not sure specifically what you'd like to see. I'd suggest picking an item from the list and focusing on that. For example, you may ask "What does a generic classification system or code template look like?" if that is an area you feel needs to be fleshed out. -t
Reminds me of a lecture back in college A pack of useful lies about the Renaissance... Would everyone feel better if we said heuristics?
I believe I said something like that somewhere, but I don't remember where at the moment. But here is a modified copy of a list from ProceduralMethodologies that can be used as a starting point. Note that I never said such a list would be final, and amendments and adjusts would indeed need to be made as problems with them are found over time. --top
27 Database-centric Procedural Design Rules of Thumb
- Divide up code by "tasks" or "events" (EventDrivenProgramming) to be performed by the software
- Why do you think this is easy? Finding the right task or events to include in the scope an deciding what to do firts is IMHO a hard job
- StepwiseRefinement comes fairly natural to most people after they have a little experience. But in generally, I would say, "let repetition be your guide" (see next). Draft it out in psuedocode and if you see lots of repetition, then factor that into routines/functions.
- Even without StepwiseRefinement, you'll find that most UserStories work similarly. A user story needs to be broken down into several EngineeringTasks? -- each of these tasks map well to this. I've been a long-time OO programmer and huge XP advocate. I finally realized one very valuable reason why I love ForthLanguage so much, and it is because it strongly encourages TaskMajor, rather than ModuleMajor, program organization. But, it requires a whole new set of support tools to implement well. I'll be blogging on this soon, once I get my thoughts down. --SamuelFalvo?
- Study a list of proposed tasks to see if there is duplication that may suggest frameworks or libraries.
- Again not easy... but a good idea
- I think an initial draft is fairly easy for an experienced developer. But culling, tuning, and re-factoring the list indeed may be a lot of work. Good factoring is never easy, but spending some time to at least take note of duplication helps one to at least focus on the lowest-hanging-fruit or biggest perpetrators of duplication.
- Information about domain nouns goes mostly in the database, not code.
- Why?
- Primarily because it is easier to sift, search, sort, and filter facts about nouns in a database than in programming code.
- Most interactive tasks should be no more than about 100 lines. Batch tasks may be much longer.
- I personally use the limit of no larger than a screen, this rule (and yours) are easy to follow...
- In many cases that is too small in my opinion.
- Use the HeadlinesTechnique to comment each related unit of code.
- Follow relational normalization rules. When in doubt, make one larger table instead of two smaller ones. (Some interpretations of normalization rules lead to too many tables.) See DatabaseBestPractices for more tips. FearOfAddingTables gives an account that disagrees with this rule.
- For larger teams, assign an experienced developer to coordinate and oversee shared libraries and frameworks.
- Any hints on how to convince management of this? IMHO this is pretty hard...
- Use a DuplicationRefactoringThreshold of 3 to 4. Thus, if you see a pattern repeated 3 or 4 times, then consider factoring it to a subroutine or shared library. Eschew DontRepeatYourself louder.
- Lean toward using the database as the primary communication conduit between tasks instead of variables.
- Isn't this bad for performance... specially in slow networks?
- I don't see why it would in general unless you are doing something unusual. How about an example. But I agree that one may have to abandon some rules-of-thumb if performance is the driving factor.
- Hire a good DBA, or at least a good data modeler
- In my experience may good DBAs make the life of application developers unnecessary complex , by forcing patterns like ComplexPrimary? instead of SurrogatePrimaryKey? as some kind of tradition... they don't care if it is harder to databind that to a combobox (or many other ui controls)... unless you use an ObjectRelationalMapper... but they don't like those either... so IMHO DBAs are good to maintain an existing database, but not to design a new one
- If they get in the way, they are not "good". Perhaps bring these up at DbasGoneBad.
- Only use stored procedures for the more common or most costly queries. Excessive use of stored-procedures results in more code change-points for changes such as new columns.
- I agree, but how do we convince everyone of this?
- Show them the extra steps a developer has to take to add or remove a column/parameter from an SP and multiply it out to show them the hours wasted per year.
- I think this problem is related to the lack of support for sending complex structures to the database... (object as a parameter support?) is there something is relational theory that forbids complex structures as parameters? or this is just one of SQL flaws?
- How about a UseCase?
- Use languages with named parameters and built-in "local" tables if possible.
- Like?
- Unfortunately, named parameters fell out of style. Only VB.NET has them among the commonly-used languages.
- So... this one isn't just hard... is just plain impossible with any commonly-used language ?
- There are ways to emulate them, it's just more work. (Insert related topic when found.)
- What I use in my C coding is borrowed heavily from Smalltalk. If I want to associate a widget with a containing widget, I'll call a function associate_withContainer_(someControl, someParentWidget). Note how I use the underscore similarly to how Smalltalk uses the colon. I find it really improves legibility. --SamuelFalvo?
- Off topic perhaps, but what is "row R" in this example? Is "datum" an entire row or merely a specific column?
- I've altered the example to be a bit self-describing; I was hoping that my previous example was self-evident given the context, since the goal is to illustrate my function naming conventions, not to spend any attention to its parameters per se. --SamuelFalvo?
- Understand the users' backgrounds, not just the requirements. For example, would they prefer easy-to-use or reduced typing/mousing? Optimizing interfaces for newbie understanding and power-users generally creates ConflictingRequirements. If you need both, then perhaps create two screens per user task, one for newbies and one for power users. However, this complicates the system, so only consider it for high-volume or critical portions of the app.
- Use DeclarativeGuiFrameworks and EventDrivenProgramming when creating GUIs rather than hard-wiring GUI arrangments into code calls.
- Avoid too many parameters. Lots of parameters usually indicates a problem with the design or the particular language you are using. See TooManyParameters for tips.
- This is hard, specially with when talking to stored procedures, they cannot receive complex structures... do you know of one able to receive complex structures?
- See above about stored procedures.
- Managing the number of parameters is easy. I suggest reading ThinkingForth for an excellent philosophy of coding that results may result in a drastic reduction of parameter passing. See also ParameterObject. --SamuelFalvo?
- Dividing up by tasks should keep the name-space at any given (execution) time relatively small. If the name-space gets crowded such that naming collisions are common, then some refactoring may be needed to keep tasks small and relatively independent. Keeping the name-space small per task allows one to keep the variable names and scope management simple.
- Strive for HelpersInsteadOfWrappers. GenericBusinessFrameworkUnobtainable for the most part. Make it relatively easy to abandon a given framework if need be and feel free to copy-and-modify framemworks from other applications rather than try to centralize them. Break up big frameworks into smaller cooperating sub-frameworks if possible. Clean out any irrelavent stuff from copied frameworks/utilities.
- Lots of similar case-statement conditions scattered thru-out code suggests that a table would be a better solution.
- Or maybe use the StrategyPattern ?
- I've never really found a good use for it in my domain. How about we agree that lots of similar case lists suggest pondering alternatives.
- Document the reasons behind major design decisions so that one can trace the origin of problems in the future. The purpose is not to blame people, but to learn from everyone's mistakes. It also encourages one review their decisions.
- So... we could say this mean use DesignByContract languages? or DesignByContract practices..., because the bad thing about documenting this stuff without using actual code to do it is that if it isnt compiled, it falls out of sync and becomes irrelevant pretty fast.
- There is a difference between documenting design decisions and documenting the final/running project.
- Name by frequency - Frequently-used variables and functions should have shorter names than rarer ones. Describe the meaning of any abbreviations at the declaration point.
- I am against abbreviations for team work, stuff like "po" can be "public office" "public operations" "personal orders", "pretty old", etc, etc, in my experience abbreviations lead to unmaintainable code
- I don't think its a problem if they follow the describe-at-declaration rule. It is a matter of enforcing DAD.
- Please explain this... what could be better to explain something than "not using abbreviations"... If I need to explain the abbreviation, the why not just "not use abbreviations" (the text saved in the abbreviations is wasted in their description)
- Somewhere there is another topic for this that avoids my memory right now. I find long names make code harder to follow in many cases and can also make formatting difficult because lines get too long to print or fit on the screen. True, it may be a personal preference. For my eyeballs personally, heavy use of long names makes code harder to read, risking misinterpretation and slowing one down. Using your example, if "public office" is very common in the domain, then don't be afraid of abbreviating it to "po". If it's "kind of" common, then consider "pubOffc". If it's rare, use the full name.
- Sets are ofter better than trees and graphs for managing "variations-on-a-theme" types of problems for non-trivial and non-local variations. Manage the sets via many-to-many tables and implement the features via IF statements where appropriate. (The relationship between features and IF statements will often not be one-to-one, but this is expected. This lack of 1-to-1 is one reason why the OO strategy pattern does not work well, along with its difficulty in reporting on.)
- {This is partly related to the "category manager" item below.}
- If many to many is so easy in the relational model... why do we always need an intermediate table? why isn't there a standard way to change multiplicity at the database level?
- Does any paradigm make the transition in multiplicity well? Changes in quantity-of-relationship fudges up or creates big work in every paradigm/practice I've seen.
- If every paradigm makes it hard... what is the justification for saying that "Sets are ofter better than trees and graphs" ?
- Perhaps I am not understanding the original complaint. Many-to-many tables are not that hard to create in most cases. Sets are more flexible than trees generally. If you error toward sets when a tree would have been fine, you can still do what the tree can, but with a bit more management complexity of the relationships. If you error toward trees and later need sets as requirements change, you have a relatively big overall job. Thus, it's safer to error toward sets. If you have tools, sample code, techniques, and experience with sets, then they are not as daunting as it may first seem.
- Consider a ConstantTable. It could also perhaps be used to keep the "feature list" for AccessControlLists if you believe you need fairly extensive security control.
- If dealing with input data or structures that are not reasonably close to the ideal for the specific task at hand, then if practical, consider converting it closer to the ideal form first, perhaps via an intermediate structure. Try not to mix heavy parsing or converting with processing. This is somewhat related to SeparateIoFromCalculation.
- Do the hardest or least-certain parts of the project first. For example, produce the most difficult screens, batch tasks, and reports first. This will do two things. First, it will provide a sample and style to use for the simpler parts. Scaling down in complexity (mirroring style) is usually easier than scaling up. Second, the difficult parts are more likely to expose any fundamental design flaws or change-needs earlier in the process. The earlier you discover the "gotchas", the better. See also WorstThingsFirst.
- Don't be afraid to make simplified copies of data (tables) for special needs. For example, marketing often needs only a subset of customer activity data. Querying through the data in raw form each time for every marketing study and campaigns can be tedious and resource-intensive. Create an infrastructure for nightly or periodic batch jobs that create simplified tables and views for specific user types or departments. Almost any medium-large and large company will have a need for such an arrangement. The goal is not to eliminate the need to dig into raw data, but get the low-hanging fruit out of the way. [EditHint: ponder merging with the "intermediate structure" item above.]
- Let power-users manage category codes, not programmers, especially in larger companies with lots of products, services, and sales campaigns (coupons, etc.). It might be good job-security for a programmer to manage category codes (or conditional/polymorphic equivalents), but it is not a good use of resources. I've seen a lot of wasted time in this area. Create a nice set of category-editing tools up front. A variation of this is a "feature list". (Many-to-many tables are an effective way to implement these.)
- Use user-managed category codes (above) when possible instead of explicit columns. If you've already committed to building a category management system (above), then you might as well take advantage of it. This will reduce the need to add table columns in the future. (Some argue this slows down queries, but this is offset by the "simplified copy" rule above. I agree that its usually good to have some high-level category columns in the main tables, but these should not be a dumping ground for nitty-gritty or department-specific categories or classifications. For example, a customer service visit may be classified as one of: Sales, Installation, Service, or Other. In practice it's probably possible to mix these, but managers still like a single "primary classification". More detailed categories can be added (and mixed if such combo's are allowed) using the category coding system mentioned. This technique was not used much in the past because it can be a performance hit. However, machines can usually handle it in a typical business setting these days and it's a great way to offload some much of the domain complexity onto power-users instead of programmers. (Related: CompilingVersusMetaDataAid)
- Warnings are your friend. Don't flag everything as an error if there is uncertainty. Create a robust set of warning tools or conventions such that warnings can be stored, tracked, and logged if/as needed. The all-or-nothing approach of hard errors is often too inflexible. I've seen people waste days trying to perfect convoluted validation rules. Instead, make a warning and let a power-user or administrator inspect it further to determine if it needs fixing, perhaps with an inspection "check-off" (work-flow-like) system in some cases. But, avoid the temptation to over-complicate such work-flow with bells and whistles unless it's actually a domain requirement.
- Allow a description or note field/column in almost all non-trivial entities. Often the user needs to put extra info that a given form may not provide for. It can also serve as an interim field before formal fields can be created when there's an immediate change in domain environment. Similarly, place an "internal note" or "techie note" column in reference tables so that developers and DBA's can document info about rows, codes, etc. that one may otherwise forget. Think of it is a table-oriented code comment.
- Get the customer to sign-off on a features priority list. A priority list allows one to allocate resources better. Most likely deadlines will loom and it's best to tackle the most important items first. Even if the customer doesn't want to officially sign such a list, give them a draft and state something like, "Our development team will use these priorities to allocate and prioritize resources as the default. Please let us know if you want to change the ordering". This is a weak form of "sign-off". If they later come back and say, "Why didn't you get to feature X? We really need X!", then just bring up that email and say, "I didn't hear otherwise, we had to make choices because we can't do everything first." (CC as many people as possible to leave evidence that you sent it. Even if customer never received it due to some glitch or denial, at least you have sufficient evidence of your good faith effort to communicate priorities. That's CoverYourAss in action.) And try to be diplomatic when going about this. It can be an interaction mine-field if not done with care.
- Build and vet API's up front for common services, such as security, database access, error/warning messages (with optional halting & logging), etc.
Dealing with Cutting Edge
A Notion: Don't stop trying to do it because some one or some rule says "You can't get there from here" - Even if you have no idea how you will reach your destination or way-point, take a step that brings you closer!
Whilst that might be reasonable general advice -- though I'd suggest hiring in an expert to progress things when you get stalled, rather than fumbling aimlessly -- I'm not sure it has anything in particular do with building software that works and is easy to maintain.
Maybe it doesn't, but when one is a programmer, while in the midst of an effort of programming, seeking a way to discover new ways to do things, the dependence on the "state of the art", rules, and experts to make a dream come true is not progress, but rather submission to "this is the way to do it" thinking. To make good art, one does not copy and paste structures and methods, one rather finds a way to enhance and express, create and actuate that work not only well, but excellently. If one is simply finding another way to count money or store information, to edit some text, or to communicate with ones peers, reliance on what already works well or dependence on others expertise is certainly called for. But when you want to go where they say "You can't get there from here", you have to bravely and intelligently seek a way to get there that others can follow. You cannot get to new places following old paths, you must make your own path.
{I never say "can't be done", but rather give an estimate of how long it will take to complete, or if too new or "far out", then give an estimated amount of R & D time before the options are known. Generally one should produce a proof-of-concept before going full-bore into an app that depends on the "new thing" to work. Work out the big mysteries first before planning the details surrounding the thing. One shouldn't even start planning the details if there is a big fat question mark hovering around the project regarding new key technology. Don't start writing a FlyingCar Tracking System until you know that the car actually flies. The above context is where those involved generally know the domain and are not in Columbus Mode. Managing R&D is generally another topic. Note that there is already a rule above called "Do the hardest or least-certain parts of the project first." -t}
The tracking system already exists, one has but to put the geolocational component in it when it becomes. Often people are found reinventing systems to do something that has already been done. What I like to do is keep track, not of flying cars, but of useful systems.
I do not provide estimates of how long, since what I am doing is not funded by anyone else and is not a project someone else owns. When done, what I do that can not be done may become useful for others, but for now I build what I build as a Lone Programmer and make it useful as another tool in doing more of the same. Thus far my storage requirements have exceeded several terabytes. All of which I continually review and refine. The Ten Terabyte System I spoke of years ago will soon become a normal minimum. What I am doing is making all that stuff into information which is UsefulUsableUsed.
The above generally assumes team-based projects.
CategoryMethodology, CategoryBusinessDomain, CategoryModelingLawsAndPrinciples, CategoryProcess
JulyZeroSeven