Software Design For Testing

Electrical engineers have a concept of "design for X": design for manufacturing, for maintenance, for test. Design the (sub)system, keeping in mind the things you'll need to do later. (Cope, back me up here?)

So, isn't it true that SoftwareDesignForTestingViolatesYagni?

(Later: I've started capturing some of these under TestEnvironments.)

The discussion of ExtremeProgramming and ExtremeTesting made me think of this. (So did the disastrous upgrade we just made to one of our servers.) Software modules are often so dependent on other modules, when you test module A, it's really hard to even know if you should see the results in module A or B or Z.

FunctionalProgramming is one solution; every software module is just a computation on its inputs to produce an output. I'm skeptical this approach works well with a GUI, or a file system, or a system that (logically) moves money from one account to another. (Even if FunctionalProgramming can scale up this way, most of us deal with systems that weren't built this way. Non-functional, if you will.-)

Another approach is a sort of hierarchical design. Ensure there is a strictly acyclic "depends-on" graph among all your modules. Then test all bottom up. (Shlaer and Mellor espouse this, but not explicitly for ease of testing.)

Anyway, either you use a design approach that happens to give you easy-to-test systems, or you add "software design for testing" to your design approach, or you have problems. (You have particular problems when testing on-line systems, where creating a TestEnvironment? is expensive, hard, or both.) ExtremeProgramming sounds as if it doesn't have that particular kind of problem, because everything gets tested early and often. How's that managed?

-- PaulChisholm

"How's that managed?"

Every class must have UnitTests. Ideally these tests are written before the code that supports them. (Usually one feature at a time.) Code cannot be released unless all unit tests in the system run at 100%. Code must be released approximately daily. Whenever something breaks in the real system, unit tests (and FunctionalTests) must be enhanced to catch that error and any other error that one makes you think of.

You just do it. Every time. --RonJeffries (see also TestingFrameworks)

In other words, writing tests ahead of time inherently includes "design for testability."

I think a substantial fraction of my bugs are due to errors in the tests. By "bugs" here I mean assertion failures that happen during early testing. Sometimes it's a trivial assert which I thought should always be true but isn't, but as you get into more and more elaborate cross-checks and consistency checks, the checking code grows until working on it takes up a huge amount of time.

This isn't a bad thing; it's just what I find happens. 95% of the checks never fail - just as 50% of advertising spend is wasted. If only we knew in advance which part was needed. I am still surprised every time an assertion fails because I never expected that to happen.

-- DaveHarris

As a point of comparison, our tests aren't much like assertions as I understand the concept. Our tests are more like: if you put these 10 things into a Set, the number of things in the set should be 5. In other words, they check whether the class works, not whether it is being used correctly.

And our testing code grows linearly with real code - if you implement a feature, you implement N tests for it. 1 <= N <= many. We rarely if ever "work" on it. Once a test is in place, there is rarely any need to change it.

I'd like to understand the difference from what you (Dave) describe - it is as if you are writing some different kind of test, one that requires geometric volume and lots of maintenance.

--RonJeffries

You put 10 things into a Set, and then you check that it contains 5. That final check is what I called an assertion. Making sure the state of the system is what you expected. You are placing more emphasis on generating test data, which gives you more opportunities for assertions because you have more knowledge of what is going on, but I'd still call them assertions.

Do you perform checks on live data too, or only test data? Do you have PreCondition(s), PostCondition(s) and ClassInvariant(s) that are verified as the code runs, independent of whether it was a UnitTest that generated the data?

-- DaveHarris

Generally speaking, one line of code has as good a chance of containing a defect as any other. It follows that filling one's program with, say, an equal amount of check code as there is working code, doubles the chance of defects. I think I'm serious about this. What say y'all? --RonJeffries

I usually find most bugs aren't due to mistyped lines of code, they are due to a bad sequence of code (or a missing sequence of code). Given that test code often involves simpler sequences than the code under test, it is much less likely to contain bugs. -- WayneMack

Let me intervene here. It doubles the chance of *finding* a defect. Suppose you just ran your program, and a check failed. If the check code is wrong, you correct it. (Just like your UnitTests, they are often obviously correct, and they don't change often.) If the real code is wrong, you just found a defect. You wouldn't have found it without the check. That's assuming you don't have UnitTests, of course. -- MarnixKlooster

Yes, that's what I was referring to. I was deliberately vague about the amount of extra work, but didn't mean to imply geometric growth.

An example from one of the Microsoft books: The Excel spreadsheet has two independent calculation engines, a fast one and a simple one. During testing they run both and require them to give the same answer. The simple engine might have as much or more code as the fast engine, and needs to be updated whenever the fast engine is extended. (The final product just uses the fast engine.)

This is not quite the same as having a UnitTest which sends "2+2" and expects the answer "4" back, because it's more general; it applies to live data where the answer isn't known in advance. The UnitTest does most of its job just by picking test data that exercises every path; the engines own internal code do most of the actual checking.

-- DaveHarris

Please see DesignByContract for one critical aspect of designing software for test. A method or class contract specifies what the method or class should do. if you don't have this clearly specified, how can you "white box" test it? And if you don't have this clearly specified, how can it be subclassed correctly and that tested? -- PatrickLogan

UnitTests are one way to specify a class's contract; assertions (i.e. preconditions etc) are another way. The first is "by example", the second is more declarative. In practice you need both.

However, I suppose someone philosophically in favour of dynamic type checking might prefer UnitTests and some who favours static type checking might prefer assertions. -- DaveHarris

I wholeheartedly agree. DesignByContract (asserts), I use as sanity checks. I'd rather have a single test at the top of the methods saying "I can't work if this reference is null" than if(myString != null) throughout the code.

My unit tests, I suppose, are a bit like Postconditions. They're black box checks that ensure my class is doing what I think it should be. Like you, sometimes my asserts or UnitTest(s) fire when I think they shouldn't. This usually highlights a false assumption that I've made, rather than a bug in the code...

-- AlanFrancis

One of the ways I develop differently when I am testing is that I am careful to make instantiating families of objects as simple as possible. I recently profiled one of the LifeTech unit tests. It requires three special builder classes (used only for testing) and generates almost a megabyte of objects for each test run. This is huge, but it's already considerably simpler than it was six months ago.

Writing unit tests creates a force for teasing apart communities of highly interconnected objects, making parts of the logic more autonomous. If you go into the writing of a unit test with the possibility of a SimplifyingRefactoring? in mind, you get yet another reason to gently improve the design. I never really considered this effect before, but I think it is nearly as important as the SimplifyingRefactoring? that happens before adding functionality.

-- KentBeck

I've recently starting writing some unit tests for classes which already exist. While it seems that the primary benefit comes from writing the test before the class, I am seeing some benefit in that I am forced to confront some coupling issues that I hadn't noticed before. Migrating the system to unit testing is helping me notice areas to refactor. -- MichaelFeathers

Just as one can manage up, down, or across, one can have assertions that test upwards, downwards, or sideways:

Up: Am I being called with the proper parameters / protocols? Helps find problems in client code, and perhaps undocumented interface assumptions.
Down: Did the subsystem I called return reasonable results? Helps find integration problems, and gives you 'for free' a new test case for the underlying library. This is a bit inefficient because you write N test cases for a library when one perfect one would do.
Across: Am I in a valid state? is the result I just calculated reasonable? Finds problems in the module itself, and takes the most advantage of white-box information: but at the same time any faulty assumptions in the design are less likely to be caught by this type of test.

I seem to find it useful to think about which direction a particular assertion tests. Most seem to be across, many are up, and a few are down. -- MartinPool

Does anybody want to talk about Design for Testability at the Systems or subsystem level? -- NickBishop

I have strong suspicions that I could generalize Design for Testability to DesignForDevelopment. Really soon now :) -- PavelPerikov

When a system consists of several processes communicating with one another, it seems sensible to test each process on its own, but some system architectures make it exceptionally hard. One of the hardest things to test is a C++ process that connects to a database by Pro*C, to a database that (rightly) has many constraints. You wind up by testing your process by hooking the WHOLE system together and using the GUI!

My initial thought is to ask the DBA to disable the constraints, because you know the data to feed into which tables of the database.

My 2 cents on the issue. -- NickBishop

Some quick suggestions (we have a similar environment).

Write a separate test interface. This can either be a different stand alone project that shares source files, except for the GUI, or a special run mode, i.e., start the program with a command-line /test switch or have a semi-hidden test screen in the main program.
Initially, you can capture the database constraints when you write to the database as part of your tests. Later, add up front methods to catch these constraints prior to the write to the database or even better, prevent the user from entering invalid data.
Encapsulate your database accesses into classes and write stub classes to replace them. These can be either separate project builds or done with conditional compiles and compiler switches. Take care to always do a full rebuild for production code if you choose the latter.

CategoryTesting