Unit Testing Question

BrianMarick asks:

One thing that struck me about XP: In my book, The Craft of Software Testing, I advocate designing "unit" tests by thinking about the code but implementing them at a conveniently stable interface. That is, I might look at the code for a particular class to figure out what inputs need to be tried against it. But instead of writing a class driver that exercises the class directly, I might implement the test from the user interface by figuring out what values to give there to cause the target class's methods to be invoked the way I want.

Why be indirect?

(Note: the user interface isn't necessarily the most stable interface. That's just an example.)

XP is a process of constant perfective maintenance (refactoring). Hence you must have figured out a different way to deal with the maintenance problem than mine. Do you have ways of reducing the maintenance burden of UnitTests broken by changes to the code they test (or other code they use)?

-- Brian Marick, Testing Foundations (a training and consulting company) mailto:marick@testing.com, http://www.stlabs.com/marick/root.htm


Brian, you must mean something quite different by the term UnitTests than XP does. We should explore further, but your tests sound like what we would call AcceptanceTests.

To us, each class must be supported by UnitTests. You must test anything that could possibly not work. Since most classes have nothing to do with GUI, the UnitTests for most classes can't possibly be GUI-oriented tests.

Whenever any developer releases code to the configuration (which they generally do at least daily), all UnitTests must run at 100%. Not my UnitTests, all UnitTests. We never release code without running all the UnitTests, so they never stop getting updated. Used well, UnitTests are not a "maintenance burden": they make development go faster, not slower. Here's an example: yesterday, one of the developers called me over to show me some code that was working, but that she thought was ugly. I helped with the refactoring from two angles: at the detail side, we reworked a bunch of boolean logic to make it more clear; at the conceptual side, we renamed and rearranged the methods. As we made these changes, which were very substantial, we stopped every few minutes and ran the tests for the class. When from time to time we screwed up the logic or failed to make the method names consistent, the tests failed. Since we had only made a change or two, we knew just what was wrong, and fixed it immediately. In a half-hour or so we had restructured a very ugly and hard-to-maintain part of the system into a clean and clear one. There's no way we could have been sure that the changes we made were correct, without those UnitTests.

Of course, sometimes we need to replace a class or radically change one. If it's a replacement, the new class of course has UnitTests. And it probably has the same interface as the old class, so maybe the old UnitTests will be just fine.

If the old class is removed, its UnitTests won't run. We "discover" this fact, and remove the tests as redundant. Sometimes some other tests won't run, usually because the replacement class doesn't quite work as advertised. We fix the class or the test, as appropriate.

ChryslerComprehensiveCompensation has about 1300 model classes and 25,000 methods. It has around 750 UnitTesting classes and, I forget, 15,000 testing methods. All those tests run at 100% all the time, and they have run at 100% every day for the last 2 1/2 years. We couldn't live without our tests: they let us go fast, they let us change anything that needs it with impunity, and most important of all, they tell us when we're done!

One more thing ... you mention "designing" UnitTests by looking at the code. The best XP programmers write their UnitTests before they write the code. It's amazing how much we can learn about how the code should be written by using it in the tests. And of course we get a much less biased test if we haven't already fallen in love with the code.

More questions, please ... -- RonJeffries


"Now I have to make the system handle a co-signer in a different country than the customer. In that case the risk premium should be 14.3412 marks." - This is the moment at which you write a UnitTest. You don't write them from the code, you write them before the code.

The discipline to do this is the hardest thing I have done in programming for years. I love jumping to the implementation. But when I don't, when I write the test first, I code faster, cleaner, simpler, and with lots less stress.

Try it, just once, and tell us your experience here (this addressed to all WikiDom). -- KentBeck


I've been practicing JustInTimeProgramming with UnitTests on an off for a few months. In the beginning, it was hard to justify. I felt I like was wasting time. But then I noticed a few things. A couple times I would say to myself "Well, this is obviously going to work when I code it. Why write a test?" Then, I'd capitulate, write the test, then the code and discover that the test fails. Another scenario: I'd get one test working and several others would fail. I got immediate feedback. The first couple of times that happened, it was shocking. Then it felt wonderful because I realized that JustInTimeProgramming was leading me to higher quality code, faster.

I also notice that the code I am writing now is pretty trim. When I write a test, I am working on the basis of what I need not what would be cool to have. On the other hand, I don't know whether this is a pure side effect of UnitTesting, or a result of picking up the XP mindset.

Another fun thing.. when I first started using UnitTests, I got part of the way through the development of two classes, when I discovered that some of the methods on one class really belonged on the other class. The first thing that I did was move one test from the TestCase for the first class to the TestCase for the the second, discover that it failed, and then move the methods which supported the test. I did that a couple of times until the change was complete. Then I sat back amazed that it went quickly and, more importantly, how sure I was that it was right.

That is the big thing. With the tests hanging around, I feel much more comfortable. You always know what shape your code is in. Stress is reduced dramatically. I can leave my computer without nagging doubts. To knock off an old public service announcement: "It's 11PM, do you know what your code does?" Now I know. :-) I now have much less tolerance for code in an unknown state. To me, it seems like the single biggest reducible risk in development.

I wish I could use UnitTests in all my development, but the fact is, I do some machine control and graphical programming. It is possible to support some of this, but I think that it would be more at the level of acceptance tests than UnitTests.

-- MichaelFeathers

This posting makes me smile with pleasure. -- RonJeffries

I think the original question was somewhat misinterpreted. When he says testing at stable interfaces, I don't believe he means user interfaces, but rather stable things which are well-defined in the code. Now maybe there aren't any of those in XP, but there could be. This is similar to what DaveThomson? of OTI talks about in testing. You are better off testing a known, defined API (e.g. Common Widgets) than testing the internal code that implements them, since that code changes more, and both sets of tests exercise the same thing.

Maybe those are considered AcceptanceTests in XP?

Also, the fact that there are 'only' 750 UnitTest classes for 1300 classes suggests that the UnitTests aren't quite one-to-one with domain classes.

-- AlanKnight

Every API goes through PunctuatedEquilibrium (at least that's what I think evolutionary biologists call it). At first it is radically unstable. Then it settles down for a while. Then every once in a while it gets wonky. You could reduce the cost of writing tests by waiting to write them until that first calm period. What I discovered is that when you wait for the dust to settle, you reduce the benefit of testing far more than you reduce the cost. Put the other way around, you can make testing much more valuable to development, looking long term and short term, if you test very early and at a detailed level and pay the inevitable price in increased maintenance from time to time. Note also that you still have to pay to maintain tests written just after the code, you just avoid that first payment. Maybe.

And the units test classes aren't one to one with domain classes. Every time you are about to write logic that could possibly break, you write a new test. Each test case class creates a unique TestFixture (predictable configuration of objects). If you can make an existing fixture work, you do so, avoiding the need to create a new test class. So, the number of test classes has more to do with the factoring of the TestFixture creation code than with any direct mapping to domain classes.

-- KentBeck


XP's approach to testing is orthodox. It's the way most testers think things should be done, and the way most testing books say things should be done. (See, for example, Hetzel's The Complete Guide to Software Testing, first published in '84.) The only thing odd about XP is that the orthodoxy actually gets put into practice. (Not a small thing!)

In particular, everyone in testing thinks you should write the tests before the code. I emphatically agree. Sorry that I (BrianMarick) wrote sloppily.

But I vary the orthodoxy when it comes to UnitTests. I make a strong distinction between test design and test implementation. Test design is asking what inputs should be used to give reasonable confidence that (1) the code, when written, works as intended and that (2) behavior-preserving changes really do preserve behavior. So you're thinking about what inputs should get stuffed into the interface you care about. In the case of a programmer, that interface is a class or method boundary.

But once you've picked the inputs, it's time to implement the test. How do you get inputs to that interface? The obvious way is to create the inputs and stuff them directly into the interface you're testing. That's what people usually mean by "UnitTesting".

But test implementation has different goals than test design. You're no longer trying to figure out what the right inputs are. You're trying to use those inputs as cheaply as possible (both today and next year). The obvious way might not be cheapest.

For example, suppose a certain method is supposed to prevent creation of records with a duplicate name field. Here's one test:

  1. create a record with name "foo".
  2. create a record with name "foo". (All other fields different.)
  3. expect the second creation to fail.

Here's another:
  1. create a record with name "foo".
  2. create a record with name "food". (All other fields *identical*.)
  3. expect the second creation to succeed.

Suppose there are three obvious ways to create this test.
  1. I could build all the contents of the fields and call the create method directly. (Conventional UnitTesting.)
  2. I might notice that someone's already written a handy utilityCreate method that does a lot of the normal field-building work for me, then calls the create method. So I can use a different interface to test the create method indirectly.
  3. If there's a convenient command-line interface, I could use it and not write any code at all.

Does it matter which I choose?

Well, it might. Maybe utilityCreate redundantly checks for duplicates. (Yes, you should refactor that out, but you haven't noticed it yet.) In that case, versions 2 and 3 don't really test what they're intended to test.

On the other hand, version 1 has you doing more work, which costs money. (You may end up redoing some of utilityCreate, which ought to grate on a refactoring-sensitive person.) Moreover, it's more sensitive to change. Suppose you have 36 tests that create records, together with their associated fields. If something about one of the fields changes, you may have to fix up all 36 tests. In version 2, utilityCreate might protect you from the change: no test need change. That saves money while still getting the benefit of UnitTesting.

(There are other things to think about; see my book.)

My experience is that conventional per-class UnitTests are unmaintainable. No, let me qualify that: my experience is that they're unmaintainED. Given that the tests die soon, and incur such a large maintenance cost while they live, I wonder if they're worth the money. A more change-resistant implementation approach seems better. (But design stays the same.)

Kent points out that APIs are also unstable. My point is that larger-grain APIs are less unstable than class APIs. If you have to bite the bullet of updating tests, maybe that's where to do it.

So, my question is whether you, in XP, make these kinds of implementation tradeoffs? Or do you always test directly at the class interface? If so, how do you avoid maintenance problems? Could you get the same effect with lower cost?

KentBeck: could you explain more about TestFixtures? We may be getting at the same thing. -- BrianMarick

"Given that the tests die soon..." Is that a given?

Not in my experience. Most of the tests are stable for months if not longer now. Earlier on, when the system was evolving, they changed more rapidly. But that's when you need them all and when it pays to keep them up to date. Effective refactoring requires solid tests. -- RonJeffries


As with everything else, we apply YouArentGonnaNeedIt to testing. For example, we begin by just coding the tests (in our TestingFramework). Then, when there get to be redundancies in the code, duplicate setups and so on, we refactor (OnceAndOnlyOnce). The tests get cleaner, as do the real classes.

If the API changes, it changes. In Smalltalk (in particular) this is just not a problem. You get it to show you all senders of the old API, edit them to the new one, and you're done. Often, for testing purposes, you don't actually need to provide some new parameter that the API uses, so you can build a new method that defaults that parameter. It is legal to put methods in a class that are just there to facilitate testing. Whatever makes the code simplest.

ExtremeProgramming isn't afraid of change, and we do not defend ourselves against change. Instead, we embrace the fact of change, positioning ourselves to handle it immediately and readily. The results are that we implement what is really needed (not what they told us six months ago), and we build it to the best design we know today (not what we thought six months ago). I see upon reading that this sounds pretentious, but it's true. I'll look for some pointers to XP's views on change and hook them in here. Colleagues, you can help with that as well.)

When you build your process around change, it turns out that lots of things go better ... and so far as we've found ... nothing goes worse. -- RonJeffries


SteveJorgensen asks:

Regarding unit isolation and general-purpose utilities...

Let's say I'm working on a project, and certain reusable utilities arise for string handling, collection handling, etc. They are astoundingly useful to me, and I use them idiomatically throughout my code. Unit testing philosophy says that I should test every unit in isolation, so taken to the extreme, that means I need to mock every one of these utilities when testing other code that uses it. Otherwise, a bug in the utility could result in failures of multiple tests.

The trouble is, when a general-purpose utility is ubiquitously used like this, it becomes very inconvenient to mock it everywhere, and furthermore, the fact that the utility is being used under the hood should be an implementation detail that the test should not have to be "aware" of in order to mock it.

What is the best practice in this case?

  1. Pragmatically allow tests higher-level functionality under test to rely on internal use of utility code defined within the same project without mocking?
  2. Separate general purpose utilities into a separate library project, all of the tests for which must pass independently before it can be referenced from the business application?
  3. Other...?

We don't do UnitTests here - we do DeveloperTests. That means we don't work too hard to isolate the _unit_, we work to isolate the _developmental_edit_.

If you have common utilities, they have their own low-level tests, and they are cross-tested when they support abilities that higher-level tests check for.

So instead of isolating any unit, you write your IncrementalTesting server to run all tests on all projects whenever the shared code changes. Basic SoftwareProductLines.

I see. So it's not really a problem that breaking a utility can cause multiple tests to fail when doing TDD because whatever changed last was the cause, right? I can buy that.

I would like to point out, though, that this answer clearly shouldn't extended to all local resources a "unit of work" interacts with. When, for instance, I've tried to test functionality of a service without mocking the model(s), I've had test cases that overlap significantly with the model tests, require an excessive number of tests for adequate good coverage, and tend to break when model-layer behaviors are changed.


CategoryTesting


EditText of this page (last edited March 8, 2009) or FindPage with title or text search