Unit Tests Require Perfect Developers

UnitTests require developer perfection in both their initial construction and maintenance. Without static typing you are requiring developers to write extensive and nearly perfect UnitTests. If this were possible then UnitTests would be largely unnecessary. Static type checking formalized and makes automatable and makes complete a great deal of checking. Some languages like Eiffel go even farther with DesignByContract.

See StaticTypingRepelsElephants for more


Regardless of the static typing issue, UnitTests are written line by line in direct collaboration with the features they tests. Per CodeUnitTestFirst, never change a line of code without a failing test. Therefor add just enough testage to ensure failure before adding the tested ability. This tests the tests.

UnitTests are not another version of BigDesignUpFront. -- PhlIp


With or without static typing, you need to write good UnitTests [c.f. CleanroomSoftwareEngineering]. Either way, though, you don't need to be perfect. Write the tests as best you can, and then beef them up every time you learn how. When you find a hole in the tests, close it up.

If a team of imperfect developers does this for X months (where the value of X depends on how quickly they learn, not how perfect they were at the beginning), I'm confident that they'll end up with good enough tests that they won't care about StaticTypeSafety anymore.

-- AdamSpitz

I don't think good UnitTests are good enough. And the waiting for them to be perfected is very painful. I think even good UnitTests are an ambitious goal for a large percentage of developers, especially if they must replace all the tests a compiler implements. -- AnonymousDonor

You might be right, but static typing doesn't solve the problem. You need to write the tests anyway (or have some other method of knowing correctness other than the compiler). Does the compiler give you enough confidence that you'd be willing to ship an untested system? -- AdamSpitz

Actually, I don't find any difference in the kind or number of UnitTests I write in languages with or without static typing. If you've never worked in a dynamically typed language like Smalltalk, then you can make a number of assumptions about how things might work, but you'll probably be wrong. Personally, I think that it's a RedHerring. The two are not related. -- KyleBrown

Not smalltalk but I do use perl extensively. If I don't care about catching certain errors then I won't write the UnitTests. But to catch the "I do not understand" type of errors before runtime takes a very extensive set of tests. -- AnonymousDonor

Perhaps you're imagining that people working in dynamic languages are writing tests like:

assert(Person.hasMethodNamed("getName"))
We're not. We just write tests like:

assert(bob.getName() == "Bob Smith")
If that kind of test (which would be necessary in both Java and Smalltalk) passes, then we can be confident that the Person class has the getName() method. We don't need a compiler to confirm it.

-- AdamSpitz

Then your UnitTests are bad. They are assuming a level of functionality is working, that is, a name was successfully added. Personally I would prefer you have tests for everything that can fail, not "system" level tests that fail at a higher level. -- AnonymousDonor

Look at Adam's test again. What exactly is it testing that is at too high of a level? Calling bob.getName() when it was not been defined will throw an error if the method is missing. So that compiler error just showed up. If it doesn't equal Bob Smith, then the problem with setName() is revealed. Unit tests often manage to test more than what you see on the surface.


The original statement above is bizarre in the extreme. I can write terrible UnitTests (e.g. assertFalse(collection.elements().isEmpty()) when I really mean that the collection has one element "ABC"). Therefore, UnitTests don't require perfection.

[A supporting statement...] I'm sure I put as many bugs into the tests as in the production code. When a test fails, I look at it and determine if the test code or the production code is wrong. I fix the one that's wrong. -- JeffGrigg

Well, you're my hero. ;) I'm sure I write more bugs in my UnitTests than the production code. Fortunately, I rarely write a UnitTest and production code wrong in the same way. In fact, I've only done that once. So, when the test fails, I have the occasion to step through the code to figure out what's wrong. This is the main advantage of writing tests immediately. They are a good place just to place a breakpoint so you can step through the code.

But > 90% of the time, when a test fails, it's due to a bad test. That really annoys me. I wish I knew a way to reduce this ratio, but I've found that good testing is complex. -- SunirShah

Sunir, not that I know anything about your programming abilities, but personally I find that when my UnitTests are getting convoluted and difficult, its because the classes they're testing are too complex and could stand to be refactored. UnitTests, while they ignore the implementation of classes, put a tremendous amount of pressure on their interfaces. (In fact, I find I write much better, more deliberate interfaces now that I'm using UnitTests.) --FrancisHwang

You should see the UnitTests for this: http://research.bitflash.com/sdvg/palette [BrokenLink]. I'm not going to bloatify it by breaking it into a thousand methods and classes. That would make it slow and big. -- SunirShah

Yes and no. You create another class and that requires a whole new set of unit tests. Then you need a set of UnitTests to test the interaction of the new class and the existing class. To a large extent UnitTests are preserved, they just move to testing interactions of classes instead of individual classes. -- AnonymousDonor

I've observed the same thing that Sunir has, but with 98% of failures due to problems with the tests. And it's not because the tests themselves were difficult. One reason is that test cases receive much less care than the main code, since bugs have less impact. Changing functionality and environments break tests much more often than they break the deliverables. The rare occasions when tests reveal code defects do justify the tests, but tracking down the other test failures was one of the more annoying aspects of my last job.

Though in the case of the link Sunir provides, it seems to be a special case ... writing graphics libraries for portable devices does have a pretty high need for optimization, and I can see the argument for not breaking down some of the classes. My gut still tells me that in the general case -- most code doesn't need to be optimized -- difficulty in writing UnitTests may be a CodeSmell.

(I wonder if there's a page somewhere in this wiki about UnitTestsVsOptimization?.) -- FrancisHwang


I'm confident that they'll end up with good enough tests that they won't care about StaticTypeSafety anymore.

I'm confident that assertions are not nearly good enough in this regard. Any serious research you happen to know about it?


You might be right, but static typing doesn't solve the problem. You need to write the tests anyway (or have some other method of knowing correctness other than the compiler). Does the compiler give you enough confidence that you'd be willing to ship an untested system? -- AdamSpitz

No, but I'm not saying UnitTest is unnecessary if one has a compiler. I want both. -- AnonymousDonor

Hmmm... This statement seems to be self-contradictory: Suppose you have a compiler with strict type checking. You say you want both that and UnitTests. Given that you have a strict type checking compiler, why do you want UnitTests? Didn't you just say they're unnecessary? (Puzzled. -- JeffGrigg ;-)

It's not contradictory. The compiler catches one large category of errors. UnitTests catch another category. System tests check another category. Acceptance tests will catch another. And the field catches another. Making sure a system works requires a layered approach. If the question is can UnitTests replace a compiler then the answer is clearly yes. But it is very difficult and i don't think most people will do it correctly, just like the UnitTest previously exampled. I simply don't trust developers to write that detailed a UnitTest for everything all the time. I have for this opinion years of actual experience with real-life developers and not a ItCouldBeThisWayIfEveryOneBelievedAndBehaved? sort of approach. Too many will not do a good enough job. I would like to move even more checking into the tools and in to the infrastructure because i do not believe in the PerfectabilityOfDevelopers?. I still use locks on my car and house even though if everyone behaved i would not have to. In one location i never locked my car or house and it worked, but that's rare. Usually the world is more cruel and disappointing. -- AnonymousDonor

Suppose that without UnitTests, 10% of your typical programmer's code was buggy. Now suppose they write test that cover 20% (at random) of your code. 10% of those tests are wrong too, from how you rated your coders originally. But now, you have 90% correct tests for 20% of the 10% of the production code that was buggy. Or in other words, you have found 18% of the bugs that were in your production code (20%*90%). A big win! Clearly there is much wrong with this argument, like code coverage being equated to coverage of 'things that need tested', and so on. But the point is that imperfect coders can use imperfect UnitTests to reduce their bug rate. To paraphrase a current ad: Tools that automate testing are priceless, but for everything else, there's writing tests. -- BrianEwins

You will never find a bigger advocate for UnitTesting than me. It is a big win. Add to your calculations the improvement a compiler makes in the accuracy and completeness of tests and you can see why a compiler is far better than a buggy human writing this class of tests. Especially because the compiler is good at the repetitive boring tests than humans are not good at. I fail to see why it has to be either and not both. It's clear to me a human will not do as a good as compiler. Let the human spend their time on the part of the tests that relate to the semantics of the application and now you have a good division of labor that should yield the best results. -- AnonymousDonor

This phrase confuses me: "the part of the tests that relate to the semantics of the application." All of my tests relate to the semantics of the application. There are zero tests that could be replaced by a compiler. Any test that ensures semantic correctness also ensures type correctness, so there is no need for separate type-correctness-tests. Can you give me an example of a test that a Smalltalk programmer would need to write that a Java programmer wouldn't? -- AdamSpitz

I think it depends on the completeness of your tests. To me making sure all methods exist, making sure only existing methods are being called, having the correct parameters and types passed, and having the interfaces called correctly are not part of semantics. It's book keeping. Yet a complete set of UnitTests will need to test all these things. Of course you can hack around this with system style tests but i don't think that's a correct form of UnitTesting. -- AnonymousDonor

I don't know what to say that will help. Does anybody else know how to explain this better? -- AdamSpitz

I'll give it a shot. I used to be a big fan of ManifestTyping, for the same reason AnonymousDonor is. Once I started writing a lot of UnitTests and doing a lot of refactoring, though, I realized a few things. First, good UnitTests exercise the type system pretty thoroughly as a side effect. You don't need any extra tests. Second, manifest types make refactoring more difficult.

My conclusion is that ImplicitTyping is less work and same quality if you're already doing a lot of UnitTesting and refactoring. I haven't had the pleasure of testing this theory yet, but it matches the experiences of programmers I trust. -- JimLittle

It's not that I'm ill informed and just need a better explanation. I do not agree with your position for reasons which to me are solid and valid. -- AnonymousDonor


A few points:

I see all of these techniques as complementary, but UnitTesting as a core technique that you can apply when none of the others is available. And I would certainly argue that DBC+static types won't give you all the coverage you can get by adding some UnitTests. -- BrianEwins.


I think we've got two separate pages trying to live on this page. One of them is about whether imperfect developers can produce good enough UnitTests. The other is about whether a type-checker adds any value to a system that has UnitTests. (UnitTestsMakeTypeCheckersUnnecessary?, or some shorter name that I can't think of. :) The topics are related, because we're mainly trying to figure out whether imperfect UnitTests need a type-checker to supplement them, but I think things will be a lot clearer if we put them on separate pages. I'll do it myself, when I can find the time, but that probably won't be for a while. Maybe some kind soul will do it for me. :) -- AdamSpitz


I'll give it a shot. I used to be a big fan of ManifestTyping, for the same reason AnonymousDonor is. Once I started writing a lot of UnitTests and doing a lot of refactoring, though, I realized a few things. First, good UnitTests exercise the type system pretty thoroughly as a side effect. -- JimLittle

Sorry, that's crappy testing. You need to test everything that can fail. Why bother with UnitTests at all then when system tests will do the exact same thing for far less effort. I'm beginning to thing the UnitTests people are talking about are pretty thin and incomplete. -- AnonymousDonor

I don't know who came up with the idea that you have to test everything that can fail, but it's wrong. It's pointless to test against things like disk crashes or faulty memory for most applications. There's a cost curve here. Some tests aren't worth the time it takes to write them, or the time to maintain them, or the time to debug them. -- SunirShah

Maybe even more to the point, why does AnonymousDonor think you have to test everything that can fail explicitly? Someone pointed out that in Adam's example, you do get the test for the method's existence and the test for the setter's proper functioning, you just get them implicitly. Why are implicit tests considered crappy testing? -- Richard Rapp

You don't need any extra tests. Second, ManifestTyping makes refactoring more difficult. -- JimLittle

Then of course UnitTests also make refactoring more difficult because you have to change them whenever code is updated. It is clearly the exact same issue. I know i often dread making some changes because the number of tests that will have to be modified as well. Unless of course i did a very shallow set of tests then i'm ok but that would mean i did a poor job of testing in the first place. But shallow testing seems to be what you are advocating. -- AnonymousDonor

I'm not advocating shallow testing. Hmm. Take a look at the code in CommentingChallengeResponse. It's written in Java, and thus uses ManifestTyping, but pretend it doesn't for now. Do you see how all of the tests in that example test behavior, not types? Do you agree that the tests aren't shallow? Now look at the production code. Is there any way to incorrectly change types in the production code without breaking a test? -- JimLittle


I am not sure I agree with the basic title of this page. UnitTests (or any tests for that matter) do not require perfect developers. False positives and false negatives are still possible, and the false negatives will be resolved. -- WayneMack

I would have to agree with you Wayne, If UnitTestsRequirePerfectDevelopers, then Programming-System-X-That-Doesn't-Suck Requires Perfect Developers. AnyXisBetterThanNone. And really, TheBestIsTheEnemyOfTheGood -- JonathanArkell


Someone wrote, "98% of failures due to problems with the tests. And it's not because the tests themselves were difficult. One reason is that test cases receive much less care than the main code, since bugs have less impact."

It sounds to me like your build or version control environment is broken. If you're serious about the tests, programmers should be unable to check code in that doesn't complete its tests. See AeGis. -- BillTrost

I believe this quote may be referring to after the fact testing, rather than TestFirstDesign.

Does that matter? If you're adding the test afterward, you can still tweak the build/checkin environment to declare the program broken if tests fail.

Bill, I don't understand "programmers should be unable to check code in that doesn't complete its tests." We agree that programmers write (untested) code, then tests are run, and only then -- after all the tests pass -- the programmer can check in that code. I assumed thought that was exactly what the original poster was describing. But I thought the original poster was complaining about the situation where a test fails (not in the checked-in code, of course), and it turns out the bug is in the test code, not in the application code. (Buggy tests *can* and often *are* checked in, right? If you have a build system that can somehow prevent that, *please* tell me how I can set up a system like that.) -- DavidCary


I've always held the view that unit tests aren't there to catch every single bug. Oh sure, they make sure that things work reasonably well, and may even improve quality by forcing you to explicitly handle boundary conditions rather than mumbling IllDoItLater?. The real benefit comes both from the instant gratification they provide (it at least doubles my productivity) and the vasly improved design I end up with. There's a reason it's called TestDrivenDevelopment and not DefectFreeDueToUnitTests.


ProgrammerTests (nee UnitTests) when done with TestFirstDesign do not require perfect developers; they increase the effectiveness of less than perfect developers and their effectiveness increases as the skill level of the developer decreases.

TestFirstDesign forces a developer to think through and incrementally code and test software. The developer does not need to solve the entire problem all at once, he just needs to solve one tiny piece and then go on to the next. At any point in time, the developer is one roll back away from working code, so he will only lose a couple hours of work even with a major mistake.

It is not ProgrammerTests that require perfect developers, it is writing code without them that requires perfect developers.


CategoryTesting


EditText of this page (last edited January 2, 2010) or FindPage with title or text search