Two key aspects of extreme programming (XP) are unit testing and merciless refactoring. Given the fact that the ideal test code / production code ratio approaches 1:1, it is not surprising that unit tests are being refactored. We found that refactoring test code is different from refactoring production code in two ways: (1) there is a distinct set of bad smells involved, and (2) improving test code involves additional test-specific refactorings.
In a paper written by ArieVanDeursen, LeonMoonen (CWI), AlexVanDenBergh and Gerard Kok (SoftwareImprovementGroup) that is presented at XpTwoThousandAndOne, we share our experiences on refactoring test code by discussing a set of bad smells that indicate trouble in test code, and by presenting a collection of test refactorings to remove these smells.
The paper is available online at http://www.cwi.nl/~leon/papers/xp2001/xp2001.pdf
We believe that the paper presents only a starting point for a larger catalog on test smells and test refactorings that should be based based on a broader set of projects. Therefore, we invite readers interested in further discussion to add smells and refactorings to this page.
-- LeonMoonen
There is just such a collection now available online at http://xunitpatterns.com/TestSmells.html
Avoiding False Positives
The benefits of RefactoringTestCode seem self-evident to me, but the risks also appear to be higher than in refactoring non-test code. Normal code only has one kind of correctness -- it has to pass the tests you've written -- but test code has two. It has to warn you when the code is broken, and it has to be quiet when the code works. That is, correct test code gives neither false negatives nor false positives.
To take care of this while you're initially writing the test code, you:
The only way I can think of to get around this is to disable the underlying code to ensure proper test failure, but this seems slow and cumbersome. Is there some other tactic?
MutationTesting springs to mind. But I'm not sure it would be a complete solution. -- JeffGrigg
I've also used MutationTesting for exactly this sort of thing. -- PaulTevis
Following advice somebody gave me on MockDatabase, I just took one big nasty MockDatabase and broke it up into a lot of tiny MockDatabases, one for each class. Things feel better already. But there's a little feeling of looseness here, because I can't refactor this code as far as I'd like. Then it occurs to me that maybe I should've written UnitTests for the MockDatabase itself. -- fh
I just had a discussion the other day with JeromeFillon where he mentioned that he does indeed UnitTest his MockObjects. Jeff, I think that random MutationTesting would definitely be incomplete but maybe you could come up with a range of changes to your tested code that would satisfactorily "test the tests". However I have always been under the impression that UnitTests are the "safety-net" that allow me to ReFactor with courage. -- IainLowe
moved from BeCarefulWhenChangingTests
The interface described in the UnitTests is a detailed description of what someone expected of their objects.
It always feels good to delete large chunks of code. However, it's probably good to place an extra amount of effort into preserving all tests for a given bit of functionality until you're sure that functionality isn't used anywhere.
I'm trying to figure out a way to automate the process of highlighting callers, but I can't wrap my head around the problem. It's probably a feature supported in any RefactoringBrowser, but I want to combine it with tests.
For example, if I have the following test:
apple = Apple.new assert_equal('tasty', apple.taste)And I want to change apple.taste to be 'rotten', I need to be sure that no other code depends on the 'tasty' value. A simple heuristic would be to grep for "taste" throughout the entire codebase. (This will return FalsePositives, but that's Ok).
We can and have done this sort of thing manually for years. What are some ways to automate it, though?
One would be to add some sort of comment, like this:
assert_equal('tasty', apple.taste) # grepFor "taste"Then, one could have a script that parses the output of "cvs diff" for changes that affect lines with a "grepFor" comment.
If I get around to spiking this I'll report back.
If you can identify in advance that a line of code is likely to broken in future, there are probably better ways to isolate the risk. Have you grouped stuff by ChangeVelocity?
Merged from ExtremeProgrammingUnitTestCode
Should UnitTest source code be treated with the same importance as production code, or should it be treated as "second-class"? If not properly maintained, unit test code will degrade just as any code does. In that case, developers will pay the price in the long run and be forced to spend time on rewriting the test code.
Related Egroup threads:
http://groups.yahoo.com/group/extremeprogramming/message/15913 http://groups.yahoo.com/group/extremeprogramming/message/16463 http://groups.yahoo.com/group/extremeprogramming/message/18375
Convert Test Code Into Features
I have worked on a product which accepted up to 2000 dial-in ISDN connections and converted them to ethernet traffic. We didn't have 2000 ISDN routers to test the device with, but we did have a great big 'load tester'. This was a bundle of our own poorly maintained spaghetti test code. It ran on very similar hardware. Crucially it had dial-out functionality and could dial-out over 2000 ISDN connections. The test code was very ropey. Any time there was a problem of any kind everyone blamed the 'load tester'. No real progress was being made on eliminating the bugs. I was handed the project to 'maintain'.
The bug list (once I'd amalgamated and rationalised it) was just shy of a thousand items long, about a quarter of the bugs looked hard. I decided to tackle the test software first as top priority.
This may not be an option in other kinds of project. For this project it was a huge win. The product is stable. It was not stable before. Testing is much more rapid and is more thorough now than it had been before. The hardest bugs (which only showed under heavy load) have been cracked. The project has been handed back for onward development. There is even one customer who seems to want the dial-out functionality as a feature, not just as a test feature.
So from this ISDN war story I'd like to suggest that at least in some cases refactoring of test code can make it disappear altogether. One way to keep the volume of purely test code under control - turn it into features of the code under test. -- JamesCrook
See ResilienceVsAnticipation, ExtremeProgrammingImplementationIssues