Here's the problem. You have written unit tests, both to help you understand your design and to find bugs. In due time, you refactor your code, which causes tests to break. The interface they communicate with has changed, so they must change to match. Are there reasonable alternatives to RefactorBrokenUnitTests?
See also MovingBrokenUnitTests.
There can be a difference between the value of unit tests during original coding vs. during refactoring. In external "black box" testing, it's a commonplace that most tests find bugs only the first time they're run. (http://www.testingcraft.com/regression-test-bugs.html) If that applies to unit tests - and I believe it does - you can be justified in automating a test but then throwing it away when some refactoring breaks it. You're justified if you believe the value of the test now is less than the cost of refactoring it. I talk at length about these economics at <http://www.testing.com/writings/automate.pdf>. That paper's audience, though, is independent whole-product testers, not programmers.
What worries me about proposing this? If you believe that XP is a HighDisciplineMethodology - and I tend to - I've just provided a comfortable steady slide toward sloppiness. Over time, you might judge more and more tests to be "not worth fixing". Then tests start losing their magic power to support rapid change, so they become less worth fixing. Etc.
However, I worry more about the alternative: a rigid rule that all tests must be fixed when they break leads to what I call the TwoYearItch. I think it worse than a gradual slide. --BrianMarick
I didn't observe this phenomenon on the C3 project, for four years. That's the only XP project I've as yet observed over a long time. But of course Brian didn't say it always happens.
What I don't understand is why it ever happens. In an evolving system, it seems to me that most of the classes tend to stabilize, performing some nice cohesive group of functions, and then never get edited again. When evolving any given class, it seems to me that I want its function to change, and if the class is well-tested, there is a small number of tests testing what needs changing, and some of them are now wrong. So I "fix" the tests, and then fix the code to make them work. Or, the class needs some new capability added on, in which case I just add a new test or two, then make them work.
So if changing code breaks more than a couple of tests per hour, I'd be surprised. What number of tests are you seeing change in the TwoYearItch mode, and what is it about the team, practices, or code, that makes things different from my paragraph just above? Thanks! --RonJeffries
No, only in environments that are similar to C3 will classes stabilize. You were working in a Fortune 500 company, codifying contracts that were meant to last a long time. The stability of the codebase is an artifact of the problem domain you were solving.
Instead, consider programming for the Internet as an Internet solutions company. In this field, the domain changes every two months at the longest. Here, only a few methods will remain stable; the utility classes for instance. Everything else is capable of changing quite frequently as the standards process moves forward, new competition materialize, and customers demand features to deal with their rapidly changing domains as well.
From experience, I find that XP is too slow to respond to these situations. -- SunirShah
Here's an example. I have a Java program that, among other things, evaluates boolean expressions like "a<b && c". In an early version of the program, expressions evaluated to a java.lang.Boolean. I was certain that later tasks would reveal that to be a wrong choice, but I'd decided to be strict about YouArentGonnaNeedIt, so I used Boolean.
In due course, I needed to distinguish between "true" and "barely true", and between "false" and "barely false". The easy thing to do would have been to subclass java.lang.Boolean, but you can't. It's declared as "final", which prevents that.
So I had to replace uses of java.lang.Boolean with my own ExprVal?. That was a fairly simple change, but it meant changing 91 lines of tests. Most of them were trivial, like changing this:
> assert(env.eval(and) == Boolean.TRUE);
to this:
< assert(env.eval(and).isTrue);
(I know that both the before and the after lines have an instance of questionable practice. Let's not get off on that.)
The tests were important, so DeletingBrokenUnitTests seemed unwise. I considered MovingBrokenUnitTests briefly, but - due to ThePowerOfEmacs? - fixing them seemed easier. That's what I did.
Now, you could certainly argue that this is a problem in the language I was using. I don't know that I'd dispute that. Other cases I've seen could be assigned the same cause. For example, a common reason for broken tests is a change to a C/C++ header file. But people are, by and large, stuck with a particular language.
Moreover, I think there's a deeper cause. When you test a method, every test contains a call to that method. For a key method, and for what I consider thorough testing, there will be many calls to that method. There may well be many more test calls than production calls. Therefore, changing the interface of the method will break many more test calls than production calls. That makes refactoring broken tests seem to be a disproportionate part of the burden of change.
That's different than testing during initial coding. There may be more test calls than production calls, but the time spent writing that test code is smaller than the time spent writing the code being tested. So the perceived burden isn't high.
At least, that's my best guess about what's happening. -- BrianMarick
Because of the power of emacs, it seemed better to you to change the 91 calls than to delete the tests. If you had been stuck in a brain-dead GUI without regexp search & replace, it would have seemed too hard. As I see it, the lesson here it to make sure you have good tools and know how to use them. Someone who deletes all the tests is probably stuck in an environment too painful to endure, with or without tests.
I see the situation, and it's the one I envisioned. Fixing the tests was the right thing to do ... and it might well have found a defect in the "barely true" region. It does take a tiny bit of discipline, or a tiny bit of habit, to always update the tests. But it's so much better. I think RefactorBrokenUnitTests dominates DeletingBrokenUnitTests or MovingBrokenUnitTests in quality, and probably - over the medium and long run - in development speed. -- RonJeffries
I don't really know how it happened. I was just developing along and writing unit tests and then I began commenting them out, or not including them in the master test. Now I am in a horrible, horrible mess because all this old code lost its unit tests and I don't have time to go back and see what they were supposed to do. I don't know how this could have happened. I am GungHo? about unit tests. Lord help us all. It's worse when you are doing high velocity and lose your unit tests, because then you have a lot more stuff to go back through and refactor when it comes time to PayThePiper?. -- Anonymous
I think it's pretty simple. New code must be unit tested. If there is already a unit test that covers the new code, bonus. If not, you have to write one. Migrating old tests is easier than writing new ones sometimes, but not always, just like migrating old code is easier sometimes, but not always. Pure refactoring is unimportant because often you are changing the behaviour of the system anyway (perhaps to fix bugs in the system that the unit tests didn't cover, or worse, covered incorrectly). But it is important not to JangIt. -- SunirShah