Unit Test Isolation

From the "StandardDefinitionOfUnitTest" discussion, where it's said...

: a. "In isolation" means separate and apart from the application, and all other units.

[From UnitTestsReconsidered discussion...]

The whole point of UnitTests is to reduce the scope of the system under test to a small subset that can be tested in isolation. Thus, by the ScientificMethod, we reduce the number of variables that may affect the results so we can derive useful information on whether the unit is actually correct. -- SunirShah

Hardly by anything as rigid as ScientificMethod, I'd say. Simple good judgment, informed by experience, is more like what most testers and programmers use. -- RonJeffries

The ScientificMethod is generally about isolating each experiment; carefully ensuring the control group matches the experiment group in every detail except the experimental one. That's what test isolation is for - boosting a signal out of a lot of noise.

This is impractical for most business systems, where a requirement is necessarily non-isolated. For example, testing the workflow: create account, modify account, report on account. Say you are testing each separately via stubbing out. A mistake in creating the account may not manifest itself until either the modification or reporting. Thus the workflow (lifecycle) of a business object is what we want to test. You do this by feeding the resulting objects of earlier tests into later tests. -- BrianG

Physically isolating test cases _assists_ the design of coherent objects. And then your IntegrationTests cover the end-to-end issues.

However, per ExpensiveSetUpSmell, if the only way to get an object in state B is to pipe it thru the test for state A, you have StrongCoupling. The first way to attack that problem is to write new test cases that create state B by any means necessary.

Isolation is laudable but pragmatically may be more than is necessary to get valuable tests anyway -

In XP you're testing all the time, so you know the cause of any test failure that was due to a code change - it was the last thing you changed
LargeScaleCppSoftwareDesign says you have "enough" isolation if you testee only calls tested objects - so if you don't have cyclic dependencies you probably have enough isolation to find the fault in any test failure caused by environment changes (new compiler, new libraries, new processor, etc) by testing the low-level units first and proceeding upwards

They say that one argument in favor of UnitTestIsolation is that without it, one change can cause many tests to fail. This is factually incorrect - a bug in your String class could cause all your tests to fail.

The correct argument in favor of UnitTestIsolation, even under TestInfected XP, is your test failures should never change just because you run tests in a different order. Everyone doing IncrementalTesting has broken the build, then discovered a test case passes when ran alone, and fails when ran in the complete batch.

Ideally, in a well-factored system, a change to the environment will cause only one (or a few) classes to fail. But in the absence of UnitTestIsolation, those failures will cause other classes that use them to fail. So, given that XX% of your UnitTests suddenly fail after making a drastic change to the system configuration, what do you do?

Here are some options:

Undo the drastic change to the system configuration (RevertDontDebug). Then go forward again in smaller steps so that the failures at each step are more manageable.
Assuming a good understanding of the system's overall structure... Look at the "lowest level" components that are failing, fix them, and then rerun the tests.

ExtremeProgramming practices do not seem to promote or discourage UnitTestIsolation: ExtremeProgramming books talk about using stubs when the classes you'd use are complex, slow, or not written yet.

But I'd suspect that the ExtremeProgramming response to the idea that "all UnitTests must test their classes in complete isolation from other system classes" is "YouArentGonnaNeedIt." -- JeffGrigg

It's true YouArentGonnaNeedIt, in the sense of YAGNI to XP. However, sometimes you do need it. The answer is not to be extremely paranoid about that, but to write down some patterns to deal with those cases. -- SunirShah

The largest problem with UnitTestIsolation is that often one test method may validate a large number of smaller abstractions. Consider private methods or "helper" macros that are only secondary to the method under test. However, decomposed functionality has a tendency to be reused. This means that these secondary functions aren't tested. It gets worse when you start to think that the underlying data structure of the system may be destroyed by a method under test without causing the method to fail. Since changes to global data structures are common in object-oriented programming (side effects), it becomes more and more intractable to isolate the unit tests.

To step out of my current role as critic, the answer to this problem is to not obsessively worry about it. While it is very common to leave much functionality uncovered by the tests, it's just a matter of adding more tests. Each test represents a ratchet up in knowledge after all, and thus you only have to debug a bug once in the worst case (still bad, but not terrible).

More specifically, for each bug, write a test to exhibit that failure if one doesn't already exist. When stepping in, if you discover untested abstractions like macros, HelperFunctions, or the data structure/object graph of the system, write more tests. In this way, your discovery of system failure will be encoded forever in the tests.

What this means for UnitTestsReconsidered is that it's not sufficient to just change some code and run the tests to know that the changes didn't cause any failures. Additional quality controls need to be added to ensure the tests deserve such confidence. That's where extreme peer review, and the OTI Lobster--you acquire this when you create a bug--come in to play. -- SunirShah

DanBarlow is entirely happy for the following discussion to go away, but wonders what DavidBarksdale? would like to do with it. Delete this italicized bit when you've acted

Dan: Sorry if I botched the attributions below when I joined in; I didn't mean to put words in David's mouth -- I thought it was an unsigned exchange. Personally, I'm interested in the changing-environment question, so I'd like to keep going for a bit longer. --GeorgePaci

It seems to me that the whole purpose for having "isolation" among unit tests is so that when you have a failure, you will know where the problem has been introduced. If you are testing all they time, you know where the failure is, though: It is in the code that you just wrote. - DavidBarksdale?

Unless you didn't just write any code. What if the last thing you changed was any of the target environment, the compiler version, the optimization settings, ... - DanBarlow

Then you still know where the failure is, since you're testing all the time (which means whenever you change any of these things). I don't quite see how the special case of a test failing due to misconfiguration rather than due to a code change is relevant to the "isolation" requirement, though. [--?DavidBarksdale?]

Perhaps I wasn't clear. I'm not talking about misconfiguration. I changed the target platform because there is a requirement that the system be ported to the new target. I upgraded the compiler version because there is a requirement that the system run with the newer compiler (perhaps support is ending on the old one, perhaps the new one is necessary for some other functionality).Running the unit tests after doing that tells me that there is indeed a problem, but as reverting to the old configuration is not an acceptable solution, it doesn't tell me anything about where to start looking.

If you've changed compilers, your chance of introducing multiple problems (as opposed to a single one) has gone up. Regardless, if you're doing XP, you have a full suite of UnitTests, with some tests for each unit.

Run them all. Look at the failed tests of the units at the lowest level first. Pick one; fix it; run all the UnitTests. You know you're done fixing the unit when its UnitTests all pass. You've probably eliminated some failures at a higher level now, too (that is, higher-level units might be failing not through any fault of their own, but because the lower-level units they depend on are failing). Now repeat from the top; you should gradually progress upward through the system.

I don't understand why you need to revert to the old configuration in order to localize the problem; is there something I'm missing? --GeorgePaci

I think I might start this discussion over from scratch, and it should be a lot shorter because I suspect that we basically agree with each other anyway. My position is that (some degree of) isolation is a Good Thing, because when I need to vary something that isn't the code itself, the tests aren't coupled so I am more easily able to tell where the problem is. If the system has a clearly pyramidical structure, then I'd agree that you have "enough" isolation if your testee only calls tested objects. All I was doing was giving a counter-example to the statement "you know where the failure is, though: It is in the code that you just wrote".

Toy example for purposes of argument: I have portable socket glue that runs on two Lisp implementations and several unix-like OSes. It's more likely that it gets ported to a new system or one of the Lisp compilers changes than it is that the actual code itself changes. Unit tests still help me.

I guess the proper generalization of David's statement would be, "You know where the failure is, though: it is in the thing that you just did." In other words, if all you do is change the compiler, and you suddenly get UnitTests failing, you can be pretty sure that a) the failures are due to differences in the compiler and b) the code you need to change (if you can't change the compiler) is in one of the Units whose UnitTests failed.

This isn't quite as nice as knowing exactly where the code-vs-compiler incompatibilities are, but it does narrow things down quite a bit, and the strategy of starting from the lowest-level failures narrows things down even more. The key here is cutting down bug-finding effort, even if sometimes you can't eliminate it entirely. --GeorgePaci

Even if you don't know what the lowest level of the tests are, it's usually not too hard to make progress.

Pick a failing test at random.
Start the failing test in a debugger.
Walk through the CodeUnderTest?, and determine where the failure appears. The failure will either be in this component, or in another component. Determine which is the case, and (in the second scenario) which component to look in.
If the failure is in this component, fix it.
If the failure is in another component, find the test for that component. Return to step 2.
Once you've fixed the bug, run all your tests again.
If you still have failures, return to step 1.

You may dive through a few test suites while working out where a given bug is being generated, but you'll learn a lot about the system in the process. In the meantime, you're focusing in rapidly on the problem.