Several kinds of AlmostCorrect.
I am a little concerned, actually, about the success of XP / Unit test disciplines, because there are some kinds of behavior which deserve to count as spectacular failures which are hard to test for this way, especially with systems that are large, distributed and stateful. One example would be a data management service that manipulates customer data (which is pretty complex and stateful) on behalf of a number of systems, or even within one large system. "Unit tests" on this service's exposed API, and / or internal methods need quite a lot of test harness & infrastructure to make them go. "Functional Tests" vs. this service implementing some kind of story or use case require even greater support.
Testing for correct operations of such a service requires at least these considerations for both methods and stories in addition to functional correctness:
I am wondering if in systems that are some combination of large, distributed and stateful, the correctness that is testable with unit tests + functional tests doesn't count as it's own kind of AlmostCorrect. This question relates to AlistairCockburn's idea of different degrees of refinement of use cases. I am wondering if the testomg methods that will be sufficient (XP defined unit tests and functional tests are an example) won't vary by the degree of refinement in the requirements the system must meet.
Designing to escalate AlmostCorrect.
Here is a thought experiment about design principles that might provide some mitigation of AlmostCorrect:
I'm not advocating making systems fragile. This thought experiment has me intrigued because it seems as if something organizationally useful happens if you do. "What if we made systems vulnerable to some kinds of AlmostCorrect behavior . . . "
Working with Weak Policy
The XP requirement for unit tests and 100% test success seems to be a policy to me. Any test failure, or failure to execute existing tests is declared unacceptable by the GoalDonor, or the GoldOwner or both. I have found the sticking point in implementing the practice of XP style unit tests (many testing practices, actually) is often the unwillingness to establish the policy, not the implementation details. So selecting automated, automatic unit tests plus XP style functional tests seems like three things to me:
So what do you do if that policy isn't in place from the GoalDonor or the GoldOwner? Designing to escalate AlmostCorrect would be another way to get a more robust form of correctness. Rather than escalate any test failure to unacceptable by policy, it escalates observable instances of AlmostCorrect to failures that are already unacceptable. This sounds really cynical. But isn't one aspect of "design for testability" to design so failures are visible, obvious, and unambiguous?
There's a variation on this question where the policy requiring all tests to pass all the time is owned by the folks producing the artifact vs. the GoalDonor, or the GoldOwner. Then the management of the production process estabilshes this policy or not. Another variation has the technical team assert this policy, independent of sponsorship of the people responsible for the performance of the development function, or the sponsors of the system.
It seems to me that if all testing all the time works in terms of the payoff that accrues to either development as a function, or the customers of development, the case ought to be pretty easy to make. What's going on that the case isn't so easy to make?
Clearly, "Several kinds of AlmostCorrect" and "Designing to escalate AlmostCorrect" don't work together. That's what got me typing. They both make sense to me in some circumstances. They seem to pivot on how AlmostCorrect gets treated as acceptable. But I can't reconcile them.