Do Both Unit And Acceptance Tests

Would someone care to expand on why thorough AcceptanceTests would not be superior to UnitTests? Why do both? (I may be missing what is implied by that statement, "XP UnitTests belong to the programmers, and AcceptanceTests to the customer."). -- WayneMack

My take on it is this: AcceptanceTests (nee FunctionalTests) for a system make the customer confident that the system does what he wants. UnitTests for a class make the developer who uses the class confident that the class does what she wants. There are some differences, of course (e.g. the developer can change the class; the developer might even be the one who wrote the class; etc.) But the principle is as old as Simon and Garfunkel: "One man's ceiling is another man's floor."

This goes for black-box vs. white-box, too: a test for a component that just uses its interface is a black-box test for the component, but a white-box test for the system as a whole. (I still don't know what a GrayBoxTest is supposed to be.)

So "Why do both?" You do the AcceptanceTests to make sure you give the customer what he needs. You do the UnitTests to make sure you give the developers what they need.

-- GeorgePaci

A simple answer to the "Why do both?" question is: Because a combination of UnitTests and AcceptanceTests will yield a better overall benefit-to-cost ratio, than AcceptanceTests alone. Depending on the size/type of the project/organization, the ratio will be different. AcceptanceTests alone are not viable due to exponential explosion of the number of test cases as the system grows: each new feature will multiply the cost, but only add to benefit. Plus, UnitTests (which are also frequently ProgrammerTests) provide unique benefits, such as liberating developers, and allowing to test parts of the system hidden by design (think caching). -- AnonymousDonor

I write unit tests because they help me program faster, with less stress, and they improve my designs. AcceptanceTests are too big to help with detailed design. -- KentBeck

But how do UnitTests help? It's unclear.

Unit testing, to me, is the obvious and logical extension of Divide and Conquer. Without unit testing, we're reduced to testing all 67,934 lines of code in a project. I've been in the situation of working with a developer who didn't unit test; during integration, I was constantly having to diagnose bugs in his code. I think time spent building and running unit tests will always (or almost always) be made up with faster acceptance testing and less rework at the end. -- BillyChambless

I disagree with the phrasing. You still have to test all 67,934 lines of code. What you avoid, perhaps, is having to test all of the possible paths/combinations - 5 + 5 tests rather than 5 * 5 -- DanilSuits

Well.... I agree with your disagreement with my phrasing. I should have said "...all 67,934 lines of code at once...". If I know for certain that the Do_Stuff() method does what it's supposed to, I must be calling it wrong. -- BillyChambless

[Discussion moved from UnitTestsReconsidered follows. SunirShah is probably the author of most of the unsigned comments.]

: "UnitTesting = friction" turns out not to be the case. Programmers who use test first programming find that they go faster, not slower, over the course of the project. In addition, they get better interfaces and simpler implementations. Saying it isn't so doesn't make it not so. (Of course, saying it is so doesn't make it so either, but we're reporting observations, not just making this up. -- RonJeffries

This might be true in some people's experiences. It isn't in mine. Every time I write UnitTests, they get in my way. Now, perhaps this has to do with the difference between what I define UnitTests and what XP defines UnitTests. In which case, I think we probably agree. AcceptanceTests make life good.

: Look, it's really very simple. If your developers don't CodeUnitTestFirst, then they won't really do unit tests at all. After the fact never works. That leaves you with just AcceptanceTests. AcceptanceTests don't enable RefactorMercilessly. Without RefactorMercilessly you're far from XP. -- PeterMerel

I agree people won't write UnitTests. They're boring and they don't improve people's lives. AcceptanceTests allow RefactorMercilessly at a subsystem level, not a class level. Here's a hint: your user doesn't care what classes you use, just whether they work. Your testing shouldn't constrain your ability to improve the class tree of the system design to improve the business value of the system of the whole.

SOME people wont write UnitTestss. I wite them all the time because they are fun and they have improved my life a huge amount. They are fun because I get continual positive feed back for very little code. They have improved my life because I know my code works and if I change something I have a safety net. And when someone sends me a bug report it takes me ten minutes to prove it's not my code. Here's a hint: Your customer doesn't care about anything as long as your application meets their expectations. What's a class? Nicely formatted code? Design Patterns? Smalltalk? They don't know what this stuff is and even if they did they wouldn't care. You could have the worlds most beatiful code or a complete pile of crap, they don't care as long as the app works. Only programmers care and that's what make these sorts of arguments pointless. - hjm

[AnonymousCoward points out that a good suite of UnitTests helps when you transition code between developers. It's good documentation, and it may help people collaborate.]

[...] My interest in software process is entirely about coping with software development given the stresses of the industries I work in. I have to move fast, and I prefer to not write bugs (they show up by the LawOfDemos). So, developing optimal strategies is more interesting than developing provably correct strategies. Strictly speaking, writing a lot of tests is better than not to improve the confidence of a correct system, but it's much slower. It's also annoying, so people (including myself) don't do a good job of it. These are real concerns.

There's a real difference between CowboyCoding, and the real situation of many companies who work on "internet time." This is why I get annoyed at "speedy" three week iterations. I have to work faster than that because potential investors don't like to wait.

Could it be that UnitTests more easily automated than AcceptanceTests? And isn't the presence of AcceptanceTests generally required for confident, cost-acceptable refactoring?

I would think that automation is a huge reason to use UnitTests more often than AcceptanceTests. However, automation is why UnitTests are easier to use, not the reason for which you use them. From the standpoint of the customer, they don't care about UnitTests. UnitTests are for the programmer; to insure that when he/she finishes coding that the code accomplishes the task designed.

This is especially useful when one is writing large groups of classes that are interdependent. Without UnitTests you cannot be sure a piece is functioning until you have written all the dependent parts. Then you end up running your AcceptanceTests to make sure the new code works? Then you end up spending a good deal of time pulling apart the pieces to track down the problems. UnitTests allow you to be sure the code is functioning before completing the other pieces, and writing them takes much less time then debugging 20 different classes at the same time. -- ToddEdman?

I've been using developer tests at the executable unit level rather than the class level for development and I haven't seen this problem. I have been following the basic TestFirstDesign approach where I add things one at a time and I find I usually do not have to change a lot of things at once. I have also used this approach for code refactoring. The only times I have gotten into trouble is when I try to do multiple things at once, and I am not sure UnitTests would provide much more help in this case. The tests are programmer written, so I guess they don't qualify as AcceptanceTests, but they are also at a much higher level (and quite fewer) than UnitTests, so I am not sure where they fall into the naming conventions. -- WayneMack

Lately I've been writing a lot of UnitTests (out of my own free will!). They haven't helped me find a single bug, and they are quite extensive. I followed the traditional route of checking all the boundary cases, plus normative cases, random cases, erroroneous cases. They have made my project take about five times longer than it could have taken (one class per day, versus one class an hour). I also have XML files and an XML parser that will flex the entire interface at a higher level, one that's much easier to write for.

I don't regret writing the tests. The major reason the UnitTests exist is to make the compiler complain when the interface changes as this is a public API. The second reason they exist was because the XML part was written by someone else, so I needed a way to execute my own code, not that it helped. I think my code was too simple to screw up anyway (get, set, get, set, get, set.).

But now The XML is sufficient. I've even done some significant reworking. The UnitTests didn't find any problems (except in themselves), but the XML test files did (errors in graphical output). It comes down to using the right size screwdriver for the job. Sure, we need to use a screwdriver, but picking a jeweler's screwdriver for a 1/16" slotted screw is not appropriate. Of course, the observant reader will notice that they are both screwdrivers; i.e. unit tests and acceptance tests are basically the same thing at different levels of abstraction.

Thus, since I had to use a higher level test anyway, if that higher level test gives you sufficient confidence in the system, then that's all you need. Why do more work? I think XP has a point that if you do all the UnitTests, you are more likely to do all the ones that count, even if you do more than necessary. But that's kind of overboard (extreme, you might say) for a lazy slob like myself. -- SunirShah

Sunir wrote: They have made my project take about five times longer than it could have taken (one class per day, versus one class an hour).

Sunir, my immediate reaction is that you're writing too many UnitTests. It's as if, in the old days of commenting code, someone said, "I tried commenting my code and now it takes me three times longer to write code"-- you would react that they must be writing too many comments, right?

In my experience of writing a single, isolated class, testing adds about 50% more time and roughly doubles the amount of code. I test only the things I think might break, plus a few extreme and degenerate cases. I do not test everything that could conceivably break.

On the TestEverythingThatCouldPossiblyBreak page, WayneConrad reminded us that the 80/20 rule applies to testing: "80% of the benefit of testing is gained with 20% of the effort that would go into the mythical "complete test" (if there is such a thing)."

Let me also quote an anonymous post on the UnitTest page in answer to, Are we Testing Too Much: "The purpose of unit testing is to increase developer mobility. It is not to verify the correctness of your program.

Are you testing privates?
Are you testing simple accessors?
Are you trying to make unit tests perform 100% code coverage?"

But what's even more important is what happens later in the project, when I have multiple, interacting classes. I find that unit tests greatly shorten my development time. They help me reconsider my design, reorganize my classes, and even sometimes eliminate classes. Any of my co-workers can look at the code, notice a simplification or refactoring, and do it, secure in the knowledge that the unit tests will prevent them from breaking anything. Pretty soon, a user comes to us with a new, unexpected feature, and we're able to add it in just an hour (rather than taking a whole day, as in the past) because the design is so modular and flexible. Remember, the goal is increasing developer mobility, not verifying correctness.

I don't have proof, but it's my experience that UnitTests add about 50% more time in the beginning but save "a lot" of time later in the project. If you know of any formal studies, please put them at UnitTestingCostsBenefits.

It sounds like you're working on your own and not pairing with an experienced XP developer. Take a look at the ObjectMentorBowlingGame for an example of writing only tests that matter. In that example, there are 104 lines of bowling-scoring code and 114 lines of test code. The session also has examples of refactorings inspired by unit tests.

The page CanYouHaveTooManyUnitTests cites an example with "30,000 lines of production code with about 5,000 lines of test code comprising 500 tests. In this case, because of heavy dependence upon a database, the tests would run for over 2 hours" which people felt was taking too long, which discourages programmers from running their tests frequently, which is bad.

-- JeanCzerlinski

ATs (or system or functional or whatever you call them) validate the system; in XP, these are the only things that matter for validation - if the customer hasn't made a test for a requirement - it's not a requirement.

UTs have two uses - driving the design (test-first) and isolating the code responsible for a defect (defect meaning an AT failing, regardless whether it's a new AT [new feature] or old one [bug]).

If we could put that burden on the customer (all required functionality covered by ATs), then we could even go so far as to say that if the ATs pass and the UTs fail, the problem is in the UTs, not the real code. In a chart:

|UTs|
 ATs | pass | fail |
 ----+------+------+
 pass|A  |B  |
 fail|C  |D  |
 ----+------+------+

Actions:

A - stop working, the system is up-to-date
B - fix your unit tests until they pass
C - fix your unit tests until they fail
D - fix the code until your unit tests pass

Transitions:

From A - get new AT, you should end up at C (but could be A again)
From B - you'll be at A when done
From C - you'll be at D when done
From D - you could end up at A or C, depending whether your code actually met the requirement

Note: There's no obvious transition to B; the only way to get to B is to remove or change dead code that is still under UTs but not under ATs. For example, an AT is removed, and you clobber the associated code but not the UTs.

''The most obvious transition from A to B is to refactor across classes. You will break your UnitTests but should not break your AcceptanceTests. -- WayneMack

NOTE: In my original comments, I specifically avoided the term "AcceptanceTest" and used "functional test" instead. This was deliberate in order to avoid the issue described below. Please take care when editing text not to change its meaning.''

For this to work, we have to have the ATs done test-first (yeah, that'll happen), and we need a way to add ATs to the suite one at a time. The ATs "in the suite" are like UTs - they all must pass (except the one you're currently working on).

I think the area that concerns me most is the level of effort required to keep unit test drivers, stubs, and the actual production code in sync. It seems the most problematic areas of the codes base I am currently working with involve complex and inappropriate interactions between classes. What I am tyring to do is preserve the existing system functionality while altering the internal classes and their interfaces. If I use a set of system level functional tests, I can make my class level changes and verify operation without changing the tests. If I use class level unit tests, I must change both the code and the test and when I'm done, it is still unclear whether I maintained the system level functionality. Is this assessment correct or am I missing something? -- WayneMack

Your assessment is correct. You need both kinds of test. -- KeithBraithwaite

See CanFunctionalTestsReplaceUnitTests