Can Functional Tests Replace Unit Tests

At issue here:

In most cases, functional tests can more than adequately replace UnitTests. Furthermore, in any software maintenance environment people are changing code they did not write, and doing it without UnitTests. UnitTests may be useful, but are certainly not required. -- Anonymous from CoreXpDependencies

(I branched this from CoreXpDependencies for I think this is a common misconception that is worth discussing on its own page.) -- DierkKoenig

(DierkKoenig) Well, for me, for my team and I guess for XP, that's not true.

Try this: let somebody throw a deliberate bug into your code and measure the time to find it - with and without UnitTests.

Put a bug in your code and let somebody else fix it - with and without UnitTests.

The difference in time between using FunctionalTests and UnitTests is the time needed to isolate to the class level. Is this a significant amount of time?

Measure the time that a new team member needs to be ready to explain the meaning and behavior of a class - with and without UnitTests. Do the same with yourself on old classes that you haven't touched for some time.

Why is it important to understand the meaning and behavior of a class? Is this level of understanding even possible within a large project? Can the class be really understood within the environment of the UnitTest as opposed to its actual operating environment?

I suppose that this was probably meant as a serious experiment, not just rhetoric. Whether or nay, it really would be nice to try it, with a decent-sized sample set. -- DanielKnapp

I was in a discussion about just this topic last week, at our XpNewYorkCity meeting. Someone was advocating that the FunctionalTests are all the functionality the client wants, in the spirit of DoTheSimplestThingThatCouldPossiblyWork. But it felt, to me, that UnitTests are still necessary. Isolating all the ways a class could go wrong seems important, and to try to write FunctionalTests that captured all the individual class behaviors would seem to lead to a CombinatorialExplosion. Do you really want to spend all that time writing FunctionalTests?

If you follow DoTheSimplestThingThatCouldPossiblyWork and refactoring, any class level change should be driven by a functional level change. Why have to create multiple UnitTests to verify a change, when a single FunctionalTest will suffice?

The argument for just FunctionalTests, though, is in part that UnitTests are a way that programmers let their own requirements creep into the code. If the client didn't ask for it, maybe it shouldn't be in there. Maybe that's too extreme for me? I dunno.

Unit tests are indeed a way that the programmers let their own requirements into the code: their requirements for internal quality. If the programmers' requirements for internal quality are not met, then pretty soon the customers' requirements for external quality (which is what functional tests measure) will not be met either, other than at exorbitant cost. Functional tests cannot replace unit tests because they address fundamentally different aspects of the development process. They are not even the same kind of artifact, despite appearances. Unit tests of the kind that XP developers write are design documents. They are an aid to design, and a record of design decisions. They are the enablers of design improvement through refactoring. Functional tests are requirements documents, they record scope decisions and the interpretation of requirements. They are an aid to specification and tracking. -- KeithBraithwaite

Functional test and Unit test serve different purposes. One gives the developer confidence when refactoring, the other gives the customer confidence when planning. Neither require particularly complicated framework support unless you are trying to make one framework serve both needs. For example, one framework must operate in the developer's environment while the other framework has to work in the customer's. I have been able to make one framework serve both purposes, but I relied heavily on interoperability between our application's table/report framework and the development environment's browser/debugger/inspector framework, all of which ran at the same time. -- WardCunningham

Unit tests are indeed a way that the programmers let their own requirements into the code: their requirements for internal quality.

Isn't it about delivering functionality to customers? Internal quality is not something involving customers so resources should not be spent on it. Or at least why is the customer not asked if they want to spend resources on unit tests. This is at least the line of reasoning XP uses frequently in other scenarios. Why it doesn't apply to to unit tests is somewhat mysterious, unless unit tests are an axiom and can not be questioned.

The internal quality is the building block of external quality. Specifically, it's how the software engineer controls the behavior that will be observed in function tests. In large enough systems, this control becomes the critical factor in external quality. Without the added leverage of fine grained testing, you're sunk. With smaller systems, you may be left wondering "Why do I need all these fine grained tests?".

But this same argument can be used for frameworks and many many other practices XP says are not directly customer-related or approved by the customer so they should not be performed. Unit test are no different and it's hard to see why they are in a privileged position. Why isn't the customer even given the option?

To give the customer the option of not paying for unit tests would be to give them the option of not getting a working system quickly and cheaply. You could do that, and the funny thing is that some customers might very well accept that option. An XP team could negotiate the use of each of the twelve practices, and teams seeking to introduce XP into a new environment do do that. One of the interesting features of XP that doesn't get talked about so much any more is the claim that the twelve are one possible highly optimized and minimal set of practices for the domain that XP addresses (whatever that turns out to be). If that's even close to being true, then no one of the twelve is optional.

Unit testing is indeed not customer-related. Neither is refactoring. Nor, for that matter, is writing code. An XP pair writes its code by writing unit tests, making them pass, then refactoring. To not do unit testing would hamper refactoring, which would make the code less resilient, which would ultimately increase cost for the customer, both through increased re-work and through having to do other things, like write documents and review them, instead. Unit testing is not an axiom, it is an enabler that has been observed to enhance quality and productivity, and developer motivation. There's no reason, a priori, to think that there can't be a process that reaps the same kinds of benefits as XP, but uses frameworks or whatever, and not XP style unit testing. It just wouldn't be XP. Which would be ok. -- KeithBraithwaite

My understanding of an XP project is that nearly every decision is customer-related because it is all about giving customer value. Even the simplest of decisions, say adding a logging framework, must be approved by the customer because it is the customer that decides what has value to them. Given the effort required to create and maintain unit tests it would be consistent to ask the customer if this practice has value for them and if they are willing to pay for it. To not even ask seems very inconsistent with the very deep rule of customer value. -- AnonymousDonor

Every decision about what functionality to add, certainly. About what development tools to deploy and when? Maybe not. Imagine you had commissioned a master carpenter to build some furniture for you. You might very well expect to make the call on whether the carcass of the piece be solid rosewood, versus oak with a rosewood veneer, either being adequate for your purpose. Would you expect to go into the carpenter's workshop and start choosing the tools they should use? Only in the most unusual circumstances, I'd suggest.

Well, this is the essential tension of engineering. There are things of value to be built and building them is sufficiently challenging so as to create a separation between customer and engineer. When software is really simple, then customers are their own engineers. Witness the spreadsheet phenomenon. Otherwise, there is a division of roles and a trust that customers are capable of knowing what they value and engineers are capable of building that at low cost. Each role has its value system, and then there is a third region in value space where customer and engineering values overlap. The best we can do is nudge people back into role when they're off in the weeds, and tolerate lost wanderers (especially if they have money).

Unit tests are valued by engineers when they help keep costs low and engineering effectiveness high, and are valued (through value composition) by savvy customers, too.

Frameworks are different. They can add value, but harkening back to the essential control problem, frameworks are expendable while fine grained testing really isn't.

-- WaldenMathews

What is "internal quality" and how does this differ from external quality?

If we take Weinberg's definition of quality as "value to some person", then internal quality is what makes the developer value the code, external: the customer. The customer needs the developers to value the code.

Careful, "being of value" is different than "being valued." Should the code be of value to the developer?

Functional test and Unit test serve different purposes. One gives the developer confidence when refactoring, the other gives the customer confidence when planning.

XP Unit tests are devised, written and executed by developers as an aid to design, quality control, and an enabler for refactoring. They are written before, and are a driver for, the code that makes them pass. They are white-ish box tests. They prove the existence of features meaningful to the developers, and prevent errors becoming defects. XP "unit tests" are probably not tests at all.

Functional tests are devised by the customer, written and executed by at least the developers with a different hat on if not a separate team, as an aid to requirements gathering, progress tracking and quality assurance. They may be written some time after the code intended to make them pass. They are black-ish box tests. They prove the existence of features meaningful to the customer, and find failures from which defects may be diagnosed. Functional tests are definitely tests.

I'll agree that the XP implementation of UnitTests and AcceptanceTests differ, but my question is should they differ? How does "requirements gathering" differ from being "an aid to design?" Should "features meaningful to the customer" differ from "features meaningful to the developer?" Isn't FunctionalTesting an enabler (at least) for refactoring across classes (as opposed to within classes)?

[Aside: Before the QaIsNotQc bunch gets riled, you may want to reverse the usage above. Quality Control is usually after the fact testing, while Quality Assurance would better fit with before the fact testing. This is probably not important to this discussion, I just want to prevent the discussion from becoming side-tracked.]

Just how strange XP-style unit tests are is discussed on TestFirstDesign. -- KeithBraithwaite

It's certainly up to the Customer to choose how to ensure quality. If they won't pay for unit tests, point out that without a credible alternative the project will take longer and cost more. (QualityIsFree). By definition this would take value away, not yield value.

Is there any proof for this assertion? If someone says years of experience show that without well crafted frameworks a project will take longer and cost more, they are told in no certain terms that this is not so. You are told you cannot predict the future and the customer decides what has value. Without any proof it seems reasonable that good system/functional tests will guarantee the quality the customer is interested in, not what the developer is interested in, because by definition the system test tests customer requirements. If these tests continue to pass then the customer has got value. Thus unit tests do not provide additional customer value and are not the simplest things that could possibly work. -- AnonymousDonor

Indeed unit tests do not provide additional customer value. They are an integral part of providing the customer with any value at all.

This is boring philosophy. Really, the question comes down to quantifying the cost of initial verification and regression. I'm sure there are papers on this. Anyone got them handy?

I'm sorry you find my comments boring. It's not (merely) philosophy, however:

The unit tests that XP developers write are not overhead. They also aren't tests, so a comparison with the cost of regression is neither here nor there. The comparison should be with the cost of doing code reviews on every line, plus producing design documentation, reviewing it, and updating it many times to reflect improvements in understanding throughout an iterative, evolutionary project.

Ok, then. The request from the "AnonymousDonor" was for data. Find the paper references. I'm sure they exist.

[ed: Note that I deleted my personal philosophy on unit testing because that's not germane to this section of the page. I think we ought to leave it blank until we can answer the question instead of redirecting it. Anything else is just redundant with the other hundred pages on unit testing on Wiki. Indeed, that's why it's boring. We've all read it before, done better, on pages like TestInfected which actually have traction, or even UnitTestsReconsidered which actually has some empirical results on it. Since this is Wiki, I think it's cheaper to link to it than to restate it.]

What is the rule for defining what is or what isn't overhead? What would be the justification for code reviews? Is the customer asked if they wish to pay for any of this? -- AnonymousDonor

Unit tests reduce cost. Should the customer be asked if he would prefer to pay more??

This is merely an assertion.

I think AnonymousDonor would like some EmpiricalEvidence for the value of XP-style Unit Testing in the presence of Functional Testing. It's a fairly reasonable request, and it's not surprising that the question comes down to contingent details, rather than a theoretical argument.

One key question is: "Do you get any benefit from having Unit Tests when you already have Functional Tests?" I think so, and I'll present some experience to that effect. It's not as good as a large-scale study, but it's much better than what I could produce by sitting around thinking.

We had some functional tests, which we started writing code to fulfill. We didn't write unit tests. We accumulated a modest-size heap of code. The functional tests turned up a problem. We found it difficult to figure out where the problem was - we knew where it manifested itself, naturally, but not where it started.

After a while, we thought we'd figured out where the problem was, and changed some code to fix it, but something else broke. One of our difficulties was in figuring out exactly what the correct behavior for some of the classes was supposed to be - not everything has an obvious dominant solution, and just who was supposed to do what in which weird case was (a) not blindingly obvious and thus (b) difficult to keep straight while thinking about the whole problem in context.

We wrote some unit tests. They nailed down the existing function of the classes to the point where we could easily see the main problem. They also revealed a couple other problems that the existing functional tests didn't catch, but that a user might actually encounter. The time spent to write the unit tests was significant -- there were some technical difficulties, which was why we skipped them in the first place. But it was much smaller than the time we spent (fruitlessly) searching for the root defect without them.

The benefits provided in this case by our Unit Tests over and above those provided by our Functional Tests were:

better coverage (found more problems)
better understanding of the code
shorter debugging time (and if we'd written them first, we'd have had no debugging time)

Some more questions: Does anyone have any negative experiences with Unit Tests in the presence of Functional Tests? Or with Functional Tests that come later? If so, how and when were they written? Is the case where you have Functional Tests and you have the option (but not the mandate) to write Unit Tests common?

In the above example, is the problem one of FunctionalTests versus UnitTests or one of thorough tests versus non-thorough tests?

[Could everyone try to keep the attributions clear? If you don't sign or switch to italics, just set off your contribution with a horizontal rule. The anonymous texts have no relevant attributions because they are generic (cf. WikiNewspaperAnalogy). You can reduce most of this page to the SummaWay at this point, but really this page has gone nowhere further than the others. Perhaps it would be a good time to identify the UnitTesting pages and summarize the ideas in a logical and simple manner. Change of author is important to signal, especially since it usually accompanies a change in the direction of argument. Again, you don't have to sign, but do toss in a horizontal rule to make life easy on those of us who actually read. As for whether this page is worthwhile, you can always VoteWithYourFeet.]

One key question is: "Do you get any benefit from having Unit Tests when you already have Functional Tests?" I think so, and I'll present some experience to that effect.

There's a larger question if anecdotal evidence from experience is sufficient to justify pro arguments. In numerous XP-related conversations, arguments from experience have not been sufficient, so it seems reasonable that it's not sufficient to justify an XP practice.

They also revealed a couple other problems that the existing functional tests didn't catch, but that a user might actually encounter.

Then the system test was incomplete? Theoretically, bugs not found by the system test are not important because they aren't directly related to customer functionality.

One of our difficulties was in figuring out exactly what the correct behavior for some of the classes was supposed to be - not everything has an obvious dominant solution, and just who was supposed to do what in which weird case was (a) not blindingly obvious and thus (b) difficult to keep straight while thinking about the whole problem in context.

One could say that more complete system tests, assertions/pre conditions/post conditions, debug, and a good debugger would counter these problems.

The philosophy of RegressionTests to me is pretty simple - of course, this doesn't have much to do with XP, which has very different reasons for writing UnitTests than the usual case, but bear with me. I prefer driving tests, like FunctionalTests, because they accomplish more for less effort. However, if you find that you are frequently finding bugs that cause you to dig through the same many layers of abstraction repeatedly, it's cheaper to add the functional tests at a lower depth so you can reduce the amount of time it takes to locate the defect. This process reaches a limit at the point of a unit, at which point the FunctionalTests become UnitTests. Therefore, functional tests cannot replace unit tests because unit tests are functional tests. Moreover, testing at the level of a unit is more efficient if that unit is either highly defective, frequently churn, or otherwise subject to frequent review. -- SunirShah

The function in function(al) testing usually refers to functions that are user-accessible, such as "add an account", "rename account", "post sale". The use of this terminology is not uniform, and that's a problem we have to live with. The unit of unit testing is also not clearly defined. In an ironic twist, in procedural languages this unit is likely to be a function or procedure. So when you test that function in isolation, some would say you're doing function testing, others unit testing. Gawd.

I intuit that the moderator of this discussion is asking whether anything beneath the visibility of the customer warrants developer attention, specifically should developers consume customer resources testing technical details that the customer cannot see?. The moderator is invited to correct me if this misinterprets his/her concern.

In summary, the answer is yes, if developers are conscientious about delivering quality in its broadest sense: function, correctness, affordability and timeliness. Here is an analogy. When a house is built, the framing is inspected before the wall coverings are added. The customer doesn't care directly about where the studs are, but may well care if a wall buckles after a year, and may also care that the sheetrockers doubled their fee because the nailers were not where they were expected. Here's another analogy: when you vacuum the house, do you only remove the dirt that can be seen from normal trafficking? In both these cases, the apparent overkill isn't overkill in the bigger picture.

All software problems are problems of excessive complexity, i.e. things that a simple mental model cannot deal with. The problem with taking functional testing as your sole indicator of quality is that it offers no leverage against that complexity. In other words, functional testing is fine for verifying a system, not so hot for fixing it. Observable failures of a software system are likely to be highly tangled compositions of individual code behaviors. These can be backtracked, but not without analyzing code. You may think of unit tests as a pre-analysis, and as such, a preventive measure against that kind of tangling.

Without an appreciation of general systems concepts (see G.M. Weinberg for some excellent reading on the subject) it's hard to understand paradoxical advice such as SlowDownToSpeedUp. In this case, speaking of unit testing, it's SpendMoreToSpendLess?.

-- WaldenMathews

Walden just pointed out that the main problem we are having on this page centers around definition of terms, and it looks like the entire topic may be misguided. All the reading I've done regarding XP discusses the importance of both Unit Tests and Acceptance Tests, not "functional tests". They both test function, however unit tests are designed to test software function in a complete and regression-friendly way, while acceptance tests do the same for business functions. XP indicates that both are required to successfully carry out the project, and encourages them to be viewed as complementary, not exclusive artifacts. Unit tests are used to flatten the change curve by providing a fine-grained level of detail, allowing bugs introduced by changes to be found in the quickest way possible, while the acceptance tests ensure that the required business value is being delivered. If you only care about one of the two things, I suppose it's possible to use one without the other. My guess, however, is that if you did so, you would quite quickly find that you've caused yourself more trouble that it is worth. In addition, since unit tests are finer-grained entities, there are more of them. While you might be able to code up your acceptance tests at a later time, it may be very difficult to do this with the unit tests, as the code-base could be quite large. Perhaps the creator of this page might help with a clear definition of what he intended by Functional Test so we can move this discussion in a more constructive direction.

-- SeanMcNamara

: I'm sorry for having caused so many misunderstandings. In the citation that is at issue here, I assumed that the AnonymousDonor used functional and acceptances tests as synonyms (as in FunctionalTests and EmbraceChange).
: The contributions on this page were very helpful to me. Thank you everybody. -- DierkKoenig

I see functional tests as something that are developer written and used for TestFirstDesign, but implemented at a higher level than the class level. Functional tests may be a hybrid of AcceptanceTests and UnitTests, serving as the basis of development and derived directly from user requirements. I believe the main assumptions leading to CanFunctionalTestsReplaceUnitTests are:

) Functional tests take significantly less time and effort to develop than UnitTests,
) Functional tests provide equal test coverage, and
) UnitTests do not significantly decrease fault isolation time.

Another, equally important, issue is:

Functional tests are not affected by refactorings across classes, while UnitTests may become broken.

I think the question is do Programmer-written tests need to be done at the individual class level? I find that isolation at the class level often requires far more effort than the benefit provided. Testing at a higher level verifies operation of the class within its target environment without having to emulate the target environment.

See DoBothUnitAndAcceptanceTests

Having used both and gone through the same kind of confusion of terms, my team came up with the following definitions:

1. Unit Test: tests the behavior of a single class, without dependencies on external system environment.We did support limited use of external file resources and the like to allow data-driven unit tests.

2. Functional Test (or System Test): tests a function of the software either at an interface point (APIs, internal engines, etc.), or at a user-visible point (Web data entry page). Functional tests generally required system environment setup (database, application server, etc.), and often made assumptions about data in a particular state. This makes them more brittle than unit tests by far (as defined), but still automated.

3. Acceptance Test: a QA or customer level test (can be automated or manual) that exercises a part of the specification. Example "Add an account" or "Generate a billing report". When automated, these look just like functional tests.

-- DaveChurchville?