Pair Programming Costs Benefits

LaurieWilliams did some statistical experiments on her students as part of the doctoral thesis at the U of Utah. She and AlistairCockburn wrote a paper together on the PairProgramming costs and benefits, online at [http://members.aol.com/humansandt/papers/pairprogrammingcostbene/pairprogrammingcostbene.htm]. It will be presented at XP2000.

The short of the story from my perspective is that there are no notable costs and numerous notable benefits. Briefly, in her study, two people worked almost twice as fast as one (10-15% more work hours, 55% the elapsed time), but had 15% fewer bugs, and the code base was smaller, and the people enjoyed themselves more (statistically significant).

The 10% time overhead would easily be recovered in a test department, which she didn't simulate, or on the first defect reported from the field. Also, if you had 10 programmers and 10 projects to work on, you could get 5 of them out in about 55% of the time, creating significant market opportunity, while the other 5 would only come out 10-15% slower, so you could match your market opportunity gain of the first five to offset the lost opportunity cost of the second 5. --AlistairCockburn

I realize that's only a summary, Alistair, but could you elaborate a bit on the part where 5 of 10 projects come out in about 55% of the time (55% of what?) and how that makes up for the other 5 coming out 10-15% slower? It seems like all of them will come out slow meaning that you'd lose some market opportunity . But later projects would be less likely to be adversely affected by the quality of earlier projects. Isn't this a case where time is being traded in for quality? I've long suspected that being first to market was a bit overrated, but, in the cases where economics do actually indicate that it is advantageous to ship lower quality sooner, isn't pair-programming contra-indicated? -- PhilGoodwin

You'll probably want to follow the link [http://members.aol.com/humansandt/papers/pairprogrammingcostbene/pairprogrammingcostbene.htm] to read the words more slowly, but simply doing the math leads to an interesting result.

Say that PP has a total overhead cost of 10% work hours from the time the programmers start working until they say they are done. At that point the sw goes into external test, or gets shipped to the consumer, depending on the organization. Let's totally ignore any savings that come from reduced defects and just focus on the 10% extra work hours.

That 10% extra work hours means that the sw is ready in 55% of the calendar time compared to if it was developed by one person. Net gain, time to market, is 45%. Quite a huge time gain, with appropriately large market opportunities.

Let's say 2 programmers are assigned to projects (work-efforts, but let's say projects), each 10 months long. They start Jan 1. Projects A and B would each be delivered at the end of October.Enter PairProgramming.

Put the two together, focussing entirely on Project A. It now gets done mid-June, and the company starts realizing 4.5 months of income from its delivery. The two now start work on Project B. It now gets done at the end of November, instead of the end of October. The company loses 1 month's worth of income from it being late.

With a little bit of forethought, the company arranges for Project A to be the one that delivers more income in the 4.5 months it is ahead of due date, than Project B would lose for being 1 month behind.

Run this same exercise with a backlog of projects, say 10 people on 10 projects, and you get that 5 projects are bringing in income for 4.5 months, and 5 projects lose 1 month's income. A little bit of imagination should enable you or your executives to see how to tune this to maximize profit. I think this is very interesting, and would take advantage of it in a flash. --AlistairCockburn

Forgive me for not getting it, but it sounds like you are saying that you can get a job done more quickly if you put more people on it. Since that's not a particularly stunning result (except for how often it manages to not work in real life) I assume that there is something else that's going on that I'm missing. Also the strategy of choosing to work on projects serially rather than in parallel seems to be orthogonal to the issue of pair programming.

The key results seem to be that pair programming results in 10-15% longer lead times in development for similarly sized teams with about 15% fewer bugs, smaller code size (how much smaller, btw?) and significatly higher programmer morale. That seems to indicate that it is best suited for larger, marathon type, developments and that a more individualized style might still be more appropriate for sprint style developments especially when code quality (in terms of both bugs and maintenance) is of low importance. In other words small, panicked, startups that are about to lose their VC funding needn't try pair programming, but just about everyone else will probably benefit from it. -- PhilGoodwin

Phil, I have gone over the above paragraphs a couple of times and found some places where my fast typing was confusing. 10%-15% more work hours when two people program in pair programming is 45% less elapsed time.

"Since that's not a particularly stunning result (except for how often it manages to not work in real life)". This is a strange mixture of "that's not surprising, except it's not true." If it is true, given that you think it doesn't work in real life, then it is, indeed a stunning result.

"get a job done more quickly if you put more people on it" is a vaccuous generalization. I didn't make that generalization, nor do I agree with it. Being discussed here is PairProgramming, not Programming In Arbitrarily Large Groups.

"suited to marathon projects... startups needn't try it", that is actually the interesting part. Certainly, any company with two projects of differing priorities ought to do it, as you seem to accept. The question is, what about the .com companies in a crashing hurry?

The discussion seems to revolve around the following:

Program A needs to be demoed next Friday. If software were perfectly divisible, then two people could each be given half of Program A to develop, achieving maximum parallelism, and so they should be given separate parts of Program A. Repeat this subdivision, and you get the strategy that any number of people could be applied to make any project arbitrarily fast. This is the "more the merrier" strategy you refer to above. However, software is not perfectly divisible, as I think we agree on, therefore you do not get to keep subdividing. At some point the interfaces between the pieces starts costing time.

It's at that time that the serializing strategy I mention above comes into play. Put two people on the piece that really really needs to be done soonest. Count on it being done in 55% of the time one person could have done it, and decide which second piece can afford to be 10% later than it otherwise would. I'd further suggest that we can recoup the 10% based on reduced mods after the demo.

If you can't find those two pieces of software to stage in this manner, and can't recover the 10% time in reduced defect correction, then indeed, the extra time is lost time.I'm in favor of having this exact discussion on specific projects. --AlistairCockburn

The question above seems to be how efficient pair-programming is compared to two people collaborating in their usual ways. Suppose a task took a single person 40 hours. A pair-programming team might complete the task in 23 hours (46 person-hours). How long would it take an ordinary non-pair team of two people to complete the task? I would guess 25-30 hours (50-60 person-hours), depending on the task divisibility. This question does not seem to be addressed in the paper.

Good observation. I note, first of all, that in most companies I've visited, that people collaborate nearly not at all. Which means they are solo programming, and the experimental results apply.

: Putting on my PointyHairedBoss hat, this would mean that assigning two programmers as a non-pair would result in faster delivery. A 40-hour solo project would be doable by two people in 20 hours. Of course this fails badly for larger numbers of programmers, but it seems more reasonable for two people.

But... Here's a thought experiment. Divide up the program into two seemingly separable but interacting parts, put the two people sitting right next to each other, so they can look into each others' screens as needed. SideBySideProgramming, not PairProgramming. Now do the timings. Wonder what they'll show. --Alistair

: SideBySideProgramming seems like partial PairProgramming. I think a more useful or realistic experiment would place the non-pair programmers at separate desks in the same room. (One could also experiment with several distancing features like partitions, different rooms, or even telephone and email communication.)
: One of the benefits of PairProgramming might be reduced design time since the design does not have to be completely divided between the programmers. This can be a big factor if the job doesn't divide evenly--if one side of the split is much larger. After the smaller side-tasks are coded, the available programmer may not be able to help effectively on the partly-coded large task. Marginal designs are sometimes approved because they make it easier to divide effort. (We have three teams, so we'll do a three-tier architecture.)
: A similar benefit should be reduced integration time. Most programmers have experienced the sinking feeling from "I thought we were... (using a different structure, relying on the other side for error detection, etc.) Now I have to rewrite my code." PairProgramming should largely eliminate such surprises.
: Given the relatively small slowdown of pairing, and the common nasty effects of miscommunication, PairProgramming might be faster than non-pair groups, even before considering quality/debugging improvements.--CliffordAdams

I'm asking for help in understanding these ideas in FullyParallelProgramming. Help

"Being discussed here is PairProgramming, not Programming In Arbitrarily Large Groups."

Although the latter is interesting. I was quite surprised when XP was claimed to work with teams of 10 programmers, because 10 sounds like a lot to me. What I was missing is that they were working in pairs. They suffer roughly the same congestion as a conventional team with just 5 programmers. Using PairProgramming increases the maximum workable team size.

This is a significant result because some people would throw money at a project if it would help. The basic cause of the software crisis is that usually it doesn't help. -- DaveHarris

This is a very interesting study! Let me add a few comments.

If this experiment was re-run, could I suggest having the programmers take the task to completion, i.e., 0 detected faults? This would be a more real-world result. It would also take much of the speculation of the advantages of PairProgramming away; you would have actual numbers on the completion of development. I suspect the result might show fewer total hours for PairProgramming, but I can only guess.

I would be interesting to have someone observe the PairProgrammers and solo programmers in action to document how they spent their time. This would also go a long way in explaining why PairProgramming would be effective.

Finally, it would be interesting to see the raw data generated by this experiment. What was the distribution of time spent for the various programs? What was the distribution of errors for the programs? Were their certain errors which were only made by solo programmers? Errors only made by pair programmers? Errors common to both?

I truly appreciate the experiment to evaluate PairProgramming, and like all good experiments, I hope it will lead to additional experiments. Thanks for sharing this information!

WayneMack

Why does everyone say that the LaurieWilliams study shows that PairProgramming reduces bugs only 15%? If you look carefully at figure 2, you will see that the lone programmer's programs passed about 70% of the test cases, and the pair programmed programs passed about 85% of the test cases. For example, if there were 100 tests, the lone programmer would have 30 failures / bugs, while the pair programmers would have 15 failures / bugs. This is a 50% reduction in the number of bugs!

This is also supported in [http://www.pairprogramming.com/csed.pdf], in Table 1. There, the lone programmers' programs passed an average of 75% of tests, and the pairs' programs passed an average of 89.125% of tests. For every 100 tests, this means 25 failures for lone programmers, and 10.875 failures for pairs, which is a 56.5% reduction in the number of bugs!

DougKing