Gummi Bears Considered Harmful

During the ExtremeProgramming CommitmentSchedule, UserStories are estimated by developers. The estimates are adjusted by a LoadFactor, added up, divided by the number of developers, and used to estimate the project completion date.

During the IterationPlan, EngineeringTasks are generated from UserStories, and are estimated by developers. These are adjusted by a [different but related] LoadFactor, and used to determine what will be done in the iteration. Estimates are tracked by Tracker, and used to feed back into the LoadFactor.

At least two kinds of estimates have been used in Extreme Projects. Some teams estimate in IdealProgrammingTime or Perfect Engineering Days. Others estimate task size in arbitrary tokens such as Gummi Bears.

The primary advantage to Gummi Bears, aside from the obvious nutritional ones, is that the conversion factor between Gummi Bears and real time is arbitrary and harder for management to object to. They feel qualified to say "you should be able to get a real day's programming every day", but have no basis for saying "you should be able to do ten Gummi Bears a day, not just four".

This author wishes to challenge the token-based approach. Other Extremists and observers are invited to rebut and/or comment.

First, Gummi Bears are so arbitrary that they cannot be compared. Imagine that you and I are faced with a task. We do not yet know it, but we have in mind exactly the same implementation. We happen to program at exactly the same speed. Asked how many Gummi Bears the task is, you answer, of course, 7. I, however, answer that the task is 31 Gummi Bears.

It is important that the teams estimates all be comparable, because that's what makes the CommitmentSchedule estimates work.

It might be argued that the estimates will become comparable over time, but unless we prime the pump by saying something like "between you and me, a Gummi Bear is an Ideal Hour", we certainly won't start on the same value. If we do say that, management will become aware of it, and you might as well just use Ideal Time.

Second, Gummi Bears are inherently untrackable. When you track Ideal Time, you do it by asking developers how much real time they have in the effort so far, and how much they have to go. In their mind, Ideal Time means how long it would take if I were working on it. The developer will learn to do this part of the estimate better and better, and LoadFactor will converge on an estimate of how much time people spend programming, tending toward zero estimation error.

How do you track Gummi Bears? You can't ask how many bear-nibbles are in already, or how many there are to go, can you? Won't the developer just think "well, I'm about half done, I estimated 2 bears, therefore I have one in and one to go"? (This self-fulfilling behavior occurs in ideal tracking as well, but can be countered by asking developers to report actual time. What are actual Gummi Bears? They're little colored jelly candies, but that's not important right now.

Therefore, Gummi Bears: ConsideredHarmful.

You're right. I'm wrong. Back to explaining why we only do 3 hours of work a day. Sigh... It's basically true, but I hate explaining. OTOH, if they know someone who can do it faster than us, they should hire them. -- KentBeck

How about GummiBearSecrets?

Having been on a Gummi Bear project (and having put on some weight ;-), I notice an implicit assumption in Kent's arguments - that developers do not communicate with each other. Of course, each developer takes responsibility for her tasks, and is the only one allowed to estimate them. But, if developers talk to each other, they will probably talk about how difficult it is to estimate, exchange war stories, and tend to develop a consensus about how big a Gummi Bear is (here in Munich, they come in different sizes). IMHO, a Perfect Engineering Day is just as arbitrary a unit of measurement, in that it varies from developer to developer.

I also think that Gummi Bears are useful for their pedagogic value. Take a newbie, who's never done a hard day's work in his life, and ask him to estimate a task. Wouldn't it be easier if you told him to take an arbitrary unit that has no relation to time for him, and tell him to use that? Once he gets into a feedback loop of estimating and comparing, he will get a better feeling for the real time involved, and eventually the Gummi Bears can be taken out of the loop and used for their real purpose in life. -- JosephPelrine

Think about it, Joseph. At least the newbie knows what a day is. He hasn't a clue what a Techno-Gummi Bear is - because no one has! If it'd work for GBs, it'd certainly work for days, and then he would actually know how to estimate!

At least he knows that he doesn't know. The problem with using hours and days is that people think they know what they mean. You tell them something will take 3 days and they expect it 3 days later! You continually have to qualify yourself and say "Ideal Days, not real days". With the GB there's less opportunity for confusion. (PS I don't use GB myself.) -- DaveHarris

That would be the problem, but at least it's possible to get the estimate, whereas I haven't a clue on GBs. Maybe the experience at ChryslerComprehensiveCompensation is unique in some way? We just do the math and no one seems to complain. Wonder why ... -- RonJeffries

If management retains even a shred of a deliverable function, there's no reason not to simply turn the LoadFactor around. For example, if a manager often writes project proposals one could ask "how many perfect project proposal pages per day (P5D) can you produce on an ideal day when you are in the groove with no interruptions?" And then ask them why they can't produce five times that many pages in a single week. I don't see any reason why management shouldn't be able to relate to loadfactor if the "units" of work are appropriately mapped. -- AllanStokes?

Problem is, they will explain their LoadFactor is influenced by having to manage interruptions from people not doing the right thing (meaning: you), but don't see their interference in your activities as interruptions (they are after all doing their job, and once again it's your fault if it's to your detriment).

For the DoIt team, TrackRecord is the key to the credibility of estimates. When people object to (or question) the LoadFactor, DoIt can easily point to past successes where the estimates were bang-on. "In our experience, we have found that when we are working at our best it takes us approximately twice our IdealTime estimates to complete each item. If you require us to change the LoadFactor, we reserve the right to adjust our IdealTime estimates." It is usually a short conversation.

A more frequent and contentious issue is DoIt's preference for serializing the delivery of each ImpliedRequirement? (or group of UseCases). DoIt's customers frequently want to use a larger team and deliver several requirements in parallel. This pressure is always initially resisted, but it's a place where DoIt can give in. It means that DoIt has more risk (primarily integration and scheduling risk) to manage. But it's better than signing up for delivery times that aren't achievable. -- BillBarnett

How about using ActualGummiBearsForEstimation? -- KarlKnechtel

CategoryEvil