Large Scale Equals Failure

There are large problems, and then there are large solutions, and then there are large projects. And not too strong a link between them, it turns out.

At XP2002, JimJohnson? of the Standish Group showed a very interesting table (on page 7 of http://www.xp2003.org/xp2002/talksinfo/johnson.pdf) BrokenLink relating the likelihood of a project succeeding to the initial budget of the project, based on a bi-annual survey of the industry. The likelihood of success drops to zero at around 10 million dollars.

He gives an illustration (also in that presentation). The states of Florida and Minnesota set out to implement essentially identical information systems for their child welfare programmes. Minnesota hired 8 people who took two years to build the system and delivered on time, and on budget (1.1 million dollars). Florida hired over 100 people, budgeted 32 million, still haven't finished twelve years later, expect to delvier in another three years (2005) for a total cost of 230 million. Draw your own conclusions...

You're not seriously trying to compare Minnesota with Florida, are you?!? Is this a crude attempt at a joke?

On MethodsVsCodeFragments, SunirShah wrote:

: I would contend that LargeScaleEqualsFailure. That is, if your project has reached the crossover point where higher compile time reduces quality of the product significantly (compare to maintenance time, refactor time, comprehension time), your project is toast. It has too much cruft, it isn't modular enough, it is too big. The largest project I worked on was 350 000 lines of code and only got that large due to failure.

... a pretty interesting thought, and one I haven't seen expressed clearly anywhere else here. What reasons would we have for this? The only thing I can think of, offhand, is that if the project gets really big and you can't break it up into smaller projects that interact with one another, that for some other reason your design thinking isn't very clear anyway. Sunir will no doubt have something much more insightful to add ...

It's not unimaginably impossible for one person (say, a Buddha) to singlehandedly maintain one billion lines of code. Like many pages on wiki, the title is untrue if we are strictly staying within the pointless bounds of true and false. However, we can assume that for any given team of people, there is some finite limit to the complexity of project they can handle. I contended above there is also a technological limit set by your tools; if your tools can't handle a billion lines of code (tractably), then the project will also fail. Thus, the question being asked is what you can do to balance the complexity that any one team handles so that no team is overwhelmed, and the entire system of projects proceeds efficiently--and optimally. -- SunirShah

Your tools are part of development. You can make and or change tools as needed. They aren't a limit. I believe we have the hard disk space to handle a billion lines and the build system can be scailed as needed because we made it that way. As for development we understand what we need as needed, write tests, etc.

Clearly since large projects have worked and are working the human problem can be solved. But it's not easy. Never said it was easy. Working at larger scales takes better humans, much like the difference between an amateur team and a pro team. In a large project you need pros. -- AnonymousDonor

LargeScale and Monolithic programs will eventually collapse under their own weight. -- DirckBlaskey

I'd buy this, given a suitable definition of monolithic :-) --AnonymousDonor

Main Entry: mono�lith�ic Pronunciation: "m�-n&l-'i-thik Function: adjective Date: 1825 1 a : of, relating to, or resembling a monolith : HUGE, MASSIVE b (1) : formed from a single crystal <a monolithic silicon chip> (2) : produced in or on a monolithic chip <a monolithic circuit> 2 a : cast as a single piece <a monolithic concrete wall> b : formed or composed of material without joints or seams <a monolithic floor covering> <a monolithic furnace lining> c : consisting of or constituting a single unit 3 a : constituting a massive undifferentiated and often rigid whole <a monolithic society> b : exhibiting or characterized by often rigidly fixed uniformity <monolithic party unity>

LargeScale simply requires a different approach. You can't apply the same approaches or criteria you do in the small. 350,000 lines of code is small. If your project is failing at that scale then it's because you haven't managed it well, there's nothing inherent. -- AnonymousDonor

Projects fail for two reasons. Either they are technically impossible to achieve (or intractable) or, in the vast majority of the cases, because they aren't managed well. Obviously a project that failed because it was > N lines of code failed due to mismanagement. Hence, we reject the conclusion as tautologous.

On the other hand, for some N <= 350 000, N is not large. But have you ever graphed the number of lines of code against time for a project? It grows exponentially. And have you ever graphed the number of lines of code deleted against time for a project? It shrinks exponentially. Do you not think there is some value of N where the rate of project failure exceeds the median (*) failure rate of all software projects? -- SunirShah

(*) We can't say 50% because 90% of software projects fail already. If we did, the value of N would probably be around N=10.

Speaking of tautologous, since so many projects fail it doesn't really matter what criteria you pick for failure because all will be highly correlative. Correlation is not causation. You argument relies only on correlation, there's nothing meaningful in it. --AnonymousDonor

The 5ESS project at Bell Labs had 10,000,000 lines of code and several thousand programmers. They produced version after version and make a lot of money for year after year. It was a very successful project.

MS Word, Excel, Windows 95/98/NT, etc are each a few million lines.

So, it is false that large scale projects are doomed to failure. Some people know how to make them succeed. -RalphJohnson

I can't speak about Bell (*), but I know that isn't true when it comes to Office and Windows 9x. Microsoft a long time ago discovered that it was having difficulty maintaining large gloms of code, so it went to components with COM. MS Office is a collection of individual COM objects. Collectively, the entire thing is millions of lines of code, but that doesn't mean that the entire system is one project. [It's worse too, because MS Word went from 2 MLOC to 25 MLOC in one version (but don't take my word on that; I need to find a reputable source).] It would be like claiming the Internet is a trillion lines of code with millions of developers, so clearly large projects can succeed. The argument here is not about how complex a system can humans develop, but how large a system can a single person or team (**) control? It's not the case that the development of MS Office and MS Windows is controlled and consistent--just look at the horror of the COMCTL32 version fiasco.

(*) The problem with analyzing monopolies like Bell and Microsoft is that they can afford to throw large amounts of resources at a problem to solve it, and then extract even larger amounts of resources from the marketplace to pay for it. While I don't doubt either company can manage large codebases, they are probably uniquely positioned to do so.

(**) There is a technical distinction between a team and a group. Teams are one cohesive organizational unit. The sit together, work together, and think together. They work with each other from one task to another. A group is just a conglomeration of people, defined by one task, and lasting only as long that task exists.

-- SunirShah

[The reason Microsoft succeeds with very large projects (NT is currently well beyond 10 MLOC), has nothing to do with it being a "monopoly", or even just "large". Microsoft is able to maintain NT for several core reasons (in priority): Test-driven "health" metrics (the entire code base is built, installed, and tested, every single day), Mature source control tools, and Highly disciplined division of the product along functional lines. I was an NT engineer for 3 years -- the NT team understands very well the dangers of managing Very Large Projects. -- arlied@exmsft.com]

And perhaps this gets into how you define "failure", as well. Many would argue that Microsoft products do fail when looked at in terms of quality, as opposed to sales. The two are different, though of course quality is much harder to quantify.

moved from FactoringProjects?

Software Projects, especially when they are large and complex, have to be divided into workable "chunks" -- These chunks are worked on by software developers, usually fairly small teams. If the coupling between the chunks is sufficiently weak, the chunk developments look like individual program developments, otherwise NOT.

I looked at the data used to generate COCOMO and studied it from the standpoint of programmer productivity measured in terms of delivered lines of source code at the end of the project. Nominally a single programmer delivered about 10,000 DLOC per year. As team size grew, the productivity of the individual programmer fell so that project productivity fell roughly at the inverse square root of the number of programmers on the total project.

For rough estimation purposes, I concluded that if I wanted to model the effort on a project I could estimate programmer productivity as P/sqrt(n) where P is the productivity of a single programmer and n is the # on the project. This is the estimate without factoring/partitioning of the project's work.

If the work could be divided successfully (it can't; this is an ideal illustration) into n-independent chunks all weakly coupled, then the productivity would be P in DLOC/yr. So best case estimate of project completion is the most ideal case would be DLOC-Project/P. Worst case would be DLOC-Project * sqrt(n)/P. Actual will lie somewhere between depending on the quality of the factoring. --RaySchneider

Wouldn't any calculation based on lines of code be essentially invalidated if you practice refactoring? As soon as you start refactoring, the lines of code in you source begin to vary dramatically. I would think that the use of lines of code as a measure is only valid if each line of code is written once.

It's not necessarily th scale that is the problem. It usually the feature interaction that is the problem. A large system of isolated parts would be much simpler than a large system with a high degree of interaction. Unfortunately, over time, features tend to interact more and more which leads to collapse.

There's a lot of begging the question on this page: For instance:

1) What is a "failure"? Any excess over planned schedule/budget? Significant excess (>20%, say?) Cancellation? Reduction in scope (and how much)? Commercial acceptance?

2) What is "large scale"? I've worked on numerous projects >350k LOC (though less than 1 million) that shipped and were commercially successful (some of them were even on time). Yet the industry routinely tackles projects with tens and hundreds of millions of LOC, and develops working software at that scale--the notion that 500k is "large scale" is patently laughable to many.

Examples:

Separate the reporting application from the data entry application. This lets you split the teams up, and data-entry issues and data-entry frameworks don't get all mixed in with reporting issues.