(from TimeForaParadigmShift)
A simple faith: Science has a simple faith, which transcends utility. Nearly all men of science, all men of learning for that matter, and men of simple ways too, have it in some form and in some degree. It is the faith that it is the privilege of man to learn to understand, and that this is his mission. If we abandon that mission under stress we shall abandon it forever, for stress will not cease. Knowledge for the sake of understanding, not merely to prevail, that is the essence of our being. None can define its limits, or set its ultimate boundaries. Vannevar Bush (1890 - 1974), U.S. electrical engineer, physicist. Science Is Not Enough, The Search for Understanding (1967).
Upon the shoulders: In science men have discovered an activity of the very highest value in which they are no longer, as in art, dependent for progress upon the appearance of continually greater genius, for in science the successors stand upon the shoulders of their predecessors; where one man of supreme genius has invented a method, a thousand lesser men can apply it. Bertrand Russell (1872 - 1970), British philosopher, mathematician. A Free Man's Worship and Other Essays, ch. 3 (1976).
What if it was not the field of software development, but the field of software development improvement that needed a paradigm shift?
I know of two ways to improve software development tools and practices:
Pro experiment comments
I like your use of the words "software development improvement" it expresses more correctly the point I am trying to make with this page. The whole way of thinking about improving the way we use computers and how we think about and acquire new knowledge. The extreme concept of WorkingInPairs is a form of collaboration, which I believe is a powerful concept, perhaps the most powerful of the extreme community.
"To do experiments to determine how best to do this" - good suggestion - some steps may be underway - Would you consider the work in ExtremeProgramming a software development improvement experiment of sorts?
The existence and use of this Wiki - could this also be an experiment in communication and collaboration? While neither may fit a laboratory definition of an experiment, they may be in the broadest sense be "experiments".
Certainly more directed and specific experiments are in order, but what form could they take? Who would do them, and who would pay for them?
Stan, I think you have some common misconceptions about what scientific experiments are for. First off, I believe your above statements form a false dichotomy; your first point addresses experimental method, your second addresses one of the goals of research - all research is 'trying new things' to some degree. We do controlled, repeatable experiments because it is orders of magnitude more effective than anything else. However, experiments don't 'prove' anything, this is not the purpose of experiment. Without sharing of results, there is no science. A move away from good experimental method will just mean that you will be much less likely to find true relationships: you may end up with effective models, you may not --- but you won't know why. If you refrain from sharing conclusions, you are making a step back into the pre-scientific era of knowledge, not a step forward.
In my opinion, the failure to improve software development practices by leaps and bounds is primarily due to a single cause: Trying to engineer without sufficient underlying theory. I don't believe that your 'third way' exists; systemic methods of knowledge improvement will always require reproducibility, and control is necessary for any successful experiment. But we are perhaps focusing too much on the word 'experiment'.
It seems that many view C.S. as a branch of mathematics - no experimentation needed. I believe this is false; that some amount of experiment is needed - however, the vast majority of practitioners don't know how to conduct an experiment properly. On the other hand, we have the 'Software as an engineering discipline' camp - who often seem completely oblivious to the massive gaps in their underlying knowledge. Some of these people have "cargo-cult engineering" stamped all over them: pretending that process alone will get you over gaps in fundamental knowledge.
So we have two problems. On one side, we have people pretending to be engineers in the same sense that a mechanical engineer is, for example - but this is not possible today. That doesn't mean there are not valuable lessons and techniques to be learned from the engineers, but they must be coupled with an understanding of what makes engineering work (and when it fails).
On the other side, we have a computer science community. There are huge gaps in the understanding of C.S.; these need to be delineated and communicated to the outside world. There are many areas of current C.S. that demand experiment, but many CS people either don't believe this, or do it in a half-baked way, never learning how to do this properly (e.g., it isn't taught in many CS graduate programs, as it would be in say Physics). Many people with little formal CS training (or perhaps a bachelors) end up trying to conduct experiments where they see the need, but they often don't know how to go about it either. Furthermore, CS people have to start doing science again. In north america at least, many of them have stopped. The influx of university-as-a-business and similarly broken ideas, plus the waxing of IP laws has lead researchers to play their hands 'close to the chest'. Every time I read a 'peer reviewed' article that does not contain enough information to reproduce the work, I shake my head. How can someone claim to have 'reviewed' it? In mathematics, the idea of leaving out details like this would be laughable.
Bottom line: much of the area I think you are covering with the statement 'where they don't apply' has a much simpler explanation: the work has never been attempted.
Pro trial and error or third way comments
You may want to look into the science of Sociology. A lot of Sociology is performed without controlled experiments, but rather by systematically observing social behaviour -- and using statistics to make and test hypotheses. Indeed, much of traditional software engineering research uses the methods of Sociology (e.g. the CMM, WattsHumphrey, and people of that ilk).
On the other hand, I don't think the paradigm shift here is in the methods of improving software development, but in the underlying assumptions about what is wrong with software development, and hence, what needs to be improved. Traditional methodologists assume that since the cost of change increases exponentially over time, then we need to control change. XP pulls out the Aikido approach and says that instead we need to be able to roll with change as efficiently as possible.
Let me jump in and make a few comments.
First, I am not sure "controlled, repeatable" experiments are valuable in software. Once you make the experiment controlled and repeatable, it is no longer possible to generalize the result. To make a statistically significant measurement requires a large amount of tests. In trying to recall my college statistics, I believe 20 - 30 test results each for the control and each variation. To be repeatable, you must have a fixed population of people to draw from and a single or fixed set of software programs to run the test with. To be feasible to complete, you are required to test single person programs that can be completed in a couple of days at most. Will these results provide insights into multiple person development which goes on for years?
Second, most talk about software development improvement is more focused on improving the management of software development and rarely discusses actual improvement of software development.
Any thoughts?
In "controlled, repeatable" experiments, the degree of control is tailored to the situation. Think of how little you have to control to demonstrate that the boiling point of water is 100 degrees C at sea level. Is the reaction to "controlled, repeatable" on this page coming out of an assumption of over-control, I wonder?
Look at what is restricted in this experiment. Atmospheric pressure (sea level) and presumably the volume for expansion (near infinite) and the purity of the water, if you get 100 degrees C as the result. The boiling point of pure water is a complex function and when you start throwing in impurities, the actual boiling point for a sample of water is almost impossible to predict. Using 100 degrees C as the boiling point of water is a useful approximation in many instances, but is not helpful when cooking in mountain tops or in pressure cookers.
To relate this back to software, it is not just a matter of creating a "controlled, repeatable" experiment, but the degree to which the results of the experiment can be generalized. Results of the experiment may not apply outside the bounds of the experiment.
Precisely, and yet too precisely at the same time. And hence the problem, and so the answer to my question, apparently, is YES.
The boiling point of water is presumably a continuous function of its (continuous) variables (the level of each type of impurity, the volume for expansion, the atmospheric pressure etc.) so that with a few controlled or at least closely monitored/recorded experiments we can then use the results to predict the results in a range of cases between (via interpolation) and beyond (via extrapolation) those tested in the experiments.
Software (and Software Development Processes) is more complex and its important characteristics (performance, availability; productivity, quality) are functions of huge numbers of variables, the variables may be discrete rather than continuous, and the characteristic functions may not be continuous over those variables.
This makes interpolation harder, and extrapolation much harder.
This entire page is founded on an unspoken premise: That the software development process is a repeatable, improvable discipline (i.e. say, like engineering).
If you think that what we do is more akin to writing great novels, or coming up with new proofs in mathematics, or tending a garden and helping it grow in the most aesthetically pleasing direction, then all this talk of "controlled experiments" immediately starts to sound like nonsense.
DO you believe that the software process is repeatable and improvable? I suspect not, but I'd be happy to discuss this point.
-- AlainPicard (p.s. I was a professional scientist for quite a while, which is why this page interests me. Now my business card says "system architect", and my bosses think I'm a "software engineer", but I don't really know what it is I do on a day to day basis. :-)
Other comments
I still haven't digested all of the above, so I don't have anything intelligent to say about it yet - but the bit about "standing on the shoulders" moved me to speak:
In software, we tend to repeatedly build on top of the systems produced by others - who, in turn, built their systems on top of previous systems. We think we get incremental improvement this way.
Radical computer scientist ChuckMoore would disagree; he'd say we get horribly slow, horribly wasteful, horribly complicated systems this way.
The important "shoulders" to stand on are not the programs written by our predecessors, but the knowledge gained from those systems. And sometimes that knowledge should lead us to toss those systems overboard and start anew. (Moore seems to always start anew; AlanKay is wont to BurnTheDiskpacks.)
If "BuildingOnlyUpward" is a paradigm of current software development, maybe we need to shift to a CreativeDestruction paradigm.
-- GeorgePaci
George, you might be interested in the book TheStructureOfScientificRevolutions by ThomasKuhn. Kuhn talks about "normal science" and "revolutionary science" (PleaseComment: I'm not sure if these are the exact words he uses). Normal science is the BuildingOnlyUpward phase where people use the understanding of what is known to see how far they can push it. Revolutionary science starts to happen when the younger crowd starts to see inconsistencies. They start to come up with new theories to explain the inconsistencies, and when a good one is found, the KuhnParadigmShift begins. This seems to be analogous to your CreativeDestruction phase.
Perhaps the majority of this page could be moved to a place for discussing the nature and merits of the scientific method and experimental techniques? Interesting topic in and of itself, but slightly divergent from a discussion of what a paradigm shift might look like. -- StevenNewton
A KuhnParadigmShift starts with a phase (normal science) where its mainstream theory is explored in enough detail that its flaws start to become apparent. The established scientists try to explain the flaws by making slight modifications or additions to the established theory, but because the flaws are fundamental to the theory, this doesn't work. I think this is what Stan is trying to point out with this page. Because of the nature of software development, improvement efforts based on the old models of process improvement are fundamentally flawed. Hence the need for (or rather, the impending probability of) a paradigm shift in software development improvement.
Paradigm Shifts (Please add to the list)
Hmmm. The first two proposed paradigm shifts seem to be wisdom gained from years of coding that they don't teach in school. In other words, the newbies repeat the same mistakes over and over until they, too, gain wisdom. So a question the third proposed paradigm shift brings up might be how do we CodifyAcquiredSoftwareWisdom?
I would question the third item. I don't believe you can have a systematic and methodical paradigm shift. Yes, we need systematic and methodic improvements, but major shifts do tend to come "magically" from an outside source.