One of the first reported times that people died as a direct result of software errors.
The Therac-25 was a radiation therapy machine made by Atomic Energy Canada Limited (later called Theratronics International Limited). The Therac-25 was a follow-on product to products called the Therac-6 and Therac-20, which were developed in conjunction with the French company GGR. Both the Therac-6 and Therac-20 had optional computer controls, but were primarily designed to be operated manually, and as such had numerous safety interlocks. The Therac-25 was designed to be entirely computer controlled. As such, it was decided not to include the same hardware safety interlocks, instead relying on the computer for all of this functionality.
Between June 1985 and January 1987, the Therac-25 was involved in 6 known accidents, several of them fatal, and even the non-fatal ones having gruesome side effects, like skin falling off, and shoulders and hips becoming completely immobilized.
[I saw a photo of one of the victims who died from overexposure. On their back was a perfect circle of pure black skin (the patient was very light skinned otherwise). The mind boggles at the dosage that would have been required to give that effect.]
Someone want to summarize the major defects in the Therac-25 software?
The fundamental errors were caused by RaceConditions between various concurrent activities within the system. Basic techniques of concurrent programming were either not known or ignored. This was exacerbated by project management that did not put sufficient emphasis on analysis of the design or testing of the implementation.
Perhaps an even more fundamental error was the reliance on software interlocks (aka if statements) for what was previously done with hardware interlocks...the race conditions couldn't have killed without this extra level of trust in the software. Indeed, one of the bugs was found in an earlier version of the Therac, but didn't cause any damage because a hardware interlock caught it. -- AdamBerger
[Additionally, the removal of hardware interlocks violates all recognized practices of electronics design. In any kind of control system you want sensors to tell the controller when an axis is near its limit. There also needs to be a switch or other cutout device that physically prevents the axis from crashing at the limit. For instance, a mechanical arm driven by a DC motor might have a switch with a diode across it at the end of travel. When the arm nears end of travel the controller slows down the arm and stops it before it hits the mechanical stop. If the controller fails the switch cuts off the drive current, but the diode allows the drive to be reversed and drive the arm away from its limit.
My first medical device gig was with H.G. Fischer (Fischer X-Ray). This was at the time they were developing their first computer-controlled X-ray system. The beta machine was at the University of Chicago Medical Center. The doctors wanted clean images despite the effects imaging would have on the patients -- who were mainly indigents, homeless, and other poor folk. The Beta machine had limits calculated for certain exposure values, and defaults of "none" once those bounds were exceeded. We couldn't believe it when the UCMC doctors were complaining that the machine was croaking at the high end of their studies until we saw what kind of exposure values they were using. There were a few jokes circulating around Fischer about the x-ray imager/metal cutter we had provided them.
All mechanisms have limits. All mechanisms need those limits recognized by both the hardware and software. If a mechanism's limits aren't easily transposed to software then the software needs safely low limits set into it before it is ever even tested.]
For details, see:
Perhaps related to CategoryQuality, CategoryDesignIssues, CategoryLegal?
CategoryHardware, CategoryRealTime, TheCaseOfTheKillerRobot, FbiVirtualCaseFile