Richard Feynman’s observations on the reliability of the Space Shuttle.
19 Apr 2013I’ve been reading a lot on Richard Feynman lately. I find his character and his unique approach to learning appealing.
In the book “What Do You Care What Other People Think”, he reminisces about his time on the Rogers Commission investigation into the Space Shuttle Challenger disaster. The book contains his appendix to the report. These are Feynman’s personal observations : Appendix F – Personal observations on the reliability of the Shuttle
A few key points stood out to me that are relevant to how we build software.
- Becoming immune to small failures. NASA ignored minor errors, and modified their quality certification to account for these errors. NASA did this without investigating the systemic failures behind the errors.
- It didn’t fail in the past, therefore it will keep on working.
- Difference in culture. During the Apollo program, there was shared responsibility. If there was a problem with an astronaut’s suit, everyone was involved till the problem was solved. In the Space Shuttle era, someone else designed the engines, another contractor built the engines and someone else was responsible for installing the engines. They lost sight of the common goal. It was someone else’s problem.
- The Space Shuttle was built in a top down manner (big design up front ?). There was constant firefighting to keep it all working. The engines were rebuilt each time. Instead of a bottom up manner, using parts that were known and proven to work.
- His observations appreciates the efforts of the software engineering team though. Their testing was rigorous and he wonders why other teams were not as good as them