But how much work the software does is not what makes it remarkable. What makes it remarkable is how well the software works. This software never crashes. It never needs to be re-booted. This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats : the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.
The article goes on to discuss what the team does to ensure the software is this good and why it's so absolutely critical that it is flawless. The answer to "why" is fairly obvious. Think of a multi-ton spacecraft lifting off and shooting into space. Think of a spacecraft moving at 17,500 miles per hour and needing to adjust course precisely to hit a re-entry point. Think of the crew sitting inside. Wow! It gives you a whole new perspective on what it could mean if you have bugs in your code.
I agree with the article that software development seems to be in the "cave man" phase of evolution. I see a lot (and I mean a lot) of buggy code and bad SQL. And the prevailing attitude is usually "log a defect...we'll prioritize it and get it fixed when we can". What if the code that controls the space shuttle was treated that way?
I get it that having bugs in the software that runs our businesses doesn't mean life or death. I also understand that often the speed with which new features in software are delivered is more critical than delivering that feature "bug free" or, for that matter, delivering it knowing it will perform well! But knowing that the ability to create code of this quality is possible makes me wish (even more so than I already do) that the software development process was approached with greater care by everyone. Not just by the developers of code that must be bug free or people could die.
What if the code you wrote would mean someone died if it had a bug? What would you do differently? While I know it may be a poor analogy, this makes me think of the "2 minute offense" in football. The team has been playing poorly until the last 2 minutes of the game. Then, they go into this hurry up offense and really push themselves to score. My question has always been why don't they just play all out the rest of the game?
It seems to me to be similar to how software development is approached. The job gets done, often not very well, but everybody goes into 2 minute mode when the code goes into production and people start screaming about bugs and performance. If the screams are loud enough, things get fixed. It just seems that it could be done so much better to start with.
I understand the arguments for why it doesn't happen "the right way" first. But after reading this article, it makes me marvel at how cool it would be to be part of a team that looks at their code as being something that MUST NOT have bugs...ever. And, they create the processes that support achieving perfect code. I'd think those folks really like what they do and take great pride in what they develop knowing that their diligence and good work keep people alive and keep our space program running.
Cool. Very, very cool.