Judgment Calls. Thomas H. Davenport
and discussion continued in the Operations Support Building, Mike Ryschkewitsch brought more than one hundred other engineers into the conversation by way of an e-mail thread. He recalls, “People thought I was doing my own e-mail, but of course I wasn't. I'm sending messages out to all of these guys, with things like ‘I just heard this; do you all agree?’ … I sent maybe two to three hundred messages in all. It allowed me to have my own equivalent of a back room caucus … We are in the kitchen talking and having little arguments back and forth. Right at the very end of the very long day I put a poll out to everyone and said ‘Are we ready to go or not?’ ”
Commenting on the length and openness of the session, NASA chief safety and mission assurance officer Bryan O'Connor remarked, “Gerst [FRR chairman Bill Gerstenmaier] was absolutely open. He never tried to shut them down. Even though he could probably tell this is going to take a long time, he never let the clock … appear to be something that he was worried about.”
That is not to say that Gerstenmaier was indifferent to the launch schedule. Discovery was scheduled to deliver the final set of solar arrays needed to complete the International Space Station's electricity-generating solar panels, enabling the station to support an expanded crew of six. If STS-119 launched later than March 15, it would interfere with the March 26 mission of the Russian Soyuz vehicle to transport the Expedition 19 crew to the station, and would push back future U.S. launches. Late in the day, Gerstenmaier reminded the group of these risks to the Space Station program and the shuttle schedule. A few participants perceived his comments as pressure to approve the flight. Others saw it as appropriate context setting, making clear the broader issues that were part of their collective judgment. After he spoke, he gave the groups forty minutes to “caucus,” to discuss what they had heard during the day and decide on their recommendations. When they came back, he polled the groups. The engineering and safety organizations and some center directors in attendance made it clear that they did not find adequate rationale to fly STS-119.
NASA manager Steve Altemus summed up the decision: “as a community we never really got our arms around the true risks.” Bill McArthur, safety and mission assurance manager for the space shuttle at the time, said, “The fact that people were willing to stand up and say ‘We just aren't ready yet’ is a real testament to fact that our culture has evolved so that we weren't overwhelmed with launch fever.” That syndrome in the past had led dissenting voices to be discouraged and, even worse, treated with disdain. As the participants filed out of the meeting, Joyce Seriale-Grush said to Mike Ryschkewitsch, “This was really hard and I'm disappointed that we didn't have the data today, but it feels so much better than it used to feel, because we had to say that we weren't ready and people listened to us. It didn't always used to be that way.”
Clearly, the culture and norms around open discussions and speaking truth to power had changed considerably from the days of Challenger.
Discovery Mission Success
After the FRR, Gerstenmaier had doubts about making a March 15 launch date, but he decided to “kick it back to the team, give them the action, see what they can go do and see how it comes out.” Approximately one thousand people worked intensely on the problem.
A breakthrough came when an eddy current system, typically used to test the integrity of bolts, was successfully adapted to check for cracks in valve poppets without affecting that hardware. Results of those tests gave engineers and managers confidence that the risks of another valve malfunction were acceptably small.
The third FRR, on March 6, led to a “go” decision. “By the time we eventually all got together on the last FRR the comfort level was very high,” said O'Connor. “For one thing, everybody understood this topic so well. You couldn't say, ‘I'm uncomfortable because I don't understand.’ We had a great deal of understanding of not only what we knew about, but what we didn't know about. We had a good understanding of the limits of our knowledge as much as possible, whereas before we didn't know what those were.”
In the final decision-making session, the astronauts slated to take the controls of Discovery were also in the room. Sitting before the scientists, engineers, and managers, they were a powerful and visible reminder of the stakes of the decision. They too had a vote for readiness. If ever there were an example of a decision-making process engaging—critically—the people who had to live with the judgment made, this would be it.
STS-119 was approved for launch on March 11. After delays due to a leak in a liquid hydrogen vent line (unrelated to the valve problem), Discovery lifted off on March 15, 2009, and safely and successfully completed its mission.
The Persistence of Good Judgment
Several elements of the process used to analyze, delay, and eventually approve the launch contributed to NASA's good judgment. Overriding all was the design of an FRR problem-solving process, which brought together so many and varied experts and interested parties in one room and also through a series of well-orchestrated offline working sessions. Discussions were artfully facilitated so stakeholders could listen to one another and discuss their findings and opinions in a “truth first, hierarchy later” kind of way. Widespread, “democratic” polling (rather than, say, providing information to a few senior managers who would make the decision themselves) was another hallmark of the process. The FRR designed into the process the presence and influence of multiple viewpoints and sets of expertise—an important component of judgment, and one that study after study has shown to be critical to successful decision making.5
The extent and quality of testing and research, including having separate groups take different approaches to the same problem, were also important. The scientific culture of NASA, and its commitment to having the best possible factual information, was another critical aspect.
Understanding the importance and potential consequences of a decision—in this case, seeing the astronauts whose lives depend on the shuttle technology and, at the same time, understanding the legitimate need to fly as soon as it is possible to do so safely—was a living example of another invaluable dimension of building judgment into this process: tightly linking the stakes of the decision to the accountability for what would be decided.
The leader's approach and style in managing this entire process was also fundamental.6 Decision-making processes are often subverted by a leader who pays lip service to consultation, going through the motions of openness while pushing the group toward the choice he's already made. Similarly, some attempts to create a decision-making process fail because they go to the opposite extreme—encouraging endless discussion and lowest-common-denominator consensus building.
NASA colleagues praised Gerstenmaier's openness to discussion and debate while still keeping his eye on the need to find the best possible answer in the context of project constraints. Several of the participants in the FRRs commented generally on how NASA's culture has been changed, learning from the tragedies that came before—away from “launch fever” and much more balanced toward safety; away from discouraging or disregarding dissent from engineers and others, but rather embracing scientific inquiry as a critical partner in final decision making.
The earlier, fatal errors in the shuttle program happened in part because of the agency's tendency to think of space flight as routine—operational rather than experimental—when it in fact remains a risky endeavor that tests the limits of complex technology designed to control immense forces. Gerstenmaier notes that the probability of failure of a shuttle mission with no obvious technical problems is about 1 in 77—not odds that any of us would accept in commercial aviation. In “Some Safety Lessons Learned,” Bryan O'Conner writes about the importance and difficulty of fighting complacency:
Countering complacency is arguably harder than recovering from a mishap. We have to find creative ways to counteract the common psychological tendency to assume that a string of successes means that we have somehow reached a state of engineering and operational perfection—and, therefore, immunity from failure.7
Sustaining a commitment to safety—and to good judgment—requires constant, vigilant attention to both processes and culture.