A Matter of Millimeters: The story of Qantas flight 32
On the 4th of November 2010, a Qantas Airbus A380 was rocked by a catastrophic engine failure minutes after takeoff from Singapore, hurling fragments of a turbine disk through its wings and fuselage in multiple locations. The explosion damaged almost every major system on the airplane, from the flight controls and fuel tanks to hydraulics and pneumatics. Faced with a barrage of diverse failure warnings and an airplane of uncertain integrity, the flight crew worked together to make a series of critical decisions that would get their enormous airplane back on the ground. And in the end, despite one curveball after another — including landing gear problems, loss of braking power, and an engine that refused to shut down — they not only landed the plane, but did so without putting a scratch on any of the 469 passengers and crew.
The cause of the incident would ultimately be traced deep inside the number two engine to a single oil pipe that had been manufactured with a wall that was slightly too thin. How this seemingly tiny defect came about, and how it nearly brought down the world’s largest passenger plane, represent a story equally as fascinating as that of the flight itself, tracing back years to encompass questionable drawing board decisions, hidden flaws in the machining logic, and faulty assumptions about engine behavior. Time and time again, the problem slipped through the gaps in the system, tumbling down the long slope toward disaster — only to be stopped at the last moment, not only by the pilots themselves, but by a number of explicit protections built into the design of the A380, each of which played a crucial role in containing the fallout from a failure that exceeded the manufacturer’s worst expectations. The story of Qantas flight 32, as told herein, is therefore not only the tale of a dramatic emergency, but a testament to the safety of aviation today — a story that should make every reader feel a little less fearful of flight.
◊◊◊
Introduced to service in 2007, the double decker, four-engine A380 is by far the largest passenger aircraft in the world, exceeding the vaunted Boeing 747 in almost every measurement. Awesome to behold and pleasant to fly, there’s not much to dislike about the A380 — unless you’re an airline, in which case most major carriers dismissed the behemoth as too large for their operating models. Indeed, analysts’ assessment today is that the A380 was built for a market that had shrunk considerably by the time it actually entered service, and as a result, production ended in 2021 with only 254 built, some of which have already been scrapped. For an extraordinarily expensive aircraft equipped with some of the most advanced technology of any airliner, the collective result of immense effort and imagination, such a fate is unfortunate, but the reasons behind it, and the future of the type, are topics for a different article.
Despite the above, some airlines did find the A380 suitable for their operations, including Qantas, the flag carrier of Australia. Qantas ordered 12 A380s to be delivered in 2008, of which ten are still flying today. It was the very first of these, registered as VH-OQA and nicknamed Nancy-Bird Walton after the pioneering Australian aviator, that would come to be involved in the events of the 4th of November 2010.
On that date, VH-OQA arrived in Singapore for a scheduled stopover on a marathon London-to-Sydney trip, where it took on fuel, passengers, and a new flight crew for the last leg to Australia. The plane was essentially full, with 440 of 450 passenger seats filled, plus a massive complement of 29 crewmembers, including no less than five pilots. Although the A380 is normally flown by only two pilots, Qantas had also rostered a second officer as a relief crewmember; a check airman was conducting a line check on the captain; and another check airman was training the first check airman, so the cockpit was certainly crowded.
In command, and under examination, was 53-year-old Captain Richard Champion de Crespigny, a veteran airman with over 15,000 hours of experience and 32 years in aviation. The other crewmembers consisted of First Officer Matt Hicks, Second Officer Mark Johnson, Check Captain Harry Wubben, and Senior Check Captain David Evans. The five crewmembers had a combined 140 years in aviation and 71,000 hours of flying experience, an incredible total that is rarely equaled.
At 9:56 a.m. local time, with Captain de Crespigny at the controls, Qantas flight 32 departed Singapore and proceeded southeast across the strait toward Indonesia, passing over the densely populated island of Batam. All parameters still appeared normal as the A380 climbed through 7,000 feet, four minutes after takeoff. There was no indication that a catastrophic failure was in fact just seconds away.
◊◊◊
The Airbus A380 is powered by four massive Rolls-Royce RB211 Trent 900-series high-bypass turbofan engines, each producing up to 84,000 lbf of thrust. Designed specifically for the A380, the Trent 900 was produced at several locations in the United Kingdom and was sold in competition with the American-built GP7200 jointly developed by Pratt & Whitney and General Electric.
Understanding what happened aboard Qantas flight 32 requires that I subject you to a description of the structure of certain very specific parts of the engine.
Like all high-bypass jet engines, the Trent 900 consists of four main sections: the fan, the compressors, the combustion chamber, and the turbines. During normal operation, air is forced backward and pressurized by a series of increasingly powerful compressors before being fed into the combustion chamber, where it is mixed with fuel and ignited. The combustion creates mechanical energy that spins a series of turbines, which in turn power the compressors, as well as the fan at the front of the engine, which accelerates large quantities of so-called bypass air around the outside of the engine core to generate most of the thrust output.
Most turbofan engines examined in my articles have a high pressure compressor and a low pressure compressor, which correspond to high- and low-pressure turbines. These turbines are connected to their respective compressor sections by concentric drive shafts. However, the Trent 900 differs from this layout slightly, because it also has an intermediate pressure compressor with a corresponding intermediate pressure turbine, in between the high and low pressure sections. It also differs in that the fan itself doubles as the low pressure compressor. (From now on, for brevity’s sake, the abbreviations LP, IP, and HP will be used for low pressure, intermediate pressure, and high pressure, respectively.)
On the Trent 900, the HP and IP turbine sections at the rear of the engine each consist of a single-stage turbine disk. Blades around the circumference of each disk capture mechanical energy from hot combustion gases flowing through what is known as the “annulus gas path.” These gases spin the disk, which is attached by a drive arm to its associated drive shaft. The shaft then transfers the turbine’s rotational energy forward to the corresponding compressor at the front of the engine.
In order to support the turbine disks and shafts while still allowing them to rotate freely, the engine features a complex system of bearings. The HP and IP turbine sections have a common bearing assembly, called the HP/IP bearing hub, which encircles the drive shafts and holds them in place while allowing free rotation of the shafts within.
To prevent wear on the bearings, the space inside the bearing hub, called the bearing chamber, is constantly filled with pressurized oil that keeps everything gliding smoothly. This oil is supplied primarily by an oil feed pipe that runs from the main engine oil supply, past the annulus gas path, and down into the HP/IP bearing hub, where it injects oil through a filter and into the bearing chamber.
The structure of the bearing hub consists of an inner and outer hub section, such that an empty buffer space exists between the bearing chamber and the rest of the engine. The oil feed pipe runs through both sections in order to reach the bearing chamber. The last segment of this pipe, a few centimeters in length, is welded in place during manufacture of the bearing hub, and the main portion of the pipe is fitted to it later. This fixed final segment that passes through the buffer space between the inner and outer sections of the HP/IP bearing hub is called the “stub pipe.”
For reasons that will be examined in detail later, this tiny oil feed stub pipe triggered an escalating sequence of events within the space of mere seconds as Qantas flight 32 climbed away from Singapore.
Within the №2 engine, located in the inboard position on the left wing, a crack in the oil feed stub pipe caused it to begin leaking oil less than four minutes after takeoff. The leak was small, but the oil within the pipe was highly pressurized, resulting in an atomized spray into the buffer space between the inner and outer sections of the HP/IP bearing hub. The temperature inside this buffer space was likely between 365 and 375˚C, well above the 280-degree auto-ignition temperature of the engine oil, so the spray immediately ignited.
At the forward end of the bearing hub, a triple seal is attached to the hub structure to isolate the buffer space, which has a low internal pressure, from the area behind the high pressure turbine disk, where pressure is higher. But as the fire in the buffer space expanded, it impinged upon this seal, causing it to fail. With the low-pressure area no longer sealed off, highly pressurized air from the annulus gas path was drawn toward it, bursting down past the high pressure turbine disk and into the buffer space. This rush of air coming in through the forward end of the buffer space blasted the fire aft, propelling it like a blowtorch against the corresponding seal at the other end, which also failed. The burning oil “blowtorch” was then able to make direct contact with the drive arm connecting the IP turbine disk to its drive shaft, located directly behind the HP/IP bearing hub. Already under considerable stress as a result of normal engine operation, the drive arm failed under the withering heat within a matter of seconds.
This entire sequence, from the initiation of the oil leak to the failure of the drive arm, lasted considerably less than one minute.
Now, with the drive arm broken, the IP turbine disk was not directly connected to anything, which is (to use the technical jargon) a really big problem. A turbine disk is subjected to considerable mechanical energy from the annulus gas path, and that doesn’t go away when the drive arm breaks, but what does go away is any ability to transfer that energy somewhere other than the turbine disk itself. During normal operations, this energy is used to turn the intermediate compressors, which requires considerable force. But this load path only exists when the turbine disk is connected to the drive shaft, so when the drive arm broke, the only way for the disk to absorb all that energy was to spin faster… and faster… and faster…
Within the space of four seconds, the energy from the annulus gas flow accelerated the IP turbine past its critical speed, until centrifugal forces exceeded the ultimate strength of the nickel alloy disk. The red-hot, wildly spinning disk instantly fractured into several sections, which rocketed outward in multiple directions at incomprehensible speed.
Engine designers do their best to make sure that the engine structure can contain flying debris like individual fan and turbine blades, but the amount of energy contained within those types of projectiles is a fraction of that within a piece of a burst disk. For engineering purposes, disk fragments are assumed to have infinite energy at the moment of release; they will cut through any reasonable material and cannot be contained.
When the IP turbine disk on Qantas flight 32 suddenly burst, nothing could stop the resulting fragments, which cut through the engine case and cowling like butter, cleaving a neat circumferential fissure all the way around the engine within the plane where the disk used to be. Within a fraction of a second, one of these disk fragments exited the engine downward and to the left, propelled toward the distant ground, but several more traveled in the opposite direction. One fragment shot to the right, entered the belly area below the cargo hold, plowed through several structural stringers and wire bundles, then exited out the other side, never to be seen again. Two more fragments rocketed upward, carving paths of destruction through the interior of the left wing before emerging from its upper surface, at which point they too disappeared into space. Numerous smaller pieces, such as dislodged turbine blades and fragments of engine structure, also peppered various parts of the aircraft, including the wings and fuselage.
On board the plane, the pilots and passengers heard two distinct bangs in very short succession, which were followed in the cockpit by a sudden, overwhelming cascade of warnings. The plane yawed slightly to the left and the autothrust system disconnected. Recognizing that a serious malfunction had occurred, Captain de Crespigny pressed the altitude hold button on the autopilot to level the plane, and then everyone’s attention turned to the ECAM.
The Electronic Centralized Aircraft Monitoring system, or ECAM, presents the crew with warnings, cautions, and advisories related to the degradation or failure of a huge range of aircraft systems, along with the associated abnormal or emergency procedures. A feature of most modern airliners, ECAM displays have transformed the way pilots deal with in-flight emergencies, helping ensure swift and correct action in response to almost any conceivable mechanical failure. Today, this system would be put to the test.
The first message to appear on the ECAM screen was an ENG 2 TURBINE OVERHEAT warning, which was followed within the next 20 seconds by another 34 messages of varying alert levels. The highest priority messages, displayed at the top of the list, all indicated a major problem with the №2 engine, but strangely enough, not its outright failure, because the engine — minus the IP turbine disk and everything in line with it — was in fact still turning. Following the indicated ECAM actions, Captain de Crespigny reduced power to the damaged engine, but instead of correcting the problem, an “ENG 2 FIRE” warning flashed on the screen for one or two seconds, then disappeared.
As First Officer Hicks made a pan-pan call to air traffic control, one level short of “mayday,” the crew decided that the engine was likely severely damaged and elected to shut it down. As they moved to reduce power, an ENG 2 FAIL message finally appeared on the ECAM. The pilots also attempted to actuate both of the engine’s two built-in fire extinguishers, but only one of them actually fired, and neither of the confirmation lights illuminated. However, as they scrolled through the seemingly unending list of ECAM alerts, only one or two of which could be displayed at one time, it became clear that the engine and its fire extinguishers were hardly the only systems having issues.
When the fragments of the IP turbine disk passed through the wing and belly of the A380, they caused considerable secondary damage along the way, not only to the №2 engine but also to the left wing fuel tank, which sprang a leak; the leading edge wing slat extension mechanism, which took a direct hit; and the aircraft’s electrical system. Additionally, two wire looms in the leading edge of the wing and in the belly area were completely severed, collectively affecting about 650 wires that carried critical information to and from almost every conceivable aircraft system. A brief and hardly exhaustive summary of the damage to the aircraft included impact damage to the left wing upper and lower skin, front spar, internal ribs, and wing-to-fuselage joint; loss of control over the green (left) hydraulic pumps; degradation of the yellow (right) hydraulic system; loss of AC electrical power from engines 1 and 2; loss of one of four AC power distribution systems; loss of functionality of all leading edge slats; partial loss of functionality of the spoilers and ailerons; degradation of control over all remaining engines, resulting in loss of the autothrust system; loss of functionality of the left wing landing gear brakes and anti-skid on the right wing gear brakes; severe disruption of the pneumatic system; loss of both the №1 and №2 low fuel pressure shutoff valves; degradation of the fuel quantity indicating system; reduction in fuel transfer capability; loss of the fuel jettison system; loss of fire protection capability on engines 1 and 2; and much more besides.
One of the most serious malfunctions was the loss of the green hydraulic system. Although the system itself was not breached, the severed wires stopped all the green hydraulic pumps, resulting in a loss of pressure that rendered inoperative all systems reliant upon it. Unlike some other wide body aircraft, the A380 has only two hydraulic systems — green and yellow — which mainly supply the left and right sides of the aircraft, respectively. Instead of more redundant hydraulic systems, the A380 has independent backup hydraulic actuators on each individual flight control surface, ensuring that even a total loss of both hydraulic systems will result in minimal flight control difficulties. Nevertheless, direct damage had degraded the plane’s roll control, eliminating the left middle aileron (the A380 has three on each wing) and left wing spoilers 4 and 6. The loss of green hydraulic pressure also caused the failure of the outboard ailerons on both wings, spoilers 2 and 8 on each wing, and spoiler 4 on the right wing. However, the built-in redundancy in the A380 was so great that the remaining spoiler and aileron panels were sufficient to maneuver the airplane despite what amounted to a 65% loss of roll control capability.
Initially, the hydraulic system indications confused the crew. Their initial impression was that the damage was confined to the left side of the airplane, so they were surprised to see an indication related to the yellow (right) hydraulic system mixed in with all the other ECAM messages, and they briefly considered whether the messages might be spurious. The same possibility was considered by Qantas technicians on the ground, who were receiving live telemetry data from the aircraft. But by accessing detailed electronic status pages for each aircraft system and cross-checking what was not working, the pilots were able to come to the conclusion that all the failures were most likely real.
At that point the crew needed to make a decision: should they land immediately, or should they wait to action all of the abnormal procedures associated with the dozens of ECAM messages? Assessing that the aircraft was controllable with the autopilot both on and off, they eventually came to an agreement that it was safe to remain airborne, and that they would rather make sure all crewmembers had a complete understanding of the health of every aircraft system before coming in for a landing. The last thing they needed was a surprise on final approach that would force them to go around.
Having chosen this course of action, the crew requested clearance to enter a holding pattern, and ATC granted them permission to circle over the ocean northeast of Singapore. The crew estimated that it would take at least 30 minutes to work through all of the ECAM actions.
In the meantime, the cabin crew had been attempting to get the pilots’ attention using the emergency call button, but all the pilots were so focused on the failures that they initially didn’t notice. Only now did they send Second Officer Mark Johnson to assess the situation in the cabin, whereupon a Qantas pilot riding as a passenger in the upper deck drew his attention to the in-flight entertainment system, which featured a live view of the aircraft from a tail-mounted camera. The digital stream clearly showed a much more literal stream of fuel pouring from the left wing and into the aircraft’s wake, which was also visible with the naked eye from the lower deck. Johnson proceeded down to check for himself, at which point he also observed for the first time that there were two gaping holes in the top of the left wing, surrounded by jagged metal, where the turbine disk fragments had exited.
According to Check Captain David Evans, the cabin crew were concerned that so many passengers were watching the live feed from the tail camera, but in a collective decision, the pilots elected not to turn it off because, in their view, the feed suddenly cutting out would probably be more alarming than anything that could be seen on it.
◊◊◊
Meanwhile on the ground, events were taking an unexpected turn. On Batam Island in Indonesia, debris from the №2 engine plunged into a populated area shortly after the failure, resulting in surprise and alarm. Among the debris was a large portion of the failed IP turbine disk, which fell with such force that it cleaved straight through a building, razing a brick wall. Thankfully, no one on Batam was hurt by the debris. However, photographs of locals holding airplane wreckage in what appeared to be Qantas livery were soon posted to Twitter, where they were taken as indications that a Qantas airplane had actually crashed somewhere over Batam. Qantas engineers already knew that the plane was still flying, but they were unable to contact the crew to find out more information. And outside that bubble, the news that a Qantas A380 had possibly gone down spread so quickly that even investors reacted while the plane was still in the air. In fact, the first time Qantas’s CEO learned of the situation was when he received a call asking why the company’s stock price was dropping.
Up in the air, such concerns were far from the minds of the crew. Working through one ECAM procedure after another, they slowly stabilized various disrupted systems, but as they did so, one problem in particular was getting worse: their fuel situation. With fuel leaking prodigiously from the left wing tank, an imbalance had developed between the left and right wings, prompting an ECAM alert instructing them to open the fuel transfer valves to equalize the fuel levels. However, with a leak clearly occurring and other ECAM messages indicating damage to the fuel transfer system, after a detailed discussion the pilots concluded that following the computer’s instructions would be inadvisable — a great example of why experience and judgment are still necessary even on an aircraft as highly automated as the A380.
In the end, it took 55 minutes to clear all of the ECAM messages, an unprecedented length of time that was certainly far beyond anything any of the pilots had previously imagined. But even then, more considerations remained before they could land. Most notably, the airplane was still more than 40 tons over its maximum landing weight, and among the failures that had occurred was the loss of the fuel jettison system, so it would be impossible to reduce their weight by dumping fuel. The possibility of remaining in the air until they had burned the extra fuel was considered, but with an increasing fuel imbalance between the right and left wings and a 65% loss of roll control, the pilots determined that this would be irresponsible. Their conclusion was that they had to land soon, even if they were over the maximum weight. But would any of the runways at Singapore be long enough to accommodate a significantly overweight A380 with degraded brakes, several faulty spoilers, and no reverse thrust on engine 2? There was certainly reason to doubt.
In order to find out for sure, Captain de Crespigny instructed the two Check Captains to determine their required landing distance using the Airbus performance software installed on laptop computers stored in the cockpit. The detailed software allowed them to enter various parameters including the weather, runway condition, and any systems failures, and then calculate whether it was possible to land. But when everything had been entered, the program spit out an unhelpful answer: “no result.”
When calculating the landing distance, the software applied a generic “operational coefficient” to account conservatively for variations in pilot techniques that could result in less efficient deceleration. The problem, as investigators would later discover, was that the software applied the coefficient again whenever another system failure was added. With so many system failures on the aircraft, the coefficient was applied a total of 9 times, resulting in a calculated landing distance considerably greater than the length of the available runway. However, Check Captain Evans was able to fix the problem by manually entering their actual landing weight, overriding the program’s assumption of a maximum landing weight. By specifying a landing weight in excess of the maximum, the system logic changed to apply the operational coefficient only once — for unrelated and obscure reasons — and lo and behold, when he ran the numbers this time, the computer said they could just barely land on any of the 4,000-meter runways at Singapore Changi Airport, with only 100 meters to spare. It wasn’t much, but with no better runways anywhere nearby, it would have to do.
Finally ready to commit, Captain de Crespigny now removed the aircraft from its holding pattern and began maneuvering for approach. The crew requested fire trucks on landing due to the fuel leak, and advised the cabin crew to prepare for an on-ground emergency if they overran the runway. They aligned with the runway from considerably farther away than usual in order to minimize maneuvering, since the controls felt rather sluggish — mainly the result of the degraded roll control. Captain de Crespigny subsequently carried out further manual control checks during the approach to make sure that the controllability characteristics remained consistent as First Officer Hicks progressively extended the flaps. The pilots also decided to leave engines 1 and 4 at a constant thrust level and adjust their airspeed using only engine 3, because that engine was least affected by the various electronic control system failures.
The problems only continued, however. Due to the failure of the green hydraulic system, the landing gear did not drop when the gear lever was selected “down,” forcing the pilots to use a backup system that gravity drops the gear instead. Fortunately, this was successful. Captain de Crespigny also had to maintain a very narrow airspeed band, because there was only a fine margin between the slowest safe speed in the air, below which they would stall, and the highest safe speed on touchdown, above which they would overrun the runway. De Crespigny later recalled that the safe band between these speeds was probably only three or four knots. At one point, the airplane even generated an automatic low energy alert, warning that their airspeed was dropping too low for their present configuration — so de Crespigny increased power slightly on engine 3, and the alert went away.
Nevertheless, with the help of the Airbus’s still operational fly-by-wire system and mostly undamaged controls, de Crespigny was able to thread the needle, greasing it onto runway 20C at Changi Airport at 11:46, just shy of two hours after the flight took off. A stall warning briefly sounded just before touchdown, but a split second later they were on the ground and the matter was moot. The pilots applied those brakes that were still working, activated reverse thrust on engine 3 — the A380 only has reversers on its inboard engines — and hoped that it would be enough. The deceleration was not exactly exuberant, but considering how much it takes to stop something the size of an A380 even on a good day, it was impressive enough. Nevertheless, it was not until fairly late in the rollout that the pilots felt certain that they would stop on the runway, and ultimately the performance software was not far off the mark: by the time the plane ground to a halt, only 150 meters remained of the 4,000-meter runway.
And yet at that point, even if some of the passengers thought their ordeal was over, it turned out that there was more to come. As the pilots shut down the engines, they observed that the brake temperature on the left body gear brakes had risen alarmingly. These had been the only working brakes on the left side of the aircraft and were subject to considerable strain during the landing, causing them to overheat; the overheating in turn caused four tires to deflate. To make matters worse, fuel was still leaking from the wing, and there was genuine concern that a fire could erupt if spilled fuel contacted the hot brakes. And on top of that, when engines 3 and 4 were shut down, the plane lost electrical power, and when they tried to start the auxiliary power unit, it wouldn’t hook up the electrical system because of damage to the distribution infrastructure. Operating solely on emergency electrical power, the plane now had only one working VHF radio, and it took a few moments for the crew to figure out which one that was so that de Crespigny could contact the airport fire services.
Upon making contact with the fire crew, de Crespigny urged them to cool down the brakes, but the firefighter in charge replied with surprising news: the №1 engine was still running, even though the crew had already carried out the shutdown procedure. Damage to systems in the wing had rendered the №1 fuel shutoff valves inoperative, preventing the crew from shutting the engine down by normal means. Both fire extinguisher bottles in engine №1 were also inoperative, preventing the crew from shutting it down by pulling the emergency fire handle. Unable to resolve the issue, de Crespigny instead urged firefighters to approach the brakes while staying as far as possible from the inlet and exhaust ends of the №1 engine. If firefighters got too close to the inlet, they could be sucked into the fan; alternatively, approaching too close behind would result in severe burns, not to mention the jet blast. Nevertheless, the fire crews managed to get close enough to douse the brakes with foam, averting a conflagration.
Throughout this period, the crew also debated whether or not to evacuate the passengers. An evacuation is not always an easy call: statistics show that around 5 to 10% of passengers who evacuate by the emergency escape slides sustain serious injuries, and with 440 passengers on board, including elderly and disabled, that was potentially a rather large number of people. Considering that the fire risk had been tamped down, the crew ultimately agreed that the safest place for the passengers was on the plane — at least for the moment. The flight attendants were asked to move to their stations on the right side of the aircraft to be ready for an emergency evacuation, should the calculus suddenly change, while the pilots attempted to call for a set of stairs to be brought to the aircraft. But with the only working radio being used to liaison with the firefighters, the only other working communications systems aboard the powered-down airplane were the pilots’ cell phones. It took quite some time, and several abortive attempts, before they managed to get through to someone at Qantas who could call the airport services company at Changi Airport and tell them to send a set of boarding stairs.
After 50 minutes aboard the increasingly sweltering airplane — without power, there was no air conditioning — the stairs finally arrived, and the disembarkation began. It ultimately took an hour to get everyone off through a single exit in an orderly manner, but in the end, all 440 passengers walked away without a single injury.
As for the still-running №1 engine, Qantas engineers eventually concluded that the only way to shut it down was to drown it with firefighting foam. Firefighters then poured huge amounts of water and foam into the intake, which proved successful, as the engine finally spooled down and stopped at 14:53, more than three hours after the aircraft landed.
◊◊◊
The news that the crippled Qantas A380 had landed safely in Singapore came as a great relief to all, especially those who were on board the airplane, who immediately heaped praise on the crew. Captain de Crespigny, in an exemplary act of professionalism, even handed out his personal phone number and stayed for two hours in the Qantas lounge in Singapore to answer passengers’ questions about the flight. He and his crew were hailed as heroes, and not without reason. The sheer number of failures aboard the A380 eclipsed almost any other incident, at least on an aircraft with modern failure monitoring technology. Levelheaded decision-making, teamwork, and crew resource management helped the crew collectively determine the course of action least likely to result in injury or loss of life, with perfect results.
The aircraft itself, however, helped quite a lot. The design of the flight control system ensured that the impact on controllability was limited, even with serious damage to the ailerons and the loss of one of two hydraulic systems. Despite its size, the A380 is known to respond very gracefully to control inputs thanks to the design of its fly-by-wire system, which during the incident remained fully intact, allowing the pilots to focus primarily on decision-making rather than handling the airplane. The ECAM and the detailed systems pages also helped the pilots take full stock of the extent of the failures in a way that was not necessarily possible two decades earlier. Without minimizing the actions of the crew in any way, it’s also fair to say that the design of the A380 incorporated such redundancy and such high safety margins that the risk of a catastrophic crash was probably very low, even with so much damage to the airplane.
However, there were a couple of places where the potential for worse damage existed. It goes without saying that if any of the turbine fragments had entered the passenger cabin, there would have been injuries, if not fatalities, even if the plane later landed safely. And perhaps even more worrying, investigators later found signs of a very brief flash fire inside the left wing fuel tank, which likely occurred when an extremely hot turbine fragment contacted fuel vapors in the tank ullage. Research conducted as part of the investigation into the 1996 crash of TWA flight 800, which was caused by an explosion of the center fuel tank, found that the wing tanks on commercial airliners contained flammable fuel-air mixtures around 7% of the time. However, on Qantas flight 32, the temperature in the tank was too low for the fuel-air mixture to reach a flammable concentration, and investigators determined that the brief ignition of vapors during passage of the turbine fragment likely failed to raise the temperature of the rest of the fuel sufficiently to sustain combustion. Had this fuel continued to burn, causing an explosion or sustained wing fire, then the outcome could have been very different.
Even though these worst case scenarios didn’t happen, the level of damage was still far beyond anything Airbus or Rolls-Royce had anticipated, and both companies were keen to know why — as were investigators with the Australian Transport Safety Bureau (ATSB), who were charged with finding out. Initial efforts established that an oil leak caused an internal fire that led to the failure of the IP turbine disk drive arm, as described previously. But that didn’t answer one of the manufacturers’ most burning questions, which was why the disk was allowed to overspeed until it burst — because according to the engine’s design philosophy, this should never have happened.
The risk posed by a burst engine disk, whether it’s a fan, compressor, or turbine, is well known in the industry. Numerous catastrophic accidents have occurred as a result of burst disks, either due to overspeed or material defects. Demonstrated side effects of burst disks include severe flight control damage (see United flight 232, Sioux City, Iowa, 1989, in which fan disk debris resulted in the failure of all hydraulic systems, a loss of control on landing, and 111 deaths); direct fatal impacts to passengers (see Delta flight 1288, Pensacola, Florida, 1996, in which a compressor disk entered the passenger cabin on takeoff, killing two passengers); and in-flight fires (see LOT Polish Airlines flight 5055, Warsaw, Poland, 1987, which crashed after a fragment of a burst turbine disk started a fire in the baggage compartment, killing 183). Accidents such as these inform aircraft certification guidelines, which classify a disk failure as a “hazardous” event whose probability must be “extremely remote” (defined as one event or less per 100 million operating hours). And just in case such an event were to happen anyway, rules introduced following the United accident in Sioux City also require manufacturers to minimize the potential secondary failures that could occur as a result, which was part of why the A380 had individual backup hydraulics for all critical control surfaces.
In general, the accident aboard flight 32 demonstrated that the requirements for damage minimization were met, with the exception of the resultant inability to cut fuel to the №1 engine, which was explicitly defined in certification guidelines as undesirable. This success alone distinguished it from previous, fatal disk release accidents. But Rolls-Royce was concerned that the disk burst at all. During development of the Trent 900, the company calculated that in the event that the IP turbine disk became disconnected from the drive shaft, the disk would not accelerate fast enough to burst. Essentially, it was believed that without the turbine to drive it, the IP compressor would decelerate until air no longer flowed smoothly over its blades, causing a compressor stall that would subsequently spread to the HP compressor behind it. This should lead to an engine surge, in which the disruption of airflow through the compressor section allows pressurized air from the combustion chamber to surge forward toward the front of the engine. In theory, this would reduce the amount of air flowing rearward through the annulus gas path and over the IP turbine, relieving the load on the turbine and preventing it from accelerating beyond its critical speed.
A key assumption here was that the IP compressor stall would happen faster than the advanced electronic engine control system could detect the surge and reduce fuel flow, which would bring down the combustion chamber pressure to clear the surge and restore normal airflow. In the actual event, investigators found that although the IP compressor stalled and a surge occurred, the automatic reduction in fuel flow came swiftly enough to enable the partial recovery of the HP compressor, which resumed forcing pressurized air into the combustion chamber and thence over the turbine. This unexpectedly robust airflow provided the mechanical energy required to accelerate the IP turbine disk beyond its critical speed. Rolls-Royce engineers were unable to conclusively determine why this occurred. Nevertheless, to prevent it from happening again, the company developed an IP turbine overspeed function for the electronic engine control that directly monitors the IP turbine disk and instantly cuts fuel flow to the engine if the disk starts spinning too fast.
◊◊◊
Of course, if you’ve gotten this far, then you’re probably aware that one critical question remains. This entire chain of events began when the oil feed stub pipe — remember that? — developed a leak. If you need a refresher, this was the short segment of pipe that passed through the outer and inner sections of the bearing hub to deliver oil to the bearing chamber. So why did this happen?
The answer, as it turns out, was visible to the naked eye. When investigators removed the faulty pipe, they found that one wall of the pipe was simply too thin. Unable to withstand the stresses of normal operation, it began suffering from metal fatigue and failed after only 677 flights.
The oil feed stub pipe was designed to have a slightly wider inner diameter at the bottom end in order to accommodate a filter. This required widening the interior diameter of the pipe by drilling a “counter bore” in from one end, as shown above. The center of the counter bore should be aligned with the center of the main bore. But on the stub pipe recovered from the failed Qantas engine, the counter bore was displaced to one side by approximately half a millimeter, resulting in an irregular wall thickness that varied from 1.42 mm on one side to only 0.35 mm on the other. It was this extra thin part of the pipe wall that failed on flight 32.
The story of how the counter bore became offset by half a millimeter has implications that far outstrip the physical size of the error. The following section is going to involve some fairly complex discussions of the design and manufacturing process, but I hope you’ll bear with me.
When the designs for the HP/IP bearing hub assembly were drawn up during the early 2000s, the design engineers followed standard practice by defining the position and dimensions of the oil feed stub pipe and associated features relative to a fixed point, referred to as a “datum.” This point, designated datum AA, was defined as the hole in the outer section of the bearing hub through which the oil feed stub pipe passes on its way to the bearing chamber. This hole will henceforth be known as the “outer clearance hole.” All other aspects of the stub pipe fitting were positioned with respect to the centerline of this hole.
In line with the outer clearance hole, the designs also called for an “interference bore” in the inner hub into which the bottom end of the stub pipe would fit. The interference bore was designed to be very slightly narrower than the stub pipe so that the pipe end, once tapped firmly into place within it, would be held inside by friction. This bore was required to be centered on datum AA so as to keep it perfectly aligned with the outer clearance hole.
Next, an “inner hub counter bore” would be drilled in from the inside of the inner hub, meeting in the middle with the interference bore. This was the hole through which the oil from the stub pipe would enter the bearing chamber.
Then, once these holes were machined, the oil feed stub pipe itself would be inserted through the outer clearance hole and into the interference bore, then welded in place.
Finally, working from the inside through the inner hub counter bore, the stub pipe counter bore would be drilled to a specified depth to accommodate the filter. (This was the bore that was later found to be offset by half a millimeter.) According to the design specifications, the stub pipe counter bore should be in line with datum AA, with a tolerance of Ø 0.10 mm. In plain English, that means that the center of the counter bore should lie within a 0.10-mm-diameter circle centered on datum AA. It does not mean that the bore can be 0.10 mm from the datum, but rather that the bore can lie within 0.05 mm of the datum in any direction, for a total range of possible positions measuring 0.10 mm across. (Hereinafter, the term “offset” refers to the distance of a given point from the datum, while the terms “tolerance” and “non-conformance,” indicated with the “Ø” symbol, refer to the diameter of a circle centered on the datum with its edge at the given point. As confusing as this may be for some readers, this is how tolerances are measured in real life engineering, so if engineers can deal with it, so can you (hopefully).)
In any case, when it came time to plan the actual manufacture of the bearing hub assembly, some changes had to be made to this process. The basic problem was that once the oil feed stub pipe was inserted into the hub assembly, it would no longer be possible for the machining computer to find the location of datum AA, because the outer clearance hole by which it was defined would be too full of stub pipe. That meant that it would be impossible to determine exactly where the oil feed stub pipe counter bore should subsequently be drilled.
In order to fix this problem, Rolls-Royce manufacturing engineers decided to redefine the position of the stub pipe counter bore with relation to a new datum, named datum M, which corresponded to the center of the inner hub counter bore. At the same time, the tolerance for the stub pipe counter bore was changed from Ø 0.10 mm to Ø 0.20 mm for unknown reasons. But the bigger problem was that the position of the stub pipe itself was determined by the position of the interference bore, which was still defined by datum AA. Because the position of the inner hub counter bore did not have a specified tolerance relative to datum AA — in fact, in the original design plan, its position didn’t matter much at all — there was no direct assurance that datum M would line up with datum AA, and thus it could not be assured that the stub pipe would line up with its own counter bore either. At this point you might already be starting to see the problem.
The actual manufacturing process, based on the above specifications that were written into the manufacturing stage drawings, proceeded as follows. First, the basic hub assemblies were delivered to the Rolls-Royce machining plant in Hucknall, UK, with the outer clearance hole already drilled (and datum AA thus defined), but without the interference bore or the inner hub counter bore (datum M). Instead, a reference hole was drilled, with reference to datum AA, in the location that would later become the inner hub counter bore. A temporary timing pin was inserted into this reference hole, which was then used by the computerized machining equipment to orient the hub so that the machining arm aligned with datum AA. So far so good.
Subsequently, the interference bore was drilled, with reference to datum AA. Then came the most dastardly part: in order to drill the inner hub counter bore, the timing pin had to be removed, but the machine also needed to remember its location in order to know where to drill. Therefore, the machining computer used specialized probes to measure and record the position of the timing pin within three-dimensional space, allowing subsequent removal of the pin. At that point the only thing ensuring the correct alignment of the inner hub counter bore — and thus datum M, and thus the stub pipe counter bore — was the assumption that the recorded position of the timing pin remained accurate.
Unfortunately, that assumption proved incorrect. The problem was that while the interference bore had been drilled from the outside, the inner hub counter bore had to be drilled from the inside, which necessitated the reconfiguration of the clamps holding the hub assembly in place, in order to make room for the machining arm. During this process, the hub assembly sometimes — but not always — shifted imperceptibly.
If the hub shifted, then when the machining process resumed, the recorded location of the timing pin (and thus datum AA) would be slightly offset from its actual, new location, and the machine would subsequently drill the inner hub counter bore offset by an equal amount. That meant that datum M — again, defined as the center of the inner hub counter bore — would also be offset from datum AA by that amount.
Subsequently, the stub pipe was inserted through the outer clearance hole and into the interference bore, where it was welded in place. The stub pipe counter bore was then drilled into the end of the pipe with reference to datum M, which, again, would be offset if the hub had shifted. In the case of the components involved in the accident, the hub presumably shifted by just under half a millimeter, resulting in an equal offset of both the inner hub counter bore and the stub pipe counter bore relative to the pipe itself. As a side effect, one wall of the pipe was too thin.
In the original design, no stub pipe wall thickness was specified; instead, adequate wall thickness was ensured by the alignment of both the pipe (via the interference bore) and its counter bore with the same datum (AA). The fact that this assurance could be lost when using the reworked manufacturing process was not recognized at the time, nor were the subsequent inspections tailored to find such a defect.
In this regard, two inspections are of note, both of which involved the use of a coordinate measuring machine, or CMM. One of these, known as OP 230, involved only the measurement of the stub pipe counter bore position relative to datum M, which provided no useful information as to its position relative to the pipe itself. A visual inspection was also conducted at this stage, but it was not possible for an inspector to observe the stub pipe wall thickness at the counter bore because this end of the pipe was welded inside the interference bore, completely out of sight.
Another inspection, called OP 70, occurred prior to OP 230 and presented a better opportunity to notice the error. During this inspection, the CMM measured the position of the interference bore relative to datum M, even though it was machined with reference to datum AA. If datum M and datum AA were offset by more than the specified tolerance for the interference bore, this should have caused the CMM to report an error in the position of the interference bore. The tolerance for this bore was supposed to be Ø 0.05 mm according to the design drawings, but was changed to Ø 0.5 mm in the manufacturing drawings without explanation. Even so, the non-conformance on the accident hub was between Ø 0.90 and Ø 0.98 (an offset of 0.45–0.49 mm), which should have been flagged by the machine. The CMM records from the accident hub were not retained, so it was not possible for investigators to confirm that the error was actually registered. However, even if it was, a follow-up inspection might have concluded that the error was false — because the manufacturing drawings specified the position of the interference bore relative to datum AA, and inspectors were generally unaware that the CMM was actually measuring the position of the bore relative to datum M. Therefore, if inspectors saw that the position of the interference bore was flagged as out of tolerance, they could refer to the manufacturing drawings, check the bore’s position with reference to datum AA, find everything to be normal, and give the hub a clean bill of health.
For the above reasons, numerous HP/IP bearing assembly hubs were released for service with oil feed stub pipe walls that may or may not have been too thin. No one was aware of this until 2009, after the accident hub was manufactured, when Hucknall Casings and Structures decided to change the datum for the oil feed stub pipe counter bore in order to simplify the manufacturing process. The change called for the use of datum AF, which was defined as the center of the pipe’s main bore, to position the pipe’s counter bore. Subsequently, two previously manufactured stub pipe counter bores were measured against this new datum and found to have a non-conformance of Ø 0.5 mm (or an offset of 0.25 mm, about half that of the accident pipe). As a reminder, the tolerance for the stub pipe counter bore was supposed to be Ø 0.20 (for a maximum permissible offset of 0.1 mm).
This problem was soon called to the attention of a design engineer, who escalated it up the chain of command in order to figure out what to do about the approximately 100 oil feed stub pipes that had already been released for service. At issue was whether the error would have safety implications. If there were no safety implications, then the plant could issue what it termed a “retrospective concession,” allowing the improperly manufactured products to remain in service. But if there were safety implications, then there would need to be corrective actions, perhaps even a recall. In order to find out, a manufacturing engineer was assigned to conduct a statistical analysis of the likely distribution of oil stub pipe counter bore non-conformances based on measurements taken on nine previously manufactured hubs that were still at the facility. Using a statistical analysis program, the engineer found that the likely maximum non-conformance of any stub pipe counter bore was Ø 0.7 mm (an offset of 0.35 mm). Engineering calculations showed that the resulting wall thickness would not seriously alter the service life of the stub pipe, which meant that there was no safety implications.
In reality, this statistical analysis was flawed because of the low number of data points, and the lack of assurance that the dataset of nine hubs was representative of hubs manufactured in previous years. Therefore, the result should really have been read as “a maximum likely non-conformance of Ø 0.7 mm plus or minus an uncertainty factor of unspecified magnitude.” However, the engineer was unfamiliar with the statistical analysis program and failed to clearly convey this uncertainty in the report that was submitted to the Non-Conformance Authority, the engineer empowered to make decisions about the acceptability of non-conformances. The Non-Conformance Authority took the report to mean that there were no safety implications, and signed off on the retroactive concession allowing the affected pipes, including the accident pipe, to remain in service.
Investigators noted that according to Rolls-Royce’s internal procedures, a retrospective concession also required the signatures of the Business Quality Director and more importantly the Chief Engineer, who had the power to decide whether any fleet-wide actions were warranted. Neither of these signatures were obtained for the retrospective concession that was granted to the oil feed stub pipes. The ATSB observed that the Hucknall facility was using the same paperwork for retrospective concessions as it used for concessions on non-conforming parts that were caught in-house, which did not require these extra signatures, so no signature field for them was provided. There was also nothing on the paperwork to indicate whether a concession was retrospective or not. As such, the Non-Conformance Authority might have been unaware that they lacked the right to unilaterally approve the concession without the consent of a higher-ranking engineer.
◊◊◊
After the accident, measurements were taken on all in-service HP/IP bearing oil feed stub pipes to determine the alignment of their counter bores. The majority were outside the Ø 0.20 mm tolerance on the manufacturing drawings, and several of them had non-conformances greater than the Ø 0.7 mm predicted by the statistical analysis. Four of the stub pipes had non-conformances even greater than the accident pipe, and two were found to have a staggering non-conformance in the vicinity of Ø 1.2 mm. These pipes likely would have failed in service, potentially causing repeats of the Qantas incident, had they not been caught.
Investigators also criticized the culture within the Hucknall facility that manufactured the HP/IP bearing hubs, identifying signs of complacency and widespread procedural non-compliance. A paperwork review showed that the required signatures were missing from 131 out of 138 retrospective concessions issued between 2009 and 2011, and a large number of minor non-conformances had not been properly handled through the normal chain of command. An internal review in 2007 had also previously found that the Hucknall facility lacked a “strong focus on quality within the business.” This appeared to extend to the creation of the original manufacturing drawings, which had been altered from the design drawings without the consent of the design engineers. Furthermore, initial inspections at the start of the production run were supposed to verify that the manufacturing process was creating products that satisfied the “design intent,” but the initial products were checked against the manufacturing drawings, not the design drawings. This verification was circular in nature and did nothing to ensure that the design intent was actually met. Had the design drawings been used, as procedures demanded, engineers likely would have discovered that the process was not producing stub pipes with the required tolerances.
◊◊◊
Following the accident, a long list of safety actions were taken to prevent a recurrence and incorporate lessons learned. A non-exhaustive list of these safety actions includes the following:
· Qantas temporarily grounded its A380 fleet between November 4 to November 27, 2010.
· The European Aviation Safety Agency issued an airworthiness directive mandating inspections of all Trent 900 oil feed stub pipes.
· Rolls-Royce developed an IP turbine overspeed protection system. Airbus issued a mandatory service bulletin requiring its installation on All A380s within 10 flights.
· The entire original production run of HP/IP bearing hubs was removed from service and scrapped. All other HP/IP bearing hubs with an oil feed stub pipe wall thickness less than 0.7 mm were also removed from service.
· Rolls-Royce revised its procedures to ensure consultation between manufacturing and design engineers over the design intent of newly introduced parts.
· A number of efforts were initiated to change the culture around non-conformances at the Hucknall facility.
· Rolls-Royce ended the practice of retrospective concessions and instituted a new program for dealing with non-conforming parts that escaped into service.
· Airbus modified the landing performance software to more accurately predict the actual performance of the airplane at all landing weights.
Collectively, these reforms have done much to ensure that Rolls-Royce continues to produce quality products, and to maintain the perfect safety record of the Airbus A380, which to date has never suffered an accident resulting in injury to passengers.
◊◊◊
If you have made it this far, I first of all commend your patience, and/or your nerdiness. And second, I will speculate that everything you’ve read thus far has probably left a positive impression of modern aviation safety. The sequence of events required to merely wound, not kill, this Airbus A380 was absurdly long, passing numerous gates at which the progress toward disaster could have been stopped. And yet the system still held its ground. According to the swiss cheese model of safety, an accident happens when the holes in the stacked swiss cheese slices align, allowing a hazard to pass straight through unhindered. The hazard in this case penetrated countless swiss cheese slices, from the drawing board to the manufacturing floor to the inspection room and beyond. But the industry has put up so many slices of cheese that even this impressive run was insufficient to put a scratch on so much as a single passenger. Even the airplane itself ultimately survived: after a marathon repair that lasted 535 days and cost $139 million, the A380 Nancy-Bird Walton triumphantly returned to the skies in 2012.
A number of intangible lessons can be drawn from both the successes and failures along the road to Qantas 32, from the continued importance of experience and judgment in the cockpit, to the negative consequences of believing that a tiny non-conformance couldn’t possibly be consequential. The fact remains that a deviation of less than half a millimeter nearly brought down the world’s largest passenger airliner. Aviation is, and has always been, unforgiving of even the smallest flaws. The devices, designs, and decisions that kept Qantas flight 32 in the air didn’t appear from nothing, but are rather the collective result of rules, regulations, and forward-thinking policies imposed by people upon that unforgiving substrate. In that sense, the upshot of the drama aboard flight 32 is that at the end of the day, the system is working.
_______________________________________________________________
Don’t forget to listen to Controlled Pod Into Terrain, my new podcast (with slides!), where I discuss aerospace disasters with my cohosts Ariadne and J! Check out our channel here, and listen to our latest episode, in which we break down the incredibly poor decision-making aboard Pinnacle Airlines flight 3701. Alternatively, download audio-only versions via RSS.com, or look us up on Spotify!
_______________________________________________________________
Join the discussion of this article on Reddit
Support me on Patreon (Note: I do not earn money from views on Medium!)
Visit r/admiralcloudberg to read and discuss over 250 similar articles