Into the Valley of Death: The crash of American Eagle flight 4184 and the ATR icing story

Admiral Cloudberg
112 min readSep 2, 2024

--

The wreckage of American Eagle flight 4184 lies scattered in a fallow field outside Roselawn, Indiana. (Indianapolis Star)

On the 31st of October 1994, an American Eagle ATR 72 with 68 passengers and crew on board abruptly plummeted from the sky over northwestern Indiana, shattering the aircraft and its occupants against a muddy field near the rural town of Roselawn. There were no survivors.

The crash proved to be the darkest moment in a decade-long saga that put French-Italian manufacturer ATR under fire over the fundamental safety of its airplane. At issue was the turboprop commuter plane’s ability to handle atmospheric icing, a danger that has bedeviled aviators since the dawn of flight. In fact, American Eagle flight 4184 was ripped from the sky without warning after a buildup of ice induced a terrifying control anomaly that came to be known as “aileron hinge moment reversal,” which sent the aircraft rolling into an inverted dive in a matter of seconds. The questions raised were far-reaching, from the adequacy of aircraft certification, to the underappreciated hazards of freezing rain, to the controversial question of whether the accident could have been prevented, and by whom. Did ATR know that its plane had a hidden flaw? Did American and French authorities ignore warnings before the deadly crash? Could the pilots have avoided the disaster? And most of all, was the ATR safe? The debate over these issues formed the centerpiece of a far-reaching drama involving complex technical analysis, multiple fatal accidents, and countless close calls stretching from 1986 to the present day. Pieced together from official documents, books, and expert interviews, this is the story of what really happened in the years, hours, and minutes leading up to that fateful moment over Roselawn, Indiana — a moment whose echoes still reverberate three decades later.

◊◊◊

An American Eagle ATR 72–212, N288AM, sister ship of the accident airplane N401AM. The accident airplane would have appeared identical to the aircraft in this photo. (Thomas Brosson)

Part 1: Halloween, 1994

The 31st of October 1994 was a blustery, damp day over much of the Midwestern United States. Temperatures in Indiana and Illinois that afternoon hovered in the mid-40s F (upper single digits C) with steely gray skies and intermittent drizzle and rain, promising a mildly unpleasant evening for Halloween trick-or-treaters.

At Indianapolis International Airport, numerous airplanes sat parked beside the terminal, boarding and de-boarding the thousands of people who travel each day by air to and from the city that bills itself as the Crossroads of America. One of these planes was an unassuming 66-passenger ATR-72 twin turboprop, painted in the red, white, and blue color scheme of American Airlines, the world’s largest passenger air carrier. The name on the side of the plane was “American Eagle,” the brand name for American Airlines’ regional services.

American Eagle was not actually an airline, nor was it an American Airlines subsidiary. American Eagle, officially AMR Eagle, was essentially four different airlines in a trench coat, all of which were owned by the AMR Corporation, which also owned American Airlines. AMR Eagle didn’t have an air operator certificate; rather, flights under its brand were at that time operated by Flagship Airlines, Wings West Airlines, Executive Airlines of Puerto Rico, or Simmons Airlines, each of which had its own internal structure but was subject to certain standardized policies handed down from AMR Eagle. Most of the passengers boarding American Eagle flights were probably completely unaware of this arrangement.

Period Simmons Airlines timetables, from before they were acquired by American Eagle. (Northwest Airlines History Center)

Simmons Airlines was founded in 1979, one year after the deregulation of the US airline industry allowed the entry of new companies into the air travel market. The end of price controls caused legacy airlines to abandon many shorter routes and smaller cities that could not be profitably served by their all-jet fleets, which opened up a niche for new regional airlines that filled the gap with smaller, cheaper turboprops. Simmons Airlines got its start in the town of Marquette in Michigan’s sparsely populated Upper Peninsula, where it flew a fleet of 5-passenger Piper Aerostars. The airline grew rapidly, incorporating larger aircraft such as the Embraer EMB-120 and the Short 360, until in 1986 it acquired the brand new ATR 42, a 48-passenger high-wing turboprop manufactured by the newly created French-Italian consortium Avions de Transport Regional (ATR). In 1991, the airline upsized again, adding the updated ATR 72, a stretched version of the ATR 42 with a higher passenger capacity. By 1994, Simmons owned or leased 79 aircraft that collectively performed 565 flights a day, mostly out of Chicago and Dallas.

Under American Eagle branding, Simmons operated a regular ATR 72 service multiple times a day between Chicago and Indianapolis, the capital of neighboring Indiana. On Halloween 1994, the commander on the midday round trip was 29-year-old Captain Orlando Aguiar, who had over 7,800 flying hours including over 1,500 on the ATR 42 and 72. He was described by his coworkers as outgoing, personable, and easy to work with. Similar comments had been made about the second-in-command, 30-year-old First Officer Jeff Gagliano, who was actually more experienced on ATR aircraft, with over 3,600 of his 5,167 total hours having been spent on the French-Italian turboprop.

In the cabin, the passengers were under the care of senior flight attendant Sandi Modaff and junior flight attendant Amanda Holberg, who had just graduated from training and was enjoying her very first day on the job.

A map of the route of flight 4184 with relevant en route waypoints highlighted. (Own work, map by Google)

After picking up the plane in Chicago at 10:49, the crew flew to Indianapolis without incident, ate lunch, and prepared for the return journey, scheduled to depart at 14:10 with the flight number American Eagle 4184. As 64 passengers boarded the ATR 72, the pilots would have been reviewing the dispatch paperwork, which included the latest weather reports for their route and destination. Conditions were not substantially different than they had been three hours earlier. En route aerodromes were reporting various levels of rain, fog, low overcast, and gusty winds. The forecast summary prepared by American Airlines indicated that rain would continue into the night, associated with the passage of a cold front, and that scattered thunderstorms were possible. The freezing level was reported to be near 10,000 feet. The National Weather Service had not issued any SIGMETs, short for significant meteorological information, which would have indicated particularly adverse conditions.

The NWS had issued an AIRMET (airman’s meteorological information) forecasting light to moderate turbulence and icing in clouds below 19,000 feet in northern Indiana and Illinois, but Simmons Airlines didn’t normally include AIRMETs in its dispatch package. The justification for this policy is not stated in any of the documents that I was able to acquire. Alternatively, crews could acquire any applicable AIRMETs while underway by tuning in to the HIWAS (Hazardous in-flight weather advisory service), but it is unknown whether that happened in this particular case because the cockpit voice recorder only captured the last 30 minutes of the flight.

The contents of the AIRMET were significant because icing is a serious hazard to aircraft, and especially turboprops. Snow, sleet, and other freezing precipitation with a high water content can adhere to the aircraft in flight if the temperature is within the “sticky” band stretching from approximately -20˚C to +5˚C. The resulting ice buildups can have a number of deleterious effects, including but not limited to reduced propeller efficiency, increased weight, and decreased lifting capacity of the wings. As a result, all transport aircraft are equipped with anti-icing systems that protect critical areas such as the windscreen, propellers, and external sensors. Turboprop aircraft are also typically equipped with rubber deicing boots inside the leading edges of the wings, which inflate cyclically to crack and remove any ice accumulations. But while these systems are effective, they are imperfect, and flight in icing conditions requires increased crew awareness in order to identify any negative performance changes. The fact that the crew of flight 4184 wasn’t provided with any information about the forecast icing conditions was therefore less than ideal. However, “light to moderate” conditions are by definition within the capabilities of transport category aircraft, so even if the information had been provided, it probably wouldn’t have altered the flight crew’s decision-making as they prepared for departure.

◊◊◊

At 14:14, just a few minutes behind schedule, flight 4184 pushed back from the ramp, ready to taxi. Takeoff was, however, not imminent.

Due to gusty winds and poor visibility, Chicago O’Hare International Airport was operating below full capacity. The task of preventing the Chicago area from becoming oversaturated fell to the federal flow control office, which coordinates air traffic across the United States in order optimize the use of national airspace. (For more information about flow control, you can check out my article on Avianca flight 052.) That day, the central flow control facility had implemented a ground hold program for Chicago-bound aircraft, effective from 12:00 to 18:00, in order to ensure that the number of aircraft arriving in Chicago remained at or below the number that the airport could handle. As a result, the local flow control branch in Chicago informed Indianapolis that flight 4184 would be held on the ground until at least 14:52.

The idea behind such a ground hold is that it’s safer and more efficient to prevent an aircraft from taking off than it is to place it into a holding pattern after it’s airborne. In fact, when flow control is working as intended, no aircraft should experience prolonged holding in the air. Of course, most experienced air travelers still have stories about circling for ages waiting to land, which simply reminds us that the system is imperfect.

As 14:52 approached, the Indianapolis controller asked Chicago flow control whether flight 4184 could be released for takeoff, or whether further delays would be required. Flow control replied that flight 4184 could depart but that they should expect “a little bit of holding in the air,” which was then relayed to the crew, who began preparing for takeoff. So much for the dream of perfect efficiency!

Flight 4184 finally took off from Indianapolis at 14:55, with a cleared cruising altitude of 14,000 feet. While First Officer Gagliano flew the airplane, Captain Aguiar contacted the Danville sector of the Chicago Air Route Traffic Control Center (ARTCC), who cleared them to the Chicago Heights radio beacon, southeast of their destination. Minutes later, flight 4184 requested and received clearance to climb to 16,000 feet, possibly to avoid moderate turbulence, and after reaching that altitude, the flight was transferred to the Boone sector of the Chicago ARTCC. The crew then requested a descent to 10,000 feet in order to comply with an altitude crossing restriction southeast of Chicago Heights, which was granted. The crew must have seen some ice accumulating on the airplane as they descended, because flight data shows that they activated all the anti-icing and de-icing systems at that point.

An airspace map of northwestern Indiana and northeastern Illinois shows the area where flight 4184 was instructed to hold. Note the “holding area” box, added by the NTSB, which shows the zone of operations for aircraft holding at LUCIT. (NTSB)

At 15:18, however, the flight crew received the news they knew was coming: they would have to hold in the air. A rush of traffic was approaching Chicago from the west, and controllers in eastern sectors, including Boone, had been ordered to hold inbound flights until more space was available. As a result, flight 4184 was informed that they could expect to hold around the LUCIT intersection in northwestern Indiana until 15:30. One minute later, this time was revised again to 15:45, and the controller advised that the holding pattern would be performed with right-hand turns, ten mile legs, and a speed no greater than 175 knots, which is a blanket maximum holding speed for turboprop aircraft.

At 15:24, flight 4184 reached the LUCIT intersection and entered the holding pattern at an altitude of 10,000 feet. Apparently the pilots observed that they were no longer picking up any ice, because at that time they turned the deicing boots off. However, their exact reasoning is unknown because the cockpit voice recording didn’t start until 15:28.

The crew of flight 4184. Reprinted in “Unheeded Warning” by Stephen Frederick.

As the recording began, the junior flight attendant Amanda Holberg could be heard entering the cockpit to learn about life in the pilot’s seat. A commercial music station was blaring over the radio. “Is that like stereo, radio?” Holberg exclaimed. “You don’t have a hard job at all! We’re back there slugging with these people…”

“Yeah you are,” said Captain Aguiar. “We do have it pretty easy. I was telling Jeff I don’t think I’d ever want to do anything else but this.”

The friendly conversation eventually shifted to the topic of their arrival time. “We already got two people that have already missed their flight,” Holberg reported. “Three fifteen is one of them.”

“Three fifteen, three fifteen?” Aguiar asked, incredulous.

“It’s all your fault!” Holberg joked. “Uh huh. We weren’t due into Chicago until three fifteen.” Obviously this passenger never would have made the connection even if the plane was on time.

“She’s lying then,” said Aguiar.

“You know what we deal with out here?” Holberg vented.

“Ya, you should hit her,” Aguiar lightheartedly suggested.

Further non-pertinent conversations were redacted from the transcript, until 15:31, when Holberg asked, “What do you all do up here when autopiloting? Just hang out?”

“You still gotta tell it what to do,” said First Officer Gagliano, momentarily taking his attention away from flying the plane.

“If the autopilot didn’t work, he’d be one busy little bee right now,” said Aguiar. Gagliano laughed.

“So does the FO do a lot more work than you?” Holberg asked.

“Yep,” said Aguiar.

“Not really,” Gagliano contested.

A couple minutes later, as flight 4184 negotiated the turn at one end of the holding pattern, Captain Aguiar commented, “Man this thing gets a high deck angle in these turns. We’re just wallowing in the air right now.”

According to the flight data, the aircraft tended to lose speed in the turns, probably due to the use of the autopilot’s high bank angle setting with a heavy aircraft. In order to prevent a corresponding loss of altitude, the autopilot tended to compensate by raising the nose slightly to increase lift. The flight data doesn’t indicate that this had any significant effect that would have been noticeable to the occupants, but apparently Aguiar perceived that the nose was too high. The passengers would be more comfortable if it was a little bit lower, he thought.

“You want flaps fifteen?” Gagliano suggested.

Extending the flaps to fifteen degrees would increase the lift generated by the wings, allowing the autopilot to maintain altitude with a lower pitch angle. In cruise flight this could also be done by increasing their speed, but in the holding pattern they were limited to 175 knots.

“I’ll be ready for that stall procedure here pretty soon,” Aguiar said. The aircraft was not in danger of stalling but the comment revealed that he was at least paying attention. Regarding the flaps, he added, “Do you want to kick ’em in? It’ll bring the nose down.”

“Sure,” said Gagliano. Captain Aguiar responded by moving the flap handle to the 15 degree position, and the flaps deployed. The nose immediately dropped a couple degrees.

“Guess Sandi’s going ‘ooo,’” Aguiar joked.

How extending the flaps allows the wings to produce the same amount of lift with a lower AOA. (Boldmethod)

At 15:33, Holberg began asking about aircraft systems, and Aguiar demonstrated all the different alarms that could be produced by the ground proximity warning system. A caution alert chime was also heard, which was not associated with anything Aguiar was doing, but it was not acknowledged by the crew. Most likely, it came from the aircraft’s anti-icing advisory system, a device that warns the crew when ice is detected on a wing-mounted sensor. The aircraft was flying intermittently into cloud tops that ranged from 9,000 to 10,600 feet, likely causing an accumulation of ice. Flight data showed a slight increase in drag on the aircraft at around the time the alert sounded, but there was no change in performance that would have been noticeable to the pilots.

Off-topic conversations continued until 15:38, when the Boone sector controller called to add another 15 minutes to their hold, with an estimated release at 16:00. With 22 minutes still to wait, the pilots resumed chatting with Holberg and listening to music. However, at 15:41, the anti-icing advisory system emitted another caution chime, and this time the pilots took notice. In accordance with the proper procedure, they responded by turning on the deicing boots. Moments later, Holberg left the cockpit.

At 15:43, the pilots began discussing whether to inform dispatch about their additional delays. Gagliano seemed to have some difficulty operating the automatic communications and recording system, or ACARS, which pilots use to send text messages to company operations. While Gagliano fiddled with the system, Captain Aguiar turned off the music and made a lengthy passenger announcement apologizing for the delays.

At 15:48, Gagliano commented, “That’s much nicer, flaps fifteen.”

“Yeah,” said Aguiar. Seconds later, one of the two pilots said, “I’m showing some ice now.” Most likely they were looking at the icing probe mounted just outside the cockpit window, which was designed to give pilots a visual indication of the amount and type of ice that was probably building up on the wings. It was not determined which pilot made the comment, and it was not verbally acknowledged.

“I’m sure that once they let us out of the hold and forget they’re down we’ll get the overspeed,” Aguiar said, referring to the flaps. The maximum speed for flaps 15 was 185 knots, but the flap overspeed warning tended to sound closer to 180. As soon as they started descending from the hold, their speed would likely increase from 175 to 180 knots, triggering the warning, unless the flaps were retracted earlier. Aguiar felt that they were likely to forget.

Moments later, he added, “I can’t hold anymore man, that big cup needs out right now.”

“They’re gonna be giving you dirty looks, man,” Gagliano said, referring to the passengers’ reaction to seeing the captain coming back to use the toilet.

Over the intercom, the senior flight attendant Sandi Modaff called the cockpit to inquire about their fuel status and whether they would have to divert. Aguiar informed her that they had “plenty of gas,” then left the cockpit to use the bathroom. In his absence, Gagliano continued chatting with Modaff about the cabin temperature.

Arriving at the toilet, Aguiar found that it was occupied. Since the intercom was nearby, he took it from Sandi and called Gagliano. “Hey bro,” he said. “Getting busy with the ladies back here.”

“Oh,” said Gagliano. Modaff snickered.

“Yeah, so if I don’t make it up there within the next, say, fifteen or twenty minutes, you know why,” he joked.

“I’ll uh, when we get close to touchdown I’ll give you a ring,” Gagliano shot back.

Five minutes later, having done his business, Aguiar returned to the cockpit and announced, “We have a brand new hombre.”

The pilots settled again into a discussion of their delays and communications with dispatch. Aguiar searched for dispatch information about passengers’ connecting gates but couldn’t find it. “And you haven’t heard any more from this controller chick, huh?” he added.

“No, not a word,” Gagliano replied. Moments later he commented, “We still got ice.” Aguiar didn’t acknowledge him. The flight data indicated that there was still no sign that the ice was noticeably affecting the performance of the airplane.

Then, at 15:56, the controller called and said, “Eagle flight one eighty four, descend and maintain eight thousand.” A second later, having heard no reply, she asked, “Chicago, do you copy, forty one eighty four?”

“Forty one eighty four, go ahead,” said Gagliano. On his own radio, Aguiar was simultaneously reporting their delay and fuel status to the company.

“Eagle forty one eight four, descend and maintain eight thousand,” the controller repeated.

“Down to eight thousand, Eagle flight one eighty four,” Gagliano acknowledged. At the same time, their traffic collision avoidance system announced, “Traffic, traffic,” indicating that another airplane was nearby. This flight was passing over their heads and was in fact the reason why the controller had ordered them to descend. Since the airplanes were at different altitudes and diverging, the flight crew didn’t discuss or react to the “traffic” alert.

“Eagle flight one eighty four, uh, should be about ten minutes till you’re cleared in,” the controller added.

“Thank you,” said Gagliano. Turning to Aguiar, he continued, “They say ten more minutes.”

At 15:57, Aguiar signed off with the company and asked Gagliano, “Are we out of the hold?”

“Uh, no, we’re just going down to eight thousand,” said Gagliano.

“Okay.”

“And uh, ten more minutes she said,” Gagliano repeated.

Seconds later, as they descended toward 8,000 feet, their speed increased past 180 knots, and the flap overspeed warning sounded.

“Oop,” said Gagliano.

“We, I knew we’d do that,” said Aguiar, who immediately retracted the flaps.

“I was trying to keep it to one eighty,” Gagliano explained.

Then, just two seconds later, all hell broke loose.

◊◊◊

This animation of the loss of control and crash of flight 4184 appeared in Mayday season 7 episode 8: “Frozen in Flight.”

At 15:57 and 29 seconds, a series of loud thumps was heard, and almost instantly the pilots’ control wheels slammed with incredible force to the right-wing-down stop. The autopilot disconnected and the airplane banked dramatically to the right, rolling over at a rate of 50 degrees per second, far beyond the normal operating envelope.

Both pilots let out exclamations of surprise, and Gagliano grabbed the controls in an attempt to recover, managing to stop the roll at a dramatic 77 degrees right. Flying almost on a knife’s edge, the wings lost lift and the nose dropped as the plane began to dive toward the ground. Captain Aguiar immediately reversed his last input, moving the flaps back to 15, but since their airspeed was now above 185 knots, a computer blocked them from extending.

Uttering expletives, Gagliano started banking back to the left and raised the nose, which appeared for several seconds to be effective. But he only got as far as 59 degrees right before the controls slammed again to the right wing down stop, ripping the yoke out of his hands. The plane rapidly banked past the vertical and simply kept going, rolling more than 450 degrees — one and a quarter full rotations — before Gagliano managed to stabilize the bank at 144 degrees right. Amid multiple shouts and expletives, he started forcing the wings back to the left again, decreasing the roll angle, but by now the airplane was pitched 60 degrees nose down and its airspeed was rapidly increasing. Acceleration forces reached three times the force of gravity.

“Mellow it out,” Aguiar exclaimed.

“Okay!”

“Autopilot’s disengaged!”

“Okay!”

“Nice and easy!” Aguiar urged. If they pulled up from the dive too quickly, they would increase the G-forces beyond the structural strength of the aircraft and all would be lost. But with the airplane descending through 6,000 feet with a vertical speed of -24,000 feet per minute, they were already doomed.

“TERRAIN!” blared the ground proximity warning system. “PULL UP! PULL UP!”

Letting loose another expletive, Gagliano increased his efforts to pull up. They were below the clouds now and the ground was rapidly approaching, fields and farms filling their windscreen. Staring into the face of death, the terrified first officer hauled back on his controls, until at 3.6 G’s the airplane broke. The tail and both wingtips separated in flight, a loud crunching sound was heard, and both flight recorders stopped at about 1,000 feet above the ground. The last recorded airspeed was a blistering 375 knots, with level bank angle and a nose down pitch of 38 degrees.

Two seconds later, the disintegrating hulk of the ATR-72 slammed into a field with incredible force. The only witness was a farmer, who caught a split-second glimpse of the airplane as it plummeted from the leaden sky, and then it was gone.

◊◊◊

Most of the ATR 72, except for the tail, was shattered into fragments generally smaller than a standard sheet of paper. (Indianapolis Star)

Part 2: All Eyes on Roselawn

American Eagle flight 4184 came down in a muddy, fallow field six kilometers southwest of Roselawn, Indiana. The unimaginable violence of the impact reduced most of the aircraft to a spray of mangled debris strewn for hundreds of meters downstream of a smoking, riven crater. Only the tail remained intact, lying forlorn atop the barren furrows. By the time local residents and first responders reached the scene, it was obvious not only that none of the 68 people on board had survived, but that there would be no intact bodies to send home.

Within an hour of the crash, the news began to flash across television screens throughout America, and the National Transportation Safety Board was assembling a team to travel to the scene to begin an urgent investigation. Across the country and indeed the world, the families and friends of those on board also learned of the disaster in Indiana, and at airports and offices throughout the region, so too did the pilots and staff at American Eagle. Across the Atlantic in Europe, executives and engineers at ATR got the same dreaded phone call, as did officials at the Federal Aviation Administration, the French Bureau of Inquiry and Analysis, and the Directorate General of Civil Aviation. Each would approach the crash with their own motivations and preconceptions.

Teams of body recovery experts gather the remains of the victims for transportation to temporary morgues. (Lafayette Journal & Courier)

One of the first tentative conclusions reached by the NTSB on November 1st, the first full day of the investigation, was that the aircraft had broken up in flight at a low altitude. There was no evidence that the structural failure was causal, because radar data showed that the plane went out of control at a much higher altitude, well before it broke apart. No debris was found more than a couple hundred meters from the point of impact.

From the very beginning, there was suspicion that ice might have had something to do with the loss of control. Weather forecasts warned of light to moderate icing, but the forecasts can be wildly off, and much worse icing could have existed. Tracking down other pilots who were in the area, the NTSB learned that many of them had indeed reported ice near flight 4184’s last position, but their reports largely matched the forecast.

The top priority of the NTSB at that stage was to recover the flight recorders, popularly known as black boxes. The cockpit voice recorder and flight data recorder were found inside the intact tail cone, having survived the crash with superficial damage; both were carefully packaged and rushed to Washington, D.C. for analysis.

At the crash site, the scale of the human destruction forced authorities to declare the field a biohazard, and investigators working to recover evidence had to follow strict decontamination protocols. Recovery crews would ultimately find thousands of fragmented human remains. Those who were there remembered it as possibly the grisliest crash scene ever encountered by the NTSB. Officials tried, unsuccessfully, to keep this unpleasant information away from the victims’ families.

Pieces of the tail cone lie in the field where the flight came to rest. Red flags mark the location of fragmented human remains. (Lafayette Journal & Courier)

In Washington, investigators downloaded the flight data and listened to the cockpit voice recording. The tapes conveyed a casual cockpit atmosphere, as the pilots passed the time in the holding pattern with jokes, music, and conversations. There was no sign that the pilots identified anything amiss with the flight, other than a couple of idle comments about picking up some ice. But everyone flying in the Chicago region that day encountered ice, and none of those airplanes ran into trouble. The Great Lakes region of the United States has some of the most frequent and varied icing conditions in the world, and the crew had likely seen similar conditions many times before.

The FDR data suggested nothing abnormal until the last two seconds before the aircraft abruptly departed controlled flight. But what the NTSB saw at that point was stunning. Moments after Captain Aguiar retracted the flaps, the ailerons, which control roll, started to move in a right-wing-down direction all by themselves. At first, this movement was slight, but after a moment the ailerons slammed to the maximum right-wing-down deflection in less than one third of a second. This was well beyond the deflection rate that can be commanded by the autopilot, which was engaged at the time. It was also beyond the rate that could be achieved by a pilot moving the control wheel. Unable to prevent the deflection from occurring, the autopilot immediately disengaged, and the aircraft banked out of control, reaching 77 degrees right in 1.5 seconds.

The hinged panel on the outboard portion of this ATR 72 wing is the aileron. (Olivier Cleynen)

Clearly the ailerons had deflected under the influence of some force other than the autopilot or the pilot. But as for what could cause such a massive movement, there were few possibilities. A failure of the control cables would tend to lock the ailerons in the neutral position, and the ailerons didn’t have hydraulic actuators that could have driven them to maximum deflection. Furthermore, the ailerons were discovered mostly intact near the start of the debris field, having separated along with the wingtips during the in-flight breakup, and no sign of a jam or malfunction was found. Besides, the flight data showed that during the final stages of the dive, the ailerons responded normally to pilot inputs — so whatever went wrong, it was transient in nature.

The most powerful force acting on an airplane’s flight controls is the flow of air around the exterior of the aircraft. In fact, without this airflow, flight controls won’t do anything at all. But if this airflow starts to behave unusually, then the results can become unpredictable — perhaps, hypothetically, the airflow could even “snatch” an aileron and fling it to maximum deflection. And what else could so disrupt the airflow but ice? Was it really possible that ice, in the right shape and right location, could cause an aileron to deflect all by itself?

To understand that question, we need to go back to the beginning — to the way the ATR 72 was designed, and the way its engineers believed it would handle in icing conditions.

◊◊◊

Part 3: The Regional Airliner of the Future

An early 1980s advertisement for the upcoming ATR 42

By the late 1970s, aircraft manufacturers around the world had largely stopped producing large passenger turboprops with more than 20 seats. Just about the only Western aircraft in that category that remained in production during the 1970s were the Dutch-built Fokker F-27 and the British-built Hawker Siddeley HS 748. Both were reasonably popular within their particular niches, but neither had much of an impact in the United States, the world’s largest airline market. That only began to change with the deregulation of 1978, when major airlines were allowed to pivot their jet-heavy fleets away from shorter routes.

Several European and Canadian manufacturers saw this shift as an opportunity to bring back large passenger turboprops. Jet engines are more efficient than propellers only at high speeds and altitudes, which makes turboprops more economical on short flights. For a route like Indianapolis to Chicago — large population centers located in neighboring states, barely an hour’s flying time apart from runway to runway — a big turboprop made more sense than a jet. But in 1979, the only options available on the market were turboprops with 19 seats or less, or full size jets like the DC-9 and 737. What the airlines really needed was something with two big propellers and 40–60 seats. The F-27 and HS 748 were still in production, but these airplanes were 1950s technology, and were hardly worth considering for most carriers. No, what they wanted was something new — something like, say, the ATR 42.

In 1981, the French manufacturer Aérospatiale joined forces with Italian aerospace company Alenia, formerly Aeritalia, to fill this gap by producing a high wing twin turboprop with room for 40–50 passengers that would be cheap to operate, efficient, and equipped with the latest technology. Under the brand name Avions de Transport Régional (Italian: Aerei da Trasporto Regionale), or ATR for short, the consortium designed and built the aircraft that was ultimately christened the ATR 42. The plane first entered service in 1985 with France’s Air Littoral.

The ATR was groundbreaking in several respects, but for the purposes of this story, one of the most notable characteristics was its high wing aspect ratio. The wings were quite long in a spanwise direction (root to tip) relative to their chordwise width (leading edge to trailing edge), in comparison to other turboprops at the time. A higher aspect ratio wing is more efficient, but it’s also more difficult to design and build, which is why aspect ratios keep getting higher with every new generation of aircraft as technology advances. ATR did it in 1981 by building the airliner’s wings out of composite material, which was revolutionary at the time.

The ATR’s high aspect ratio wings can be more clearly seen in this photo from a high vantage point. (Menkor Aviation)

Equally as critical as aspect ratio was airfoil shape. Most airfoil designs ultimately trace their ancestry back to a database of airfoil shapes developed in the 1940s by the NACA, the predecessor to NASA. The ATR 42’s wing was no exception. Starting with a NACA design, ATR engineers tweaked and modified the shape to achieve a much greater lifting efficiency. How exactly they managed that is well above my pay grade, but the long and short of it is that the ATR wing possessed miraculous lifting capability, but only as long as the wing remained perfectly uncontaminated. Any small airflow perturbations would cause the enhanced airfoil characteristics to revert very quickly. It should be noted that this is a feature of any high performance wing, not just the ATR’s.

It has been observed that this design gives the ATR rather abrupt stall behavior.

A stall occurs when airflow separates from the upper wing surface, creating a region of turbulent air above the wing that prevents it from generating lift. This separation occurs when the angle of attack — the angle of the wing into the oncoming air — becomes too high for the air to flow smoothly over the top. On all aircraft, increasing the angle of attack provides more lift up until the point of airflow separation, but as that point approaches, the relationship between angle of attack and lift becomes less efficient and heavy buffeting can occur. The ATR doesn’t experience any of that. As the stall approaches, there is very little buffeting and lifting continues to increase steadily almost until the moment of airflow separation.

Like all aircraft, the ATR is equipped with a stall protection system that shakes the pilots’ control columns to warn them when the angle of attack (AOA) is too high. If the AOA continues to increase, then a stick pusher will intervene to physically force the nose down and reduce the angle of attack. Not all aircraft have stick pushers, but T-tail airplanes — those with the elevators mounted above rather than below the vertical stabilizer — are required to have them in order to prevent “deep stall,” which occurs when the plume of disrupted airflow over a stalled wing envelops the elevators, eliminating the pilot’s ability to control pitch. Recovery from a deep stall is essentially impossible.

On the left is a generic graph of lift coefficient vs. angle of attack (also sometimes called angle of incidence). On the right is a graph of lift coefficient vs angle of attack (incidence) for the ATR 72 specifically. Can you see how the stall is more abrupt on the ATR 72? (TU Delft, Magnar Nordal)

For the purposes of this story, however, the most important flight controls are the ailerons. Although the ATR has a hydraulic system, its ailerons are not hydraulically powered; rather, they’re connected directly to the pilots’ control wheels by a system of interconnected cables and pulleys, which is highly significant to this particular accident.

A control surface is, at the end of the day, just a big, flat panel that sticks into the airflow and harnesses the power of the rushing air to turn the plane. But every force induces an equal and opposite reaction, which means that as deflection increases, so does the force that reacts against it, striving to push the control surface back to the neutral position. On all but the tiniest, most docile airplanes, it would be unreasonable to expect the pilot to apply this force via brute strength alone.

One way to reduce control forces is to have hydraulically boosted controls that can vastly amplify the pilot’s strength. Hydraulically actuated controls are extremely powerful, but there are some drawbacks — mainly that they weigh a lot and are mechanically complicated. Because ATR was trying to design an extremely efficient airplane, they saw this extra weight as a significant drawback, and turned instead to simpler means of control force reduction.

On the ATR, control forces are reduced using devices called “horns.” These horns are attached to the outboard edges of the ailerons, are exposed at the wingtips, and extend significantly forward of the hinge line. When the aileron deflects up, for example, the horn — located on the opposite side of the hinge line — deflects down. While the airflow over the top of the wing tries to force the deflected aileron back to neutral, the airflow over the bottom of the wing catches the exposed horn and tries to push it forward, applying an opposite force about the hinge line. Although the force returning the aileron to neutral is necessarily greater, the assistance from the control horn ensures that the pilot can counteract it without exceptional strength.

A diagram of the ATR 72’s ailerons with the horn highlighted. Can you see how the horn moves in the opposite direction from the rest of the aileron because it’s on the opposite side of the hinge line? (NTSB)
How control horns actually function, very roughly. Not to scale. (Own work)

ATR was not alone in making these design decisions. Several other turboprop airliners designed around the same time, including the de Havilland Canada DHC-8 and the Fokker 50, also feature unpowered ailerons with horns to reduce control forces. It’s also worth noting that the DHC-8 has an even higher aspect ratio wing than the ATR does, so the ATR isn’t unique in that respect either.

Although I haven’t seen any specific data that would confirm this, it has been argued that one area where the ATR is unique is in the design of its deicing system. The ATR has inflatable deicing boots, just like every other turboprop, but it is unusual in that the individual inflation tubes are oriented chordwise instead of spanwise. I could not confirm whether that has any significance to this story. On the other hand, it is significant that the coverage of the ATR’s deicing boots was supposedly much smaller than average. A deicing boot is inherently inefficient in that it cannot have a perfectly smooth transition with the rest of the wing, which means that the highest performance wing is one without a deicing boot. As you hopefully recall, ATR had designed its aircraft with a high performance wing that was very sensitive to small disruptions, which presented a problem when it came to the deicing boots. Not having any obviously wasn’t an option, so ATR instead designed a deicing boot that protected only the forwardmost 5% of the wing chord, which apparently mitigated most of the negative aerodynamic effects. (On the stretched ATR 72, this was increased to 7%, which is still very low.) This was not anticipated to be a major issue because ice buildups produce more adverse effects the closer they are to the leading edge of the wing, and because that is precisely the area where the most ice tends to accumulate, since it points into the wind. Therefore, even a small deicing boot should be almost as effective as a much larger one, while also preserving the full aerodynamic capabilities of the wing — at least in theory. FAA experts Howard Greene and Gary Lium later wrote that they were surprised ATR was able to make the airfoil work with deicing boots at all.*

The black areas along the leading edges of the wings and horizontal stabilizers are the deicing boots. This diagram depicts the ATR 72 as it appeared in 1994, with deicing boots covering 7% of the wing chord. (NTSB)

In any case, ATR still had to prove that this design would pass muster in actual icing conditions, which brings us to the next major area of discussion required to understand this accident.

*From Unheeded Warning by Stephen Frederick, pg. 163, quoting Greene & Lium: “The wing section is a modified NASA 65xxx series, modified to spread leading edge negative pressure peak out chordwise, i.e. laminar flow. Any discontinuity will trip it back to ordinary 65xxx characteristics, with consequent large drop in CLmax. It is surprising that ATR could make this section so effective with a boot installation.”

◊◊◊

Part 4: Certified to Freeze

All transport aircraft have an obvious need to operate in icing conditions. Any time an aircraft flies in a cloud with measurable water content within the sticky temperature range, ice will accumulate, which is precisely why anti-icing and deicing systems are required. However, simply having these systems is not sufficient, because manufacturers must prove that they adequately protect the aircraft under the range of conditions it will be expected to encounter during routine operations. This necessarily raises the question of precisely what conditions those are.

For certification purposes, the envelope of “normal” icing conditions is defined by the Federal Aviation Administration in a regulation entitled 14 CFR Part 25, Appendix C; and Europe has an essentially identical statute. This regulation specifies a range of conditions as measured in terms of droplet size, temperature, and water content per unit volume of air under which a manufacturer must demonstrate that its aircraft can fly without experiencing any significant performance, control, or systems anomalies. However, if you’ve read my article on Comair flight 3272, you might remember that this regulation had a number of serious flaws, which we’re going to dive into again here.

At the time the ATR was developed, the range of conditions in the Appendix C envelope was based on a program of data collection undertaken by the NACA in the late 1940s and early 1950s, and had undergone essentially no revisions since. This program determined that the vast majority of icing conditions featured droplet sizes under 50 microns (50 millionths of a meter) with a range of water content levels, at altitudes between sea level and 22,000 feet, and temperatures between -30 and 0˚C. Graphs depicting the Appendix C envelope in terms of continuous and intermittent icing are shown below.

The extent of the Appendix C icing envelope at the time the ATR was originally certified. (NTSB)

The Appendix C envelope by no means contains all icing conditions known to exist. Larger drop sizes and water contents can be found, and were observed during the NACA program, although insufficient data about them was generated. By definition, aircraft are not certified to fly in these conditions, and nearly all accidents related to icing involving transport category airplanes occur while flying in conditions outside the Appendix C envelope, for obvious reasons.

In practice, pilots categorize icing conditions using a subjective scale that varies slightly from place to place but generally includes levels corresponding to the terms trace, light, moderate, and severe. Under the most commonly accepted definition, trace icing causes no issues even with the deicing equipment turned off; light icing causes no issues with the equipment turned on; moderate icing can be controlled with the equipment but still causes some negative effects; and severe icing cannot be controlled even with the deicing equipment. The subjective boundary between moderate and severe conditions does not correspond 1:1 with the edge of the Appendix C certification envelope, but as a general rule, transport aircraft can be safely flown for extended periods in trace and light icing, and for short periods in moderate icing. Severe icing is beyond the capabilities of the aircraft and must be exited immediately if encountered.

The NWS definitions of the four ice severity levels. Other agencies and other countries have definitions that might vary from this slightly. (National Weather Service)

Severe icing generally means that ice builds up faster than the boots can remove it, but it does also include other phenomena, such as ice that builds up in an unusual location outside the areas protected by anti-icing and deicing equipment. The most likely reason for a buildup in unusual areas is an encounter with freezing drizzle or freezing rain, which will be crucial to this story.

Freezing drizzle or rain occurs when liquid water droplets in a cloud fall from an area above freezing into an area below freezing. Normally, temperature decreases with altitude, but sometimes regions of warmer air can be found above regions of colder air, which is called a temperature inversion, and this phenomenon is closely associated with freezing precipitation. As they fall into the freezing area, liquid droplets become supercooled — that is, they cool to below 0˚C without actually freezing, until they come into contact with a nucleation site that can initiate the crystallization process. Possible nucleation sites include aerosols such as dust or pollutants, or larger objects like the surface of an airplane.

There is no single definition of what constitutes freezing drizzle vs. freezing rain, but the most common definition holds that freezing drizzle droplets range from 50 to 500 microns in diameter, while freezing rain drops are over 500 microns in diameter. Collectively these are known as “supercooled large droplets,” or SLDs. SLDs were not included in the Appendix C envelope and at the time of the Roselawn accident no aircraft was certified to fly into them. A 1963 FAA handbook did state that, “In any aircraft design, the effect of freezing rain should be considered in addition to the current design procedures for normal (small droplet) icing conditions,” but this document was later superseded.

How freezing rain forms. (Skybrary)

The particular danger of SLDs, as opposed to smaller cloud drops that might also be supercooled, is that their relatively large mass allows them to slide across the surface of the aircraft before the nucleation process is finished. This can result in ice “runback” into areas where it’s not normally observed, and deicing boots might fail to remove SLD icing because the droplets can slide off the boot and freeze on the wing surface behind it. The Appendix C rules did not require manufacturers to test whether the aircraft behaves in a controllable manner with ice accumulating in this location.

The ATR 42 was tested with a variety of ice configurations within the boundaries of Appendix C and no anomalies were observed. This testing included flight with artificial ice shapes made of plastic attached to the wings; flights behind a tanker spewing droplets under 50 microns; wind tunnel tests; and flights into actual icing conditions, to the extent that they could be found. When the stretched ATR 72 was developed in the late 1980s, the French Directorate General of Civil Aviation (DGAC) also required ATR to identify any altered handling characteristics or performance changes associated with Appendix C icing, which included testing stalls and zero-G pushovers with ice shapes attached to the wings. No unexpected behavior was observed.

Although these tests were conducted with the deicing boots turned on, they did include an assumption that the pilots would allow one inch (2.5 cm) of ice to build up before activating the system, just to provide extra insurance against human error. The tests also required ATR to show that the aircraft could fly for at least 45 minutes in icing conditions with droplets measuring 22 microns in diameter. Both tests were passed and nothing unusual was noted, although ATR received a waiver to assume ¾ of an inch of buildup before boot activation instead of 1 inch.

◊◊◊

A Boeing 737 cockpit covered with clear ice as a result of freezing rain. (Alaska Airlines)

However, by the time these tests took place in the mid-1980s, the NTSB had already been warning for several years that the Appendix C requirements were inadequate to protect the traveling public. The definition of the Appendix C envelope assumed that conditions outside of it were both rare and easily identifiable — but neither of these was actually the case.

In a 1981 report on icing safety, the NTSB identified that freezing drizzle and rain were more common than the NACA observations would suggest. Furthermore, pilots could not be relied upon to avoid these conditions because there was no accepted method to determine when the airplane was in conditions outside the Appendix C envelope. Pilots have no way to tell whether the droplets impacting the plane are larger or smaller than 50 microns, and the rate of ice accumulation might appear to be moderate, even though the conditions are technically severe. Furthermore, freezing precipitation tends to produce clear ice rather than white ice, making it difficult to see whether ice is building up on the unprotected portions of the wings — especially on aircraft like the ATR, where the tops of the wings are not visible from the cockpit. To make matters even worse, weather forecasts didn’t specify the presence or absence of freezing precipitation, and in fact there was no reliable method to predict where and when it would occur. A temperature inversion is generally required, with precipitation present and two distinct freezing levels, but this is not a steadfast rule, nor is it a sure sign that freezing rain will be encountered, because predictions based on these factors alone can be something of a crapshoot.

This situation alarmed the NTSB, prompting a recommendation that the FAA expand the Appendix C certification criteria to include freezing rain. An expert commissioned by the FAA confirmed in a 1983 report that the original NACA program findings contained an absence of data supporting the assumption that freezing rain was rare. This data hadn’t been updated in part because of the aforementioned lack of large turboprops in service in the US during the 1970s. Jets are not as vulnerable to structural icing because they fly higher and faster and spend less time in icing conditions, often only a few minutes on final approach. They also tend to be larger, reducing the effects of any given amount of ice, and they have heated wings that greatly reduce ice accumulation. Light aircraft like Cessnas and Pipers are extremely vulnerable to ice, but aren’t subject to the same regulations. Passenger turboprops, on the other hand, can spend almost the entire time from takeoff to touchdown at altitudes where icing is likely to occur, and they fly at lower speeds where ice builds up more readily. Furthermore, they lack the power to have heated wings and rely on deicing boots instead. For this reason, the NTSB became concerned that commuter turboprops entering the market in the early ’80s would not be adequately protected by the Appendix C envelope.

Unfortunately, the FAA responded to the NTSB’s warnings with indifference, verging on willful blindness. The agency’s position was that Appendix C already contained the “full icing envelope expected in nature,” so no further testing of aircraft was needed — even though the NTSB was trying to point out that that very assumption was wrong. The FAA also specifically declined the NTSB’s warnings about freezing rain, writing (incorrectly) that such conditions are not often encountered. The NTSB snappily replied that operations in freezing rain happen “often enough.”

The FAA ultimately did initiate a study in 1986 to determine whether Appendix C ought to be expanded, but the agency was seen as slow-walking the issue. Multiple studies had already been conducted and their findings were freely available. Nevertheless, after four years, the FAA closed the matter in 1990, telling the NTSB that its research had proven the Appendix C standards were “adequate.” Nine years had passed since the NTSB found that not be the case.

◊◊◊

Following the crash of American Eagle flight 4184, the NTSB assembled a team of expert meteorologists from major institutions to determine the exact nature of the conditions encountered by the ill-fated ATR. Only then could the NTSB determine how much ice built up on the plane, where it accumulated, and whether the conditions were within the Appendix C envelope.

The forecast conditions that day were not exceptional, with a forecast of “light to occasional moderate rime icing,” but a temperature inversion did exist in the area of the accident. Between 13:00 and 17:00, aircraft within 100 nautical miles of Roselawn filed 13 pilot reports (PIREPs) indicating icing conditions, but all of them characterized the icing as light to moderate, in accordance with the forecast. Flight 4184’s data recorder captured a temperature reading of -2˚C at 10,000 feet just before the crash, which was also consistent with the expected conditions.

However, when experts began to break down the more granular data, they discovered that the conditions might not have been as routine as they seemed. Using doppler weather radar data, atmospheric models, and other information, they estimated that the liquid water content at 10,000 feet inside the clouds near the accident site was between 0.59 and 0.74 grams per cubic meter, and that due to the temperature inversion, this water was likely supercooled. Furthermore, the doppler data showed that the likely drop sizes ranged from 100 to 2,000 microns, indicating mixed freezing drizzle and freezing rain well outside the Appendix C certification envelope.

Although the danger was obvious in hindsight, it was not predicted at the time. In fact, when the known atmospheric data available from the period before the accident were plugged into the state-of-the-art forecast models developed by the National Oceanographic and Atmospheric Association and the National Center for Atmospheric Research, the model failed to predict any freezing precipitation near Roselawn. The NTSB noted that the FAA had been working on a program called the Advanced Weather Product Generator that would have been able to produce more accurate icing forecasts, but development was cancelled earlier in 1994 due to budget cuts.

In any case, it was apparent that flight 4184 flew into conditions that far exceeded what the ATR 72 was certified to withstand, without having received any documents in their dispatch package that would have indicated that icing was even present.

Based on the analysis of cloud cover and precipitation locations, the NTSB determined that flight 4184 was flying intermittently in and out of SLD conditions as it circled in the holding pattern over the LUCIT intersection. The aircraft would have periodically entered clear air between the cloud layers, at which point icing would have dropped off to trace levels, before picking back up again inside the clouds. The clouds themselves were fairly benign, with no gusty winds and only light to moderate turbulence.

The flight data recorder indicated that the deicing boots were turned on at 15:41 and were still active when the plane crashed at 15:58. Twice during that period, the pilots commented on an ice buildup, but did not discuss it further. In the NTSB’s view, this suggested that the conditions likely did not appear severe to the crew, underscoring the difficulty of detecting the danger posed by SLDs. Although the ATR 72 did have an automated ice detection system, which initially alerted the crew to the icing, the pilots still had to determine the severity of the icing subjectively by examining the ice collection probe mounted just outside the cockpit. But because SLDs “run back” before freezing, this type of ice can fail to accumulate on the small icing probe at the same rate as elsewhere, leading the pilots to believe the buildup is less serious than it actually is. If the deicing boots are correctly removing ice from the wing leading edges, then the crew also has virtually no way to tell whether ice is also building up further back on the unprotected portion of the wing.

Thus, the rate of ice buildup might have appeared to be within the realm of normal, even though meteorological studies concluded that with a liquid water content of 0.74 grams per cubic meter and droplet sizes over 100 microns, the icing was probably severe. A supplemental weather study conducted by France’s Bureau of Inquiry and Analysis (BEA) to support the NTSB investigation estimated that the airplane spent 24 minutes in icing conditions, during which time the total accumulation of ice was estimated to have been between 3.0 and 6.5 cm (1.2–2.6 in), mostly during a couple of distinct periods. A NASA icing training document available today provides ballpark ice accumulation rates associated with the subjective icing levels, under which the rate of ice accumulation during these periods would be considered “severe.”

This finding led the NTSB to conclude, for the second time in 15 years, that there was no reliable way for pilots to determine when they were flying in icing conditions that exceeded the certification envelope of the aircraft. A search for useful techniques revealed that some ATR pilots had been using the presence of ice on the cockpit side windows as an indication of freezing rain, but the appearance of this indicator was not guaranteed. In the NTSB’s view, the scale of the problem was too large to be solved with pro tips and rules of thumb. The only answer was to expand the Appendix C envelope to include freezing rain.

As for the connection between the freezing rain and the mysterious deflection of the ailerons — the findings in that area would prove even more disturbing.

◊◊◊

Part 5: Aileron Hinge Moment Reversal

Shortly after investigators identified the abnormal aileron behavior on flight 4184, having concluded that ice was the most likely cause, they sat down with ATR engineers to test a brute force investigation technique: could the manufacturer find a pattern of ice buildup that would cause a massive uncommanded aileron deflection in the ATR engineering simulator? In the end, after testing hundreds of ice shapes and locations, they found that the answer was yes.

To understand what they found, we first need to talk about some of the inherent properties of unpowered ailerons, and an undesirable side effect of their design, known as hinge moment reversal.

A properly balanced control surface should remain in the neutral position in the absence of inputs from the pilot or autopilot. When the forces acting on the control surface balance to keep it in this position, then the hinge moment is neutral. An aerodynamically perfect surface will balance itself as long as the airflow over both sides is equal and constant, but this is real life, so in practice additional stabilizing measures are needed. On hydraulic controls, the sheer power of the hydraulic actuator forces the control surface to remain in the neutral position when it’s not moving, which makes these types of surfaces extremely stable. However, because the ATR’s ailerons are unpowered, their stability comes from two main mechanical and aerodynamic sources instead. First, the control horns contain balance weights that help dampen any undesired movement of the ailerons. And second, ailerons are inherently quite well balanced due to their fundamental design. In order to turn the plane, the aileron on the “down” wing deflects up and the aileron on the “up” wing deflects down, and both ailerons are connected to each other by control cables that run through the pilots’ control wheels. Therefore, one aileron cannot deflect down without the other simultaneously deflecting up, and vice versa. This tends to cause forces on the ailerons to cancel each other out, returning both ailerons to neutral when no inputs are being made. It also means that a sudden force that simultaneously affects both ailerons in the same direction will deflect neither of them.

However, if a sufficiently powerful force affects one aileron and not the other, then the ailerons can experience hinge moment reversal. That means that instead of the forces balancing each other to return the ailerons to neutral, they become divergent, driving both ailerons to their respective stops. The ailerons will remain at the stops unless a force is applied to return them to neutral.

How aileron hinge moment reversal happens. Can you see why the phenomenon is much more dangerous on aircraft with horn-controlled ailerons? Not to scale. (Own work)

Airplanes with unpowered ailerons can experience hinge moment reversal when airflow is disrupted over one side of an aileron but not the other. This typically occurs during a stall. As I discussed earlier, a stall occurs when the angle of attack gets too high, leading to the separation of airflow over the top of the wing. This separation creates a turbulent low pressure area between the upper wing surface and the predominant airflow above. At the same time, airflow over the bottom of the wing remains attached to the lower surface, creating a pressure differential that draws the aileron upward. Just as the control horns make it easier for the pilot to move the aileron, they also assist the unbalanced airflow, amplifying its effect. The result is a hinge moment reversal.

During a symmetrical stall, airflow separates over both wings, which means that both ailerons are drawn upward. As long as the control cables are intact, it’s physically impossible to move both ailerons upward at the same time, so the forces cancel out and the ailerons stay in the neutral position, ensuring that the airplane remains controllable in the stall. However, if one wing stalls before the other, the forces might not be balanced and an uncommanded aileron deflection can occur in extreme cases. This is a known phenomenon associated with unpowered, horn assisted ailerons and anyone manufacturing such a system has to take it into account.

According to the NTSB, some early aileron configurations tested by ATR during the development of the ATR 42 experienced hinge moment reversal at relatively low angles of attack, before the stall. Some airflow separation occurs before the stall AOA is reached, and if the region of separated airflow encompasses the ailerons, then the result is a premature hinge moment reversal. However, ATR ultimately settled on a design that would not experience hinge moment reversal until the AOA reached 25 degrees. This is not only above the stall AOA, but above the stick pusher AOA, meaning that even if the airplane stalls, the stick pusher should intervene to reduce the angle of attack before the hinge moment reversal threshold is reached. At an AOA above 25 degrees, the airplane may enter a deep stall, precluding any possibility of recovery anyway.

The issue with this design was that it assumed a “clean wing” with predictable airflow patterns. But what would happen if the wing was covered in ice? Would the same assumptions still hold? It turned out that the answer was no, but this was only proven much later. When the ATR 42 was being developed, and indeed at the time of the crash, there was no requirement to test for ice-induced handling problems outside the Appendix C envelope. None of the Appendix C icing certification tests produced an aileron hinge moment reversal, so by all accounts the ATR 42 met the letter of the law. These tests were later repeated on the ATR 72 with the same result.

ATR did take into account the fact that ice on the wings can cause airflow separation at lower AOAs. As a result, when the anti-icing systems are turned on the stick shaker stall warning and the stick pusher will both activate at lower AOAs. Since the anti-icing systems (engine, sensors, propellers) must be activated during all icing conditions, regardless of whether ice is actually present, this should ensure that the lower stall protection thresholds are always used when there is a risk of icing. Furthermore, since aileron hinge moment reversals were associated with stalls, this feature would theoretically protect against the phenomenon even with ice on the wings. Although severe icing can cause a stall to occur at an AOA below even the reduced icing stick shaker threshold, this scenario was by definition outside the certification envelope and the ATR was not tested for handling anomalies under those conditions.

According to the NTSB, the issue of aileron hinge moment reversal cropped up again during the development of the stretched ATR 72. The ATR 72 wingspan is larger than the ATR 42 and the design of its ailerons is necessarily slightly different. Apparently, ATR found that the -72 experienced the same low-AOA hinge moment reversals experienced by the early ATR 42 aileron concepts. Test pilots even encountered aileron hinge moment reversals during stall demonstrations in some early developmental flight tests, although they found that the aileron deflection could be countered without exceptional effort.

Vortex generators on an aircraft wing. (AOPA)

This time, ATR solved the problem by installing simple devices called “vortex generators” upstream of the ailerons. Although the name makes it sound like they increase turbulence over the wing, they actually delay airflow separation downstream of their location, which ATR found was sufficient to delay the aileron hinge moment reversal to 27 degrees AOA on the ATR 72.

The NTSB pointed out that hydraulic controls would also have solved this issue, but that ATR never officially considered using hydraulically powered ailerons on the ATR 72. Some ATR engineers privately told the NTSB that they had discussed it informally during the early stages of the ATR 42 design process, but the possibility was not seriously explored. After all, the company believed it had found a cheaper, equally foolproof solution — but they were wrong.

◊◊◊

Using the company’s engineering simulator, ATR experts discovered that a sharp ridge of ice at least 1.9 cm (3/4 in.) in height, located ahead of the aileron but behind the deicing boot, about 8–10% of the way back from the leading edge, on one wing but not the other, would cause an aileron hinge moment reversal at a very low AOA. The simulations showed that this particular ice shape would cause the airflow over the aileron to separate exceptionally early, far earlier than the designers ever believed was possible. In fact, the aileron hinge moment reversal on flight 4184 occurred at just 5 degrees AOA, well within the normal operating range. That meant that the reversal could occur independent of any risk of a stall, and in any phase of flight — and to make matters even worse, the ice ridge did not produce any significant drag that would otherwise alert the crew to a developing performance problem.

How ice can induce a hinge moment reversal. Not to scale. (Own work)

Armed with this knowledge, the NTSB, ATR, and other stakeholders embarked on an ambitious program to determine whether such an ice ridge could actually form under the conditions in which flight 4184 was operating. The centerpiece of this effort was a series of real-world test flights behind a KC-135 tanker spewing freezing precipitation high above Edwards Air Force Base in the California desert. Investigators were well aware that tanker tests have certain limitations — for instance, the plume of freezing droplets doesn’t cover the entire airplane, and it can be hard to accurately control both the droplet size and water content. However, the tests were a useful starting point to determine whether the proposed accident scenario was realistic.

Using an ATR 72 flown by experienced ATR test pilots, the tanker tests revealed that when exposed to supercooled large droplets with a diameter over 100 microns and the deicing boots active, the droplets tended to run back along the surface of the wing to 14–16% of chord, exceeding the depth of the deicing boot, which only protected the leading 7% of chord. After spending 17 minutes in these conditions, a ridge of ice began to form aft of the deicing boot between 8 and 10% of chord.*

*Note: Different sources do not agree on the percentages of chord. The NTSB report states that ice ran back to 14% of chord and that the ice ridge formed at 8–9%. The special report on the Edwards tanker tests, appended to the NTSB report, states that ice ran back 16% and the ridge formed at 10%.

With the ice ridges in place on one wing but not the other, the test pilots decelerated towards a stall, slowly increasing the angle of attack while white-knuckling the controls, ready for any anomaly. And indeed, it happened: with the flaps set to 15 degrees, an aileron hinge moment reversal occurred at 12 degrees AOA, shortly before stick pusher activation at 15.3 degrees. But that wasn’t what happened at Roselawn, when the reversal happened at 5 degrees with no prior loss of airspeed. Furthermore, the forces involved were not exceptionally high. The test pilots had to exert only 30–40 lbs of force (14–18 kg) to return the ailerons to the neutral position, although in one test the force briefly peaked at 77 lbs (35 kg). These values are well within the capabilities of any pilot.

However, when the test pilots allowed ice to accumulate with flaps 15 but then retracted the flaps before starting the AOA increase, the results were different. In that configuration, aileron hinge moment reversal occurred at AOAs as low as 7 degrees, far below even the stick shaker activation threshold, let alone the stick pusher. This was an enlightening discovery, considering that this was the exact sequence of configuration changes made by the pilots of flight 4184 before they lost control. The control forces weren’t any higher in this configuration, but the principle had been demonstrated.

A Saab 2000 undergoes an icing test while flying in formation with a KC-135 tanker. (Unknown author)

Because of the limitations of tanker testing with regards to the exact ice shapes that are produced, tests were also carried out in wind tunnels and on the ground using artificial ice shapes designed to represent a worst case scenario. ATR conducted high-speed taxi tests, taxiing at up to 100 knots with artificial ice shapes, and was able to observe aileron hinge moment reversal with transient control forces up to 60 lbs (27 kg). Wind tunnel tests produced more adverse results, with a continuous application of 56 lbs (25 kg) theoretically required just to keep the wings level.

The NTSB noted that while none of these tests produced such a hinge moment reversal at the 5-degree AOA observed on flight 4184, they did demonstrate that such an event could probably occur in real operations, after accounting for unknown variations in the sharpness of the ice ridge due to random shedding, and a more asymmetrical buildup due to the fact that flight 4184 was making only right turns for half an hour before the crash.

Subsequent analysis of the tanker test data revealed why the particular configuration changes that took place on flight 4184 produced such a dramatic result. While holding at 175 knots with flaps 15, the angle of attack actually reduced to a value slightly below zero degrees, exposing more of the upper wing surface and increasing the amount of runback behind the deicing boots. In this configuration, the ice ridge built up in the most adverse possible location. When the pilots retracted the flaps, the wings provided less lift, so the angle of attack had to increase to compensate and maintain the flight profile. At that point, the ice ridge caused the airflow to separate over the right aileron, and a hinge moment reversal occurred.

◊◊◊

Part 6: A Pattern of Behavior

From the earliest days after its entry into service, pilots understood that they needed to be careful with the ATR in icing conditions. The exact reason for this is difficult to pinpoint. In his book Unheeded Warning, American Eagle ATR pilot Stephen Frederick wrote that pilots were warned in ground school that the ATR handled ice poorly, but throughout the book he was unable to articulate a clear reason why. Some possibilities will be discussed later in this article. What is known is that the early service history of the ATR 42 was not without incident.

Frederick mentioned in his book that shortly after the ATR 42 entered service, an aircraft experienced pitch control difficulties with ice on the tail and the flaps fully extended to 45 degrees, probably due to aerodynamic interference with the horizontal stabilizer. ATR’s solution was to ban the use of flaps 45 in icing conditions, while the FAA preferred a mechanical solution rather than assuming pilots would correctly identify icing conditions every time. Ultimately, the use of flaps 45 was banned in all normal operations, and pilots must now lift a mechanical gate to access the setting, which is only approved for use in an emergency. Although this anecdote demonstrates some of the ATR’s less than desirable icing characteristics, the flaps 45 issue never led to any further in-service incidents.

The first notable ATR 42 incident occurred in December 1986, when an American Eagle ATR 42 on approach to Detroit entered icing conditions, and the pilots did not immediately activate the deicing boots. According to the NTSB, the pilots were afraid that they would experience deice boot bridging — a phenomenon in which ice rapidly accumulates around the shape of the inflated deicing boot, creating an ice “bridge” over the leading edge of the wing that cannot be dislodged by later inflation cycles. (If you recall my article on Comair flight 3272, “deice boot bridging” in turboprop aircraft is actually a myth, but this was not widely known at the time.) At 1,900 feet on approach with an airspeed of 146 knots, the autopilot disconnected and the airplane experienced sharp bank oscillations up to 44 degrees in either direction, leading to a loss of 500 feet of altitude before control was regained. The aircraft landed safely.

This incident was brought to my attention by Stephen Frederick, who wrote in his book that only ½ inch (1.25 cm) of ice had built up at the time of the upset, which is less than the buildup assumed prior to deicing boot activation in certification tests. However, I tracked down the NTSB summary of this incident and the actual ice buildup reported by the crew was an inch and a half (3.75 cm), not ½ inch. Given the reportedly rapid buildup of ice, the failure to use the deicing boots, and the comparatively low speed at the time of the upset, my view is that this incident was likely an ice-induced stall at a low AOA prior to stick shaker activation, and not an aileron hinge moment reversal. Neither Stephen Frederick’s book nor the one-page NTSB summary provides any data that would indicate whether any abnormal aileron behavior took place during the stall.

For his part, Frederick suggests that the roll oscillations may have been induced by the ATR’s roll assist spoilers. The hydraulically actuated spoilers deploy on the “down” wing to help the aircraft turn when the ailerons are sufficiently deflected. However, when the pilot makes large, sharp inputs in an attempt to correct a sudden bank, their inputs can become out of phase with the slightly slower-moving roll spoilers, leading to a prolonged series of oscillations as the pilots fight against themselves.

It must be kept in mind that an uncommanded roll is not by itself evidence of abnormal aileron behavior. Any aircraft will roll during a stall if the airflow separates on one wing earlier than the other, simply because one wing has stopped flying while the other has not.

According to Frederick, just one hour and twenty minutes after the first incident, another American Eagle ATR 42 on approach to Detroit in the same conditions also experienced an ice buildup followed by a roll excursion, albeit less severe. In response to the twin incidents, the FAA briefly banned all ATR 42 aircraft in the US — which at the time was only 6 airframes — from flying into known icing conditions until it could be established that the incidents did not represent a flaw with the aircraft itself. The New York Times later reported that the French DGAC was unhappy about the grounding because they had not been consulted.

After a brief investigation, the prohibition was lifted, but ATR did increase the minimum speeds in icing conditions listed in the aircraft flight manual. The option for a 130-knot climb was also deemed too slow and was eliminated under any conditions. The company also lowered the stall warning threshold with anti-icing active. Unfortunately, this would not be enough to prevent a tragedy just 10 months later.

◊◊◊

I-ATRH, the ATR 42 involved in the crash of ATI flight 460. (Werner Fischdick)

On the 15th of October 1987, Aero Trasporti Italiani (ATI) flight 460 departed Linate Airport in Milan, Italy, bound for Cologne, Germany. The ATR 42, operated by the domestic branch of Italian flag carrier Alitalia, entered a sudden descent while climbing through 16,000 feet, disappeared from radar, and crashed into a forested mountainside near Lake Como in the Italian Alps. All 34 passengers and 3 crewmembers were killed.

Separate investigations into the accident were conducted by Italian air safety investigators, the Italian judiciary, and the French BEA, together with ATR. While researching for this article, I was unable to find any of their final reports on the crash. Some information about the crash is available in Stephen Frederick’s book and in the BEA’s parallel report on American Eagle flight 4184, although these two sources present wildly different accounts of what happened. As far as I have been able to piece together, the pilots of flight 460 apparently were not informed that ATR had eliminated the 130-knot climb speed option, and on the day of the accident they elected to use this speed. Italian weather forecasters had predicted moderate icing in clouds in the Milan area. As flight 460 climbed, it encountered a buildup of ice, and the pilots cycled the deicing boots at least once. However, they never discussed seeing ice on the plane, and the boots were not active during most of the climb. Subsequently, ice accumulated, drag increased, airspeed fell, and finally at 16,000 feet the aircraft stalled. The autopilot disconnected and the plane rolled sharply, first in one direction, then the other, increasing in amplitude up to 120 degrees of bank. The flight data recorder indicated that the pilots overrode the stick pusher three times, making the stall worse. The nose then fell through and the plane plunged toward the ground. Disoriented in dark clouds, fighting against terrifying aerodynamic forces, the pilots acted discordantly, making wild and sometimes conflicting inputs. In the end, they did not recover.

A rescue worker stands amid the wreckage of ATI flight 460. (Bureau of Aircraft Accidents Archives)

Stephen Frederick had access to the Italian investigation report, but his book doesn’t include specific information about how flight 460’s controls moved during the upset. His account broadly mentions unusual control movements but doesn’t say what, if anything, the ailerons did as the aircraft stalled and rolled out of control. Similarly, the New York Times reported that the judicial investigation found that ice caused uncommanded control movements, but the newspaper didn’t specify what these were. However, it is possible that an aileron hinge moment reversal may have occurred during the stall, contributing to the pilots’ difficulty recovering the airplane.

During the inquiry, Italian investigators commissioned icing tests from a British government laboratory. The tests concluded that under certain icing conditions, droplets could run back and freeze behind the deicing boots. The investigators wrote that the ATR’s deicing system was of “limited effectiveness,” but ATR and the BEA protested this characterization. As the Italian investigators freely admitted, the conditions in the area of the accident were well outside the certification envelope, and in fact four other nearby flights operated by multiple aircraft types also experienced control difficulties that day. Therefore, since the deicing systems were never intended or certified to operate in those conditions, ATR felt it was incorrect to describe them as ineffective.

According to the New York Times, ATR performed 26 flight tests with ice shapes behind the deicing boots in an attempt to induce a control anomaly, and were unable to do so. But in hindsight, their mistake was simple: their ice shapes were only ½ inch high, when ¾ inch was required to produce an aileron hinge moment reversal. The newspaper quoted ATR’s vice president of US flight operations, Robert Briot, looking back at the matter from 1995: “We probably were very close,” he said. “If we had increased it a little bit, we would have found it.” Regardless of whether an aileron hinge moment reversal contributed to the crash of ATI flight 460, the later disaster at Roselawn might have been nipped in the bud right then. Unfortunately, it wasn’t.

One of the recommendations from the safety investigation was that ATR expand the size of the deicing boots. If the boots had been expanded to at least 10% of chord, then American Eagle flight 4184 probably wouldn’t have crashed. Unfortunately, ATR declined to do so. The company’s position was that modifications were unnecessary because the flight tests had demonstrated no adverse handling characteristics with ice behind the deicing boots.

In the end, the Italian safety investigation concluded that the pilots were primarily at fault for the ATI accident due to their failure to identify the icing conditions, activate the deicing boots, and maintain a safe airspeed. All of this was true, but it wasn’t the whole story. The judicial investigation focused on the other half: the unspecified control anomalies experienced during the ice-induced stall. Considering this aircraft behavior to be unreasonable, Italian prosecutors indicted seven people, including chief ATR designer Jean Rech, on charges of manslaughter for producing a defective airplane. All seven were acquitted.

◊◊◊

The NTSB’s summary report on the Mosinee incident. (NTSB)

Although Stephen Frederick identified some other minor icing-related control events in 1987, the next truly significant incident came on a dismal, gray day in central Wisconsin in December 1988. According to the NTSB summary, a Simmons Airlines (d.b.a. American Eagle) ATR 42 was on approach to Central Wisconsin Airport in Mosinee with 37 people on board amid heavy rain. Between 3,000 and 5,000 feet, a temperature inversion created below freezing temperatures, causing the raindrops to become supercooled. The aircraft spent nearly 12 minutes within the freezing rain band, but the pilots did not perceive any ice accumulation on the aircraft and did not activate the deicing boots. As the pilots decelerated in preparation to extend the landing gear, the aircraft suddenly stalled without warning at an airspeed of 157 knots. The autopilot disconnected and the aircraft rolled 80 degrees to the left, then all the way back to 60 degrees right. According to the New York Times, the pilots recalled the recent ATI accident, correctly identified the upset as an icing-induced stall, and executed a successful stall recovery maneuver by pitching the nose down, leveling the wings, and increasing engine power. Control was regained after losing 600 feet of altitude, and the flight landed successfully a few minutes later.

Stephen Frederick, who later flew with the pilot involved in the incident, added some details in his book, although not all of them were accurate — for instance, he wrote that all the ice protection systems were active, even though the NTSB found that this was not the case. Nevertheless, according to his second-hand retelling, the aircraft was in a right turn to intercept the instrument landing system when the aircraft stalled and the ailerons deflected to near maximum left-wing-down. The aircraft initially did not respond to the pilots’ frantic inputs, but after their airspeed surpassed 190 knots, control became much easier and a recovery was effected. Only after the upset did they learn that the airport had received a report of freezing rain. At no point before, during, or after the stall did the stick shaker go off. The NTSB later determined that the aircraft stalled at an AOA of 11.5 degrees.

There was little doubt that an aileron hinge moment reversal occurred during the Mosinee incident, but here again it was associated with the onset of a stall. This might have again masked the threat that it posed under slightly different conditions. A detailed discussion of the debate over how much ATR should have learned from the incident can be found in Part 7.

After the Mosinee incident, ATR sent a letter to all ATR 42 operators that reiterated the existing icing procedures and reminded crews that the deicing boots are not certified for use in freezing rain. The FAA was not entirely satisfied with this response, so the agency worked with the company to draft a new procedure that they hoped would prevent a similar incident by banning the use of the autopilot in icing conditions. In principle, this rule is reasonably sound, because the autopilot can often mask deteriorating aircraft performance due to ice until the effect becomes so large that it can no longer compensate, at which point the autopilot clicks off and the pilots are suddenly handed a barely controllable airplane. This would later be identified as one of the major causes of the 1997 crash of Comair flight 3272. Pilots can avoid this undetected deterioration by hand-flying the aircraft, allowing them to feel changes in performance. However, some pilots, including Stephen Frederick, objected to the rule based on reasoning that’s worth considering. The issue was not that hand-flying in icing conditions was not the safer option — it is indeed safer — but that the procedure represented a band-aid fix that transferred all responsibility for managing the ATR’s scary icing behavior onto the pilots.

After mandating the non-use of the autopilot in icing conditions through an Airworthiness Directive (AD), the FAA took further steps. In February 1989, the agency approved the use of a simulator package developed by ATR and Flight Safety International that would help pilots learn about the behavior of the ATR in icing conditions. After the Roselawn crash, the NTSB pointed out that, in hindsight, the simulator package had modeled asymmetric stalls but not the abnormal aileron behavior.

During the summer of 1989, ATR separately developed the Anti-icing Advisory System (AAS), which is the automated icing alert system described in Part 1 of this article. To reiterate, the system consists of sensors on the wings that detect ice and trigger a warning chime and light in the cockpit, informing the pilots that the deicing boots need to be turned on even if ice is not visible. This improvement was directly inspired by the fact that the pilots in the Mosinee incident did not perceive any ice on their aircraft before the loss of control, even though ice was assuredly present. The system was mandated by the French DGAC with an installation deadline of October 1st, 1989, and the FAA followed suit with a corresponding AD. The ATR 72 was also fitted with an ice evidence probe outside the cockpit, described in Part 1, because the extended fuselage made it harder for the pilots to see ice on the wings and propellers relative to the ATR 42.

These measures were widely perceived by the pilot community as inadequate. In response to the notice of FAA rulemaking regarding the AAS, the Air Line Pilots Association (ALPA) issued a scathing letter accusing the FAA of not going far enough to correct a defective airplane. “Based on our knowledge of these occurrences, we are convinced that the airplane was not properly certified for flight into icing conditions,” the union wrote. “The aircraft, in our view, is equipped with an anti-icing/de-icing system which is unorthodox, ill-conceived, and inadequately designed.” In a second letter, ALPA further stated, “We are also concerned with the premise that the aircraft was not certified for flight into freezing rain. … Since freezing rain cannot be predicted with any reasonable certainty, should pilots refrain from flight into any icing conditions? How can pilots determine if their aircraft will be subjected to freezing rain? And if their aircraft are subjected to unexpected freezing rain, will the modifications proposed in the AD be effective in ensuring the continued safe flight of this aircraft? All other aircraft types were not certificated for flight into freezing rain as well, yet these same aircraft have not experienced the serious loss of control incidents [that] the ATR 42 has. Perhaps anti-ice/deice systems of other aircraft types have been more thoroughly designed to compensate for operations in all icing conditions thus recognizing the inability [to predict] freezing rain.”

The points raised by ALPA forcefully underscored the inherent fallacy of the freezing rain certification issue. Did it really matter that an aircraft wasn’t certified to fly into freezing rain if there was no reliable way to identify those conditions? As long as that remained true, blaming conditions outside the certification envelope came across as a copout. Despite this obviously glaring problem, the FAA obtusely responded that freezing rain is a “rare, low altitude phenomena, that is generally easy to forecast and therefore avoid,” ignoring a decade of NTSB research that indicated otherwise.

Meanwhile in France, the stretched ATR 72 was approaching final certification, and ATR had already installed vortex generators to raise the aileron hinge moment reversal AOA. The company identified this is a potentially inexpensive way to improve the stall characteristics of the ATR 42 in a tangible way that didn’t rely on correct pilot performance. Consequently, the company began manufacturing ATR 42s with similar vortex generators installed, but lacked the authority to mandate retrofitting of existing airframes. The FAA issued a Notice of Proposed Rulemaking for required installation of vortex generators on ATR 42 aircraft in October 1989, but the final AD would not come until 1992. Furthermore, the text of the AD appeared to falsely suggest that the vortex generators would eliminate aileron hinge moment reversal, rather than delaying it to a higher AOA. Nevertheless, once the vortex generators were installed, airlines were free to allow their pilots to once again fly in icing conditions with the autopilot engaged.

ATR also drafted a proposed revision to the Flight Crew Operations Manual (FCOM) that would have provided a procedure for handling a freezing rain encounter, which included the following provisions: 1) use of the autopilot is prohibited; 2) the minimum speed with flaps retracted is 180 knots; 3) the minimum speed with flaps extended is as close to the flap limit speed as possible; 4) excessive maneuvering should be avoided; and 5) leave the conditions as soon as possible. Authorities in Canada and Germany approved the modifications and the information was added to ATR FCOMs in those countries, but the FAA and DGAC rejected the changes in the US and France on the grounds that they could not support a procedure for flight in conditions the aircraft was not certified for, and because they considered the addition of vortex generators to have solved the problem.

In the meantime, ATRs continued to lose control in icing conditions.

◊◊◊

An Air Mauritius ATR 42. (Alan Lebeda)

On April 17th, 1991, an Air Mauritius ATR 42 was flying over the Indian Ocean off the coast of Mauritius when it encountered icing conditions. The autopilot was engaged and the deicing boots were not turned on. As ice accumulated on the aircraft, their airspeed steadily dropped, which was noticed but not corrected by the crew. Furthermore, the ice buildup was asymmetric, causing an increasing desire to pull to one side which was compensated by the autopilot. However, as the aircraft decelerated, more force was required to counteract this tendency, and the aircraft experienced two uncommanded rolls. The pilots decided to disconnect the autopilot, but when they did so, the AOA increased to 11 degrees and the aircraft stalled before reaching the stick shaker threshold. The aircraft rolled abruptly 40 degrees to the right, but the flight crew successfully leveled the wings and recovered from the stall. None of the available information from the NTSB or the BEA suggests that there was any abnormal aileron behavior during the incident.

Subsequently, on August 11th, a Ryan Air ATR 42 encountered severe icing conditions while cruising at 18,000 feet over Ireland. The flight crew were late to activate the deicing boots, and ice built up rapidly on the airframe, causing a loss of airspeed and an increase in AOA. At 11.5 degrees, the stall warning activated and the autopilot disconnected. The pilots did not apply the stall recovery procedure, and in fact they pitched up for 12 seconds, making the situation worse. The aircraft stalled and lost 4,000 feet of altitude before the pilots managed to recover. There was no evidence of abnormal aileron movement in this incident either.

The cover and first page of ATR’s 1992 All Weather Operations brochure. (NTSB)

In December 1992, more than a year after the incidents, ATR published a lengthy brochure entitled “All-Weather Operations” and delivered it to all operators of the ATR 42 and 72, including Simmons Airlines. The brochure described types of icing and their various effects, and it covered the general dangers of freezing rain accurately, although it misleadingly characterized this icing as “rare.” The brochure went on to describe the handling difficulties that could result from icing of the horizontal stabilizer, with a single-sentence note that “Similar anomalies can affect other unpowered controls (such as ailerons) when ice accretion exists.” Moving into the topic of the ATR’s performance in ice, the brochure advertised that “all effects on handling are clearly described” in the manuals, although no manuals mentioned hinge moment reversal beyond vague references to “anomalies.”

In a chapter dedicated to atmospheric icing, the brochure reminded pilots to adhere to the published minimum airspeeds in icing conditions, regardless of whether ice could be seen. It also stated that “aileron forces are somewhat increased when ice accretion develops, but remain otherwise in the conventional sense,” which appears to exclude the possibility of hinge moment reversal (i.e. the ailerons ceasing to operate in the “conventional sense”).

A chapter entirely dedicated to freezing rain followed, which stated that “Ice accretion due to freezing rain may result in asymmetrical wing lift and associated increased aileron forces necessary to maintain coordinated flight before aerodynamic stall.” This did not refer to any aileron hinge moment modification, but rather the fact that as an asymmetric stall approaches, a force will have to be applied using the ailerons to keep the wings level, since one wing is generating less lift than the other. There was no discussion of hinge moment reversal in the freezing rain chapter. Furthermore, the chapter stated that “Freezing rain conditions are usually predictable, recognizable and avoidable,” echoing the FAA’s language, and contradicting years of warnings from the NTSB. The brochure suggested that pilots could predict freezing rain by monitoring AIRMET and SIGMET weather alerts, which did not contain forecasts of freezing rain, and by monitoring for a temperature inversion.

Quoting an FAA advisory circular, the brochure stated that “Flight in freezing rain should be avoided where practical.” Tips were also provided for detecting freezing rain after entering it — namely, by the presence of “heavy rain” under conditions that the pilots have already determined are conducive to icing. These are indeed surefire signs of freezing rain, but by the time they become apparent, the aircraft is already flying outside the certification envelope and will likely experience adverse effects.

Addressing the procedures to be followed in a freezing rain encounter, the brochure stated, “In case of roll axis anomaly, disconnect [autopilot] holding the control stick firmly. Possible abnormal roll will be felt better when piloting manually.” The possible nature or cause of such a “roll axis anomaly” was not discussed.

◊◊◊

A Continental Express ATR 42. (Bernd Oberschelp)

The release of the brochure did not stop further incidents from occurring. On March 4th, 1993, an ATR 42 operated on behalf of Continental Express was on approach to Newark, New Jersey when it encountered severe icing conditions with freezing rain. The aircraft was flying at 170 knots, well above the minimum icing speed, with all anti-icing and deicing systems active, when the autopilot disconnected and the ailerons deflected to about 50% of maximum right wing down. The aircraft rolled 52 degrees to the right before the pilots recovered with full left aileron. Several further oscillations of decreasing amplitude occurred before control was regained. The pilots reported that the controls felt “spongy,” and after recovering from the incident, the captain observed an incredible 3 inches (7.6 cm) of ice on the wings behind the deicing boots.

Although the NTSB investigated the incident, they never issued a probable cause, and most of the analysis was apparently done by ATR. The manufacturer determined that the roll excursion began at an AOA of only 7 degrees, but that any analysis of aircraft performance was made nearly impossible due to the aerodynamic effects of “severe turbulence” captured on the flight data recorder. Interestingly, the pilots didn’t mention any turbulence in their account of events, although the FDR data showed that there were continuous G-force excursions on the order of ±0.3 G’s. ATR determined that ice likely lowered the stall AOA, and then the turbulence caused a locally higher-than-measured AOA on the right wing, triggering an asymmetric stall. The manufacturer argued that the abnormal aileron movement could have been caused by the turbulence too.

However, looking back at the incident from after the Roselawn accident, the NTSB argued that turbulence of the magnitude recorded was not strong enough to overcome the normally neutral hinge moment and deflect the ailerons. In their retrospective view, the severe ice buildup on the wings behind the deicing boots likely reduced the hinge moment reversal AOA to 7 degrees. The pilots were able to overcome the probable hinge moment reversal thanks to their prompt and forceful application of full opposite aileron.

Just under one year later, on January 28th 1994, another Continental Express flight experienced a similar incident near Burlington, Massachusetts. The incident was not reported to the NTSB, nor was it required to be, but ATR learned of the occurrence and conducted an investigation. According to ATR, the aircraft was flying at 16,000 feet when it encountered icing conditions, causing increased drag even though all deicing and anti-icing systems were active. The airspeed dropped to 144 knots and the angle of attack increased to 11.2 degrees, causing the stick shaker to activate, and the autopilot disconnected. The ailerons subsequently deflected to about 70% left wing down, and the aircraft rolled 54 degrees to the left. The pilots were able to recover by applying full opposite aileron, and the flight was safely continued.

ATR found that the high drag and associated deceleration could only be associated with severe icing conditions that overwhelmed the deicing boots. The loss of airspeed was not noticed by the crew and the aircraft stalled shortly after the stick shaker was triggered. During the stall, an aileron hinge moment reversal occurred, leading to the roll excursion.

◊◊◊

The incidents described above were those identified as significant by the NTSB, but there were actually at least 24 icing-related incidents involving ATR aircraft known to the agency. Many of these were clearly unrelated to the issues raised by the Roselawn investigation, although Stephen Frederick’s book describes some of them anyway. But it’s worth stepping back to analyze just how relevant to Roselawn these incidents were, or weren’t.

In all of the incidents, except for Newark, the upset was associated with a prolonged period of decreasing airspeed and/or increasing angle of attack. This type of incident occurs because ice alters the drag curve — or, put another way, it increases the minimum sustainable airspeed. Below a certain speed, an aircraft cannot maintain altitude because the angle of attack required to maintain lift at that speed is too high, resulting in higher induced drag as more of the fuselage is presented to the oncoming air. When this drag is higher than the available thrust, the airplane will decelerate indefinitely until it stalls. If ice adds even more drag, that means drag will overcome thrust at a lower AOA and thus a higher airspeed.

Accidents and incidents resulting from flight “behind the drag curve,” both in icing conditions and in clear air, have featured repeatedly in my past articles. My breakdowns of Sol Líneas Aéreas flight 5428, Pinnacle Airlines flight 3701, West Caribbean Airways flight 708, and Air Algérie flight 5017 contain some of the clearest examples.

The risk of this type of incident is higher in icing conditions because airplanes contaminated by ice can fall behind the drag curve at airspeeds that are typically safe. This is precisely why higher minimum icing speeds must be adhered to. In some, but not all, of the incidents, these minimums were not observed. In all of the above mentioned cases, except for Burlington, the propeller RPM was also set to 77% rather than the minimum of 86% for icing conditions. This tends to allow more ice to accumulate and decreases propeller efficiency, leading to more drag and faster airspeed loss. However, it should be noted that this factor was not significant in the Newark incident because there was no prolonged loss of airspeed before the upset. Nor was it relevant to Roselawn, where the pilots used the correct RPM.

All of the incidents were assessed to have happened in icing conditions more severe than the Appendix C certification envelope. This was the case in Roselawn as well, so this fact is sort of beside the point. A more important factor setting the incidents apart from Roselawn was the fact that in all of the incidents, except debatably Newark, the aircraft experienced a stall. Since aileron hinge moment reversal is normally associated with a stall, the fact that the ailerons experienced an uncommanded deflection in at least three of these incidents (Mosinee, Newark, Burlington) and possibly 4 (Italy) could be interpreted as the logical result of known aircraft behavior. All aircraft with unpowered, horn-assisted ailerons can experience hinge moment reversal at high AOAs, and much as ice will reduce the stall AOA, it will reduce the hinge moment reversal AOA too, increasingly the probability that pilots will encounter the phenomenon.

The question of whether the occurrence of aileron hinge moment reversal contributed materially to the above incidents will be discussed in Parts 7, 8, and 10. But the NTSB also raised another, equally important question: given their knowledge of these previous incidents, could ATR have predicted that certain types of ice accumulation could induce aileron hinge moment reversal at low AOAs prior to stall? In the end, the NTSB and its French counterpart would come to wildly divergent conclusions.

◊◊◊

Part 7: What Did They Know, and When Did They Know It?

One of the NTSB’s central findings during the Roselawn investigation was that ATR had all the information it needed to predict an icing-induced aileron hinge moment reversal at low AOA, but did not do so, and in fact released misleading information in an effort to deflect criticism of its product.

According to the NTSB’s account of events, ATR knew from the earliest testing stages that their aileron design could experience hinge moment reversal, and a number of steps were taken to ensure that this would not happen during actual operations. The ATR’s stall protection systems were designed, like those of any aircraft, with activation thresholds intended to protect from adverse aircraft handling characteristics near, at, and above the stall AOA, including but not limited to aileron hinge moment reversal. However, the manufacturer was not required to inform the certificating authorities of the exact nature of any adverse characteristics that may be associated with high angles of attack, since this is outside the operating envelope of the aircraft. Ideally, the stick pusher should lower the AOA before these characteristics are encountered. But because hinge moment reversal is associated with airflow separation over the ailerons, it was possible to infer that ice on the wings could lower the hinge moment reversal AOA, in much the same way that ice reduces the stall AOA. The theoretical possibility of an aileron hinge moment reversal at an AOA below the stick pusher or stick shaker threshold in icing conditions probably would have drawn the attention of the DGAC or the FAA, had they been informed.

The first in-service occurrence that the NTSB felt should have drawn the manufacturer’s attention was the 1988 Mosinee, Wisconsin incident. The uncommanded deflection of the ailerons to almost full left-wing-down before the activation of the stick shaker showed that the theoretical possibility described in the previous paragraph could happen in real life if ice built up behind the deicing boots. Despite this discovery, ATR did not seriously consider increasing the depth of the deicing boots, which the NTSB believed was the most effective solution to the problem. Furthermore, the company did not inform operators or pilots that ice behind the deicing boot could cause an uncommanded aileron deflection and loss of control. Instead, various letters, bulletins, and brochures produced by ATR referred to vague, unspecified “anomalies,” “stiffness,” or “increased control forces.” In the NTSB’s view, none of these descriptions adequately conveyed what pilots could expect in the event of a hinge moment reversal.

The NTSB wrote that the steps ATR did take, including the installation of the vortex generators, development of the anti-icing advisory system, and creation of the icing simulator package, were positive but didn’t go far enough. Because the FAA was misinformed about the danger by ATR, the agency incorrectly believed that the vortex generators would “remove” the source of the abnormal aileron behavior, which was not true. Furthermore, while the simulator package did feature asymmetric stalls resulting in bank angles between 60 and 80 degrees, the package did not replicate aileron hinge moment reversal and did not represent the true speed and force of the roll upset.

According to the French BEA, Simmons Airlines never adopted the simulator package anyway.

◊◊◊

In 1989, in the aftermath of the Mosinee incident, the manager of the FAA’s Seattle Aircraft Evaluation Group wrote a letter to the manager of the Aircraft Evaluation Office, in which he stated that the ATR 42 “has an apparent inability to carry ice or at least perform reliably in icing conditions.” The letter mentioned 10 icing-related incidents in which “abnormal flight characteristics were demonstrated by the airplane,” and stated that there had been “a perceived reluctance on the part of the manufacturer to accept the fact there is an icing problem with the ATR 42. They have continually questioned the competence of the aircrews and the training programs in dealing with flight in icing conditions….” Speculating about the cause of the repeated difficulties, he added, “Intuitively, it seems that a high performance wing and boots do not go together,” and concluded by calling for flight tests with ice shapes simulating “runback.” As far as the NTSB could tell, the FAA ignored the letter. Speculating outside the scope of the NTSB report, the FAA may have been reassured by ATR that they had already flight tested such ice shapes after the Italian accident and found no anomalies.

The NTSB concluded that ATR, the FAA, and the French DGAC had enough information after the Mosinee incident to recognize a deeper safety issue and redesign the deicing boots. However, the company continued to put out inaccurate information. The 1992 All Weather Operations brochure contained a number of false or misleading statements that “minimized the known catastrophic potential” of an aileron hinge moment reversal and appeared to suggest that an encounter with freezing rain was easily avoidable by the pilot.

At the same time, the DGAC was receiving reports of other ATR 42 incidents around the world, but these were not forwarded to the FAA and the agency did not identify the scale of the ongoing trend. Under an arrangement called the Bilateral Airworthiness Agreement, US and French airworthiness authorities (that is, the FAA and DGAC) agree to accept each other’s certification of aircraft manufactured in their respective countries with minimal transatlantic participation, so long as proof of regulatory compliance is submitted and both parties are made aware of incidents that might be relevant to the “continuing airworthiness” of the airplane. Under this agreement, it was the responsibility of the DGAC to provide the FAA with information about incidents involving French aircraft types that also operate in the United States. The NTSB argued, however, that when ATR conducted detailed analyses of several post-Mosinee icing incidents, the DGAC did not forward this information to the FAA. The DGAC’s position was that ATR’s analysis of these incidents had revealed they were caused by pilot non-compliance with procedures, so there was no issue with “continuing airworthiness,” and thus nothing to tell the FAA about.

The NTSB wrote that the final Newark and Burlington incidents provided one last opportunity for ATR and the DGAC to recognize the danger of aileron hinge moment reversal. However, ATR’s analysis of the Newark incident improperly attributed the aileron movements to severe turbulence. In the NTSB’s view, the DGAC should have had the expertise to recognize that turbulence on the order of ±0.3 G’s would not cause such aileron deflection. At the time of the Roselawn crash, the NTSB investigation into that incident was still ongoing and a probable cause had not been determined. No investigation of the Burlington incident was ever conducted because Continental Express only informed ATR and not the NTSB. Even though ATR uncovered evidence of an aileron hinge moment reversal in that incident, the DGAC didn’t identify a “continuing airworthiness” issue and never informed the FAA.

◊◊◊

The BEA’s response to the final report was, at 341 pages, the same length as the report itself — although the BEA’s typeface was much larger. (BEA)

The NTSB’s French counterpart, the BEA, responded to these findings with unmitigated derision. In a scathing 341-page response, the BEA laid out its own very different analysis of the causes of the accident and its precursors, using some of the strongest language I have ever seen in an accident report. The opening paragraph of their report speaks for itself:

“The BEA strongly disagrees with substantial portions of the Factual, and with the Analysis, Conclusions, and Probable Cause sections of the report. In the BEA’s view, except for the Recommendations section, the present report is incomplete, inaccurate, and unbalanced. It appears to have been influenced by an a priori belief on the probable cause of this accident. The BEA strongly believes that [this] one-sided approach is detrimental to the cause of international aviation safety.”

Throughout its report, the BEA described various portions of the NTSB’s criticism of ATR and the DGAC as “excessive,” “not supported by the record of the investigation,” “incorrect,” “disturbing,” “erroneous,” “[lacking] factual basis,” “highly deficient,” and “outrageous and absolutely wrong.” In fact, the BEA basically only agreed that an aileron hinge moment reversal occurred, and disagreed about nearly everything else. The following section discusses their positions on previous incidents and the culpability of ATR, while part 8 will dive into the dispute over the conduct of the accident flight itself.

Central to the BEA’s argument was its assertion that the “ice-induced aileron hinge moment reversal” phenomenon was not discovered until after the Roselawn accident, nor was it likely to have been discovered earlier.

The BEA report doesn’t really address the design phase, where the NTSB states that aileron hinge moment reversal was observed at low AOAs and that modifications were made specifically to increase the AOA at which it would occur. In fact, this information was central to the NTSB’s argument that an ice-induced hinge moment reversal at a low AOA was foreseeable. Instead, the BEA emphasized (correctly) that the handling characteristics of the ATR with and without ice met all applicable certification requirements, and thereafter focused on the in-service incidents. The BEA notes, again correctly, that no abnormal aileron deflection was detected in the Ryan Air and Air Mauritius incidents, which were the only incidents described by the NTSB that occurred outside that agency’s own jurisdiction. However, the BEA’s statements about the Mosinee, Newark, and Burlington incidents are harder to justify.

The BEA’s position on the Mosinee incident was that a “momentary modification of the aileron hinge moment” occurred after the stall had begun, but that this had “no effect on the incident.” Although this “momentary modification” was discovered during the initial investigation into the incident, the BEA still maintained that the “ice-induced aileron hinge moment reversal” phenomenon wasn’t discovered until Roselawn. This contradiction appears multiple times in their report. In that very document, the BEA stated that vortex generators were installed after the Mosinee incident to help maintain “lateral control and stability” with an asymmetric ice buildup, and in another location the report states, regarding Mosinee, that “At that time… it was concluded that such conditions could have been the origin of an aileron hinge moment modification which occurred about the stall threshold….” In fact the BEA freely acknowledged that a “hinge moment modification” was observed in Mosinee and Burlington at several points in the report. It’s therefore quite transparent that the BEA was engaging in semantics — there is no meaningful difference between discovering an ice-induced “hinge moment modification” and a “hinge moment reversal,” except for the name. One has to wonder whether the purpose of these blatantly obvious word games was to protect ATR from liability as a French state-owned corporation.

In its report, the BEA also characterized the hinge moment “modifications” in Mosinee and Burlington as a “marginal characteristic of an asymmetrical stall in severe icing conditions,” meaning that the phenomenon was a consequence of the departure from controlled flight and not its cause. On the one hand, an aileron hinge moment reversal by definition requires airflow to separate over one aileron before the other, so as a stall characteristic it can only occur during asymmetric stalls. However, the implications of the massive aileron deflection, insofar as it contributed to the severity of the upset in either case, were not explored by the NTSB or the BEA. It’s hard to imagine that the ailerons suddenly moving to near maximum deflection during a stall would not significantly complicate the recovery.

Regarding the Newark incident, the BEA rejected the assertion that any aileron hinge moment reversal occurred, citing ATR’s position, which was accepted by the NTSB at the time, that the abnormal behavior was caused by severe turbulence interacting with ice. The BEA argued that if the NTSB wanted to fault the DGAC for not seeing through ATR’s interpretation, then was no reason why the NTSB, as the investigating authority, should not also have recognized the danger. If the NTSB could determine after Roselawn that the turbulence was insufficient to cause the recorded aileron deflection, then why couldn’t they have done so before the crash? And similarly, if the NTSB believed ATR should have taken more concrete action after the Mosinee incident, why didn’t the NTSB make any recommendations to that effect?

For its part, the NTSB stated that it didn’t receive ATR’s analysis of the Mosinee incident until after the Roselawn crash, which prevented them from understanding the true scope of the danger at the time. The BEA rebutted this claim, providing a handwritten document that appeared to show six NTSB investigators present at a meeting during which ATR presented its analysis. I was not able to resolve this contradiction.

The document that the BEA claims proved the NTSB did receive ATR’s analysis of the Mosinee incident. Also, as a side note, does anyone else find it bizarre that the BEA didn’t censor the telephone numbers? (BEA)

Regarding Newark, the New York Times reported that the NTSB did not send an aerodynamic expert to that investigation, which was probably why the NTSB failed to question ATR’s assertion that turbulence was responsible. This testifies to an unfortunate truth about the way the NTSB operates. In 1994, the NTSB’s aviation office had only 47 investigators, who were collectively responsible for investigating potentially hundreds of accidents and incidents every year. The NTSB is not actually required to investigate an incident unless it results in injuries to a person or substantial damage to the airplane, so resources are only allocated to lesser incidents if the NTSB has people to spare and believes the expenditure would be worthwhile. Today, this is less of an issue because safety improvements have greatly reduced the number of accidents requiring immediate attention. But in the 1990s, the NTSB had much less free bandwidth to devote to incidents like Mosinee and Newark. As a result, the investigators assigned to those cases mostly relied on analysis performed by the manufacturer, which possessed expertise that was outside the budget of a minor investigation.

Although this issue wasn’t discussed in the report, NTSB personnel were clear-eyed about their own limitations. In 1995, Timothy Forte, director of the NTSB’s Office of Aviation Safety, told the New York Times, “We missed it. It was there. When we looked back, we saw it. We wish we had caught it.”

◊◊◊

In the end, while the BEA did raise a few important points, the thesis of its argument ultimately fell flat. No one I spoke to while researching this article believes ATR didn’t know about ice-induced aileron hinge moment reversals before Roselawn.

Part 9 discusses how the NTSB reacted to the BEA’s accusations. But before we get there, we have to talk about the question that occupied the majority of the BEA report — that is, whether the pilots could also have prevented the crash.

◊◊◊

Part 8: The Blame Game

An aerial view of the crash site leaves it hard to imagine that the scattered fragments once formed an entire airliner. (NTSB)

The NTSB and BEA reports widely diverged on the question of whether the pilots and controllers bore any responsibility for the accident. The NTSB found that the actions of the pilots were broadly correct, and although some areas for improvement were noted, these did not contribute to the accident. However, the BEA disagreed in forceful terms, arguing that the pilots violated basic procedures and failed to exercise common sense. This chapter analyzes both perspectives, with additional context brought in from Stephen Frederick, who was better able to convey how pilots viewed these issues in real life.

Both the NTSB and the BEA noted that flight 4184 was cleared to depart Indianapolis even though flow control expected that they would have to hold en route, which is contrary to the purpose of flow control. The NTSB stated that flow control was permitted to release any aircraft so long as it would not place that aircraft into icing conditions severe enough to “adversely affect the safety of flight,” and thus the release of flight 4184 was proper because there were options for the pilots to exit the hold if such conditions developed. On the other hand, the BEA wrote that the decision to release the flight was “improper” and contradicted the provisions of the FAA Air Traffic Control Handbook, simply because a hold was expected, regardless of whether it was in icing conditions. Stephen Frederick also wrote that the release of the flight with an expected hold violated FAA rules, but he didn’t say which ones. I was not able to conclusively determine which of these parties was right.

Once flight 4184 was in the air, everything was essentially normal until the Boone sector controller instructed the flight to enter a holding pattern at the LUCIT intersection. Although icing conditions were forecast in this area, the Boone sector controller hadn’t received any recent pilot reports (PIREPs) indicating that such conditions actually existed. Furthermore, even if icing conditions had been reported, there was no FAA regulation or ATC policy forbidding controllers from instructing an aircraft to hold in known icing conditions, nor was there any regulation or company policy that would have forbidden the flight crew from accepting the hold.

Both the BEA and Stephen Frederick criticized the controllers for repeatedly extending the expected holding time by 15-minute increments. Frederick believes that this was a form of “structuring” intended to avoid paperwork, since controllers were instructed to report to flow control if an aircraft was assigned a hold greater than 15 minutes. However, the NTSB stated that the 36-minute assigned hold should have been reported regardless of how it was structured, although this had no effect on the accident. Frederick disagreed that this played no role — in his view, if the pilots had known from the start that they would be holding so long, they might have requested to hold away from the icing conditions.

Nevertheless, while holding at LUCIT, the flight crew never provided a PIREP to the controller, even after they identified icing conditions at 15:41. If they had done so, then the controller likely would have asked whether the pilots wanted to hold somewhere else. Controllers in Chicago were well aware of the risks of prolonged operation in icing conditions — in fact, the supervisor at the facility had posted a handwritten note that day reminding controllers to monitor the conditions because “icing kills.” There was no doubt that they would have been responsive to any icing-related request.

A photo of the handwritten reminder that “icing kills,” which was posted in the Chicago ARTCC. (BEA)

In its report, the BEA wrote that the controller should have solicited a PIREP from flight 4184, because FAA rules require controllers to acquire a PIREP from any flight operating in known icing conditions. However, the BEA doesn’t mention the fact that flight 4184 was the only aircraft holding in the Boone sector, and no other aircraft had reported icing conditions since the controller started her shift. Because icing forecasts are broad and the actual conditions are often localized, a forecast of icing was not by itself adequate basis for controllers to assume that icing conditions existed unless a pilot reported them.

The NTSB noted that while Simmons Airlines procedures required the pilots to make a PIREP whenever icing conditions were encountered, in practice many pilots did so only if the conditions were worse than usual or different than forecast. In their view, the pilots’ failure to make a PIREP was probably because they didn’t perceive the ice buildup to be operationally significant. Furthermore, they argued, the failure to make a PIREP was unrelated to the accident because if the pilots had perceived a need to exit the conditions, they would have done so regardless.

The BEA disagreed with this position. In their view, if the pilots had submitted a PIREP, the controller likely would have cleared them to leave the icing conditions immediately. However, the BEA appears to assume that a PIREP would be associated with a request for an alternate holding location. If the pilots underestimated the severity of the ice, they might not have made such a request. On the other hand, the controller was aware of the risk posed by icing and might have offered an alternate clearance anyway, which the pilots would have been unlikely to reject.

Therefore, in my opinion there was some chance that a PIREP would have altered the course of events, but not enough to definitively state that the absence of one contributed to the accident.

A closer aerial view shows the remains of the tail section, and the impact crater. (Anderson Herald Bulletin)

A potentially more significant issue was the pilots’ apparent underestimation of the severity of the icing conditions. Even though the NTSB and the BEA agreed that the rate of ice buildup during the holding pattern was likely “severe,” the actions of the pilots don’t resemble those of a crew who knew they were in danger. The first time ice was mentioned was at 15:48, seven minutes after the deicing boots were activated and ten minutes before the crash. First Officer Gagliano later stated that they still had ice at 15:55. However, these were the only mentions of ice at any point during the flight, and neither comment prompted a conversation about the conditions or the performance of the airplane. If the pilots had been worried about the rate of ice buildup, they would have said so, and they didn’t.

The NTSB believed that the pilots underestimated the conditions due to a confluence of factors. Most importantly, even though the ice buildup was theoretically severe, calculations showed that the largest drag increase experienced by the aircraft during the holding pattern was 3%, which would not be noticeable to the crew. Although ice likely contributed to a loss of speed during a turn at 15:33, the pilots apparently attributed this to the heavy aircraft and steep bank angle, and in any case they regained speed after straightening out. The aircraft otherwise had no trouble maintaining speed during the holding pattern, nor was the autopilot struggling to maintain altitude. The lack of any apparent abnormalities would have strongly reassured the crew that the icing conditions were within the capabilities of the aircraft.

Neither the NTSB nor the BEA addressed why, if the icing was severe, no significant performance penalties were observed. My best answer to this conundrum is that the severe icing conditions were intermittent, so the deicing boots probably had an opportunity to clear away the existing ice every time the rate of accumulation slowed. All sources agree that the aircraft was flying in and out of clouds while in the holding pattern, supporting this assumption.

The NTSB also pointed to the difficulty of identifying freezing rain as a second reason for the pilots’ apparent complacency. The severity of the buildup might have been obscured if the freezing droplets failed to adhere to the ice detector probe, and what ice did build up might have been mostly transparent. Certainly the pilots in the Mosinee incident failed to notice an even larger buildup of clear ice in freezing rain.

After that incident, Simmons Airlines had issued a memo to its flight crews with some tips for detecting freezing rain, but the airline undermined this in 1991 by releasing a second memo falsely claiming that company aircraft were certified to fly in freezing drizzle and light freezing rain as long as all deicing equipment was functional. This assertion was completely untrue and the memo was quickly rescinded, but not before many Simmons pilots had read it. In general, the various manuals available to Simmons flight crews correctly warned against entry into conditions of freezing rain and explained some of the hazards associated with it. But the burden was always on the pilots to “exercise vigilance” and avoid such conditions “where possible,” despite the fact that the conditions could be hard to predict or detect. To make matters worse, many Simmons pilots stated that they had never received any formal training on identifying freezing rain.

Lastly, Stephen Frederick has also noted that aircraft are certified to fly for 45 minutes in continuous icing within the Appendix C envelope. If the pilots perceived that the conditions were unremarkable, they might have believed it was safe to remain in those conditions for 45 minutes before they would have to consider alternatives. At the time of the accident, the aircraft had been holding for only 34 minutes and the pilots were expecting clearance out of the hold at any time.

The BEA was, unsurprisingly, much less charitable in its interpretation. Whereas the NTSB interpreted the lack of comments about ice as evidence that the conditions didn’t appear unusual, the French investigators arrived at the opposite conclusion, that the conditions almost certainly appeared unusual and the pilots just weren’t paying attention.

The BEA noted that the aircraft was probably in icing conditions more often than not almost from the moment the holding pattern started at 15:24. By 15:33, enough ice had accumulated to produce a detectable effect on aircraft performance, and at this time a chime was heard that the NTSB believed most likely came from the anti-icing advisory system. At that point the pilots should have activated the deicing boots, but they didn’t hear the chime because they were engaged in off-topic conversation, and they didn’t react until the chime sounded again at 15:41.

The BEA also highlighted numerous ATR and Simmons Airlines publications and manual excerpts describing the dangers of freezing rain, some of which warned that freezing rain might be slow to trigger the automatic ice detection system, and that the anti-icing advisory system is “not a substitute for crew vigilance.” A 1993 letter distributed by the airline warned that “any encounter with severe ice — including freezing rain — for a prolonged period of time may cause control problems beyond that of the intended design,” and the aircraft operating manual — informed by recent incidents — stated that freezing rain “will flow aft on the wing and freeze, creating a potentially dangerous situation.” Overall, despite a few misleading publications mentioned by the NTSB, the BEA felt that the overwhelming message of the available guidance materials emphasized the hazards appropriately and demanded a level of crew vigilance that was not apparent on flight 4184.

Furthermore, the BEA pointed out, the available materials already mentioned unusual aileron forces and other hazardous control effects due to freezing rain, and there was no evidence that the flight crew would have acted any differently had ATR informed them of the details of the hinge moment reversal phenomenon.

I have to take issue with some aspects of these arguments. The significance of ATR’s alleged failure to disclose the danger of aileron hinge moment reversal was not that pilots would react differently, although they might have, but rather that the publicization of the phenomenon would have forced ATR to make design modifications to prevent it. Additionally, the difficulty of detecting freezing rain does not, to me, imply that greater crew vigilance is an appropriate solution, although it doesn’t hurt. From a systemic safety perspective, crew vigilance is the least reliable antidote to an insidious hazard, and blaming the crew for failing to detect such a hazard is a surefire way to ensure that a similar accident happens again. The NTSB was right to argue that the only real solution to the freezing rain problem was to certify aircraft to fly in those conditions.

The Indianapolis Star on February 27th, 1995 featured a front page article praising the pilots, based on information that came out during the NTSB’s public hearing. (Indianapolis Star)

Despite this, the BEA wrote that the crew were “oblivious” to the danger and that their actions bordered on negligence. In their view, the pilots failed to react appropriately to the icing conditions because they were distracted and complacent. They criticized the lengthy off-topic conversations with the flight attendants, the captain’s trip to the bathroom, and the decision to listen to a commercial music station — and not only did they find that these actions were unprofessional, they linked them directly to the accident.

The NTSB considered that these actions were debatable but technically complied with the letter of the law. The sterile cockpit rule, in force since 1984, demands that pilots refrain from any conversations or activities unrelated to the operation of the aircraft while below 10,000 feet, except in cruise flight. Since the holding pattern was conducted at 10,000 feet, the sterile cockpit rule didn’t apply, unless the captain declared that they were in a “critical phase of flight.” Most pilots didn’t consider circling in a holding pattern on autopilot to be a critical phase of flight. But the BEA disagreed, arguing that a holding pattern requires more attention from the crew than mere cruise flight, especially in icing conditions, where extra vigilance is required. Thus, in their view, Captain Aguiar should have declared that the sterile cockpit was in force, kicked out the flight attendant, and turned off the music. Not only was this his prerogative, they wrote, it was his obligation.

The BEA argued in its report that these distractions caused the crew to relax their vigilance and underestimate the severity of the icing conditions. Thus, if the pilots had exercised proper vigilance, they would have recognized that the conditions were severe, in which case they were obligated to leave the area immediately. This would have prevented the accident.

Furthermore, the conversation with the junior flight attendant caused the pilots to miss the first icing alert chime at 15:33, and they allowed ice to continue to build up for another eight minutes before they activated the deicing boots. The BEA made a point of highlighting this error, but they never articulated how it made any difference. Even though the deicing boots weren’t activated at that time, only minor, transient performance changes were noted between 15:33 and the activation of the boots at 15:41. Although the failure to immediately activate the boots in icing conditions was technically a violation of Simmons Airlines icing procedures, the NTSB rightly concluded that it did not contribute to the accident because the loss of control was caused by an ice accretion that the deicing boots could not remove or prevent.

Ultimately, the problem with the BEA’s distraction argument is that it doesn’t reflect what is known about the effect the conditions had on the aircraft. As I mentioned earlier, the lack of any noticeable increase in drag on the airframe during the hold appears to contradict the assertion that the pilots should have considered the icing “severe.” Consequently, I have to side with the NTSB’s conclusion that the pilots did observe the conditions and did not believe them to be extraordinary.

The BEA also used some arguments that I found insulting. The French investigators highlighted the fact that the flight attendants were female every time they were mentioned, and accused the NTSB of failing to explore the “safety risk” posed by “male-female crew interactions” in flight, as though flight crews are incapable of behaving themselves when a member of the opposite sex is present. Although the CVR did record a couple of comments that would have made me uncomfortable if they were directed at me, the BEA wasn’t making a point about harassment but rather appeared to voice regressive beliefs about the mere presence of women being detrimental to the performance of male professionals. Such arguments were among those used to exclude women from piloting roles in many countries as late as the 1990s. Furthermore, I was unable to make a plausible connection between the pilots’ comments and the accident.

◊◊◊

A front page New York Times article from February 26th, 1995, placed blame for the accident on the FAA. (New York Times)

In the end, although they did devote considerable time to the distraction issue, the most significant of the BEA’s arguments surrounding flight crew performance was related to the use of flaps 15 while holding. As a reminder, the post-accident tanker tests revealed that the ridge of ice behind the deicing boots had its most significant effect on the ailerons if the ice accumulated with flaps 15, followed by flap retraction.

The NTSB wrote in its report that there was no provision in any manual that prohibited holding with flaps 15, although there were no provisions supporting it either. Nevertheless, the practice appeared to be widespread at Simmons Airlines. One Simmons first officer stated that 65% of Simmons captains extended the flaps while holding in clear air and 100% would extend them in icing conditions. A line check captain explained to the NTSB that this was done to reduce the angle of attack, which would otherwise increase while holding due to the mandatory lower airspeed. A higher AOA was undesirable because it was uncomfortable for the passengers and increased the risk of an ice-induced stall.

The NTSB noted that in general, prolonged operation in icing conditions with the landing gear or flaps extended is undesirable because these devices provide an additional surface for ice accretion. However, the flaps also would have increased the available margin above the stall speed, improving safety. The fact that Captain Aguiar joked that he would be “ready for the stall procedure pretty soon” indicated that he was concerned about this issue before he extended the flaps.

On the basis of this information, the NTSB concluded that the pilots were within their rights to use flaps 15, likely considered it consistent with their training, and could not have predicted that it would be unsafe under the conditions that existed. The ATR manual stated that holding with the flaps retracted improves fuel economy, but did not indicate that there was a safety-related reason for this recommendation. Besides, even ATR engineers didn’t know that flying with flaps 15 would produce a more adverse ice buildup in freezing rain until after the accident — so how could the pilots possibly have inferred this?

The BEA, on the other hand, strongly criticized the pilots for extending the flaps. They questioned the first officer’s stated premise for doing so, writing that there was no evidence of “wallowing” despite his comments that they were “wallowing in the turns.” They also argued that because the aircraft flight manual only provided holding performance data with the flaps retracted, that meant holding with flaps 15 was implicitly prohibited. On that basis, they accused the pilots of a deliberate breach of standard operating procedures that directly created the conditions for the accident to occur. If the flaps hadn’t been extended, then the aileron hinge moment reversal probably wouldn’t have happened.

The BEA ignored many reasons why the pilots might have wanted to extend the flaps and why they would have thought holding with flaps was allowed. Many of these reasons were discussed in the NTSB report, but Frederick added some more in his book. In his view, extending the flaps while holding in icing conditions was not only allowed, but operationally necessary. He noted that according to the Simmons Airlines flight crew operating manual, a hold in icing conditions lasting more than a few minutes was required to be conducted at or above a speed called icing VmHBO, or high bank operations maneuvering speed. Basically, the ATR has two autopilot settings, high bank and low bank, which limit the amount of bank the autopilot can use while turning the aircraft to 25 degrees or 15 degrees, respectively. The icing VmHBO is the minimum speed that must be maintained in order to safely use the high bank setting in icing conditions, accounting for the fact that ice and bank angle both increase the stall speed. Frederick indicates that at flight 4184’s reported weight, the icing VmHBO would have been 171 knots. This was only four knots below the maximum holding speed of 175 knots, leaving the pilots with an unacceptably narrow operating margin in either direction. One solution was to use the low bank autopilot setting, but this would cause the plane to make wider turns and overshoot the boundaries of the holding area, creating problems for air traffic control. A better solution, from a pilot’s perspective, was to extend the flaps, because the icing VmHBO with flaps 15 was 121 knots, providing a much wider operating margin.

The BEA suggested in its report that if the pilots needed a higher operating margin, they should have asked air traffic control for permission to use a faster holding speed, even if it caused the plane to exit the normal holding area. But I think the reality is most pilots would rather not explain to ATC why they want special treatment when they could solve the problem by making a simple, apparently routine configuration change instead.

◊◊◊

Recovery crews work at the scene of the disaster. (Chicago Tribune)

The final significant issue that caused serious division among the parties to the investigation was the question of whether the flight crew should have been able to recover from the upset.

In its final analysis, the NTSB accepted ATR’s report that the control forces during the aileron hinge moment reversal did not exceed 60 lbs (27 kg) in ground and wind tunnel testing. Notably, federal regulations require that lateral control systems produce control forces not greater than 40 lbs (18 kg) continuous and 60 lbs transient — numbers that happen to correspond quite closely to what ATR reported. I was not able to establish whether this correspondence was coincidental. However, the NTSB argued that even if the control forces didn’t exceed 60 lbs, a number of factors might have prevented the flight crew from regaining control before the flight path became unrecoverable.

The first factor was the element of surprise. The aircraft was on autopilot when the ailerons suddenly deflected, and there was no unusual behavior prior to the upset that would have put the pilots on alert. Reaction times to an extreme, unexpected upset vary, but two seconds is not unheard of. In this case, after two seconds, the bank angle was already 77 degrees, placing the aircraft far outside the normal operating envelope.

Today, pilots are taught to handle this type of upset in the simulator, using scenarios that place the aircraft in an unusual attitude and force the pilot to fly their way out. Pilots learn that to recover from a bank angle greater than 90 degrees, the first priority is to level the wings and avoid the instinctive urge to pull the nose up even if altitude is decreasing. Pulling up while upside down will cause the plane to dive almost vertically toward the ground, making the upset much worse and potentially unrecoverable. The point is that without this type of training, pilots thrust suddenly into an extreme attitude often fail to recover because the required techniques aren’t sufficiently familiar to be recalled during a moment of intense shock and adrenaline. This is true even when the ailerons aren’t also pinned at maximum deflection.

After the initial roll, First Officer Gagliano started to recover by pushing the nose down and applying significant force opposite to the roll. The angle of attack reduced below 5 degrees and the hinge moment returned to normal. He then managed to reduce the bank angle to 55 degrees, but he quickly stopped his nose down inputs, causing the angle of attack to increase back above 5 degrees, at which point the aileron hinge moment reversed again. Unable to comprehend this bizarre control behavior, Gagliano was not ready for this deflection either, and the plane embarked on a spectacular 450-degree right roll before he managed to regain lateral control a second time, holding the bank angle at 144 degrees. But as the pilots fought, disoriented and terrified, against control forces far outside anything they had previously encountered, they both pulled the nose up while upside down, sending the plane into a rapid dive toward the ground from 6,000 feet. The pitch angle peaked at 73 degrees nose down, at which point the flight path became unrecoverable. At that point the aileron hinge moment reversal ceased and never recurred, allowing Gagliano to roll back to wings level by about 2,500 feet. He also made a valiant effort to pull out of the dive, but their vertical speed and airspeed were both off the charts. Two seconds later, the G-forces exceeded the ultimate strength of the airframe and the plane disintegrated in midair.

The NTSB produced this animation depicting the loss of control of flight 4184. This animation was unexpectedly shown to the public for the first time, including to family members of the victims, at the hearing in February 1995. It apparently stirred quite an emotional reaction from those who were present. (NTSB)

It was evident that the rapid and extreme aircraft movements and high control forces involved in the upset overwhelmed the crew and rapidly placed the aircraft into an unrecoverable position. The NTSB noted that even with special training on upset recovery, the pilots might still have lost control because they couldn’t predict that the aileron hinge moment reversal would recur every time the AOA increased past 5 degrees. In fact, after the first uncommanded aileron deflection, Gagliano reacted correctly and appeared to be bringing the aircraft under control, only for the ailerons to deflect again.

During the upset, Captain Aguiar made control inputs, mostly in a nose up direction, and urged Gagliano to pull up “nice and easy” to avoid overstressing the airframe. Aguiar also quickly returned the flaps to 15 degrees, which would have helped reduce the AOA and decrease airspeed, but the flaps were inhibited at speeds above 185 knots and did not extend. The NTSB noted that this made the recovery more difficult. Also complicating the recovery was a similar inhibition on the use of the rudder. Because aileron control forces were high, the pilots could have used the rudder to assist in rolling wings level, but full rudder authority was not available at speeds over 185 knots in order to avoid overstressing the airframe.

But as you’ve probably guessed by now, the BEA did not believe the pilots attempted to recover to the best of their ability. The French investigators criticized the level of crew coordination during the upset, pointing to the lack of communication about the attitude of the aircraft and the fact that the pilots sometimes made opposite inputs. The BEA complained that the power levers were left at idle — an odd concern, since power at idle is part of the process of recovering from an overspeed condition or dive — and even the reduced rudder authority was not fully utilized. They also claimed that Gagliano’s aileron inputs were “not sustained” and stated that the flight path could have been controlled with a steady 6 degrees of left aileron beginning 3 seconds after the upset.

The report paid some lip service to the notion that the upset was extremely startling, but the BEA mostly blamed the pilots for their own disorientation. Their report cited the earlier distractions as the primary reason the pilots weren’t ready for the upset, suggesting that Captain Aguiar’s conversation with company operations, which ended 13 seconds before the loss of control, robbed them of their situational awareness. It remains unclear what part of the situation, had they been aware of it, was supposed to have helped.

However, before we move on, I want to come back to the issue of control forces, because this issue has left some unanswered questions. In his book, Stephen Frederick pointed out that ATR measured the control forces at a speed of 100 knots with a vertical acceleration of 1.0 G (nominal gravity), even though control forces tend to get stronger with increasing speed and G-forces. Furthermore, the BEA report acknowledged that its estimates of control forces were derived based on assumptions applicable only at the moment the upset began. With this in mind, Frederick argued that the aileron deflection might have been initially manageable with a sufficiently aggressive counter-input, but the forces required could have surpassed Gagliano’s strength within seconds as the airspeed surpassed 250 knots and the vertical acceleration exceeded 2.0 G.

The BEA report included this graph of aileron deflection and AOA on a common timescale that appears to support the hypothesis that the pilots could not overcome the hinge moment reversal. The red annotations were added by me. (BEA)

While Frederick stated that further testing would have been needed to prove this hypothesis, some interesting clues can be found elsewhere. According to the New York Times, as late as the summer of 1995 the NTSB believed the control forces were too strong for the pilots to overcome. This was after ATR had already made its control force calculations, and it’s unclear when during the investigation process the NTSB made an about-face and accepted the manufacturer’s findings. Additionally, the BEA included a graph of aileron deflection on a common timescale with angle of attack, and there is a remarkable correlation between the two. In general, the higher the AOA, the greater the right-wing-down aileron deflection, and the only time the ailerons were able to reach the neutral or left-wing-down positions was when the AOA was below 5 degrees — that is, below the hinge moment reversal AOA. Was this because Gagliano wasn’t making aileron inputs, or because he was unable to overpower the hinge moment reversal? It’s somewhat difficult to believe that it was the former. Unfortunately, this question has never been properly resolved.

In the end, the NTSB did not name any actions by the pilots as causes of or contributing factors to the accident. The BEA, on the other hand, determined that the failure to leave the freezing rain environment and the use of flaps 15 were both causal to the accident. They also cited a laundry list of pilot-related contributing factors, namely, “The failure of the flight crew to comply with basic procedures, to exercise proper situational awareness, cockpit resource management, and sterile cockpit procedures, in a known icing environment, which prevented them from exiting these conditions prior to the ice-induced roll event, and their lack of appropriate control inputs to recover the aircraft when the event occurred.”

As for where I stand on this matter, my heart is on my sleeve.

◊◊◊

Part 9: Clearing the Fallout

As the investigation gathered speed, a parallel story unfolded across America and the world as aviation authorities and pilots alike tried to understand the crash and anticipate its consequences.

In the chaotic first days after the accident, Simmons Airlines and ATR issued a head-spinning barrage of new flight envelope and performance restrictions in an attempt to preclude a recurrence, without knowing what had caused the crash in the first place. Many of these rules left pilots confused or angry, and a dreadful sense of uncertainty fell over American Eagle cockpits whenever icing was forecast. What scared them most, Frederick later wrote, was that “the threshold between controlled flight and uncontrolled flight was unclear.”

One week after the accident, the NTSB recommended that the FAA conduct a special airworthiness review of the ATR 42 and ATR 72 in icing conditions; ban ATR aircraft from entering any known icing conditions until the safety of the aircraft could be established; instruct controllers to expedite requests from ATR pilots to exit icing conditions; waive the 175-knot holding speed limit for ATRs; and provide clear guidance for ATR pilots to maintain an optimum speed, disconnect the autopilot, monitor lateral control forces, and reduce the angle of attack in the event of an icing encounter.

The FAA moved fast on some of these recommendations, and slower on others. As November progressed, ATRs continued to fly into icing conditions in the United States, even as pilots pushed back. Stephen Frederick recalled that one group of pilots distributed leaflets warning passengers about the ATR inside the Chicago O’Hare passenger terminal. Others decided in late November to arrange a coordinated walkout after being asked to fly in icing conditions. Some even went to the media, including Stephen Frederick, who was fired from Simmons Airlines after he criticized the ATR during an unconvincingly anonymous appearance on “Good Morning America.”

The front page of the Chicago Tribune the day after the crash. (Chicago Tribune)

In December, the FAA finally bowed to the pressure and issued an airworthiness directive banning any ATR aircraft in the United States from flying into known icing conditions. Mass flight cancellations ensued, and American Eagle moved its ATRs to southern routes where icing was less common. But the grounding lasted only 33 days, after which the FAA concluded that new operating procedures would provide sufficient assurance of safety even without changing the design of the aircraft. Fortunately for pilots and passengers, most US airlines were in no hurry to return their ATRs to colder climes.

In the meantime, a steady trickle of recommendations and modifications started to flow from the ever-expanding investigative findings. ATR issued a service bulletin instructing airlines to remove the flap inhibitor so that the pilots could extend the flaps at speeds above 185 knots in an emergency. The FAA assembled a working group to study ways to safeguard airplanes against supercooled large droplets and began a closer review of the handling characteristics of several turboprop aircraft in icing conditions. At the same time, ATR finally took the step that the NTSB felt they should have taken after Mosinee — they began working on an expanded deicing boot. Due to the construction of the wing, the depth of the boot couldn’t be increased beyond 12.5% of chord, which would not entirely solve the “runback” problem. This would, however, eliminate any chance of an ice ridge forming between 8 and 10% of chord, which was where investigators believed the ridge had to form in order to cause aileron hinge moment reversal at a low AOA. The FAA subsequently issued an airworthiness directive mandating installation of the new deicing boots by June 1995.

Another key NTSB recommendation was that the FAA expand the Appendix C certification envelope to include freezing rain. The FAA initially resisted this recommendation, and the BEA backed them up, arguing in their report that expanding Appendix C was unnecessary and energy would be better focused elsewhere. The FAA didn’t budge from this position until after the 1997 crash of Comair flight 3272, which was caused by a pattern of ice buildup that occurred within the Appendix C envelope but was not tested during certification. The process of expanding Appendix C lasted all the way until 2016 (better late than never!) and all aircraft designed since then are certified to fly in conditions of freezing rain without any adverse handling characteristics.

The NTSB’s recommendation that airlines provide upset recovery training to pilots also turned into a lengthy battle. Although American Eagle adopted such a program soon after the accident, the FAA dragged its feet on this issue because few simulators at that time could replicate the unusual attitudes needed for the advanced training. Upset recovery training was not provided for 100% of US airline pilots until 2019, but since then the requirement has spread throughout most of the world.

The three levels of warnings provided by the Performance Monitoring System on modern ATRs. (Magnar Nordal)

Another improvement, introduced in more recent versions of the ATR 42 and 72, is the performance monitoring system. This system detects when aircraft performance is being affected by ice and illuminates one of three warnings lights — “cruise speed low,” “degraded performance,” or “increase speed” — depending on the severity of the degradation. This wouldn’t have helped in the Roselawn accident but could have prevented many of the precursor incidents that were associated with a loss of speed in icing conditions leading to stall.

Separately, ATR developed an official procedure for pilots to follow in the event of abnormal roll control, which calls for pushing the nose down firmly to reduce the AOA, setting flaps 15, and applying max power.

During and after the investigation, the NTSB also recommended that the FAA require airlines to provide all relevant AIRMETs to pilots before departure; reduce the subjectivity of icing severity levels; continue searching for ways to forecast freezing rain; require manufacturers to provide information about known undesirable handling characteristics outside the normal flight envelope; revise the process of accepting foreign aircraft certifications; improve sharing of accident and incident information between the US and other countries; and enforce the sterile cockpit rule while holding at any altitude in icing conditions, among many other suggestions.

Separately, widespread media coverage of American Airlines’ sometimes inept interactions with the victims’ families, and the lack of information about the investigation that those families received, led the United States Congress to tackle the issue of post-accident care. In 1996, Congress passed the Aviation Disaster Family Assistance Act, which designated the NTSB as the party responsible for coordinating family assistance after an air disaster, including provision of information and psychological support. The NTSB’s relative lack of conflicting interests and high public trust have helped guarantee more transparency for victims’ families ever since.

◊◊◊

The front cover of the NTSB’s 340-page final report on American Eagle flight 4184. (NTSB)

On July 9, 1996, the NTSB released its final report, which determined that the probable cause of the accident was an ice-induced aileron hinge moment reversal, which was made possible by ATR’s failure to disclose known information about adverse handling characteristics, the DGAC’s inadequate oversight of the ATR’s continuing airworthiness, and the DGAC’s failure to inform the FAA about airworthiness-related information derived from previous incidents. The FAA’s inadequate oversight of the continuing airworthiness of the ATR and the FAA’s failure to expand the Appendix C certification requirements were listed as contributing factors.

Although the BEA’s lengthy response was appended to the report, almost none of the BEA’s suggested revisions were adopted. This was not for lack of effort on their part, as the BEA and ATR apparently went so far as to contact the White House in an attempt to get the findings changed. Stephen Frederick wrote that that they were unsuccessful largely because of the stubbornness of lead NTSB investigator Greg Feith, who stood by his conclusions under immense pressure. Feith himself credits NTSB chairman Jim Hall.

Nevertheless, the BEA, ATR, and the DGAC continued their efforts by submitting a petition for reconsideration by the Board. In 2002, the NTSB accepted some elements of their petition and slightly watered down the original findings. Whereas the original probable cause stated that ATR “fail[ed] to disclose previously known effects,” the revised version cited ATR’s “inadequate response to the continued occurrence of ATR 42 roll/icing upsets,” and the role of ATR was demoted to a contributing factor. The related finding was also changed in the following manner:

Original finding:Prior to the Roselawn accident, ATR recognized the reason for the aileron behavior in the previous incidents and determined that ice accumulation behind the deice boots, at an [angle of attack] sufficient to cause an airflow separation, would cause the ailerons to become unstable. Therefore, ATR had sufficient basis to modify the airplane and/or provide operators and pilots with adequate, detailed information regarding this phenomenon.”

Amended finding: Before the Roselawn accident, previous incidents demonstrated that ice accumulation behind the deice boots, at an [angle of attack] sufficient to cause an airflow separation, would cause the ailerons to become unstable. Therefore, it would have been prudent for ATR to examine the combinations of icing conditions and airplane configurations that could produce the performance, stability, and control characteristics (including aileron hinge moment shifts) exhibited in the prior incidents, and the possible repercussions of such aileron hinge moment shifts.”

These changes appeared to remove any suggestion that ATR withheld knowledge that an aileron hinge moment reversal could happen at low AOA, instead proposing that ATR could have identified the risk had they put in more effort.

In addition, the board demoted the role of the French DGAC from probable cause to a contributing factor, and removed several sentences that were based on statements ATR employees made off the record. However, the majority of the BEA’s requests were rejected, including every proposed BEA finding related to pilot performance.

In my opinion, the revised findings don’t diminish the significance of the NTSB’s conclusions, nor are they particularly charitable to the BEA’s sometimes dubious arguments. In the end, it is the NTSB’s version of events that has stood the test of time.

◊◊◊

Part 10: Cursed, or Just Unlucky?

If you’ve made it this far, you might be starting to look askance at any ATRs in your vicinity. So are the airplanes safe? Should I rebook my flight if I see an ATR lurking outside my gate? After the August 9th, 2024 crash of an ATR 72 near São Paulo, Brazil during suspected icing conditions, these questions are again being asked. In this final chapter, I will do my best to answer them.

On the one hand, despite everything, the design of the ATR’s ailerons is mostly the same as it was in 1994. If airflow separates over one aileron and not the other, hinge moment reversal can theoretically still occur. However, exhaustive testing failed to reveal any ice shapes other than a sharp ridge at 8–10% of chord that would cause hinge moment reversal prior to stall, nor has such a reversal occurred in the three decades since. Furthermore, the mechanism that transfers pilot commands from the cable into the aileron was changed from a servo tab to a spring tab. According to ATR pilot Magnar Nordal, who I asked about this, the spring tabs were introduced with the release of the updated ATR 72–500/600 in 1997, and they help ensure consistent control forces regardless of airspeed. (By the way, check out his channel!) This apparently makes the plane easier to fly in severe icing and should theoretically make it easier to recover from a hinge moment reversal.

Several accidents and incidents involving the ATR in icing conditions have nevertheless occurred during the last 30 years. For instance, in 2002 a TransAsia Airways ATR 72 crashed into the sea off Taiwan during a cargo flight, killing both pilots, after encountering severe icing conditions. The flight crew observed the severe icing but didn’t apply full power, didn’t maintain a high airspeed, and didn’t descend to warmer temperatures. As a result, their airspeed bled away until the aircraft stalled, banked sharply to the left, and nosed over into a nearly vertical dive. The airplane accelerated beyond its maximum operating speed, vertical acceleration surpassed 4.0 G, and the flight data recorder stopped at an altitude of 500 feet above the water.

B-22708, the aircraft involved in the 2002 TransAsia Airways accident. (Dennis HKG)

The Taiwanese accident report includes a table showing aileron deflection as a function of time. According to this table, the ailerons deflected to nearly full left-wing-down (-13.7˚) for several seconds during the stall, which was determined by the investigation to be an aileron hinge moment reversal. However, the Taiwanese investigators concluded that the aircraft would have rolled to the left anyway due to the asymmetric stall that was already occurring.

This table from the TransAsia Airways final report appears to show evidence of a hinge moment reversal. (Taiwan TSB)

At least three other fatal ATR accidents have occurred in icing conditions since the TransAsia crash. Two of these, UTAir flight 120 (Russia, 2012, 33 fatalities) and West Wind Aviation flight 282 (Canada, 2017, 1 fatality) were related to improper ground de-icing of the aircraft leading to a stall immediately after takeoff, which is an altogether different issue.

The third accident involved Aero Caribbean flight 883, an ATR 72 that crashed in Cuba in 2010, killing all 68 people on board. The final report on this accident is not available, but according to Magnar Nordal, it was yet another case of pilots encountering severe icing, failing to take evasive action, and allowing the airspeed to decay to a stall. There is no evidence that aileron hinge moment reversal occurred during this accident.

As for the most recent such accident, the August 9th 2024 Voepass Linhas Aereas crash in Brazil, it’s too early to draw any conclusions. Although severe icing was forecast in the area, no official information has been released at the time of this writing that would confirm whether the flight actually encountered such conditions before it apparently stalled and fell from the sky. However, as a rule of thumb, aileron hinge moment reversal tends to turn a stall into an inverted, high-speed dive, as was seen in Roselawn and possibly Taiwan, rather than the slow flat spin observed in Brazil.

This section of the article might be updated in the coming weeks depending on the information contained in the anticipated preliminary report on the Voepass crash.

Update: On September 6th, 2024, CENIPA released its preliminary report on the Voepass accident. The following were among the preliminary findings:

1. The aircraft encountered severe icing

2. Earlier in the flight, an airframe deicing fault was registered and seen by the crew, however they later turned the system back on and flew into icing conditions, contrary to the associated abnormal procedure.

3. As the aircraft began to accumulate ice, the airspeed decreased, and the “cruise speed low,” “degrade performance,” and “increase speed” alerts were successively triggered. The crew did not react to these alerts although the first officer did mention that there was a lot of ice on the airplane.

4. Shortly after the “increase speed” alert, the stick shaker activated and the aircraft stalled at a speed of 169 knots.

5. During the stall, the aircraft rolled 52 degrees left, switched to 94 degrees right, then spun counterclockwise and entered a flat spin.

The remains of Voepass Linhas Aéreas flight 2283, which crashed on August 9th, 2024, while I was working on this article. (Unknown author)

During research for this article, I also examined two non-fatal incidents in Norway in 2005 and 2016. In the 2005 incident, a Coast Air ATR 42 was climbing amid forecast moderate icing when the pilots experienced a reduction in climb performance, followed by a 45-degree roll to the right, then a similar roll to the left. However, the climb was subsequently stabilized and the flight continued to its destination. The Norwegian investigation found that the conditions became severe, but the pilots didn’t recognize the severity, allowing their speed to drop and the angle of attack to increase. Investigators believed the aircraft experienced a stall possibly accompanied by an aileron hinge moment reversal.

A second, similar incident followed in 2016 involving a Jet Time AS ATR 42, in which the aircraft stalled and banked sharply after an encounter with severe icing. In that case, Norwegian investigators identified the upset as an asymmetric stall and ruled out hinge moment reversal as a factor because no abnormal aileron behavior was observed.

In summary, then, aileron hinge moment reversals have probably happened a couple of times since the Roselawn crash, but only in connection with a stall in icing conditions. The significance of the phenomenon in such cases is debatable because an aircraft experiencing a stall will often enter an uncontrolled roll anyway.

So, with all this information in mind, is there an unsafe trend involving the ATR? And if so, what is causing it?

The fact that the ATR is at least a little bit sensitive in icing conditions, relative to some arbitrary baseline, is widely known to ATR pilots. There isn’t one reason why this is the case. The high performance wing doubtlessly contributes, as does the wing’s abrupt stall characteristics, and probably the small size of the ailerons. Prior to Roselawn, the small size of the deicing boots also likely contributed. Stephen Frederick has suggested, without evidence, that the composite materials used in the wing might be more susceptible to ice accretion because they have poorer heat transfer than aluminum and might get colder in flight. But most high performance turboprops, such as the de Havilland Canada DHC-8, share most of these characteristics. So is the ATR uniquely vulnerable? As far as I can tell, the answer is… maybe only a little.

All high-performance turboprops with unpowered controls share a certain vulnerability, but so do other types of aircraft. For example, swept-wing, rear-engine jets without leading edge slats, such as the Douglas DC-9 and Fokker F28/F100, are notoriously vulnerable to trace amounts of ice on the wings during takeoff. At least five fatal accidents have occurred involving the Fokker F28 and its stretched successor, the Fokker 100, due to this vulnerability. In all cases, the aircraft was improperly deiced before departure, stalled on takeoff prior to stick shaker activation, lost roll stability, and crashed. In fact, this category of jets tends to roll wildly during a stall even without ice contamination, and this behavior has been observed in multiple accidents involving the McDonnell Douglas MD-80 and the Canadair CRJ series as well. Even jets with wing mounted engines may experience large roll excursions during a stall if the pilot doesn’t hold the wings steady, although rear-engine jets will do so more readily because the weight of their engines is closer to the center of the roll axis.

The wreckage of Palair Macedonian Airlines flight 301, a Fokker 100 that crashed in Skopje, Macedonia in 1993 due to an ice-induced stall on takeoff and loss of control. (Rune Lind)

Because aileron hinge moment reversal is theoretically possible on any airplane with unpowered ailerons, I also looked for accidents involving other medium to large turboprops in which aileron hinge moment reversal may have occurred. Although my search was not exhaustive, I did find two cases.

First, in the final report on Sol Líneas Aéreas flight 5481, a Saab 340 turboprop which crashed in Argentina in 2011 due to a stall and loss of control in severe icing conditions, investigators wrote:

“The crew were not able to level the wings and regain [control]. This was probably due to the accumulation of ice on the surface of the wing (the area in front of the ailerons), which led to behaviour such as erratic, uncommanded rolls….”

This language appears to suggest that ice on the wings in front of the ailerons modified the hinge moment in a manner that might have interfered with the pilots’ ability to recover.

The second case was highlighted by the BEA, via the MAK, the agency that investigates accidents in most of the former Soviet Union. MAK files revealed that in 1971, an Antonov An-12 turboprop crashed in the USSR after encountering icing that induced an aileron hinge moment reversal. According to the accident narrative, the ailerons suddenly deflected to the left limit; the pilots countered and managed to return the control wheel to neutral with considerable force but were unable to keep it there. The ailerons subsequently moved past neutral, at which point the hinge moment reversed to the opposite extreme, and the ailerons slammed to full right-wing-down. The aircraft lost control and crashed, killing all 7 people on board. The available narrative does not mention the severity of the icing or the angle of attack at which the hinge moment reversal occurred.

The wreckage of Sol Líneas Aéreas flight 5428. (Diario Rio Negro)

These two incidents may not represent the full breadth of past aileron hinge moment reversals in non-ATR aircraft. It’s difficult to objectively determine the extent of the phenomenon because the crash of American Eagle flight 4184 led to the publicization of a large amount of information about ATR incidents that is not available for other aircraft types. For its part, ATR did claim that its engineers managed to replicate a less severe hinge moment reversal using ice ridges on the Saab 340 and Fokker F27.

Another way to look at the relative safety of the ATR is through the lens of accident rates. In total, there have been six fatal accidents involving the ATR 42 and 72 related to icing, not including the Voepass crash, which is more than any other comparable turboprop. But then again, more ATRs have been built than Saab 340s, or de Havilland Canada DHC-8s, or in fact anything else in the same category. If 500 Saab 340s and 2000s have been built, vs 1,700 ATR 42s and 72s, and the Saab has had one fatal icing accident and the ATR has had 6, does that make the Saab safer in ice or is the outcome just random? With only one data point in Saab’s column, it’s not really scientifically possible to say.

It is definitely possible to say that, qualitatively, the ATR appears to be more severely affected. But then we’re back to the question of why, and it’s still not obvious what it is about the ATR that’s different from other comparable turboprops. And what if we acknowledge that the six ATR icing accidents have three different causes? Three of them were caused by stalls following a prolonged loss of speed in icing conditions. Two were caused by ground icing leading to a stall on takeoff, which can bring down (and has brought down) a wide range of aircraft. And the last one was Roselawn. So is there even one factor behind all these accidents, or is it all just a big coincidence?

One possibility is that the ATR is simply more popular among shoestring airlines with shakier training and lax company cultures. Out of all the large turboprops on the market today, the ATR is the cheapest to buy and the most cost effective per seat, making it attractive for airlines that have limited cash. Many of these airlines are located in parts of the world where severe icing is less common, and their pilots are not always aware of the magnitude of the danger. A lack of respect for a multifaceted hazard could be a common factor behind multiple icing accidents with seemingly disparate causes. This hypothesis is supported by the fact that many ATRs routinely operate in cold environments without incident. Finnair, for instance, has operated ATRs practically since they were introduced and has never had a serious incident. Granted, temperatures in places like Finland are frequently too cold for ice to stick to an aircraft, but icing conditions are still relatively common there and the lack of incidents is probably because Finnish pilots are exposed often enough to be sufficiently afraid.

In an ideal world, no pilot or passenger should have to fear for their safety in icing conditions. The recent expansion of certification requirements to include freezing rain is one step toward that ideal. But for the foreseeable future, mother nature will continue to hold the upper hand. Some aircraft might withstand her onslaught slightly better than others but her wrath ultimately spares no one.

◊◊◊

A memorial near the crash site honors passenger Patty Henry and her 4-year-old son Patrick, who lost their lives on flight 4184. (Chicago Tribune)

The legacy of American Eagle flight 4184 is not free of controversy, and probably never will be. The answers to some of the questions posed in this article about the safety of the ATR will still depend on who you ask, and although I’ve tried to approach every argument with an open mind, I obviously have my own opinions that not everyone will share. Readers are free to draw conclusions that differ from mine, but I hope this article provides a solid informational basis regardless.

At the very least we can say, all controversies aside, that the tragedy at Roselawn was caused by a failure of imagination, a failure to ask, “what would happen if things were a little bit different?” The pieces of the puzzle were there, and ATR even assembled some of them, but whether due to complacency, arrogance, or disinterest, no one ever quite went far enough. The miracle of flight does not forgive these qualities — not in pilots, not in manufacturers, and not in regulators. One cannot observe an anomaly, no matter how seemingly minor, and leave it unexplained simply because it didn’t matter this time — because next time, it might, and then it will be too late. And because of that careless indifference, flight 4184 rode headlong into the valley of death, and 68 souls were lost, having been ripped from the sky as though by an invisible hand, and yet it was not the hand of god, but the hands of real people who possessed the power to change some small thing for the better, and did not do so. Now it’s up to those who have come after them to ensure it never happens again.

_______________________________________________________________

Thank you for reading! Researching and writing this exhaustive article consumed a month and a half of my life. Please consider supporting me on Patreon (linked below) if you want to give something back. Supporters also gain access to a private Discord server where they can hang out with me.

_______________________________________________________________

Don’t forget to listen to Controlled Pod Into Terrain, my podcast (with slides!), where I discuss aerospace disasters with my cohosts Ariadne and J! Check out our channel here, and listen to our latest episode on the fiery demise of Swissair flight 111. Alternatively, download audio-only versions via RSS.com, or look us up on Spotify!

_______________________________________________________________

Join the discussion of this article on Reddit

Support me on Patreon (Note: I do not earn money from views on Medium!)

Follow me on Twitter

Visit r/admiralcloudberg to read and discuss over 260 similar articles

(New feature!) Bibliography

--

--

Admiral Cloudberg

Kyra Dempsey, analyzer of plane crashes. @Admiral_Cloudberg on Reddit, @KyraCloudy on Twitter and Bluesky. Email inquires -> kyracloudy97@gmail.com.