Posts Tagged ‘safety’

Roots of Reliability-Centered Maintenance

Tuesday, February 11th, 2014

Last month, I discussed the pioneering WWII-era work of the eminent British scientist C.H. Waddington, who discovered that the scheduled preventive maintenance (PM) being performed on RAF B-24 bombers was actually doing more harm than good, and that drastically cutting back on such PM resulted in spectacular improvement in dispatch reliability of those aircraft. Two decades later, a pair of brilliant American engineers at United Airlines—Stan Nowlan and Howard Heap—independently rediscovered the utter wrongheadedness of traditional scheduled PM, and took things to the next level by formulating a rigorous engineering methodology for creating an optimal maintenance program to maximize safety and dispatch reliability while minimizing cost and downtime. Their approach became known as “Reliability-Centered Maintenance” (RCM), and revolutionized the way maintenance is done in the airline industry, military aviation, high-end bizjets, space flight, and numerous non-aviation applications from nuclear power plants to auto factories.

RCM wear-out curve

The traditional approach to PM assumes that most components start out reliable, and then at some point start becoming unreliable as they age

The “useful life” fallacy

Nowlan and Heap showed the fallacy of two fundamental principles underlying traditional scheduled PM:

  • Components start off being reliable, but their reliability deteriorates with age.
  • The useful life of components can be established statistically, so components can be retired or overhauled before they fail.

It turns out that both of these principles are wrong. To quote Nowlan and Heap:

“One of the underlying assumptions of maintenance theory has always been that there is a fundamental cause-and-effect relationship between scheduled maintenance and operating reliability. This assumption was based on the intuitive belief that because mechanical parts wear out, the reliability of any equipment is directly related to operating age. It therefore followed that the more frequently equipment was overhauled, the better protected it was against the likelihood of failure. The only problem was in determining what age limit was necessary to assure reliable operation. “In the case of aircraft it was also commonly assumed that all reliability problems were directly related to operating safety. Over the years, however, it was found that many types of failures could not be prevented no matter how intensive the maintenance activities. [Aircraft] designers were able to cope with this problem, not by preventing failures, but by preventing such failures from affecting safety. In most aircraft essential functions are protected by redundancy features which ensure that, in the event of a failure, the necessary function will still be available from some other source.

RCM six curves

RCM researchers found that only 2% of aircraft components have failures that are predominantly age-related (curve B), and that 68% have failures that are primarily infant mortality (curve F).

“Despite the time-honored belief that reliability was directly related to the intervals between scheduled overhauls, searching studies based on actuarial analysis of failure data suggested that the traditional hard-time policies were, apart from their expense, ineffective in controlling failure rates. This was not because the intervals were not short enough, and surely not because the tear down inspections were not sufficiently thorough. Rather, it was because, contrary to expectations, for many items the likelihood of failure did not in fact increase with increasing age. Consequently a maintenance policy based exclusively on some maximum operating age would, no matter what the age limit, have little or no effect on the failure rate.”

[F. Stanley Nowlan and Howard F. Heap, “Reliability-Centered Maintenance” 1978, DoD Report Number AD-A066579.]

Winning the war by picking our battles

FMEAAnother traditional maintenance fallacy was the intuitive notion that aircraft component failures are dangerous and need to be prevented through PM. A major focus of RCM was to identify the ways that various components fail, and then evaluate the frequency and consequences of those failures. This is known as “Failure Modes and Effects Analysis” (FMEA). Researchers found that while certain failure modes have serious consequences that can compromise safety (e.g., a cracked wing spar), the overwhelming majority of component failures have no safety impact and have consequences that are quite acceptable (e.g., a failed #2 comm radio or #3 hydraulic pump). Under the RCM philosophy, it makes no sense whatsoever to perform PM on components whose failure has acceptable consequences; the optimal maintenance approach for such components is simply to leave them alone, wait until they fail, and then replace or repair them when they do. This strategy is known as “run to failure” and is a major tenet of RCM.

A maintenance revolution…

Jet airliner

The 747, DC-10 and L-1011 were the first airliners that had RCM-based maintenance programs.

As a direct result of this research, airline maintenance practices changed radically. RCM-inspired maintenance programs were developed for the Boeing 747, Douglas DC-10 and Lockheed L-1011, and for all subsequent airliners. The contrast with the traditional (pre-RCM) maintenance programs for the Boeing 707 and 727 and Douglas DC-8 was astonishing. The vast majority of component TBOs and life-limits were abandoned in favor of an on-condition approach based on monitoring the actual condition of engines and other components and keeping them in service until their condition demonstrably deteriorated to an unacceptable degree. For example, DC-8 had 339 components with TBOs or life limits, whereas the DC-10 had only seven—and none of them were engines. (Research showed clearly that overhauling engines at a specific TBO didn’t make them safer, and actually did the opposite.) In addition, the amount of scheduled maintenance was drastically reduced. For example, the DC-8 maintenance program required 4,000,000 labor hours of major structural inspections during the aircraft’s first 20,000 hours in service, while the 747 maintenance program called for only 66,000 labor hours, a reduction of nearly two orders of magnitude.

Greybeard AMTs.

Owner-flown GA, particularly piston GA, is the only remaining segment of aviation that does things the bad old-fashioned way.

Of course, these changes saved the airlines a king’s ransom in reduced maintenance costs and scheduled downtime. At the same time, the airplanes had far fewer maintenance squawks and much better dispatch reliability. (This was the same phenomenon that the RAF experienced during WWII when they followed Waddington’s advice to slash scheduled PM.)

…that hasnt yet reached piston GA

Today, there’s only one segment of aviation that has NOT adopted the enlightened RCM approach to maintenance, and still does scheduled PM the bad old-fashioned way. Sadly, that segment is owner-flown GA—particularly piston GA—at the bottom of the aviation food chain where a lot of us hang out. I’ll offer some thoughts about that next month.

The Waddington Effect

Tuesday, January 14th, 2014
Conrad Hal (C.H.) Waddington

C.H. Waddington (1905-1975)

In 1943, a British scientist named Conrad Hal (C.H.) Waddington made a remarkable discovery about aircraft maintenance.  He was a most unlikely person to make this discovery, because he wasn’t an aeronautical engineer or an aircraft mechanic or even a pilot.  Actually, he was a gifted developmental biologist, paleontologist, geneticist, embryologist, philosopher, poet and painter who wasn’t particularly interested in aviation.  But like many other British scientists at that time, his career was interrupted by the outbreak of the Second World War and he found himself pressed into service with the Royal Air Force (RAF).

Waddington wound up reporting to the RAF Coastal Command, heading up a group of fellow scientists in the Coastal Command Operational Research Section.  Its job was to advise the British military on how it could more effectively combat the threat from German submarines.  In that capacity, Waddington and his colleagues developed a series of astonishing recommendations that defied military conventional wisdom of the time.

For example, the bombers used to hunt and kill U-boats were mostly painted black in order to make them difficult to see.  But Waddington’s group ran a series of experiments that proved that bombers painted white were not spotted by the U-boats until they were 20% closer, resulting in a 30% increase in successful sinkings. Waddington’s group also recommended that the depth charges dropped by the bombers be set to explode at a depth of 25 feet instead of 100 feet.  This recommendation—initially resisted strongly by RAF commanders—ultimately resulted in a sevenfold increase in the number of U-boats destroyed.

Consolidated B-24 "Liberator" bomber

Consolidated B-24 “Liberator” bomber

Waddington subsequently turned his attention to the problem of “force readiness” of the bombers.  The Coastal Command’s B-24 “Liberator” bombers were spending an inordinate amount of time in the maintenance shop instead of hunting U-boats.  In July 1943, the two British Liberator squadrons located at Ballykelly, Northern Ireland, consisted of 40 aircraft, but at any given time only about 20 were flight-ready.  The other aircraft were down for any number of reasons, but mostly undergoing or awaiting maintenance—either scheduled or unscheduled—or waiting for replacement parts.

At that time, conventional wisdom held that if more preventive maintenance were performed on each aircraft, fewer problems would arise and more incipient problems would be caught and fixed—and thus fleet readiness would surely improve. It turned out that conventional wisdom was wrong. It would take C.H. Waddington and his Operational Research team to prove just how wrong.

Waddington and his team started gathering data about the scheduled and unscheduled maintenance of these aircraft, and began crunching and analyzing the numbers.  When he plotted the number of unscheduled aircraft repairs as a function of flight time, Waddington discovered something both unexpected and significant: The number of unscheduled repairs spiked sharply right after each aircraft underwent its regular 50-hour scheduled maintenance, and then declined steadily over time until the next scheduled 50-hour maintenance, at which time they spiked up once again.

Waddington Effect graph

When Waddington examined the plot of this repair data, he concluded that the scheduled maintenance (in Waddington’s own words) “tends to INCREASE breakdowns, and this can only be because it is doing positive harm by disturbing a relatively satisfactory state of affairs. There is no sign that the rate of breakdowns is starting to increase again after 40-50 flying hours when the aircraft is coming due for its next scheduled maintenance.” In other words, the observed pattern of unscheduled repairs demonstrated that the scheduled preventive maintenance was actually doing more harm than good, and that the 50-hour preventive maintenance interval was inappropriately short.

The solution proposed by Waddington’s team—and ultimately accepted by the RAF commanders over the howls of the maintenance personnel—was to increase the time interval between scheduled maintenance cycles, and to eliminate all preventive maintenance tasks that couldn’t be demonstrably proven to be beneficial. Once these recommendations were implemented, the number of effective flying hours of the RAF Coastal Command bomber fleet increased by 60 percent!

Fast forward two decades to the 1960s, when a pair of gifted scientists who worked for United Airlines—aeronautical engineer Stanley Nowlan and mathematician Howard Heap—independently rediscovered these principles in their pioneering research on optimizing maintenance that revolutionized the way maintenance is done in air transport, military aviation, high-end bizjets and many non-aviation industrial applications.  They were almost certainly unaware of the work of C.H. Waddington and his colleagues in Britain in the 1940s because that work remained classified until 1973, when Waddington’s meticulously-kept diary of his wartime research activities was declassified and published.

Next time, I’ll discuss the fascinating work of Nowlan and Heap on what came to be known as “Reliability Centered Maintenance.” But for now, I will leave you with the major takeaway from Waddington’s research during World War II: Maintenance isn’t an inherently good thing (like exercise); it’s a necessary evil (like surgery). We have to do it from time to time, but we sure don’t want to do more than absolutely necessary to keep our aircraft safe and reliable. Doing more maintenance than necessary actually degrades safety and reliability.

Join an Aircraft Type Club and Save Your Life

Tuesday, December 10th, 2013

Type Clubs Save LivesAircraft type clubs are General Aviation’s best-kept secret weapon. While there are more than a hundred of them, they fly stealthily below the radar of most pilots, who seem to be blissfully unaware of their existence and benefits. Only a fraction of pilots belong to any of them, yet they offer the best value proposition in aviation: they’re cheap and they could save your life.

No, I’m not talking about AOPA, EAA and the other large industry associations that have hundreds of thousands of members. Type clubs are smaller, usually only a few hundred or a few thousand members, and they play a very different role. While the large organizations champion industry-wide issues, type clubs are dedicated to helping owners and renters of specific aircraft makes and models.

Most type clubs offer a newsletter or magazine and many have a web site loaded with aircraft details. But no two clubs are alike; each seems to have a slightly different emphasis. For example, the Cessna Pilots Association (CPA) is focused heavily on maintenance. Each time I had a maintenance issue with the Cessna T210 I owned ten years ago, I phoned the CPA before seeing my mechanic. Invariably, their experts were able to narrow down the issue so I could point my mechanic to the specific problem that needed fixing. That saved hours of troubleshooting and lots of money.

Some clubs, like the Cirrus Owner and Pilots Association (COPA), have a strong emphasis on pilot training and safety. In addition to a very active online forum in which training and accidents are discussed in detail, they offer training at locations around the world in their weekend Cirrus Pilot Proficiency Programs (CPPP). Half of the weekend is spent in seminars on subjects like avionics and engine operation. The other half is spent in the air with a flight instructor, often factory trained, who specializes in teaching in Cirrus SR20 and SR22 aircraft.

The payoff is that the Cirrus fatal accident rate, which was originally higher than the GA fatal accident rate, has declined steadily in recent years and is now slightly lower than the overall GA fatal accident rate. Not surprisingly, COPA members have far fewer fatal Cirrus accidents than non-COPA members.

According to Rick Beach of COPA, the type club has over 3,700 members representing 2,900 Cirrus tail numbers, which is 55% of the 5,400 aircraft that have been produced. About 3,200 of the clubs members are certificated pilots, which is 40% of the total estimated 8,000 Cirrus pilots (including owners and renters).

Beach says “In the history of the fleet, 25 COPA members were involved in the 103 fatal accidents or 24%. If Cirrus pilots were uniformly likely to be involved, then we would expect 40% to be COPA members.” Not only are COPA members about half as likely to be involved in an accident, active COPA members, those who participated in a BPPP or were active in online forums, are even less likely to have an accident. In the history of the fleet, 11 active COPA members were involved in fatal accidents or 11%, about one quarter of the accident rate for all Cirrus aircraft.

Beach continues “If we just look at the past 36 months, as fatal accident frequency dropped considerably, the results are more emphatic. Of the 36 fatal accidents in the past 36 months, 7 were COPA members (20%) and 3 were active COPA Members (8%) instead of 40%.”

On the flip side, COPA members are more likely to have pulled the Cirrus parachute handle and floated down to safety. “Over the lifetime of the fleet, there have been 38 CAPS [parachute] saves. Of those, 17 involved COPA members or 45%, slightly higher than our guesstimate of the proportion of COPA members in the Cirrus pilot community. In the past 36 months, there have been 16 CAPS saves. Of those, 6 involved COPA members or 38%, almost the same proportion of COPA members in the Cirrus pilot community, and certainly a higher percentage than in fatal accidents.”

Lest you think COPA is unique in its safety results, look at LOBO, the Lancair Owners and Builders Organization. In 2008, the worst accident year in Lancair history, seven crashes resulted in 19 fatalities. In October 2008, LOBO was formed to address the high accident rate. In 2009, there were only four accidents with 7 fatalities and by 2010 there were only two fatalities, the lowest accident rate in ten years. Per their January 2011 newsletter, “since the inception of LOBO, there has only been one serious accident involving a LOBO member.”

Give yourself an early Christmas present: Join the type club for the aircraft you fly most frequently. But don’t just write a check; become an active participant. Whether you own or rent, you’re bound to learn more about the intricacies of that aircraft model. And if your family is lucky, what you learn as a type club member may someday save your life…and possibly their lives too.

Expectation Bias

Tuesday, December 3rd, 2013

I don’t know who first described flying as “hours of boredom punctuated by moments of sheer terror”, but it wouldn’t be shocking to discover the genesis was related to flying a long-haul jet. I was cogitating on that during a recent overnight flight to Brazil. While it was enjoyable, this red-eye brought to mind the complacency which can accompany endless hours of straight-and-level flying – especially when an autopilot is involved.

This post was halfway written when my inbox lit up with stories of a Boeing Dreamlifter – that’s a 747 modified to carry 787 fuselages — landing at the wrong airport in Wichita, Kansas. The filed destination was McConnell AFB, but the crew mistakenly landed at the smaller Jabara Airport about nine miles north. The radio exchanges between the Dreamlifter crew and the tower controller at McConnell show how disoriented the pilots were. Even five minutes after they had landed, the crew still thought they were at Cessna Aircraft Field (CEA) instead of Jabara.

McConnell AFB, the flight's destination, is the Class D airport at the bottom of the chart, about nine miles south of the non-towered Jabara Airport.

McConnell AFB, the flight’s destination, is the Class D airport at the bottom of the chart, about nine miles south of the non-towered Jabara Airport.

As a pilot, by definition I live in a glass house and will therefore refrain from throwing stones. But the incident provides a good opportunity to review the perils of what’s known as “expectation bias”, the idea that we often see and hear what we expect to rather than what is actually happening.

Obviously this can be bad for any number of reasons. Expecting the gear to come down, a landing clearance to be issued, or that controller to clear you across a runway because that’s the way you’ve experience it a thousand times before can lead to aircraft damage, landing without a clearance, a runway incursion, or worse.

I’d imagine this is particularly challenging for airline pilots, as they fly to a more limited number of airports than those of us who work for charter companies whose OpSpecs allow for worldwide operation. Flying the Gulfstream means my next destination could be literally anywhere: a tiny Midwestern airfield, an island in the middle of the Pacific, an ice runway in the Antarctic, or even someplace you’d really never expect to go. Pyongyang, anyone?

But that’s atypical for most general aviation, airline, and corporate pilots. Usually there are a familiar set of destinations for a company airplane and an established route network for Part 121 operators. Though private GA pilots can go pretty much anywhere, we tend to have our “regular” destinations, too: a favored spot for golfing, the proverbial $100 hamburger, a vacation, or that holiday visit with the family. It can take on a comfortable, been-there-done-that quality which sets us up for expectation bias. Familiarity may lead to contempt for ordinary mortals, but the consequences can be far worse for aviators.

One could make the case that the worse accident in aviation history – the Tenerife disaster – was caused, at least in part, by expectation bias. The captain of a KLM 747 expected a Pan Am jumbo jet would be clear of the runway even though he couldn’t see it due to fog. Unfortunately, the Clipper 747 had missed their turnoff. Result? Nearly six hundred dead.

"Put an airliner inside an airliner?  Yeah, we can do that."  Boeing built four of these Dreamlifters to bring 787 fuselages to Seattle for final assembly.  As you can imagine, this thing landing at a small airplane would turn some heads.

“Put an airliner inside an airliner? Yeah, we can do that.” Boeing built four of these Dreamlifters to bring 787 fuselages to Seattle for final assembly. As you can imagine, this thing landing at a small airplane would turn some heads.

The Dreamlifter incident brought to mind an eerily similar trip I made to Wichita a couple of years ago. It was a diminutive thirty-five mile hop from Hutchinson Municipal (HUT) to Jabara Airport (AAO) in the Gulfstream IV. We were unhurried, well-rested, and flying on a calm, cloudless day with just a bit of haze. The expectation was that we were in for a quick, easy flight.

We were cleared for the visual approach and told to change to the advisory frequency. Winds favored a left-hand pattern for runway 36. Looking out the left-hand window of the airplane revealed multiple airports, each with a single north-south runway. I knew they were there, but reviewing a chart didn’t prepare me for how easily Cessna, Beech, and Jabara airports could be mistaken for one another.

We did not land at the wrong airport, but the hair on the back of my neck went up. It was instantly clear that, like Indiana Jones, we were being presented a golden opportunity to “choose poorly”. We reverted back to basic VFR pilotage skills and carefully verified via multiple landmarks and the aircraft’s navigation display that this was, indeed, the correct airfield.

That sounds easy to do, but there’s pressure inducted by the fact that this left downwind puts the airplane on a direct collision course with McConnell Air Force Base’s class Delta airspace and also crosses the patterns of several other fields. In addition, Mid-Continent’s Class C airspace is nearby and vigilance is required in that direction as well. Wichita might not sound like the kind of place where a lovely VMC day would require you to bring your “A” game, but it is.

Pilots in the Southern California area have been known to mistake the former home of Top Gun, MCAS Miramar, for the smaller Montgomery Airport at the bottom of the map.

Pilots in the Southern California area have been known to mistake the former home of Top Gun, MCAS Miramar, for the smaller Montgomery Airport at the bottom of the map.

Expectation bias can be found almost anywhere. I’d bet a fair number of readers have experienced this phenomenon first-hand. In my neck of the woods, MCAS Miramar (NKX) is often mistaken for the nearby Montgomery Field (MYF). Both airports have two parallel runways and a single diagonal runway. Miramar is larger and therefore often visually acquired before Montgomery, and since it’s in the general vicinity of where an airfield of very similar configuration is expected, the pilot who trusts, but – in the words of President Reagan – does not verify, can find themselves on the receiving end of a free military escort upon arrival.

Landing safely at the wrong airport presents greater hazard to one’s certificate than to life-and-limb, but don’t let that fool you; expectation bias is always lurking and can bite hard if you let it. Stay alert, assume nothing, expect the unexpected. As the saying goes, you’re not paranoid if they really are out to get you!

A Future with More Government Shutdowns?

Monday, October 7th, 2013

Government Shutdown FAAAs of this writing, the 2013 government shutdown, the first in 17 years, has been in effect for a week with no signs of ending. If it only continues for another week or two and doesn’t reoccur in the near future, the many people and organizations affected by it will give a collective sigh of relief and it will soon be forgotten. But what if government shutdowns become the new normal?

It wasn’t that long ago that filibusters in the Senate were rare, but since 2009 they’ve become routine, requiring 60 votes whereas in the past a simple majority vote was sufficient.  If government shutdowns become routine, we may be in uncharted territory.

From the important to the mundane, here’s what’s not happening at the FAA during the government shutdown:

  • The Aircraft Registry Branch is closed, so new aircraft sales have halted since the planes can’t be registered. A GAMA survey indicates that 12 deliveries were missed in the first two days and a total of 135 deliveries totaling $1.38 billion if the shutdown lasts a couple of weeks. Interestingly, the Aircraft Registry Branch was deemed essential and left open during the shutdowns in the 1990s. Why not this time?
  • The Flight Standards Service is down from 5,000 people to fewer than 200 essential people, mostly managers. So the inspectors who provide safety oversight of maintenance and operations are mostly sidelined. Expect virtually no ramp checks, ferry permits, CFI renewals, or approval of applications, such as a new Part 135 certificate for a new charter operator. “Limited” certification work, such as on new aircraft under development, will continue according to the DOT.
  • Written exams for knowledge tests have halted, an inconvenience for anyone who put off taking their written exam until just before a now delayed checkride.
  • Major new initiatives are delayed. Remember Part 23 reform that according to AOPA will “overhaul small-aircraft certification rules to double safety and cut costs in half.” Not happening right now. Development and testing of NextGen technologies is also halted. And if you’ve taken a written exam and wondered why you saw lots of questions about ADF receivers, but few on GPS, be aware that the current overhaul of knowledge tests has stopped.

Some things that are essential to protect life and property continue to be in place. That includes air traffic control facilities, the FSS services provided by Lockheed Martin and the aviationweather.gov web site (which is actually part of NOAA, not the FAA). And DOT reports that 2,490 employees from the Office of Aviation Safety will be incrementally recalled over a two-week period. FAA practical tests (checkrides)  continue for now, except for those that require a ride with an FAA inspector, such as CFI checkrides in some FSDOs.

The 2013 FAA budget involved reductions of $486 million and the Fiscal Year 2014 target includes a reduction of $697 million. A future FAA with a shrinking budget is likely to take longer to implement new rules, to reduce the services it currently provides, and to outsource more of its functions. I expect it to also attempt to charge for previously free services (e.g. the $447,000 bill for ATC service at AirVenture).

So what are the near-term implications for General Aviation? For starters, people working in GA will need to start planning further ahead to minimize the impact of future government shutdowns. Some things will be easy, like encouraging flight students to take their written exams when they first start flight training. Others, like getting a new Part 135 charter certificate approved when the FAA is open will be difficult because of backlogs.

Looking further down the road, GA should be involved in the dialog on how to restructure a changing FAA. If you have a good idea on how they can cut waste and improve efficiency, send it to the Administrator. Do you have an idea on how they could outsource a service, like the Flight Service Stations (FSS) that were outsourced through Lockheed Martin? Send them a note or a proposal.

The worst possible outcome would be if other, better-funded agencies step in to help the FAA with their mission. I can only imagine how awful GA flying would become if, for example, the TSA took primary responsibility for ramp checks. If government shutdowns ever become the new normal, many things will change. And it will be up to all of us to make sure that GA as we know it doesn’t get swept under the carpet in the process.