Posts Tagged ‘safety’

Upset Recovery Training vs. Aerobatics

Tuesday, October 28th, 2014

Upset recovery training has been all the rage over the past couple of years. A Google search of that exact phrase returns more than 24,000 results. There’s a professional association dedicated to such training. ICAO even declared aircraft upsets to be the cause of “more fatalities in scheduled commercial operations than any other category of accidents over the last ten years.”

Nevertheless, I get the impression that some folks wonder if it isn’t more of a safety fad than an intrinsic imperative. It’s hard to blame them. You can hardly open a magazine or aviation newsletter these days without seeing slick advertisements for this stuff. When I was at recurrent training a couple of months ago, CAE was offering upset recovery training to corporate jet pilots there in Dallas. “If I wanted to fly aerobatics, I’d fly aerobatics!” one aviator groused.

He didn’t ask my opinion, but if he had, I’d remind him that 99% of pilots spend 99% of their time in straight and level flight — especially when the aircraft in question is a business jet. I’m not exaggerating much when I say that even your typical Skyhawk pilot is a virtual aerobat compared to the kind of flying we do on charter and corporate trips. For one thing, passengers pay the bills and they want the smoothest, most uneventful flight possible.

In addition, these jets fly at very high altitudes – typically in the mid-40s and even as high as 51,000 feet. Bank and pitch attitudes tend to stay within a narrow band. Yaw? There shouldn’t be any. The ball stays centered, period. We aim for a level of smoothness that exceeds even that of the airlines. Passengers and catering may move about the cabin frequently during a flight, but it shouldn’t be because of anything we’re doing up front.

Fly like that for a decade or two, logging thousands and thousands of uneventful, straight-and-level hours and the thought of all-attitude flying can become – to put it mildly – uncomfortable. I’ve even seen former fighter pilots become squeamish at the thought of high bank or pitch angles after twenty years of bizjet flying.

Unfortunately, there are a wide variety of things that can land a pilot in a thoroughly dangerous attitude: wind shear, wake turbulence, autopilot failure, mechanical malfunction (hydraulic hard-overs, asymmetric spoiler or flap deployment, etc.), inattention, and last but not least, plain old pilot error. Look at recent high-profile accidents and you’ll see some surprisingly basic flying blunders from the crew. Air France 447, Colgan 3407, and Asiana 214 are just three such examples. It may not happen often, but when it does it can bite hard.

So yes, I think there is a strong need for more manual flying exposure in general, and upset recovery training in particular. This isn’t specific to jet aircraft, because some light aircraft have surpassed their turbine-powered cousins in the avionics department. I only wish the 1980′s era FMS computer in my Gulfstream was as speedy as a modern G1000 installation.

Defining the Problem

To the best of my knowledge, neither the NTSB or FAA provide a standard definition for “upset”, but much like Supreme Court Justice Potter Stewart, we pretty much know it when we see it. The term has generally come to be defined as a flight path or aircraft attitude deviating significantly from that which was intended by the pilot. Upsets have led to loss of control, aircraft damage or destruction, and more than a few fatalities.

As automation proliferates, pilots receive less hands-on experience and a gradual but significant reduction in stick-and-rudder skill begins to occur. The change is a subtle one, and that’s part of what makes it so hazardous. A recent report by the FAA PARC rulemaking workgroup cites poor stick and rudder skills as the number two risk factor facing pilots today. The simple fact is that windshear, wake turbulence, and automation failures happen.

The purpose of upset recovery training is to give pilots the tools and experience necessary to recognize and prevent impending loss of control situations. As the saying goes, an ounce of prevention is worth a pound of cure, and that’s why teaching recovery strategies from the most common upset scenarios is actually a secondary (though important) goal.

What about simulators? They’ve proven to be an excellent tool in pilot training, but even the most high fidelity Level D sims fall short when it comes to deep stalls and loss of control scenarios. For one thing, stall recovery is typically initiated at the first indication of stall, so the techniques taught in the simulator may not apply to a full aerodynamic stall. Due to the incredibly complex and unpredictable nature of post-stall aerodynamics, simulators aren’t usually programmed to accurately emulate an aircraft in a deeply stalled condition. Thus the need for in-aircraft experience to supplement simulator training.

Upset Recovery vs. Aerobatics

It’s important to note that upset recovery training may involve aerobatic maneuvering, but it does not exist to teach aerobatics. Periodically over the years, discussions on the merits of this training will cause a co-worker to broach the subject of flying an aerobatic maneuver in an airplane which is not designed and built for that purpose. This happened just the other day. Typically they’ll ask me if, as an aerobatic pilot, I would ever consider performing a barrel or aileron roll in the aircraft.

I used to just give them the short answer: “no”. But over time I’ve started explaining why I think it’s such a bad idea, even for those of us who are trained to fly such maneuvers. I won’t touch on the regulations, because I think we are all familiar with those. I’m just talking about practical considerations.

Normal planes tend to have non-symmetrical airfoils which were not designed to fly aerobatics. They feature slower roll rates, lower structural integrity under high G loads, and considerably less control authority. You might have noticed that the control surfaces on aerobatic airplanes are pretty large — they are designed that way because they’re needed to get safely into and out of aerobatic maneuvers.

That’s not to say an airplane with small control surfaces like a business jet or light GA single cannot perform aerobatics without disaster striking. Clay Lacy flies an airshow sequence in his Learjet. Duane Cole flew a Bonanza. Bob Hoover used a Shrike Commander. Sean Tucker flew an acro sequence in a Columbia (now known as the Cessna TTx). However, the margins are lower, the aerobatics are far more difficult, and pilots not experienced and prepared enough for those things are much more likely to end up hurt or dead.

Sean Tucker will tell you that the Columbia may not recover from spins of more than one or two turns. Duane Cole said the Bonanza (in which he did inverted ribbon cuts) had barely enough elevator authority for the maneuver, and it required incredible strength to hold the nose up far enough for inverted level flight. Bob Hoover tailored his performance to maneuvers the Shrike could do — he’ll tell you he avoided some aerobatic maneuvers because of the airplane’s limitations.

Knowing those limitations and how to deal with them — that’s where being an experienced professional aerobatic pilot makes the difference. And I’m sure none of those guys took flying those GA airplanes upside down lightly. A lot of planning, consideration, training and practice went into their performances.

Now, consider the aircraft condition. Any negative Gs and stuff will be flying around the cabin. Dirt from the carpet. Manuals. Items from the cargo area. Floor mats. Passengers. EFBs. Drinks. Anything in the armrest or sidewall pockets. That could be a little distracting. Items could get lodged behind the rudder pedals, hit you in the head, or worse.

If the belts aren’t tight enough, your posterior will quickly separate from the seat it’s normally attached to. And I assure you, your belts are not tight enough. Getting them that way involves cinching the lap belt down until it literally hurts. How many people fly a standard or transport category aircraft that way?

Now consider that the engine is not set up for fuel and oil flow under negative Gs. Even in airplanes specifically designed for acro, the G loads move the entire engine on the engine mount. In the Decathlon you can always see the spinner move up an inch or two when pushing a few negative Gs. Who knows what that would do with the tighter clearances between the fan and engine cowl on an airplane like the Gulfstream?

Next, let’s consider trim. The jet flies around with an electric trim system which doesn’t move all that quickly. The aircraft are typically trimmed for upright flight. That trim setting works heavily against you when inverted, and might easily reach the point where even full control deflection wouldn’t be sufficient.

I could go on, but suffice it to say that the more I learn about aerobatics, the less I would want to do them in a non-aerobatic aircraft – and certainly not a swept wing jet! Sure, if performed perfectly, you might be just fine. But any unusual attitude is going to be far more difficult — if not outright impossible — to recover from.

Dang it, Tex!

Every time someone references Tex Johnson’s famous barrel roll in the Boeing 707 prototype, I can’t help but wish he hadn’t done that. Yes, it helped sell an airplane the company had staked it’s entire future on, but aerobatic instructors have been paying the price ever since.

Aerobatic and upset recovery training: good. Experimenting with normal category airplanes: bad. Very bad.

Carbon Monoxide, Silent Killer

Monday, October 20th, 2014

Danger, Carbon Monoxide
On January 17, 1997, a Piper Dakota departed Farmingdale, New York, on a planned two-hour VFR flight to Saranac Lake, New York. The pilot was experienced and instrument-rated; his 71-year-old mother, a low-time private pilot, occupied the right seat. Just over a half-hour into the flight, Boston Center got an emergency radio call from the mother, saying that the pilot (her son) had passed out.

The controller attempted a flight assist, and an Air National Guard helicopter joined up with the aircraft and participated in the talk-down attempt. Ultimately, however, the pilot’s mother also passed out.

The aircraft climbed into the clouds, apparently on autopilot, and continued to be tracked by ATC. About two hours into the flight, the airplane descended rapidly out of the clouds and crashed into the woods near Lake Winnipesaukee, New Hampshire. Both occupants died.

Toxicological tests revealed that the pilot’s blood had a CO saturation of 43% — sufficient to produce convulsions and coma—and his mother’s was 69%.

On December 6 that same year, a physician was piloting his Piper Comanche 400 from his hometown of Hoisington, Kansas, to Topeka when he fell asleep at the controls. The airplane continued on course under autopilot control for 250 miles until it ran a tank dry and (still on autopilot) glided miraculously to a soft wings-level crash-landingin a hay field near Cairo, Missouri.

The pilot was only slightly injured, and walked to a nearby farmhouse for help. Toxicology tests on a blood sample taken from the lucky doc hours later revealed CO saturation of 27%. It was almost certainly higher at the time of the crash.

Just a few days later, a new 1997 Cessna 182S was being ferried from the Cessna factory in Independence, Kansas, to a buyer in Germany when the ferry pilot felt ill and suspected carbon monoxide poisoning. She landed successfully and examination of the muffler revealed that it had been manufactured with defective welds. Subsequent pressure tests by Cessna of new Cessna 172 and 182 mufflers in inventory revealed that 20% of them had leaky welds. The FAA issued an emergency Airworthiness Directive (AD 98-02-05) requiring muffler replacement on some 300 new Cessna 172s and182s.

About 18 months later, the FAA issued AD 99-11-07 against brand new air-conditioned Mooney M20R Ovations when dangerous levels of CO were found in their cabins.

Sidebar: CO Primer

Click on image above for high-resolution printable version.

Not just in winter

A search of the NTSB accident database suggests that CO-related accidents and incidents occur far more frequently than most pilots believe. Counterintuitively, these aren’t confined to winter-time flying with the cabin heat on. Look at the months during which the following accidents and incidents occurred during the 15-year period from 1983 to 1997:

March 1983. The Piper PA-22-150 N1841P departed Tucumcari, N.M. After leveling at 9,600, the right front seat passenger became nauseous, vomited, and fell asleep. The pilot began feeling sleepy and passed out. A 15-year-old passenger in the back seat took control of the aircraft by reaching between the seats, but the aircraft hit a fence during the emergency landing. None of the four occupants were injured. Multiple exhaust cracks and leaks were found in the muffler. The NTSB determined the probable cause of the accident to be incapacitation of the PIC from carbon monoxide poisoning. [FTW83LA156]

February 1984. The pilot of Beech Musketeer N6141N with four aboard reported that he was unsure of his position. ATC identified the aircraft and issued radar vectors toward Ocean Isle, N.C. Subsequently, a female passenger radioed that the pilot was unconscious. The aircraft crashed in a steep nose-down attitude, killing all occupants. Toxicological tests of the four victims revealed caboxyhemoglobin levels of 24%, 22%, 35% and 44%. [ATL84FA090]

November 1988. The Cessna 185 N20752 bounced several times while landing at Deadhorse, Alaska. The pilot collapsed shortly after getting out of the airplane. Blood samples taken from the pilot three hours after landing contained 22.1% carboxyhemoglobin. The left engine muffler overboard tube was broken loose from the muffler where the two are welded. The NTSB determined probable cause to be physical impairment of the pilot-in-command due to carbon monoxide poisoning. [ANC89IA019]

July 1990. While on a local flight, the homebuilt Olsen Pursuit N23GG crashed about three-tenths of a mile short of Runway 4 at Fowler, Colo. No one witnessed the crash, but post-crash investigation indicated that there was no apparent forward movement of the aircraft after its initial impact. The aircraft burned, and both occupants died. Toxicology tests of the pilot and passenger were positive for carboxyhemoglobin. [DEN90DTE04]

August 1990. About fifteen minutes into the local night flight in Cessna 150 N741MF, the aircraft crashed into Lake Michigan about one mile from the shoreline near Holland, Mich. Autopsies were negative for drowning, but toxicological tests were positive for carboxyhemoglobin, with the pilot’s blood testing at 21%. [CHI90DEM08]

July 1991. The student pilot and a passenger (!) were on a pleasure flight in Champion 7AC N3006E owned by the pilot. The aircraft was seen to turn into a valley in an area of mountainous terrain, where it subsequently collided with the ground near Burns, Ore., killing both occupants. A toxicology exam of the pilot’s blood showed a saturation of 20% carboxyhemoglobin, sufficient to cause headache, confusion, dizziness and visual disturbance. [SEA91FA156]

October 1992. The pilot of Cessna 150 N6402S was in radio contact with the control tower at Mt. Gilead, Ohio, and in a descent from 5,000 feet to 2,000 feet in preparation for landing. Radar contact was lost, and the aircraft crashed into a wooded area, seriously injuring the pilot. Toxicological tests on the pilot’s blood were positive for carbon monoxide. Examination of the left muffler revealed three cracks and progressive deterioration. The NTSB found probable cause of the accident to be pilot incapacitation due to carbon monoxide poisoning. [NYC93LA031]

April 1994. Fifteen minutes after takeoff from Long Beach, Calif., the Cessna 182 N9124G began deviating from headings, altitudes and ATC instructions. The aircraft did several 360- and 180-degree turns. The pilot reported blurred vision, headaches, nausea, labored breathing, and difficulty staying awake. The aircraft ultimately crashed in a vineyard near Kerman, Calif., and the owner/pilot was seriously injured. Post-crash inspection revealed numerous small leaks in the exhaust system. The pilot tested positive for carbon monoxide even after 11 hours of oxygen therapy. [LAX94LA184]

October 1994. A student pilot returned to Chesterfield, Mo., from a solo cross-country flight in Cessna 150 N7XC, complaining of headache, nausea, and difficulty walking. The pilot was hospitalized, and medical tests revealed elevated CO which required five and a half hours breathing 100% oxygen to reduce to normal levels. Post-flight inspection revealed a crack in an improperly repaired muffler that had been installed 18 hours earlier. [CHI95IA030]

March 1996. The pilot of Piper Cherokee 140 N95394 stated that she and her passenger became incapacitated after takeoff from Pittsburg, Kan. The airplane impacted the terrain, but the occupants were uninjured. Both were hospitalized, and toxicological tests for carbon monoxide were positive. A subsequent examination found holes in the muffler. [CHI96LA101]

August 1996. A Mankovich Revenge racer N7037J was #2 in a four-airplane ferry formation of Formula V Class racing airplanes. The #3 pilot said that the #2 pilot’s flying was erratic during the flight. The airplane crashed near Jeffersonville, Ind., killing the pilot. The results of FAA toxicology tests of the pilot’s blood revealed a 41% saturation of carboxyhemoglobin; loss of consciousness is attained at approximately 30%. Examination of the wreckage revealed that the adhesive resin that bound the rubber stripping forming the firewall lower seal was missing. The NTSB determined probable cause of the accident to be pilot incapacitation due to carbon monoxide poisoning. [CHI96FA322]

January 1997. The fatal crash of Piper Dakota N8263Y near Lake Winnipesaukee, N.H. (described previously). [IAD97FA043]

December 1997. Non-fatal crash of Piper Comanche 400 N8452P flying from Hoisington to Topeka, Kansas (described previously). [CHI98LA055]

December 1997. A new Cessna 182S was being ferried from the factory in Independence, Kan., to a buyer in Germany when the ferry pilot felt ill and suspected carbon monoxide poisoning (described previously). [Priority Letter AD 98-02-05]

Overall, deaths from unintentional carbon monoxide poisoning have dropped sharply since the mid-1970s thanks mainly to lower CO emissions from automobiles with catalytic converters (most CO deaths are motor vehicle-related) and safer heating and cooking appliances. But CO-related airplane accidents and incidents haven’t followed this trend. The ADs issued against Independence-built Cessna 172s and 182s and Mooney Ovations demonstrates that even brand new airplanes aren’t immune.

CO Checklist

Click on image above for high-resolution printable version.

Close calls

In addition to these events in the NTSB accident database where CO poisoning was clearly implicated, there were almost certainly scores of accidents, incidents, and close calls where CO was probably a factor.

In January 1999, for example, a Cessna 206 operated by the U.S. Customs Service was on a night training mission when it inexplicably crashed into Biscayne Bay a few miles off the south Florida coast. The experienced pilot survived the crash, but had no recollection of what happened. The NTSB called it simple pilot error and never mentioned CO as a possible contributing factor. However, enough carboxyhemoglobin was found in the pilot’s blood that the Customs Service suspected that CO poisoning might have been involved.

The agency purchased sensitive industrial electronic CO detectors for every single-engine Cessna in its fleet, and discovered that many of the planes had CO-in-the-cockpit problems. On-board CO detectors and CO checks during maintenance inspections have been standard operating procedure for the Customs Service ever since.

How much CO is too much?

It depends on whom you ask.

EPA calls for a health hazard alert when the outdoor concentration of CO rises above 9 parts per million (ppm) for eight hours, or above 35ppm for one hour. OSHA originally established a maximum safe limit for exposure to CO in the workplace of 35 ppm, but later raised it to 50 ppm under pressure from industry.

The FAA requires that CO in the cabin not exceed 50 ppm during certification testing of new GA airplanes certified under FAR Part 23 (e.g. Cessna Corvallis, Cirrus SR22, Diamond DA-40). Legacy aircraft certified under older CAR 3 regs required no CO testing at all during certification.

Once certified, FAA requires no CO testing of individual aircraft by the factory, and no follow-up retesting during annual inspections. A March 2010 FAA SAIB (CE-10-19 R1) recommends checking CO levels with a hand-held electronic CO detector during ground runups at each annual and 100-hour inspection, but in my experience very few shops and mechanics do this.

UL-approved residential CO detectors are not permitted to alarm until the concentration rises to 70 ppm and stays there for four hours. (This was demanded by firefighters and utility companies to reduce the incidence of nuisance calls from homeowners.) Yet most fire departments require that firefighters put on their oxygen masks immediately when CO levels reach 25 ppm or higher.

It’s important to understand that low concentrations of CO are far more hazardous to pilots than to non-pilots. That’s because the effects of altitude hypoxia and CO poisoning are cumulative. For example, a COHb saturation of 10% (which is about what you’d get from chain-smoking cigarettes) would probably not be noticeable to someone on the ground. But at 10,000 feet, it could seriously degrade your night vision, judgment, and possibly cause a splitting headache.

After studying this hazard for many years and consulting with world-class aeromedical experts, I have come to the following conclusions:

  1. Every single-engine piston aircraft should carry a sensitive electronic CO detector.
  2. Any in-flight CO concentration above 10 ppm should be brought to the attention of an A&P for troubleshooting and resolution.
  3. Any in-flight CO concentration above 35 ppm should be grounds for going on supplemental oxygen (if available) and making a precautionary landing as soon as practicable.

Smokers are far more vulnerable to both altitude hypoxia and CO poisoning, since they’re already in a partially poisoned state when they first get into the aircraft. Because of COHb’s long half-life, you’d do well to abstain from smoking for 8 to 12 hours prior to flight.

Choosing a CO detector

Five CO detectors

Five CO detectors (left to right): chemical spot, UL-compliant residential (Kidde), non-UL-compliant (CO Experts 2015), industrial (BW Honeywell), TSO’d panel-mounted (CO Guardian 551).

Chemical spot detectors:Stay away from those ubiquitous el-cheapo adhesive-backed cardboard chemical spot detectors that are commonly sold by pilot shops and mail-order outfits for under trade names like “Dead Stop,” “Heads Up” and “Quantum Eye.” They have a very short useful life (about 30 days), and are extremely vulnerable to contamination from aromatic cleaners, solvents and other chemicals routinely used in aircraft maintenance.

These things often remain stuck on the instrument panel for years, providing a dangerous false sense of security. What’s worse, there’s no warning that the detector is outdated or has been contaminated—in some ways, that’s worse than not having a detector at all.

Even when fresh, chemical spot detectors are incapable of detecting low levels of CO. They’ll start turning color at 100ppm, but so slowly and subtly that you’ll never notice it. For all practical purposes, you’ll get no warning until concentrations rise to the 200 to 400 ppm range, by which time you’re likely to be too impaired to notice the color change.

Residential electronic detectors:Although battery-powered residential electronic detectors are vastly superior to those worthless chemical spots, most are designed to be compliant with Underwriter’s Laboratory specification UL-2034 (revised 1998). This spec requires that

(1)   The digital readout must not display any CO concentration less than 30 ppm.

(2)   The alarm will not sound until CO reaches 70 ppm and remains at or above that level for four hours.

(3)   Even at a concentration of 400 ppm, it may take as much as 15 minutes before the alarm sounds.

For aircraft use, you really want something much more sensitive and fast-acting. I like the non-UL-compliant CO Experts Model 2015 ($199 from www.aeromedix.com). It displays CO concentrations as low as 7 ppm and provides a loud audible alarm at concentrations above 25 ppm. It updates its display every 10 seconds (compared to once a minute for most residential detectors), which makes it quite useful as a “sniffer” for trying to figure out exactly where CO is entering the cabin.

Industrial electronic detectors:Industrial CO detectors cost between $400 and $1,000. A good choice for in-cockpit use is the BW Honeywell GasAlert Extreme CO  ($410 from www.gassniffer.com). This unit displays CO concentrations from 0 to 1,000 ppm on its digital display, has a very loud audible alarm with dual trigger levels (35 and 200 ppm).

Purpose-built aviation electronic detectors:Tucson-based CO Guardian LLC makes a family of TSO’d panel-mount electronic CO detectors specifically designed for cockpit use. These detectors detect and alarm at 50 ppm (after 10 minutes), or 70 ppm (after 5 minutes), and will alarm instantly if concentrations rise to 400 ppm. The digital display models ($599 and up) will show concentrations as low as 10 ppm. Available from www.coguardian.com. Obviously, panel-mount detectors cannot be used as a sniffer to locate the source of a CO leak.

For more information…

There is an outstanding October 2009 research paper titled “Detection and Prevention of Carbon Monoxide Exposure in General Aviation Aircraft” authored by Wichita State University under sponsorship of the FAA Office of Research and Technology Development. The paper is 111 pages long, and discusses (among other things):

  • Characteristics of CO-related GA accidents
  • Evaluation of CO detectors, including specific makes and models
  • Placement of CO detectors in the cabin
  • Exhaust system maintenance and inspection

This research paper is available online at:

http://www.tc.faa.gov/its/worldpac/techrpt/ar0949.pdf

Judgment, and the Day

Monday, August 18th, 2014

It was windy yesterday—blowing hard out of the south and gusting to near 40 knots, according to the anemometer mounted on the top of the FBO building that sits midfield at our little airport tucked into the Mad River Valley, near Warren, Vermont. Weather was inbound. But for the day conditions were still high overcast, with just a few scattered, scraggly cumulous. Nothing towering. Maybe some wave action from the wind flowing over the undulating Green Mountains and White Mountains to the south and east.

Sometimes it is better to be on the ground than in the air.

Sometimes it is better to be on the ground than in the air.

Definitely some turbulence.

All that, and I wanted to fly. No, seriously, I was aching to fly. Just two days before I’d had the opportunity to get back into a Schleicher ASK-21 two-place fiberglass sailplane. A sexy ship if there ever was one, with an excellent 40:1 glide ratio and plenty of capability (even for aerobatics, if you are skilled in that realm).

Sunday’s flight with Rick Hanson (who has been with Sugarbush Soaring so long no one I know can remember the place without him and his wife, Ginny) was all about re-familiarization. I’d flown a ship just like her the year before, in Minden, Nevada. Vermont’s conditions, on that Sunday, at least, were tame compared to the way I’d gotten my butt kicked by rising thermals and developing dust devils in the high Nevada desert. This year staying behind the tow plane, even boxing its wake was just an exercise, not a wrestling match.

Thermaling came back to me pretty quickly, too. Last year the thermals were leaning towers, tilting with the afternoon valley winds. This year, though they moved with the prevailing flow, they seemed a little wider. Finding that ball of rising air in the middle seemed easier, more intuitive. Maybe it is just that I’ve only let a year go by. Before Minden I’d had a two year hiatus from soaring. It could be that two years is just too long, leaving me just too rusty and out of practice.

In any case, by Monday’s flight I was feeling competent. My instructors that day were John and Jen, and they were a dream to fly with (as they all have been, really). It was an excellent day for soaring, with light winds and towering cumulous streets of clouds that did not over develop. One expert soaring pilot riding a capable steed made his way to Stowe, Vermont, and back. And yes, someone else called (actually he had his wife call for him, hmmm…) to ask for an aero-retrieve from 40 miles east. The good news was that he’d landed at an airport.

Landing out. That’s soaring-speak for not making it back to your point of origin. An aero-retrieve means you pay the tow plane to fly to you, and then give you a tow home. Some pilots combat this problem by flying a motor glider, firing up the engine when they get to the point where they are too low to return to their home base, perhaps because they misjudged the lift conditions, or how long the lift would hold out at the end of the day. Other pilots use better judgment to make sure they get back to home base every time.

My instructors on Monday spent plenty of time helping me “see” all of the possible acceptable off-airport landing sites in the valley, and just beyond. We were high enough to see the Adirondacks looming over Lake Champlain, and hear the Québécois’ French chatter in Canada, which I could see clearly to the north with every circle as I climbed to cloud base, rolled out, pushed over for speed, and commenced to glide to the next decent thermal.

We crossed the valley practicing wing-overs, crazy-eights, stalls and steep turns, until they felt I knew all the possible quirks of the fine machine I’d chosen to master. Landings required another skill—understanding that I was much closer to the ground at flare than in my usual ride, the RV-10. That took a bit of coaching, too, but ultimately I got the visual picture and our touchdowns were smooth and on the mark. The thing about sailplanes: though you can control your trajectory to landing nicely with dive brakes, you don’t get to go around if you come up short or long. Making it back to home base from altitude is all about calculating your inertia, choosing your descent speed, setting your trajectory with your dive brakes, and making your initial pattern entry point, downwind, base, final and landing spots on speed and on altitude. Add airport traffic into the mix and you’ve got a great scenario for teaching any pilot great judgment skills.

By day’s end on Monday I’d thermaled, reviewed primary skills, proven my pattern, landing, and even emergency landing prowess, and received my sign-off for solo in the ASK-21. Tuesday’s conditions, however, were nowhere near what I’d proved myself in, and I knew it. The sailplane sat ready for me at the end of the runway, and the tow plane pilot, Steve, eyed me, waiting to know what I wanted to do. The wind was whistling through the gaps in the window frame of the not-ready-for-winter FBO. Sure, I’d flown in some gnarly winds in Minden. But not solo. In fact the last time I’d soloed a glider was in benign conditions over flat land.

“Um…no. I’m not going up today,” I said definitively.

Steve smiled. Good call.

That afternoon I hiked up a cliffside to sit on a sheltered hunk of granite that provided me a view of  half the Champlain Valley. It wasn’t quite as splendid as my perch in the sailplane, but it did sooth. The clouds streamed by, harbingers of the rain that would follow. I was happy to be on terra firma, and ready to fly another day.

What Makes an Engine Airworthy?

Wednesday, July 2nd, 2014

If we’re going to disregard manufacturer’s TBO (as I have advocated in earlier blog posts), how do we assess whether a piston aircraft engine continues to be airworthy and when it’s time to do an on-condition top or major overhaul? Compression tests and oil consumption are part of the story, but a much smaller part than most owners and mechanics think.

Bob Moseley

James Robert “Bob” Moseley (1948-2011)

My late friend Bob Moseley was far too humble to call himself a guru, but he knew as much about piston aircraft engines as anyone I’ve ever met. That’s not surprising because he overhauled Continental and Lycoming engines for four decades; there’s not much about these engines that he hadn’t seen, done, and learned.

From 1993 and 1998, “Mose” (as his friends called him) worked for Continental Motors as a field technical representative. He was an airframe and powerplant mechanic (A&P) with inspection authorization (IA) and a FAA-designated airworthiness representative (DAR). He was generous to a fault when it came to sharing his expertise. In that vein, he was a frequent presenter at annual IA renewal seminars.

Which Engine Is Airworthy?

During these seminars, Mose would often challenge a roomful of hundreds of A&P/IA mechanics with a hypothetical scenario that went something like this:

Four good-looking fellows, coincidentally all named Bob, are hanging out at the local Starbucks near the airport one morning, enjoying their usual cappuccinos and biscotti. Remarkably enough, all four Bobs own identical Bonanzas, all with Continental IO-550 engines. Even more remarkable, all four engines have identical calendar times and operating hours.

While sipping their overpriced coffees, the four Bobs start comparing notes. Bob One brags that his engine only uses one quart of oil between 50-hour oil changes, and his compressions are all 75/80 or better. Bob Two says his engine uses a quart every 18 hours, and his compressions are in the low 60s. Bob Three says his engine uses a quart every 8 hours and his compressions are in the high 50s. Bob Four says his compressions are in the low 50s and he adds a quart every 4 hours.

Who has the most airworthy engine? And why?

Compression/Oil Level

Don’t place too much emphasis on compression test readings as a measure of engine airworthiness. An engine can have low compression readings while continuing to run smoothly and reliably and make full power to TBO and beyond. Oil consumption is an even less important factor. As long as you don’t run out of oil before you run out of fuel, you’re fine.

This invariably provoked a vigorous discussion among the IAs. One faction typically thought that Bob One’s engine was best. Another usually opined that Bobs Two and Three had the best engines, and that the ultra-low oil consumption of Bob One’s engine was indicative of insufficient upper cylinder lubrication and a likely precursor to premature cylinder wear. All the IAs agreed Bob Four’s was worst.

Mose took the position that with nothing more than the given information about compression readings and oil consumption, he considered all four engines equally airworthy. While many people think that ultra-low oil consumption may correlate with accelerated cylinder wear, Continental’s research doesn’t bear this out, and Mose knew of some engines that went to TBO with very low oil consumption all the way to the end.

While the low compressions and high oil consumption of Bob Four’s engine might suggest impending cylinder problems, Mose said that in his experience engines that exhibit a drop in compression and increase in oil consumption after several hundred hours may still make TBO without cylinder replacement. “There’s a Twin Bonanza that I take care of, one of whose engines lost compression within the first 300 hours after overhaul,” Mose once told me. “The engine is now at 900 hours and the best cylinder measures around 48/80. But the powerplant is running smooth, making full rated power, no leaks, and showing all indications of being a happy engine. It has never had a cylinder off, and I see no reason it shouldn’t make TBO.”

Lesson of a Lawn Mower

To put these issues of compression and oil consumption in perspective, Mose liked to tell the story of an engine that was not from Continental or Lycoming but from Briggs & Stratton:

Snapper Lawnmower

If this one-cylinder engine can perform well while using a quart of oil an hour, surely an aircraft engine with 50 times the displacement can, too.

Years ago, I had a Snapper lawn mower with an 8 horsepower Briggs on it. I purchased it used, so I don’t know anything about its prior history. But it ran good, and I used and abused it for about four years, mowing three acres of very hilly, rough ground every summer.

The fifth year I owned this mower, the engine started using oil. By the end of the summer, it was using about 1/2 quart in two hours of mowing. If I wasn’t careful, I could run out of oil before I ran out of gas, because the sump only held about a quart when full. The engine still ran great, mowed like new, although it did smoke a little each time I started it.

The sixth year, things got progressively worse, just as you might expect. By the end of the summer, it was obvious that this engine was getting really tired. It still ran okay, would pull the hills, and would mow at the same speed if the grass wasn’t too tall. But it got to the point that it was using a quart of oil every hour, and was becoming quite difficult to start. The compression during start was so low (essentially nil) that sometimes I had to spray ether into the carb to get the engine to start. It also started leaking combustion gases around the head bolts, and would blow bubbles if I sprayed soapy water on the head while it was running. In fact, the mower became somewhat useful as a fogger for controlling mosquitoes. But it still made power and would only foul its spark plug a couple of times during the season when things got really bad.

Now keep in mind that this engine was rated at just 8 horsepower and had just one cylinder with displacement roughly the size of a coffee cup, was using one quart of oil per hour, and had zilch compression. Compare that to an IO-550 with six cylinders, each with a 5.25-inch bore. Do you suppose that oil consumption of one quart per hour or compression of 40/80 would have any measurable effect on an IO-550’s power output or reliability—in other words, its airworthiness? Not likely.

In fact, Continental Motors actually ran a dynamometer test on an IO-550 whose compression ring gaps had been filed oversize to intentionally reduce compression on all cylinders to 40/80, and it made full rated power.

Common Sense 101Let’s Use Common Sense

I really like Mose’s commonsense approach to aircraft engines. Whether we’re owners or mechanics (or both), we would do well to avoid getting preoccupied with arbitrary measurements like compression readings and oil consumption that have relatively little correlation with true airworthiness.

Instead, we should focus on the stuff that’s really important: Is the engine “making metal”? Are there any cracks in the cylinder heads or crankcase? Any exhaust leaks, fuel leaks, or serious oil leaks? Most importantly, does the engine seem to be running rough or falling short of making full rated power?

If the answer to all of those questions is no, then we can be reasonably sure that our engine is airworthy and we can fly behind it with well-deserved confidence.

On-Condition Maintenance

The smart way to deal with engine maintenance—including deciding when to overhaul—is to do it “on-condition” rather than on a fixed timetable. This means that we use all available condition-monitoring tools to monitor the engine’s health, and let the engine itself tell us when maintenance is required. This is how the airlines and military have been doing it for decades.

Digital borescope (Adrian Eichhorn)

Digital borescopes and digital engine monitors have revolutionized piston aircraft engine condition monitoring.

For our piston aircraft engines, we have a marvelous multiplicity of condition-monitoring tools at our disposal. They include:

  • Oil filter visual inspection
  • Oil filter scanning electron microscopy (SEM)
  • Spectrographic oil analysis programs (SOAP)
  • Digital engine monitor data analysis
  • Borescope inspection
  • Differential compression test
  • Visual crankcase inspection
  • Visual cylinder head inspection
  • Oil consumption trend analysis
  • Oil pressure trend analysis

If we use all these tools on an appropriately frequent basis and understand how to interpret the results, we can be confident that we know whether the engine is healthy or not—and if not, what kind of maintenance action is necessary to restore it to health.

The moment you abandon the TBO concept and decide to make your maintenance decisions on-condition, you take on an obligation to use these tools—all of them—and pay close attention to what they’re telling you. Unfortunately, many owners and mechanics don’t understand how to use these tools appropriately or to interpret the results properly.

When Is It Time to Overhaul?

It takes something pretty serious before it’s time to send the engine off to an engine shop for teardown—or to replace it with an exchange engine. Here’s a list of the sort of findings that would prompt me to recommend that “the time has come”:

Lycoming cam and lifter

Badly damaged cam lobe found during cylinder removal. “It’s time!”

  • An unacceptably large quantity of visible metal in the oil filter; unless the quantity is very large, we’ll often wait until we’ve seen metal in the filter for several shortened oil-change intervals.
  • A crankcase crack that exceeds acceptable limits, particularly if it’s leaking oil.
  • A serious oil leak (e.g., at the crankcase parting seam) that cannot be corrected without splitting the case.
  • An obviously unairworthy condition observed via direct visual inspection (e.g., a bad cam lobe observed during cylinder or lifter removal).
  • A prop strike, serious overspeed, or other similar event that clearly requires a teardown inspection in accordance with engine manufacturer’s guidance.

Avoid getting preoccupied with compression readings and oil consumption that have relatively little correlation with true airworthiness. Ignore published TBO (a thoroughly discredited concept), maintain your engine on-condition, make sure you use all the available condition-monitoring tools, make sure you know how to interpret the results (or consult with someone who does), and don’t overreact to a single bad oil report or a little metal in the filter.

Using this reliability-centered approach to engine maintenance, my Savvy team and I have helped hundreds of  aircraft owners obtain the maximum useful life from their engines, saving them a great deal of money, downtime and hassle. And we haven’t had one fall out of the sky yet.

The Dark Side of Maintenance

Tuesday, June 10th, 2014

The Dark SideHave you ever put your airplane in the shop—perhaps for an annual inspection, a squawk, or a routine oil change—only to find when you fly it for the first time after maintenance that something that was working fine no longer does?  Every aircraft owner has had this happen. I sure have.

Maintenance has a dark side that isn’t usually discussed in polite company: It sometimes breaks aircraft instead of fixing them.

When something in an aircraft fails because of something a mechanic did—or failed to do—we refer to it as a “maintenance-induced failure”…or “MIF” for short. Such MIFs occur a lot more often than anyone cares to admit.

Why do high-time engines fail?

I started thinking seriously about MIFs in 2007 while corresponding with Nathan Ulrich Ph.D. about his ground-breaking research into the causes of catastrophic piston aircraft engine failures (based on five years’ worth of NTSB accident data) that I discussed in an earlier post. Dr. Ulrich’s analysis showed conclusively that by far the highest risk of catastrophic engine failure occurs when the engine is young—during the first two years and 200 hours after it is built, rebuilt or overhauled—due to “infant-mortality failures.”

But the NTSB data was of little statistical value in analyzing the failure risk of high-time engines beyond TBO, simply because so few engines are operated past TBO; most are arbitrarily euthanized at TBO. We don’t have good data on how many engines are flying past TBO, but it’s a relatively small number. So it’s s no surprise that the NTSB database contains very few accidents attributed to failures of over-TBO engines. Because there are so few, Ulrich and I decided to study all such NTSB reports for 2001 through 2005 to see if we could detect some pattern of what made these high-time engines fail. Sure enough, we did detect a pattern.

About half the reported failures of past-TBO engines stated that the reason for the engine failure could not be determined by investigators. Of the half where the cause could be determined, we found that about 80% were MIFs. In other words, those engines failed not because they were past TBO, but because mechanics worked on the engines and screwed something up!

Sheared Camshaft Bevel GearCase in point: I received a call from an aircraft owner whose Bonanza was undergoing annual inspection. The shop convinced the owner to have his propeller and prop governor sent out for 6-year overhauls. (Had the owner asked my advice, I’d have urged him not to do this, but that’s another story for another blog post.)

The overhauled prop and governor came back from the prop shop and were reinstalled. The mechanic had trouble getting the prop to cycle properly, and he wound up removing and reinstalling the governor three times. During the third engine runup, the the prop still wouldn’t cycle properly. The mechanic decided to take the airplane up on a test flight anyway (!) which resulted in an engine overspeed. The mechanic then removed the prop governor yet again and discovered that the governor drive wasn’t turning when the crankshaft was rotated.

I told the owner that I’d seen this before, and the cause was always the same: improper installation of the prop governor. If the splined drive and gears aren’t meshed properly before the governor is torqued, the camshaft gear is damaged, and the only fix is a teardown. (A couple of engine shops and a Continental tech rep all told the owner the same thing.)

This could turn out to be a $20,000 MIF. Ouch!

How often do MIFs happen?

They happen a lot. Hardly a day goes by that I don’t receive an email or a phone call from an exasperated owner complaining about some aircraft problem that is obviously a MIF.

A Cessna 182 owner emailed me that several months earlier, he’d put the plane in the shop for an oil change and installation of an STC’d exhaust fairing. A couple of months later, he decided to have a digital engine monitor installed. The new engine monitor revealed that the right bank of cylinders (#1, #3 and #5) all had very high CHTs well above 400°F. This had not shown up on the factory CHT gauge because its probe was installed on cylinder #2. (Every piston aircraft should have an engine monitor IMHO.) At the next annual inspection at a different shop, the IA discovered found some induction airbox seals missing, apparently left off when the exhaust fairing was installed. The seals were installed and CHTs returned to normal.

Sadly, the problem wasn’t caught early enough to prevent serious heat-related damage to the right-bank cylinders. All three jugs had compressions down in the 30s with leakage past the rings, and visible damage to the cylinder bores was visible under the borescope. The owner was faced with replacing three cylinders, around $6,000.

Sandel SN3308The next day, I heard from the owner of an older Cirrus SR22 complaining about intermittent heading errors on his Sandel SN3308 electronic HSI. These problems started occurring intermittently about three years earlier when the shop pull the instrument for a scheduled 200-hour lamp replacement.

Coincidence?

I’ve seen this in my own Sandel-equipped Cessna 310, and it’s invariably due to inadequate engagement between the connectors on the back of the instrument and the mating connectors in the mounting tray. You must slide the instrument into the tray just as far as possible before tightening the clamp; otherwise, you’ve set the stage for flaky electrical problems. This poor Cirrus owner had been suffering the consequences for three years. It took five minutes to re-rack the instrument and cure the problem.

Pitot-Static PlumbingNot long after that, I got a panicked phone call from one of my managed-maintenance clients who’d departed into actual IMC in his Cessna 340 with his family on board on the first flight after some minor avionics work. (Not smart IMHO.) As he entered the clag and climbed through 3,000 feet, all three of his static instruments—airspeed, altimeter, VSI—quit cold. Switching to alternate static didn’t cure the problem. The pilot kept his cool, confessed his predicament to ATC, successfully shot an ILS back to his home airport, then called me.

The moment I heard the symptoms, I knew exactly what happened because I’d seen it before. “Take the airplane back to the avionics shop,” I told the owner,  “and ask the tech to reconnect the static line that he disconnected.” A disconnected static line in a pressurized aircraft causes the static instruments to be referenced to cabin pressure. The moment the cabin pressurizes, those instruments stop working. MIF!

I know of at least three other similar incidents in pressurized singles and twins, all caused by failure of a mechanic to reconnect a disconnected static line. One resulted in a fatal accident, the others in underwear changes. The FARs require a static system leak test any time the static system is opened up, but clearly some technicians are not taking this seriously.

Causes of Accidents

Why do MIFs happen?

Numerous studies indicate that three-quarters of accidents are the fault of the pilot. The remaining one-quarter are machine-caused, and those are just about evenly divided between ones caused by aircraft design flaws  and ones caused by MIFs. That suggests one-eighth of accidents are maintenance-induced, a significant number.

The lion’s share of MIFs are errors of omission. These include fasteners left uninstalled or untightened, inspection panels left loose, fuel and oil caps left off, things left disconnected (e.g., static lines), and other reassembly tasks left undone.

Distractions play a big part in many of these omissions. A mechanic installs some fasteners finger-tight, then gets a phone call or goes on lunch break and forgets to finish the job by torqueing the fasteners. I have seen some of the best, most experienced mechanics I know fall victim to such seemingly rookie mistakes, and I know of several fatal accidents caused by such omissions.

Maintenance is invasive!

Whenever a mechanic takes something apart and puts it back together, there’s a risk that something won’t go back together quite right. Some procedures are more invasive than others, and invasive maintenance is especially risky.

Invasiveness is something we think about a lot in medicine. The standard treatment for gallstones used to be cholecystectomy (gall bladder removal), major abdominal surgery requiring a 5- to 8-inch incision. Recovery involved a week of hospitalization and several weeks of recovery at home. The risks were significant: My dad very nearly died as the result of complications following this procedure.

Nowadays there’s a far less invasive procedure—laproscopic cholecystectomy—that involves three tiny incisions and performed using a videoscope inserted through one incision and various microsurgery instruments inserted through the others. It is far less invasive than the open procedure. Recovery usually involves only one night in the hospital and a few days at home. The risk of complications is greatly reduced.

Similarly, some aircraft maintenance procedures are far more invasive than others. The more invasive the maintenance, the greater the risk of a MIF. When considering any maintenance task, we should always think carefully about how invasive it is, whether the benefit of performing the procedure is really worth the risk, and whether less invasive alternatives are available.

Ryan Stark of Blackstone LabsFor example, I was contacted by an aircraft owner who said that he’d recently received an oil analysis report showing an alarming increase in iron. The oil filter on his Continental IO-520 showed no visible metal. The lab report suggested flying another 25 hours and then submitting another oil sample for analysis.

The owner showed the oil analysis report to his A&P, who expressed grave concern that the elevated iron might indicate that one or more cam lobes were coming apart. The mechanic suggested pulling one or two cylinders and inspecting the camshaft.

Yikes! What was this mechanic thinking? No airplane has ever fallen out of the sky because of a cam or lifter problem. Many have done so following cylinder removal, the second most invasive thing you can do to an engine. (Only teardown is more invasive.)

The owner wisely decided to seek a second opinion before authorizing this exploratory surgery. I told him the elevated iron was almost certainly NOT due to cam lobe spalling. A disintegrating cam lobe throws off fairly large steel particles or whiskers that are usually visible during oil filter inspection. The fact that the oil filter was clean suggested that the elevated iron was coming from microscopic metal particles less than 25 microns in diameter, too small to be detectable in a filter inspection, but easily detectable via oil analysis. Such tiny particles were probably coming either from light rust on the cylinder walls or from some very slow wear process.

I suggested the owner have a borescope inspection of his cylinders to see whether the bores showed evidence of rust. I also advised that no invasive procedure (like cylinder removal) should ever be undertaken solely on the basis of a single oil analysis report. The oil lab was spot-on in recommending that the aircraft be flown another 25 hours. The A&P wasn’t thinking clearly.

Even if a cam inspection was warranted, there’s a far less invasive method. Instead of a 10-hour cylinder removal, the mechanic could pull the intake and exhaust lifters, and then determine the condition of the cam by inspecting it with a borescope through the lifter boss and, if warranted, probing the cam lobe with a sharp pick. Not only would this procedure require just 15% as much labor, but the risk of a MIF would be nil.

Sometimes, less is more

Many owners believe—and many mechanics preach—that preventive maintenance is inherently a good thing, and the more of it you do the better. I consider this wrongheaded. Mechanics often do far more preventive maintenance than necessary and often do it using unnecessarily invasive procedures, thereby increasing the likelihood that their efforts will actually cause failures rather than preventing them.
Mac Smith RCM Seminar DVDAnother of my earlier posts discussed Reliability-Centered Maintenance (RCM) developed at United Airlines in the late 1960s, and universally adopted by the airlines and the military during the 1970s. One of the major findings of RCM researchers was that preventive maintenance often does more harm than good, and that safety and reliability can often be improved dramatically by reducing the amount of PM and using minimally invasive techniques.

Unfortunately, this thinking doesn’t seem to have trickled down to piston GA, and is considered heresy by many GA mechanics because it contradicts everything they were taught in A&P school. The long-term solution is for GA mechanics to be trained in RCM principles, but that’s not likely to happen any time soon. In the short term, aircraft owners must think carefully before authorizing an A&P to perform invasive maintenance on their aircraft. When in doubt, get a second opinion.

The last line of defense

The most likely time for a mechanical failure to occur is the first flight after maintenance. Since the risk of such MIFs is substantial, it’s imperative that owners conduct a post-maintenance test flight—in VMC , without passengers, preferably close to the airport—before launching into the clag or putting passengers at risk. I think even the most innocuous maintenance task—even a routine oil change—deserves such a post-maintenance test flight. I do this any time I swing a wrench on my airplane.

You should, too.

Quest for a TBO-Free Engine

Tuesday, May 13th, 2014

“It just makes no sense,” Jimmy told me, the frustration evident in his voice. “It’s unfair. How can they do this?”

Jimmy Tubbs, ECi’s legendary VP of Engineering

Jimmy Tubbs, ECi’s legendary VP of Engineering

I was on the phone with my friend Jimmy Tubbs, the legendary Vice President of Engineering for Engine Components Inc. (ECi) in San Antonio, Texas. ECi began its life in the 1940s as a cylinder electroplating firm and grew to dominate that business. Starting in the mid-1970s and accelerating in the late 1990s—largely under Jimmy’s technical stewardship—the company transformed itself into one of the two major manufacturers of new FAA/PMA engine parts for Continental, Lycoming and Pratt & Whitney engines (along with its rival Superior Air Parts).

By the mid-2000s, ECi had FAA approval to manufacture thousands of different PMA-approved engine parts, including virtually every component of four-cylinder Lycoming 320- and 360-series engines (other than the Lycoming data plate). So the company decided to take the next logical step: building complete engines. ECi’s engine program began modestly with the company offering engines in kit form for the Experimental/Amateur-Built (E-AB) market. They opened an engine-build facility where homebuilders could assemble their own ECi “Lycoming-style” engines under expert guidance and supervision. Then in 2013, with more than 1,600 kit-built engines flying, ECi began delivering fully-built engines to the E-AB market under the “Titan Engines” brand name.

Catch 22, FAA-style

ECi’s Titan Exp experimental engine

A Titan engine for experimental airplanes.
What will it take to get the FAA to certify it?

Jimmy is now working on taking ECi’s Titan engine program to the next level by seeking FAA approval for these engines to be used in certificated aircraft. In theory, this ought to be relatively easy (as FAA certification efforts go) because the Titan engines are nearly identical in design to Lycoming 320 and 360 engines, and almost all the ECi-built parts are already PMA approved for use in Lycoming engines. In practice, nothing involving the FAA is as easy as it looks.

“They told me the FAA couldn’t approve an initial TBO for these engines longer than 1,000 hours,” Jimmy said to me with a sigh. He had just returned from a meeting with representatives from the FAA Aircraft Certification Office and the Engine & Propeller Directorate. “I explained that our engines are virtually identical in all critical design respects to Lycoming engines that have a 2,000-hour TBO, and that every critical part in our engines is PMA approved for use in those 2,000-hour engines.”

“But they said they could only approve a 1,000-hour TBO to begin with,” Jimmy continued, “and would consider incrementally increasing the TBO after the engines had proven themselves in the field. Problem is that nobody is going to buy one of our certified engines if it has only a 1,000-hour TBO, so the engines will never get to prove themselves. It makes no sense, Mike. It’s not reasonable. Not logical. Doesn’t seem fair.”

I certainly understood where Jimmy was coming from. But I also understood where the FAA was coming from.

A brief history of TBO

To quote a 1999 memorandum from the FAA Engine & Propeller Directorate:

The initial models of today’s horizontally opposed piston engines were certified in the late 1940s and 1950s. These engines initially entered service with recommended TBOs of 500 to 750 hours. Over the next 50 years, the designs of these engines have remained largely unchanged but the manufacturers have gradually increased their recommended TBOs for existing engine designs to intervals as long as 2,000 hours. FAA acceptance of these TBO increases was based on successful service, engineering design, and test experience. New engine designs, however, are still introduced with relatively short TBOs, in the range of 600 hours to 1,000 hours.

From the FAA’s perspective, ECi’s Titan engines are new engines, despite the fact that they are virtually clones of engines that have been flying for six decades, have a Lycoming-recommended TBO of 2,000 hours, and routinely make it to 4,000 or 5,000 hours between overhauls.

Is it any wonder we’re still flying behind engine technology designed in the ‘40s and ‘50s? If the FAA won’t grant a competitive TBO to a Lycoming clone, imagine the difficulties that would be faced by a company endeavoring to certify a new-technology engine. Catch 22.

Preparing for an engine test cell endurance run.

Incidentally, there’s a common misconception that engine TBOs are based on the results of endurance testing by the manufacturer. They aren’t. The regulations that govern certification of engines (FAR Part 33) require only that a new engine design be endurance tested for 150 hours in order to earn certification. Granted, the 150-hour endurance test is fairly brutal: About two-thirds of the 150 hours involves operating the engine at full takeoff power with CHT and oil temperature at red-line. (See FAR 33.49 for the gory details.) But once the engine survives its 150-hour endurance test, the FAA considers it good to go.

In essence, the only endurance testing for engine TBO occurs in the field. Whether we realize it or not, those of us who fly behind piston aircraft engines have been pressed into service as involuntary beta testers.

What about a TBO-free engine?

“Jimmy, this might be a bit radical” I said, “but where exactly in FAR Part 33 does it state that a certificated engine has to have a recommended TBO?” (I didn’t know the answer, but I was sure Jimmy had Part 33 committed to memory.)

“Actually, it doesn’t,” Jimmy answered. “The only place TBO is addressed at all is in FAR 33.19, where it says that ‘engine design and construction must minimize the development of an unsafe condition of the engine between overhaul periods.’ But nowhere in Part 33 does it say that any specific overhaul interval must be prescribed.”

“So you’re saying that engine TBO is a matter of tradition rather than a requirement of regulation?”

“I suppose so,” Jimmy admitted.

“Well then how about trying to certify your Titan engines without any TBO?” I suggested. “If you could pull that off, you’d change our world, and help drag piston aircraft engine maintenance kicking and screaming into the 21st century.”

An FAA-inspired roadmap

I pointed out to Jimmy that there was already a precedent for this in FAR Part 23, the portion of the FARs that governs the certification of normal, utility, aerobatic and commuter category airplanes. In essence, Part 23 is to non-transport airplanes what Part 33 is to engines. On the subject of airframe longevity, Part 23 prescribes an approach that struck me as being also appropriate for dealing with engine longevity.

Since 1993, Part 23 has required that an applicant for an airplane Type Certificate must provide the FAA with a longevity evaluation of metallic  wing, empennage and pressurized cabin structures. The applicant has the choice of three alternative methods for performing this evaluation. It’s up to the applicant to choose which of these methods to use:

  • “Safe-Life” —The applicant must define a “safe-life” (usually measured in either hours or cycles) after which the structure must be taken out of service. The safe-life is normally established by torture-testing the structure until it starts to fail, then dividing the time-to-failure by a safety factor (“scatter factor”) that is typically in the range of 3 to 5 to calculate the approved safe-life of the structure. For example, the Beech Baron 58TC wing structure has a life limit (safe-life) of 10,000 hours, after which the aircraft is grounded. This means that Beech probably had to torture-test the wing spar for at least 30,000 hours and demonstrate that it didn’t develop cracks.
  • “Fail-Safe” —The applicant must demonstrate that the structure has sufficient redundancy that it can still meet its ultimate strength requirements even after the complete failure of any one principal structural element. For example, a three-spar wing that can meet all certification requirements with any one of the three spars hacksawed in half would be considered fail-safe and would require no life limitation.
  • “Damage Tolerance” —The applicant must define a repetitive inspection program that can be shown with very high confidence to detect structural damage before catastrophic failure can occur. This inspection program must be incorporated into the Airworthiness Limitations section of the airplane’s Maintenance Manual or Instructions for Continued Airworthiness, and thereby becomes part of the aircraft’s certification basis.

If we were to translate these Part 23 (airplane) concepts to the universe of FAR Part 33 (engines):

  • Safe-life would be the direct analog of TBO; i.e., prescribing a fixed interval between overhauls.
  • Fail-safe would probably be impractical, because an engine that included enough redundancy to meet all certification requirements despite the failure of any principal structural element (e.g., a crankcase half, cylinder head or piston) would almost surely be too heavy.
  • Damage tolerance would be the direct analog of overhauling the engine strictly on-condition (based on a prescribed inspection program) with no fixed life limit. (This is precisely what I have been practicing and preaching for decades.)

How would it work?

SavvyAnalysis chart

Engine monitor data would be uploaded regularly to a central repository for analysis.

Jimmy and I have had several follow-on conversations about this, and he’s starting to draft a detailed proposal for an inspection protocol that we hope might be acceptable to the FAA as a basis of certifying the Titan engines on the basis of damage tolerance and eliminate the need for any recommended TBO. This is still very much a work-in-progress, but here are some of the thoughts we have so far:

  • The engine installation would be required to include a digital engine monitor that records EGTs and CHTs for each cylinder plus various other critical engine parameters (e.g., oil pressure and temperature, fuel flow, RPM). The engine monitor data memory would be required to be dumped on a regular basis and uploaded via the Internet to a central repository prescribed by ECi for analysis. The uploaded data would be scanned automatically by software for evidence of abnormalities like high CHTs, low fuel flow, failing exhaust valves, non-firing spark plugs, improper ignition timing, clogged fuel nozzles, detonation and pre-ignition. The data would also be available online for analysis by mechanics and ECi technical specialists.
  • At each oil-change interval, the following would be required: (1) An oil sample would be taken for spectrographic analysis (SOAP) by a designated laboratory, and a copy of the SOAP reports would be transmitted electronically to ECi; and (2) The oil filter would be cut open for inspection, digital photos of the filter media would be taken, when appropriate the filter media would be sent for scanning electron microscope (SEM) evaluation by a designated laboratory, and the media photos and SEM reports would be transmitted electronically to ECi.
  • At each annual or 100-hour inspection, the following would be required: (1) Each cylinder would undergo a borescope inspection of the valves, cylinder bores and piston crowns using a borescope capable of capturing digital images, and the borescope images would be transmitted electronically to ECi; (2) Each cylinder rocker cover would be removed and digital photographs of the visible valve train components would be transmitted electronically to ECi; (3) The spark plugs would be removed for cleaning/gapping/rotation, and digital photographs of the electrode ends of the spark plugs would be taken and transmitted electronically to ECi; and (4) Each cylinder would undergo a hot compression test and the test results be transmitted electronically to ECi.

The details still need to be ironed out, but you get the drift. If such a protocol were implemented for these engines (and blessed by the FAA), ECi would have the ability to keep very close tabs on the mechanical condition and operating parameters of each its engines—something that no piston aircraft engine manufacturer has ever been able to do before—and provide advice to each individual Titan engine owner about when each individual engine is in need of an overhaul, teardown inspection, cylinder replacement, etc.

Jimmy even thinks that if such a protocol could be implemented and approved, ECi might even be in a position to offer a warranty for these engines far beyond what engine manufacturers and overhaul shops have been able to offer in the past. That would be frosting on the cake.

I’ve got my fingers, toes and eyes crossed that the FAA will go along with this idea of an engine certified on the basis of damage tolerance rather than safe-life. It would be a total game-changer, a long overdue nail in the coffin of the whole misguided notion that fixed-interval TBOs for aircraft engines make sense. And if ECi succeeds in getting its Titan engine certified on the basis of condition monitoring rather than fixed TBO, maybe Continental and Lycoming might jump on the overhaul-on-condition bandwagon. Wouldn’t that be something?

How Do Piston Aircraft Engines Fail?

Wednesday, April 9th, 2014

Last month, I tried to make the case that piston aircraft engines should be overhauled strictly on-condition, not at some fixed TBO. If we’re going to do that, we need to understand how these engines fail and how we can protect ourselves against such failures. The RCM way of doing that is called Failure Modes and Effects Analysis (FMEA), and involves examining each critical component of these engines and looking at how they fail, what consequences those failures have, and what practical and cost-efficient maintenance actions we can take to prevent or mitigate those failures. Here’s my quick back-of-the-envelope attempt at doing that…

Crankshaft

CrankshaftsThere’s no more serious failure mode than crankshaft failure. If it fails, the engine quits.

Yet crankshafts are rarely replaced at overhaul. Lycoming did a study that showed their crankshafts often remain in service for more than 14,000 hours (that’s 7+ TBOs) and 50 years. Continental hasn’t published any data on this, but their crankshafts probably have similar longevity.

Crankshafts fail in three ways: (1) infant-mortality failures due to improper materials or manufacture; (2) failures following unreported prop strikes; and (3) failures secondary to oil starvation and/or bearing failure.

Over the past 15 years, we’ve seen a rash of infant-mortality failures of crankshafts. Both Cnntinental and Lycoming have had major recalls of crankshafts that were either forged from bad steel or were damaged during manufacture. These failures invariably occurred within the first 200 hours after the new crankshaft entered service. If the crankshaft survived its first 200 hours, we can be confident that it was manufactured correctly and should perform reliably for numerous TBOs.

Unreported prop strikes seem to be getting rare because owners and mechanics are becoming smarter about the high risk of operating an engine after a prop strike. There’s now an AD mandating a post-prop-strike engine teardown for Lycoming engines, and a strongly worded service bulletin for Continental engines. Insurance will always pay for the teardown and any necessary repairs, so it’s a no-brainer.

That leaves failures due to oil starvation and/or bearing failure. I’ll address that shortly.

Crankcase halvesCrankcase

Crankcases are also rarely replaced at major overhaul. They are typically repaired as necessary, align-bored to restore critical fits and limits, and often provide reliable service for many TBOs. If the case remains in service long enough, it will eventually crack. The good news is that case cracks propagate slowly enough that a detailed visual inspection once a year is sufficient to detect such cracks before they pose a threat to safety. Engine failures caused by case cracks are extremely rare—so rare that I don’t think I ever remember hearing or reading about one.

Lycoming cam and lifterCamshaft and Lifters

The cam/lifter interface endures more pressure and friction than any other moving parts n the engine. The cam lobes and lifter faces must be hard and smooth in order to function and survive. Even tiny corrosion pits (caused by disuse or acid buildup in the oil) can lead to rapid destruction (spalling) of the surfaces and dictate the need for a premature engine teardown. Cam and lifter spalling is the number one reason that engines fail to make TBO, and it’s becoming an epidemic in the owner-flown fleet where aircraft tend to fly irregularly and sit unflown for weeks at a time.

The good news is that cam and lifter problems almost never cause catastrophic engine failures. Even with a badly spalled cam lobe (like the one pictured at right), the engine continues to run and make good power. Typically, a problem like this is discovered at a routine oil change when the oil filter is cut open and found to contain a substantial quantity of ferrous metal, or else a cylinder is removed for some reason and the worn cam lobe can be inspected visually.

If the engine is flown regularly, the cam and lifters can remain in pristine condition for thousands of hours. At overhaul, the cam and lifters are often replaced with new ones, although a reground cam and reground lifters are sometimes used and can be just as reliable.

Gears

The engine has lots of gears: crankshaft and camshaft gears, oil pump gears, accessory drive gears for fuel pump, magnetos, prop governor, and sometimes alternator. These gears are made of case-hardened steel and typically have a very long useful life. They are not usually replaced at overhaul unless obvious damage is found. Engine gears rarely cause catastrophic engine failures.

Oil Pump

Failure of the oil pump is rarely responsible for catastrophic engine failures. If oil pressure is lost, the engine will seize quickly. But the oil pump is dead-simple, consisting of two steel gears inside a close-tolerance aluminum housing, and usually operates trouble free. The pump housing can get scored if a chunk of metal passes through the oil pump—although the oil pickup tube has a suction screen to make sure that doesn’t happen—but even if the pump housing is damaged, the pump normally has ample output to maintain adequate oil pressure in flight, and the problem is mainly noticeable during idle and taxi. If the pump output seems deficient at idle, the oil pump housing can be removed and replaced without tearing down the engine.

spun main bearingBearings

Bearing failure is responsible for a significant number of catastrophic engine failures. Under normal circumstances, bearings have a long useful life. They are always replaced at major overhaul, but it’s not unusual for bearings removed at overhaul to be in pristine condition with little detectable wear.

Bearings fail prematurely for three reasons: (1) they become contaminated with metal from some other failure; (2) they become oil-starved when oil pressure is lost; or (3) main bearings become oil-starved because they shift in their crankcase supports to the point where their oil supply holes become misaligned (as with the “spun bearing” pictured at right).

Contamination failures can generally be prevented by using a full-flow oil filter and inspecting the filter for metal at every oil change. So long as the filter is changed before its filtering capacity is exceeded, metal particles will be caught by the filter and won’t get into the engine’s oil galleries and contaminate the bearings. If a significant quantity of metal is found in the filter, the aircraft should be grounded until the source of the metal is found and corrected.

Oil-starvation failures are fairly rare. Pilots tend to be well-trained to respond to decreasing oil pressure by reducing power and landing at the first opportunity. Bearings will continue to function properly at partial power even with fairly low oil pressure.

Spun bearings are usually infant-mortality failures that occur either shortly after an engine is overhauled (due to an assembly error) or shortly after cylinder replacement (due to lack of preload on the through bolts). Failures occasionally occur after a long period of crankcase fretting, but such fretting is usually detectable through oil filter inspection and oil analysis).They can also occur after extreme unpreheated cold starts, but that is quite rare.

Thrown Connecting RodConnecting Rods

Connecting rod failure is responsible for a significant number of catastrophic engine failures. When a rod fails in flight, it often punches a hole in the crankcase (“thrown rod”) and causes loss of engine oil and subsequent oil starvation. Rod failure have also been known to cause camshaft breakage. The result is invariably a rapid and often total loss of engine power.

Connecting rods usually have a long useful life and are not normally replaced at overhaul. (Rod bearings, like all bearings, are always replaced at overhaul.) Many rod failures are infant-mortality failures caused by improper tightening of the rod cap bolts during engine assembly. Rod failures can also be caused by the failure of the rod bearings, often due to oil starvation. Such failures are usually random failures unrelated to time since overhaul.

Pistons and Rings

Piston and ring failures usually cause only partial power loss, but in rare cases can cause complete power loss. Piston and ring failures are of two types: (1) infant-mortality failures due to improper manufacturer or assembly; and (2) heat-distress failures caused by pre-ignition or destructive detonation events. Heat-distress failures can be caused by contaminated fuel (e.g., 100LL laced with Jet A), or by improper engine operation. They are generally unrelated to hours or years since overhaul. A digital engine monitor can alert the pilot to pre-ignition or destructive detonation events in time for the pilot to take corrective action before heat-distress damage is done.

Head SeparationCylinders

Cylinder failures usually cause only partial power loss, but occasionaly can cause complete power loss. A cylinder consists of a forged steel barrel mated to an aluminum alloy head casting. Cylinder barrels typically wear slowly, and excessive wear is detected at annual inspection by means of compression tests and borescope inspections. Cylinder heads can suffer fatigue failures, and occasionally the head can separate from the barrel. As dramatic as it sounds, a head separation causes only a partial loss of power; a six-cylinder engine with a head-to-barrel separation can still make better than 80% power. Cylinder failures can be infant-mortality failures (due to improper manufacture) or age-related failures (especially if the cylinder head remains in service for more than two or three TBOs). Nowadays, most major overhauls include new cylinders, so age-related cylinder failures have become quite rare.

Broken Exhaust ValveValves and Valve Guides

It is quite common for exhaust valves and valve guides to develop problems well short of TBO. Actual valve failures are becoming much less common nowadays because incipient problems can usually be detected by means of borescope inspections and digital engine monitor surveillance. Even if a valve fails completely, the result is usually only partial power loss and an on-airport emergency landing.

Rocker Arms and Pushrods

Rocker arms and pushrods (which operate the valves) typically have a long useful life and are not normally replaced at overhaul. (Rocker bushings, like all bearings, are always replaced at overhaul.) Rocker arm failure is quite rare. Pushrod failures are caused by stuck valves, and can almost always be avoided through regular borescope inspections. Even when they happen, such failures usually result in only partial power loss.

Failed Mag Distributor GearsMagnetos and Other Ignition Components

Magneto failure is uncomfortably commonplace. Mags are full of plastic components that are less than robust; plastic is used because it’s non-conductive. Fortunately, our aircraft engines are equipped with dual magnetos for redundancy, and the probability of both magnetos failing simultaneously is extremely remote. Mag checks during preflight runup can detect gross ignition system failures, but in-flight mag checks are far better at detecting subtle or incipient failures. Digital engine monitors can reliably detect ignition system malfunctions in real time if the pilot is trained to interpret the data. Magnetos should religiously be disassembled, inspected and serviced every 500 hours; doing so drastically reduces the likelihood of an in-flight magneto failure.

The Bottom Line

The bottom-end components of our piston aircraft engines—crankcase, crankshaft, camshaft, bearings, gears, oil pump, etc.—are very robust. They normally exhibit long useful life that are many multiples of published TBOs. Most of these bottom-end components (with the notable exception of bearings) are routinely reused at major overhaul and not replaced on a routine basis. When these items do fail prematurely, the failures are mostly infant-mortality failures that occur shortly after the engine is built, rebuilt or overhauled, or they are random failures unrelated to hours or years in service. The vast majority of random failures can be detected long before they get bad enough to cause an in-flight engine failure simply by means of routine oil-filter inspection and laboratory oil analysis.

The top-end components—pistons, cylinders, valves, etc.—are considerably less robust. It is not at all unusual for top-end components to fail prior to TBO. However, most of these failures can be prevented by regular borescope inspections and by use of modern digital engine monitors. Even whey they happen, top-end failures usually result in only partial power loss and a successful on-airport landing, and they usually can be resolved without having to remove the engine from the aircraft and sending it to an engine shop. Most top-end failures are infant-mortality or random failures that do not correlate with time since overhaul.

The bottom line is that a detailed FMEA of piston aircraft engines strongly suggests that the traditional practice of fixed-interval engine overhaul or replacement is unwarranted and counterproductive. A conscientiously applied program of condition monitoring that includes regular oil filter inspection, oil analysis, borescope inspections and digital engine monitor data analysis can yield improved reliability and much reduced expense and downtime.

Do Piston Engine TBOs Make Sense?

Thursday, March 13th, 2014

Last month, I discussed the pioneering work on Reliability-Centered Maintenance (RCM) done by United Airlines scientists Stan Nowlan and Howard Heap in the 1960s, and I bemoaned the fact that RCM has not trickled down the aviation food chain to piston GA. Even in the 21st century, maintenance of piston aircraft remains largely time-based rather than condition-based.

mfr_logo_montageMost owners of piston GA aircraft dutifully overhaul their engines at TBO, overhaul their propellers every 5 to 7 years, and replace their alternators and vacuum pumps every 500 hours just as Continental, Lycoming, Hartzell, McCauley, HET and Parker Aerospace call for. Many Bonanza and Baron owners have their wing bolts pulled every five years, and most Cirrus owners have their batteries replaced every two years for no good reason (other than that it’s in the manufacturer’s maintenance manual).

Despite an overwhelming body of scientific research demonstrating that this sort of 1950s-vintage time-based preventive maintenance is counterproductive, worthless, unnecessary, wasteful and incredibly costly, we’re still doing it. Why?

Mostly, I think, because of fear of litigation. The manufacturers are afraid to change anything for fear of being sued (because if they change anything, that could be construed to mean that what they were doing before was wrong). Our shops and mechanics are afraid to deviate from what the manufacturers recommend for fear of being sued (because they deviated from manufacturers’ guidance).

Let’s face it: Neither the manufacturers nor the maintainers have any real incentive to change. The cost of doing all this counterproductive, worthless, unnecessary and wasteful preventive maintenance (that actually doesn’t prevent anything) is not coming out of their pockets. Actually, it’s going into their pockets.

If we’re going to drag piston GA maintenance kicking and screaming into the 21st century (or at least out of the 1950s and into the 1960s), it’s going to have to be aircraft owners who force the change. Owners are the ones with the incentive to change the way things are being done. Owners are the ones who can exert power over the manufacturers and maintainers by voting with their feet and their credit cards.

For this to happen, owners of piston GA aircraft need to understand the right way to do maintenance—the RCM way. Then they need to direct their shops and mechanics to maintain their aircraft that way, or take their maintenance business to someone who will. This means that owners need both knowledge and courage. Providing aircraft owners both of these things is precisely why I’m contributing to this AOPA Opinion Leaders Blog.

When are piston aircraft engines most likely to hurt you?

Fifty years ago, RCM researches proved conclusively that overhauling turbine engines at a fixed TBO is counterproductive, and that engine overhauls should be done strictly on-condition. But how can we be sure that his also applies to piston aircraft engines?

In a perfect world, Continental and Lycoming would study this issue and publish their findings. But for reasons mentioned earlier, this ain’t gonna happen. Continental and Lycoming have consistently refused to release any data on engine failure history of their engines, and likewise have consistently refused to explain how they arrive at the TBOs that they publish. For years, one aggressive plaintiff lawyer after another have tried to compel Continental and Lycoming to answer these questions in court. All have failed miserably.

So if we’re going to get answers to these critical questions, we’re going to have to rely on engine failure data that we can get our hands on. The most obvious source of such data is the NTSB accident database. That’s precisely what brilliant mechanical engineer Nathan T. Ulrich Ph.D. of Lee NH did in 2007. (Dr. Ulrich also was a US Coast Guard Auxiliary pilot who was unhappy that USCGA policy forbade him from flying volunteer search-and-rescue missions if his Bonanza’s engine was past TBO.)

Dr. Ulrich analyzed five years’ worth of NTSB accident data for the period 2001-2005 inclusive, examining all accidents involving small piston-powered airplanes (under 12,500 lbs. gross weight) for which the NTSB identified “engine failure” as either the probable cause or a contributing factor. From this population of accidents, Dr. Ulrich eliminated those involving air-race and agricultural-application aircraft. Then he analyzed the relationship between the frequency of engine-failure accidents and the number of hours on the engine since it was last built, rebuilt or overhauled. He did a similar analysis based on the calendar age of the engine since it  was last built, rebuilt or overhauled. The following histograms show the results of his study:

Ulrich study (hours)

Ulrich study (years)

If these histograms have a vaguely familiar look, it might be because they look an awful lot like the histograms generated by British scientist C.H. Waddington in 1943.

Now,  we have to be careful about how we interpret Dr. Ulrich’s findings. Ulrich would be the first to agree that NTSB accident data can’t tell us much about the risk of engine failures beyond TBO, simply because most piston aircraft engines are voluntarily euthanized at or near TBO. So it shouldn’t be surprising that we don’t see very many engine failure accidents involving engines significantly past TBO, since there are so few of them flying. (The engines on my Cessna 310 are at more than 205% of TBO, but there just aren’t a lot of RCM true believers like me in the piston GA community…yet.)

What Dr. Ulrich’s research demonstrates unequivocally is striking and disturbing frequency of “infant-mortality” engine-failure accidents during the first few years and first few hundred hours after an engine is built, rebuilt or overhauled. Ulrich’s findings makes it indisputably clear that by far the most likely time for you to fall out of the sky due to a catastrophic engine failure is when the engine is young, not when it’s old.

(The next most likely time for you to fall out of the sky is shortly after invasive engine maintenance in the field, particularly cylinder replacement, but that’s a subject for a future blog post…stay tuned!)

 So…Is there a good reason to overhaul your engine at TBO?

Engine overhaulIt doesn’t take a rocket scientist (or a Ph.D. in mechanical engineering) to figure out what all this means. If your engine reaches TBO and still gives every indication of being healthy (good performance, not making metal, healthy-looking oil analysis and borescope results, etc.), overhauling it will clearly degrade safety, not improve it. That’s simply because it will convert your low-risk old engine into a high-risk young engine. I don’t know about you, but that certainly strikes me as a remarkably dumb thing to do.

So why is overhauling on-condition such a tough sell to our mechanics and the engine manufacturers? The counter-argument goes something like this: “Since we have so little data about the reliability of past-TBO engines (because most engines are arbitrarily euthanized at TBO), how can we be sure that it’s safe to operate them beyond TBO?” RCM researchers refer to this as “the Resnikoff Conundrum” (after mathematician H.L. Resnikoff).

To me, it looks an awful lot like the same circular argument that was used for decades to justify arbitrarily euthanizing airline pilots at age 60, despite the fact that aeromedical experts were unanimous that this policy made no sense whatsoever. Think about it…

Roots of Reliability-Centered Maintenance

Tuesday, February 11th, 2014

Last month, I discussed the pioneering WWII-era work of the eminent British scientist C.H. Waddington, who discovered that the scheduled preventive maintenance (PM) being performed on RAF B-24 bombers was actually doing more harm than good, and that drastically cutting back on such PM resulted in spectacular improvement in dispatch reliability of those aircraft. Two decades later, a pair of brilliant American engineers at United Airlines—Stan Nowlan and Howard Heap—independently rediscovered the utter wrongheadedness of traditional scheduled PM, and took things to the next level by formulating a rigorous engineering methodology for creating an optimal maintenance program to maximize safety and dispatch reliability while minimizing cost and downtime. Their approach became known as “Reliability-Centered Maintenance” (RCM), and revolutionized the way maintenance is done in the airline industry, military aviation, high-end bizjets, space flight, and numerous non-aviation applications from nuclear power plants to auto factories.

RCM wear-out curve

The traditional approach to PM assumes that most components start out reliable, and then at some point start becoming unreliable as they age

The “useful life” fallacy

Nowlan and Heap showed the fallacy of two fundamental principles underlying traditional scheduled PM:

  • Components start off being reliable, but their reliability deteriorates with age.
  • The useful life of components can be established statistically, so components can be retired or overhauled before they fail.

It turns out that both of these principles are wrong. To quote Nowlan and Heap:

“One of the underlying assumptions of maintenance theory has always been that there is a fundamental cause-and-effect relationship between scheduled maintenance and operating reliability. This assumption was based on the intuitive belief that because mechanical parts wear out, the reliability of any equipment is directly related to operating age. It therefore followed that the more frequently equipment was overhauled, the better protected it was against the likelihood of failure. The only problem was in determining what age limit was necessary to assure reliable operation. “In the case of aircraft it was also commonly assumed that all reliability problems were directly related to operating safety. Over the years, however, it was found that many types of failures could not be prevented no matter how intensive the maintenance activities. [Aircraft] designers were able to cope with this problem, not by preventing failures, but by preventing such failures from affecting safety. In most aircraft essential functions are protected by redundancy features which ensure that, in the event of a failure, the necessary function will still be available from some other source.

RCM six curves

RCM researchers found that only 2% of aircraft components have failures that are predominantly age-related (curve B), and that 68% have failures that are primarily infant mortality (curve F).

“Despite the time-honored belief that reliability was directly related to the intervals between scheduled overhauls, searching studies based on actuarial analysis of failure data suggested that the traditional hard-time policies were, apart from their expense, ineffective in controlling failure rates. This was not because the intervals were not short enough, and surely not because the tear down inspections were not sufficiently thorough. Rather, it was because, contrary to expectations, for many items the likelihood of failure did not in fact increase with increasing age. Consequently a maintenance policy based exclusively on some maximum operating age would, no matter what the age limit, have little or no effect on the failure rate.”

[F. Stanley Nowlan and Howard F. Heap, “Reliability-Centered Maintenance” 1978, DoD Report Number AD-A066579.]

Winning the war by picking our battles

FMEAAnother traditional maintenance fallacy was the intuitive notion that aircraft component failures are dangerous and need to be prevented through PM. A major focus of RCM was to identify the ways that various components fail, and then evaluate the frequency and consequences of those failures. This is known as “Failure Modes and Effects Analysis” (FMEA). Researchers found that while certain failure modes have serious consequences that can compromise safety (e.g., a cracked wing spar), the overwhelming majority of component failures have no safety impact and have consequences that are quite acceptable (e.g., a failed #2 comm radio or #3 hydraulic pump). Under the RCM philosophy, it makes no sense whatsoever to perform PM on components whose failure has acceptable consequences; the optimal maintenance approach for such components is simply to leave them alone, wait until they fail, and then replace or repair them when they do. This strategy is known as “run to failure” and is a major tenet of RCM.

A maintenance revolution…

Jet airliner

The 747, DC-10 and L-1011 were the first airliners that had RCM-based maintenance programs.

As a direct result of this research, airline maintenance practices changed radically. RCM-inspired maintenance programs were developed for the Boeing 747, Douglas DC-10 and Lockheed L-1011, and for all subsequent airliners. The contrast with the traditional (pre-RCM) maintenance programs for the Boeing 707 and 727 and Douglas DC-8 was astonishing. The vast majority of component TBOs and life-limits were abandoned in favor of an on-condition approach based on monitoring the actual condition of engines and other components and keeping them in service until their condition demonstrably deteriorated to an unacceptable degree. For example, DC-8 had 339 components with TBOs or life limits, whereas the DC-10 had only seven—and none of them were engines. (Research showed clearly that overhauling engines at a specific TBO didn’t make them safer, and actually did the opposite.) In addition, the amount of scheduled maintenance was drastically reduced. For example, the DC-8 maintenance program required 4,000,000 labor hours of major structural inspections during the aircraft’s first 20,000 hours in service, while the 747 maintenance program called for only 66,000 labor hours, a reduction of nearly two orders of magnitude.

Greybeard AMTs.

Owner-flown GA, particularly piston GA, is the only remaining segment of aviation that does things the bad old-fashioned way.

Of course, these changes saved the airlines a king’s ransom in reduced maintenance costs and scheduled downtime. At the same time, the airplanes had far fewer maintenance squawks and much better dispatch reliability. (This was the same phenomenon that the RAF experienced during WWII when they followed Waddington’s advice to slash scheduled PM.)

…that hasnt yet reached piston GA

Today, there’s only one segment of aviation that has NOT adopted the enlightened RCM approach to maintenance, and still does scheduled PM the bad old-fashioned way. Sadly, that segment is owner-flown GA—particularly piston GA—at the bottom of the aviation food chain where a lot of us hang out. I’ll offer some thoughts about that next month.

The Waddington Effect

Tuesday, January 14th, 2014
Conrad Hal (C.H.) Waddington

C.H. Waddington (1905-1975)

In 1943, a British scientist named Conrad Hal (C.H.) Waddington made a remarkable discovery about aircraft maintenance.  He was a most unlikely person to make this discovery, because he wasn’t an aeronautical engineer or an aircraft mechanic or even a pilot.  Actually, he was a gifted developmental biologist, paleontologist, geneticist, embryologist, philosopher, poet and painter who wasn’t particularly interested in aviation.  But like many other British scientists at that time, his career was interrupted by the outbreak of the Second World War and he found himself pressed into service with the Royal Air Force (RAF).

Waddington wound up reporting to the RAF Coastal Command, heading up a group of fellow scientists in the Coastal Command Operational Research Section.  Its job was to advise the British military on how it could more effectively combat the threat from German submarines.  In that capacity, Waddington and his colleagues developed a series of astonishing recommendations that defied military conventional wisdom of the time.

For example, the bombers used to hunt and kill U-boats were mostly painted black in order to make them difficult to see.  But Waddington’s group ran a series of experiments that proved that bombers painted white were not spotted by the U-boats until they were 20% closer, resulting in a 30% increase in successful sinkings. Waddington’s group also recommended that the depth charges dropped by the bombers be set to explode at a depth of 25 feet instead of 100 feet.  This recommendation—initially resisted strongly by RAF commanders—ultimately resulted in a sevenfold increase in the number of U-boats destroyed.

Consolidated B-24 "Liberator" bomber

Consolidated B-24 “Liberator” bomber

Waddington subsequently turned his attention to the problem of “force readiness” of the bombers.  The Coastal Command’s B-24 “Liberator” bombers were spending an inordinate amount of time in the maintenance shop instead of hunting U-boats.  In July 1943, the two British Liberator squadrons located at Ballykelly, Northern Ireland, consisted of 40 aircraft, but at any given time only about 20 were flight-ready.  The other aircraft were down for any number of reasons, but mostly undergoing or awaiting maintenance—either scheduled or unscheduled—or waiting for replacement parts.

At that time, conventional wisdom held that if more preventive maintenance were performed on each aircraft, fewer problems would arise and more incipient problems would be caught and fixed—and thus fleet readiness would surely improve. It turned out that conventional wisdom was wrong. It would take C.H. Waddington and his Operational Research team to prove just how wrong.

Waddington and his team started gathering data about the scheduled and unscheduled maintenance of these aircraft, and began crunching and analyzing the numbers.  When he plotted the number of unscheduled aircraft repairs as a function of flight time, Waddington discovered something both unexpected and significant: The number of unscheduled repairs spiked sharply right after each aircraft underwent its regular 50-hour scheduled maintenance, and then declined steadily over time until the next scheduled 50-hour maintenance, at which time they spiked up once again.

Waddington Effect graph

When Waddington examined the plot of this repair data, he concluded that the scheduled maintenance (in Waddington’s own words) “tends to INCREASE breakdowns, and this can only be because it is doing positive harm by disturbing a relatively satisfactory state of affairs. There is no sign that the rate of breakdowns is starting to increase again after 40-50 flying hours when the aircraft is coming due for its next scheduled maintenance.” In other words, the observed pattern of unscheduled repairs demonstrated that the scheduled preventive maintenance was actually doing more harm than good, and that the 50-hour preventive maintenance interval was inappropriately short.

The solution proposed by Waddington’s team—and ultimately accepted by the RAF commanders over the howls of the maintenance personnel—was to increase the time interval between scheduled maintenance cycles, and to eliminate all preventive maintenance tasks that couldn’t be demonstrably proven to be beneficial. Once these recommendations were implemented, the number of effective flying hours of the RAF Coastal Command bomber fleet increased by 60 percent!

Fast forward two decades to the 1960s, when a pair of gifted scientists who worked for United Airlines—aeronautical engineer Stanley Nowlan and mathematician Howard Heap—independently rediscovered these principles in their pioneering research on optimizing maintenance that revolutionized the way maintenance is done in air transport, military aviation, high-end bizjets and many non-aviation industrial applications.  They were almost certainly unaware of the work of C.H. Waddington and his colleagues in Britain in the 1940s because that work remained classified until 1973, when Waddington’s meticulously-kept diary of his wartime research activities was declassified and published.

Next time, I’ll discuss the fascinating work of Nowlan and Heap on what came to be known as “Reliability Centered Maintenance.” But for now, I will leave you with the major takeaway from Waddington’s research during World War II: Maintenance isn’t an inherently good thing (like exercise); it’s a necessary evil (like surgery). We have to do it from time to time, but we sure don’t want to do more than absolutely necessary to keep our aircraft safe and reliable. Doing more maintenance than necessary actually degrades safety and reliability.