Have you ever put your airplane in the shop—perhaps for an annual inspection, a squawk, or a routine oil change—only to find when you fly it for the first time after maintenance that something that was working fine no longer does? Every aircraft owner has had this happen. I sure have.
Maintenance has a dark side that isn’t usually discussed in polite company: It sometimes breaks aircraft instead of fixing them.
When something in an aircraft fails because of something a mechanic did—or failed to do—we refer to it as a “maintenance-induced failure”…or “MIF” for short. Such MIFs occur a lot more often than anyone cares to admit.
Why do high-time engines fail?
I started thinking seriously about MIFs in 2007 while corresponding with Nathan Ulrich Ph.D. about his ground-breaking research into the causes of catastrophic piston aircraft engine failures (based on five years’ worth of NTSB accident data) that I discussed in an earlier post. Dr. Ulrich’s analysis showed conclusively that by far the highest risk of catastrophic engine failure occurs when the engine is young—during the first two years and 200 hours after it is built, rebuilt or overhauled—due to “infant-mortality failures.”
But the NTSB data was of little statistical value in analyzing the failure risk of high-time engines beyond TBO, simply because so few engines are operated past TBO; most are arbitrarily euthanized at TBO. We don’t have good data on how many engines are flying past TBO, but it’s a relatively small number. So it’s s no surprise that the NTSB database contains very few accidents attributed to failures of over-TBO engines. Because there are so few, Ulrich and I decided to study all such NTSB reports for 2001 through 2005 to see if we could detect some pattern of what made these high-time engines fail. Sure enough, we did detect a pattern.
About half the reported failures of past-TBO engines stated that the reason for the engine failure could not be determined by investigators. Of the half where the cause could be determined, we found that about 80% were MIFs. In other words, those engines failed not because they were past TBO, but because mechanics worked on the engines and screwed something up!
Case in point: I received a call from an aircraft owner whose Bonanza was undergoing annual inspection. The shop convinced the owner to have his propeller and prop governor sent out for 6-year overhauls. (Had the owner asked my advice, I’d have urged him not to do this, but that’s another story for another blog post.)
The overhauled prop and governor came back from the prop shop and were reinstalled. The mechanic had trouble getting the prop to cycle properly, and he wound up removing and reinstalling the governor three times. During the third engine runup, the the prop still wouldn’t cycle properly. The mechanic decided to take the airplane up on a test flight anyway (!) which resulted in an engine overspeed. The mechanic then removed the prop governor yet again and discovered that the governor drive wasn’t turning when the crankshaft was rotated.
I told the owner that I’d seen this before, and the cause was always the same: improper installation of the prop governor. If the splined drive and gears aren’t meshed properly before the governor is torqued, the camshaft gear is damaged, and the only fix is a teardown. (A couple of engine shops and a Continental tech rep all told the owner the same thing.)
This could turn out to be a $20,000 MIF. Ouch!
How often do MIFs happen?
They happen a lot. Hardly a day goes by that I don’t receive an email or a phone call from an exasperated owner complaining about some aircraft problem that is obviously a MIF.
A Cessna 182 owner emailed me that several months earlier, he’d put the plane in the shop for an oil change and installation of an STC’d exhaust fairing. A couple of months later, he decided to have a digital engine monitor installed. The new engine monitor revealed that the right bank of cylinders (#1, #3 and #5) all had very high CHTs well above 400°F. This had not shown up on the factory CHT gauge because its probe was installed on cylinder #2. (Every piston aircraft should have an engine monitor IMHO.) At the next annual inspection at a different shop, the IA discovered found some induction airbox seals missing, apparently left off when the exhaust fairing was installed. The seals were installed and CHTs returned to normal.
Sadly, the problem wasn’t caught early enough to prevent serious heat-related damage to the right-bank cylinders. All three jugs had compressions down in the 30s with leakage past the rings, and visible damage to the cylinder bores was visible under the borescope. The owner was faced with replacing three cylinders, around $6,000.
The next day, I heard from the owner of an older Cirrus SR22 complaining about intermittent heading errors on his Sandel SN3308 electronic HSI. These problems started occurring intermittently about three years earlier when the shop pull the instrument for a scheduled 200-hour lamp replacement.
I’ve seen this in my own Sandel-equipped Cessna 310, and it’s invariably due to inadequate engagement between the connectors on the back of the instrument and the mating connectors in the mounting tray. You must slide the instrument into the tray just as far as possible before tightening the clamp; otherwise, you’ve set the stage for flaky electrical problems. This poor Cirrus owner had been suffering the consequences for three years. It took five minutes to re-rack the instrument and cure the problem.
Not long after that, I got a panicked phone call from one of my managed-maintenance clients who’d departed into actual IMC in his Cessna 340 with his family on board on the first flight after some minor avionics work. (Not smart IMHO.) As he entered the clag and climbed through 3,000 feet, all three of his static instruments—airspeed, altimeter, VSI—quit cold. Switching to alternate static didn’t cure the problem. The pilot kept his cool, confessed his predicament to ATC, successfully shot an ILS back to his home airport, then called me.
The moment I heard the symptoms, I knew exactly what happened because I’d seen it before. “Take the airplane back to the avionics shop,” I told the owner, “and ask the tech to reconnect the static line that he disconnected.” A disconnected static line in a pressurized aircraft causes the static instruments to be referenced to cabin pressure. The moment the cabin pressurizes, those instruments stop working. MIF!
I know of at least three other similar incidents in pressurized singles and twins, all caused by failure of a mechanic to reconnect a disconnected static line. One resulted in a fatal accident, the others in underwear changes. The FARs require a static system leak test any time the static system is opened up, but clearly some technicians are not taking this seriously.
Why do MIFs happen?
Numerous studies indicate that three-quarters of accidents are the fault of the pilot. The remaining one-quarter are machine-caused, and those are just about evenly divided between ones caused by aircraft design flaws and ones caused by MIFs. That suggests one-eighth of accidents are maintenance-induced, a significant number.
The lion’s share of MIFs are errors of omission. These include fasteners left uninstalled or untightened, inspection panels left loose, fuel and oil caps left off, things left disconnected (e.g., static lines), and other reassembly tasks left undone.
Distractions play a big part in many of these omissions. A mechanic installs some fasteners finger-tight, then gets a phone call or goes on lunch break and forgets to finish the job by torqueing the fasteners. I have seen some of the best, most experienced mechanics I know fall victim to such seemingly rookie mistakes, and I know of several fatal accidents caused by such omissions.
Maintenance is invasive!
Whenever a mechanic takes something apart and puts it back together, there’s a risk that something won’t go back together quite right. Some procedures are more invasive than others, and invasive maintenance is especially risky.
Invasiveness is something we think about a lot in medicine. The standard treatment for gallstones used to be cholecystectomy (gall bladder removal), major abdominal surgery requiring a 5- to 8-inch incision. Recovery involved a week of hospitalization and several weeks of recovery at home. The risks were significant: My dad very nearly died as the result of complications following this procedure.
Nowadays there’s a far less invasive procedure—laproscopic cholecystectomy—that involves three tiny incisions and performed using a videoscope inserted through one incision and various microsurgery instruments inserted through the others. It is far less invasive than the open procedure. Recovery usually involves only one night in the hospital and a few days at home. The risk of complications is greatly reduced.
Similarly, some aircraft maintenance procedures are far more invasive than others. The more invasive the maintenance, the greater the risk of a MIF. When considering any maintenance task, we should always think carefully about how invasive it is, whether the benefit of performing the procedure is really worth the risk, and whether less invasive alternatives are available.
For example, I was contacted by an aircraft owner who said that he’d recently received an oil analysis report showing an alarming increase in iron. The oil filter on his Continental IO-520 showed no visible metal. The lab report suggested flying another 25 hours and then submitting another oil sample for analysis.
The owner showed the oil analysis report to his A&P, who expressed grave concern that the elevated iron might indicate that one or more cam lobes were coming apart. The mechanic suggested pulling one or two cylinders and inspecting the camshaft.
Yikes! What was this mechanic thinking? No airplane has ever fallen out of the sky because of a cam or lifter problem. Many have done so following cylinder removal, the second most invasive thing you can do to an engine. (Only teardown is more invasive.)
The owner wisely decided to seek a second opinion before authorizing this exploratory surgery. I told him the elevated iron was almost certainly NOT due to cam lobe spalling. A disintegrating cam lobe throws off fairly large steel particles or whiskers that are usually visible during oil filter inspection. The fact that the oil filter was clean suggested that the elevated iron was coming from microscopic metal particles less than 25 microns in diameter, too small to be detectable in a filter inspection, but easily detectable via oil analysis. Such tiny particles were probably coming either from light rust on the cylinder walls or from some very slow wear process.
I suggested the owner have a borescope inspection of his cylinders to see whether the bores showed evidence of rust. I also advised that no invasive procedure (like cylinder removal) should ever be undertaken solely on the basis of a single oil analysis report. The oil lab was spot-on in recommending that the aircraft be flown another 25 hours. The A&P wasn’t thinking clearly.
Even if a cam inspection was warranted, there’s a far less invasive method. Instead of a 10-hour cylinder removal, the mechanic could pull the intake and exhaust lifters, and then determine the condition of the cam by inspecting it with a borescope through the lifter boss and, if warranted, probing the cam lobe with a sharp pick. Not only would this procedure require just 15% as much labor, but the risk of a MIF would be nil.
Sometimes, less is more
Many owners believe—and many mechanics preach—that preventive maintenance is inherently a good thing, and the more of it you do the better. I consider this wrongheaded. Mechanics often do far more preventive maintenance than necessary and often do it using unnecessarily invasive procedures, thereby increasing the likelihood that their efforts will actually cause failures rather than preventing them.
Another of my earlier posts discussed Reliability-Centered Maintenance (RCM) developed at United Airlines in the late 1960s, and universally adopted by the airlines and the military during the 1970s. One of the major findings of RCM researchers was that preventive maintenance often does more harm than good, and that safety and reliability can often be improved dramatically by reducing the amount of PM and using minimally invasive techniques.
Unfortunately, this thinking doesn’t seem to have trickled down to piston GA, and is considered heresy by many GA mechanics because it contradicts everything they were taught in A&P school. The long-term solution is for GA mechanics to be trained in RCM principles, but that’s not likely to happen any time soon. In the short term, aircraft owners must think carefully before authorizing an A&P to perform invasive maintenance on their aircraft. When in doubt, get a second opinion.
The last line of defense
The most likely time for a mechanical failure to occur is the first flight after maintenance. Since the risk of such MIFs is substantial, it’s imperative that owners conduct a post-maintenance test flight—in VMC , without passengers, preferably close to the airport—before launching into the clag or putting passengers at risk. I think even the most innocuous maintenance task—even a routine oil change—deserves such a post-maintenance test flight. I do this any time I swing a wrench on my airplane.
You should, too.