Failure Mode and Effects Analysis (FMEA) in Manufacturing: A Practical Guide

Every unplanned equipment failure in a manufacturing plant had a precursor — a failure mode that existed in the machine before it stopped the line. Failure Mode and Effects Analysis (FMEA) is the structured methodology that finds those failure modes before they become production events, quantifies the risk each one carries, and prioritises the corrective actions that reduce it most efficiently. Sign up for Oxmaint to link your FMEA findings directly to maintenance work orders, PM schedules, and asset records — turning your analysis into operational action rather than a document that sits on a shelf.

Reliability Engineering

Failure Mode and Effects Analysis (FMEA) in Manufacturing: A Practical Guide

FMEA is one of the most powerful tools in reliability engineering — when it is applied correctly, linked to real maintenance data, and used to change what the plant actually does. This guide covers the complete process from first session to CMMS integration.

Book a Demo Start Free

RPN — Risk Priority Number

Severity

1–10

Occurrence

1–10

Detection

1–10

RPN

1–1000

RPN < 80: Monitor

RPN 80–200: Action Plan Required

RPN > 200: Immediate Action

What FMEA Actually Is

FMEA Explained: More Than a Risk Register

FMEA is a structured, team-based methodology for identifying the ways a process, product, or piece of equipment can fail — and understanding the consequences of each failure before it occurs. Unlike a generic risk register, FMEA produces quantified, prioritised action lists that directly drive maintenance programme design, spare parts strategy, and engineering changes.

Design FMEA

DFMEA

Applied during product or equipment design phase. Identifies failure modes inherent to the design before the asset is built or installed. Prevents failures from being designed in rather than inspected out later.

Process FMEA

PFMEA

Applied to manufacturing or maintenance processes. Identifies failure modes in how work is performed — incorrect assembly sequences, inadequate controls, training gaps — and the product or equipment defects they produce.

System FMEA

SFMEA

Applied at system or subsystem level to identify interactions between components that create failure modes not visible at the individual part level. Most relevant for complex production lines and integrated automation systems.

The FMEA Process

How to Conduct an FMEA in Five Structured Steps

A well-run FMEA is a team exercise — not a document completed by one engineer in isolation. It requires cross-functional input from maintenance, operations, engineering, and quality. The five steps below are the complete FMEA process from scope definition to corrective action closure.

Define the Scope and Assemble the Team

Define the system, subsystem, or process boundary for the analysis. Assemble a cross-functional team with direct knowledge of how the equipment or process actually behaves in operation. The most valuable contributors are experienced maintenance technicians and operators — not just engineers. Document the function of each component within the scope boundary before identifying failure modes.

Identify All Failure Modes

For each function within scope, the team systematically identifies every way that function could fail to perform as intended. Failure modes are described in physical terms — bearing seizes, seal leaks, sensor reads low, coupling fractures — not in consequence terms. At this stage, the question is only: how could this component fail? Not how likely is it or how bad would it be. Completeness is the goal — use historical work order data, failure records, and OEM documentation to ensure nothing is missed.

Assess Effects, Causes, and Current Controls

For each failure mode, document: the effect on the system and on production output; the potential causes or mechanisms; and the current controls in place — preventive maintenance tasks, inspection intervals, alarms, or redundant systems. This stage builds the complete picture of what is known and what is assumed about each failure mode. Many FMEA teams discover at this stage that assumed controls either do not exist in practice or are not effective.

Score Severity, Occurrence, and Detection — Calculate RPN

Each failure mode receives three scores from 1 to 10: Severity (how bad is the effect if the failure occurs), Occurrence (how often does the cause happen), and Detection (how likely are current controls to detect the failure before it causes an effect). The RPN (Risk Priority Number) is the product of all three scores. High-RPN failure modes receive priority action; however, failure modes with Severity scores of 9 or 10 require action regardless of their RPN because the consequence is unacceptable even at low frequency.

Define Corrective Actions, Assign Owners, and Update Controls

For every failure mode above the RPN threshold — and for all high-severity items — the team defines a corrective action with an owner and a completion date. Actions may include new or revised PM tasks, condition monitoring additions, engineering modifications, spare parts provisioning, or operator training changes. After implementation, the affected failure modes are re-scored to verify that the action has reduced the RPN to an acceptable level. The FMEA is a living document — it is updated when plant conditions, failure history, or maintenance strategy changes.

Turn Your FMEA Findings into Maintenance Work Orders in Oxmaint

Oxmaint lets you link FMEA corrective actions directly to PM work orders, inspection tasks, and asset records — so your FMEA output drives real operational change rather than becoming another folder in the reliability engineer's cabinet.

Book a Demo Sign Up Free

RPN Scoring Reference

FMEA Scoring Criteria: How to Score Severity, Occurrence, and Detection Consistently

Inconsistent scoring is the most common source of unreliable FMEA results. The same team must use the same criteria for every failure mode. The tables below provide standardised scoring guidance adapted for manufacturing environments.

Severity (S) — Effect of the Failure Mode

9–10

Safety hazard or regulatory non-compliance. Failure endangers personnel or violates statutory requirements.

7–8

Major production loss. Full line stop or product scrapped. Customer delivery impacted.

4–6

Significant production degradation. Output rate reduced or quality impaired but line continues.

1–3

Minor effect. Slight inconvenience to operation. No production loss. Detected by operator.

Occurrence (O) — Frequency of the Cause

9–10

Failure occurs almost inevitably. Happens multiple times per shift or daily in current conditions.

7–8

Repeated failures. Occurs weekly or monthly. Well-known problem in current operation.

4–6

Occasional failures. Occurs several times per year. Appears in maintenance history records.

1–3

Rare failures. Isolated historical occurrence or theoretical failure mode with no current evidence.

Detection (D) — Ability to Detect Before Effect

9–10

No current detection method. Failure reaches the customer or causes production impact without warning.

7–8

Detection is difficult. Existing controls have low probability of catching the failure before it causes effect.

4–6

Moderate detection. Controls will likely detect failure but are not reliable across all occurrences.

1–3

High detection certainty. Controls almost always detect failure before it reaches the next process step.

Common Mistakes

Why FMEA Exercises Fail — and How to Avoid the Most Costly Mistakes

Many FMEA programmes produce good analysis documents that change very little about actual maintenance practice. The reasons are predictable and avoidable when teams know what to watch for from the start of the process.

Mistake 01

Completed by one engineer, not a cross-functional team

An FMEA completed by a reliability engineer without input from maintenance technicians and operators systematically misses the failure modes that occur most frequently in actual operation — because those failure modes only become visible through direct field experience, not engineering drawings.

Mistake 02

RPN threshold used as the only action trigger

Failure modes with Severity 9 or 10 require action regardless of their RPN. A failure mode that occurs rarely (low Occurrence) and is easily detected (low Detection) can have a low RPN but still carries an unacceptable consequence when it does occur. Teams that use RPN as the sole trigger consistently under-address high-severity items.

Mistake 03

No connection between FMEA output and the CMMS

FMEA corrective actions that are not translated into CMMS work orders, PM tasks, or inspection items within weeks of the analysis session rarely get implemented. The FMEA document becomes a compliance artefact rather than an operational driver. Every action with an owner and a date should immediately generate a corresponding task in the maintenance management system.

Mistake 04

The FMEA is treated as a one-time exercise

Plant conditions change: new equipment is installed, production rates increase, materials change, and operating hours accumulate. An FMEA that was accurate when first completed becomes progressively less accurate as conditions evolve. World-class reliability teams review and update critical FMEAs annually and trigger unscheduled updates after every significant unplanned failure event.

FAQ

FMEA in Manufacturing — Questions Reliability and Maintenance Engineers Ask

What is an acceptable RPN threshold for triggering corrective action in manufacturing?

Most manufacturing reliability programmes use an RPN threshold of 80–100 as the corrective action trigger. Any failure mode scoring above this threshold requires a documented action with an owner and completion date. However, all failure modes with a Severity score of 9 or 10 require action regardless of their total RPN — because the consequence of occurrence is unacceptable even at low frequency or high detectability. Sign up for Oxmaint to track corrective action closure against your FMEA findings.

How is FMEA different from RCM (Reliability-Centred Maintenance)?

FMEA and RCM use overlapping methodology but serve different purposes. FMEA systematically identifies and prioritises failure modes to drive risk reduction actions. RCM uses failure mode analysis as one input into a broader methodology that determines the most appropriate maintenance strategy — preventive, predictive, condition-based, or run-to-failure — for each failure mode based on its consequences and cost. Many organisations use FMEA as a starting point that feeds their RCM programme. Book a demo to see how Oxmaint supports both approaches.

How many people should be in an FMEA team for a manufacturing plant?

Effective FMEA teams typically have four to seven participants — enough for cross-functional expertise without becoming unmanageable. A typical team for equipment FMEA includes: a reliability engineer or facilitator, two experienced maintenance technicians with direct knowledge of the equipment, one production operator or supervisor, and a quality or process engineer where relevant. Larger teams tend to produce slower, lower-quality analysis because reaching consensus becomes progressively harder.

How long does an FMEA typically take to complete for a single production asset?

A well-prepared FMEA for a single production asset — where the facilitator has gathered failure history, current maintenance records, and equipment documentation in advance — typically requires two to four half-day team sessions. Teams that arrive without preparation and attempt to gather data during the analysis session take significantly longer and produce lower-quality output. The preparation done before the team meets is what determines the quality of the FMEA that results from it. Sign up for Oxmaint to access the asset history data your FMEA team needs, before your first session.

Can FMEA output be used to justify changes to a PM schedule in Oxmaint?

Yes — FMEA corrective actions frequently result in changes to PM intervals, new inspection tasks, or condition monitoring additions. In Oxmaint, updated PM tasks can be linked to asset records with notes referencing the FMEA finding that generated them, providing a documented audit trail from risk analysis to maintenance programme change. This traceability supports ISO 9001 and ISO 55000 audit requirements for evidence-based maintenance programme management.

An FMEA That Doesn't Change Your Maintenance Programme Is Just a Document. Make Yours Drive Action.

Oxmaint connects your FMEA corrective actions to live maintenance work orders, PM schedules, and asset records — so every risk identified in the analysis room translates into real work completed on the plant floor, tracked to closure, and visible in your reliability performance data.

Book a Demo Sign Up Free