How to Build a Preventive Maintenance Program

Key takeaways

Inventory and criticality come first: rank assets A/B/C by consequence of failure, and accept deliberate run-to-failure on cheap, redundant kit.
Set intervals from three sources: the manufacturer manual as a baseline, your own failure history, and the condition your technicians actually observe.
Match the trigger to the wear mode: usage-based for wear driven by running, calendar-based for degradation that happens whether the machine runs or not.
PM compliance measures effort, not effect: judge the program on unplanned downtime and repeat failures, and prune tasks that never find anything.

A preventive maintenance (PM) program is a scheduled system of inspections, servicing and planned replacements designed to catch equipment failures before they stop production. You build one in five steps: inventory your assets and rank them by criticality, choose PM tasks and intervals from the manual, your failure history and observed condition, pick time-based or usage-based triggers, write tasks a technician can actually follow, then review and prune the list relentlessly. This guide walks through each step from zero.

Step 1: Inventory your assets and rank them by criticality

You cannot schedule maintenance on equipment you have not listed. Start with a physical walk of the plant and record every maintainable asset: machines, lines, sub-systems such as compressors and chillers, and the utilities that feed them. For each one, capture the make, model, serial number, install date and where the manual lives. A spreadsheet is fine at this stage. The goal is a complete list, not a perfect one, and a rough register you actually finish beats an elegant database you abandon halfway down the line.

Next, rank by criticality, because not every asset deserves preventive maintenance. Ask four questions of each item: does a failure stop the whole line or just one station, does it create a safety or environmental risk, is there a redundant unit that can take the load, and how long do spares take to arrive? A simple A/B/C classification is enough. A-critical assets get the full PM treatment, B assets get a lighter routine, and for many C assets a deliberate run-to-failure decision is the honest answer. Running a cheap, redundant transfer pump to failure is a strategy; ignoring a bottleneck press until it seizes is negligence. Write the classification down so the whole organisation knows which is which.

Step 2: Choose PM tasks and intervals from three sources

The manufacturer manual is your starting point, not your finishing point. OEM schedules are written to protect warranties and assume a generic duty cycle, so they tend to be conservative and occasionally irrelevant to how you actually run the machine. Load them in first anyway: they encode failure modes the designers know about and you do not, and they are the defensible baseline while you gather your own evidence.

Your own failure history is the second and better source. Pull the recent breakdown records for each A-critical asset and look for repeaters. If a particular bearing, seal or sensor fails again and again, there is a PM task hiding in that pattern: an inspection, lubrication or planned replacement set comfortably inside the observed life. Where you have enough data to calculate MTBF, use it to sanity-check the interval. The principle is simple: most failures announce themselves before they happen, and the job of a PM task is to look during that window, not after it.

The third source is condition: what your technicians and operators can see, hear and feel. Heat, noise, vibration, leaks and drifting cycle times are all early warnings, and the people running the machine every shift usually spot them first. Build simple condition checks into the tasks and let the findings tune your intervals. A task that keeps finding problems is scheduled too late; a task that never finds anything is a candidate for stretching.

Step 3: Pick time-based or usage-based triggers

A time-based (calendar) trigger fires every fixed period: weekly, monthly, quarterly. It is easy to plan, easy to staff, and completely blind to how hard the machine actually worked. A press that ran three shifts flat out and an identical press that sat idle for a month get the same service on the same day, which means one is under-maintained and the other is wasted effort.

A usage-based trigger fires on actual work done: run hours, cycles, strokes or units produced. An injection moulding machine serviced every set number of shots, or a compressor serviced on run hours, gets maintenance proportional to wear. The rule of thumb is to match the trigger to the degradation mode. Wear caused by running (bearings, tooling, filters, belts) suits usage triggers. Degradation caused by time and environment (perishing seals, corrosion, calibration drift) suits calendar triggers, because it happens whether the machine runs or not. Most plants end up with a mix. The practical constraint is data: usage triggers need counters or run-time readings, so start them on the machines where you can get those numbers reliably and keep the rest on the calendar until you can.

Step 4: Write PM tasks people can actually follow

'Check conveyor' is not a PM task, it is a wish. A usable task tells the technician what to do, where, how, and what good looks like, plus what to record and what to raise if the check fails. Write for the newest person on the crew, not the veteran who could do it blindfolded. Photos of the correct condition, the specified grease and quantity, and the torque figure belong in the task, not in someone's head.

Standardise the format across every asset so a technician moving between machines never has to decode a new layout. The findings field matters more than it looks: a PM system that only records 'done' teaches you nothing, while one that records what was found becomes the evidence base for step five.

One action per line: 'inspect, clean, lubricate' hides three jobs; split them so nothing gets skipped silently.
A measurable pass/fail: state the acceptable range in numbers, not adjectives; a deflection figure beats 'check tension'.
Safety and access built in: lockout points, guards to remove and permits required, so the safe way is the written way.
Parts, tools and time: list the consumables and the expected duration, so planning is realistic and kitting is possible.
A findings field: every task should ask 'what did you find?', because those answers are the data you will prune with later.

Step 5: Avoid the compliance trap, then review and prune

PM compliance, the share of scheduled tasks completed on time, is the most gamed number in maintenance. It measures effort, not effect. A team can hit 100% compliance on a list full of low-value inspections while the machines that actually stop production keep stopping production. If compliance is high and unplanned downtime is not falling, the discipline is fine and the task list is wrong. Judge the program on outcomes: unplanned downtime on critical assets, repeat failures, and MTBF trending the right way. Compliance is a health check on execution, nothing more.

That is why review and pruning are part of the program, not an afterthought. At least once a year, sit down with the findings history and interrogate every task: has this check ever found anything, has the failure it guards against ever actually occurred here, and did the last few services change anything measurable? Stretch the intervals on tasks that never find problems, tighten the ones that keep catching things late, and delete the ones nobody can justify. Every useless PM you remove hands wrench time back to the assets that need it, and a short list done properly beats a long list done on paper.

The whole loop runs on knowing what your machines actually did: real run hours for usage triggers, honest stop records for the failure history, and downtime numbers you can trust when you judge the results. Gathering that by hand is possible, but it is the first thing that slips when the plant gets busy. The partner we recommend, Fabrico, reads stops directly from the machines and routes the resulting work orders to the right technician, which closes exactly that gap. Fabrico is a partner we recommend; the tools here are free regardless.

Put numbers on it

Size the prize with the free OEE and downtime calculators.

Open the toolkit

FAQ

How many PM tasks should a new program start with?

Fewer than you think. Cover the A-critical assets properly first: the machines whose failure stops the line, creates a safety risk, or has long spare-part lead times. It is better to run a short list well and expand it from findings than to launch hundreds of tasks nobody has time to complete.

Should PMs be time-based or usage-based?

Match the trigger to the wear mode. Use usage-based triggers (run hours, cycles, units) where wear comes from running, and calendar triggers where degradation happens over time regardless of use, such as corrosion or calibration drift. Most plants run a mix, limited mainly by which machines can report usage data.

What does PM compliance actually tell you?

Only whether scheduled work got done on time. It says nothing about whether the schedule contains the right work. Track compliance alongside unplanned downtime and repeat failures: high compliance with flat downtime means the task list needs rework, not the technicians.

How often should PM intervals be reviewed?

At least annually, and whenever an asset's duty changes. Use the findings history: stretch tasks that never find anything, tighten tasks that keep catching problems late, and delete tasks nobody can justify against a real failure mode.