LET’S TALK ABOUT OEE

Francisco Requena Alcaraz

OEE, TPM and the Digital Factory: From Equipment Losses to Strategic Decision-Making

For years, many factories have talked about availability, breakdowns, productivity, quality, and efficiency as if they were separate conversations.

Production talked about units built.
Maintenance talked about downtime.
Quality talked about defects.
Engineering talked about theoretical cycle times.
And management usually asked a much simpler question:

Why are we not producing what we should be producing?

That is where OEE starts to make sense.

OEE, or Overall Equipment Effectiveness, measures the overall effectiveness of a piece of equipment, a station, or a line. The formula is well known:

OEE = Availability × Performance × Quality

But the important part is not the formula. The important part is what the formula forces us to look at.

Availability asks a simple question:

Was the equipment available when it was supposed to produce?

Performance asks another one:

When the equipment was producing, did it run at the expected speed?

And Quality asks the third:

What the equipment produced, was it right the first time?

That combination turns OEE into much more than a percentage. It turns it into a structured way of seeing industrial losses.

In an automotive assembly line, this is especially powerful. Losses do not always appear as spectacular breakdowns. Sometimes the loss is five seconds of micro-stoppage repeated hundreds of times during a shift. Sometimes it is a robot moving slightly slower than expected. Sometimes it is a filling station repeating a check. Sometimes it is a tightening system rejecting operations because torque and angle curves are out of tolerance. Sometimes it is an automatic station that produces, yes, but with too many adjustments, too many waits, and too many small human interventions.

OEE gives a name to all of that.

And that is its greatest strength: it makes visible the losses that used to remain hidden behind excuses, impressions, and comfortable averages.

However, one thing must be said clearly from the beginning: OEE does not improve anything by itself.

A KPI does not tighten bolts.
A KPI does not clean sensors.
A KPI does not adjust robots.
A KPI does not train operators.
A KPI does not remove root causes.

OEE only shows where effectiveness is being lost. After that, a working system is needed to act on those losses.

That is why OEE makes so much sense within TPM, or Total Productive Maintenance. TPM is not just about maintenance. It is about maximizing equipment effectiveness by involving production, maintenance, quality, engineering, logistics, and leadership.

OEE is the thermometer.
TPM is the system that acts on the fever.

OEE in automated automotive stations

OEE is truly understood when it stops being a formula and starts smelling like the plant.

Think about a robotic windshield bonding and fitting station. From the outside, it may look like a highly controlled process: the robot applies adhesive, the glass is positioned, the operation is verified, and the vehicle moves forward. But inside that cycle, many losses may be hidden.

There can be downtime caused by nozzle blockage, pressure alarms, dosing failures, communication loss with the robot, vision system problems, poor glass position detection, or traceability issues with the adhesive batch. All of that affects availability.

There can also be performance losses if the robot follows a slower path than expected, if the cycle includes position corrections, if checks are repeated, or if speed has been reduced to avoid application defects.

And there can be quality losses if the adhesive bead is not continuous, if there is surface contamination, if the glass is not correctly positioned, or if water-tightness problems appear later.

The same happens in a cockpit bonding and insertion station. Losses may come from manipulator misalignment, variant errors, presence sensors not detecting correctly, interferences during insertion, centering problems, cosmetic damage, connectors not properly positioned, or rework.

A station can technically be “running” and still destroy OEE through small adjustments, cycle repetitions, and manual corrections.

In liquid filling stations, the situation is equally interesting. Losses may come from pumps, valves, flowmeters, level sensors, purging issues, false alarms, leaks, bubbles, or model identification errors. If filling is slowed down to avoid defects, performance drops. If tests must be repeated or levels corrected, quality drops. If the station is blocked by recurring alarms, availability drops.

In the powertrain and body marriage station, losses are usually especially critical. These are complex operations involving synchronization between body, lifts, conveyors, AGVs, guiding tooling, and fastening elements. A small misalignment can stop the station. A cycle repetition can compromise takt time. An interference can create damage. And a traceability error can block the unit.

In automatic tightening stations, OEE is also very revealing. A spindle failure, an unstable screw feeder, a worn nozzle, a torque and angle curve out of tolerance, a crossed thread, a tightening repetition, or a poor association between operation and vehicle can generate availability, performance, and quality losses at the same time.

The key is to understand that, in an automated line, losses rarely belong to one clean category.

A breakdown can end up creating rework.
A quality issue can create availability loss.
A speed reduction can hide an unresolved technical problem.
A repetitive micro-stoppage can be more damaging than a long but isolated breakdown.

That is why OEE is so useful: it forces us to break down the loss.

It is not enough to say:

The station is not performing well.

We need to know whether it loses because it stops, because it runs slowly, or because it produces defects.

And then we must go one level deeper:

Why does it stop?
Why does it run slowly?
Why does it produce defects?
Since when?
On which shift?
With which model?
With which variant?
With which operator?
After which intervention?
With which supplier?
With which material?
With which parameter?

That is where the real analysis begins.

In automotive manufacturing, there is another difficulty: many losses become normalized. The station “has always done that.” The sensor “always causes some trouble.” The robot “always needs that adjustment.” The filling station “is always checked twice.” The tightening system “always rejects some curves.”

And when a loss becomes normal, it stops being seen as a loss.

A well-used OEE breaks that normalization.

It shows that five seconds repeated a thousand times are lost capacity. That rework accepted as routine is lost quality. That small wait before each cycle is lost performance. That manual intervention everyone considers normal is a sign that the process is not robust enough.

In automated stations, OEE has an almost therapeutic effect: it forces the organization to stop living with its losses as if they were part of the natural landscape.

The OEE trap: if the data is poor, the decision will be poor too

Once we decide to measure OEE, a very common trap appears:

Believing that having a number means having the truth.

It does not.

An OEE can have two decimals, appear on a beautiful dashboard, and update in real time. But if the assumptions are poorly defined, if causes are misclassified, or if data is captured inconsistently, that OEE does not help management.

It only professionalizes confusion.

Before measuring, we must agree on what measuring means.

The first decision is the scope. Are we measuring a station, a cell, a line, an area, or a group of linked equipment? This is not a minor detail. If a station stops because the previous one does not deliver bodies, is that loss assigned to the station or considered an external blockage? If it waits because the next station is saturated, is it its own loss or a flow loss?

Without a clear definition, every area will defend a different interpretation.

The second decision is the planned production time. What is included in the calculation and what is excluded? Breaks, meetings, preventive maintenance, model changes, cleaning, engineering trials, lack of schedule, lack of material, safety stops, start-up adjustments. All of this must be defined upfront.

Otherwise, OEE becomes an elastic indicator that changes depending on convenience.

The third decision is the ideal cycle time. This is one of the most delicate. If we use an impossible theoretical cycle, we unfairly punish the team. If we use a cycle that is too comfortable, we hide performance losses.

The ideal cycle must be technical, achievable, documented, and governed. It cannot change every week to make the indicator look better, but it also cannot remain frozen if the process has changed.

The fourth decision is what we mean by a good unit. In automotive manufacturing, this is critical. A unit that is right the first time should not be treated the same as a unit that needs rework, reinspection, or containment.

If everything that eventually becomes conforming is considered “good,” we are hiding quality losses.

The fifth decision is the loss taxonomy. In other words, the cause catalogue. If there are too few categories, the analysis will be poor. If there are too many, people will not use them properly.

Balance is essential.

And there is a clear warning sign: when the “other” category grows too much, the system is not explaining reality.

We must also decide which data is captured automatically and which is entered manually. PLC signals, machine states, cycles, alarms, tightening curves, or traceability events can be captured automatically. But some causes require human input. And that is where the risk begins.

If an operator must classify a stoppage under pressure, with the line waiting and twenty unclear codes available, they will choose the fastest option. Not necessarily the most accurate one.

If a supervisor knows that some codes penalize their area more than others, defensive classification may appear.

If maintenance and production do not share the same criteria, the same stoppage will be classified differently depending on who records it.

And if data is used to blame, the data will be contaminated.

This is one of the great truths of OEE:

Data quality depends as much on technical architecture as on organizational trust.

We can have excellent automatic capture, but if the rules are unclear, the indicator will be debatable. We can have a strong improvement culture, but if systems are not connected, the analysis will be slow and incomplete.

OEE requires both: technology and governance.

That is why data review routines are so important. It is not enough to publish the indicator. We must review whether causes make sense, whether losses are assigned correctly, whether micro-stoppages are captured, whether times match, whether systems are synchronized, and whether teams trust the information.

The question should not only be:

What was the OEE?

It should also be:

Do we trust the data?
Does it explain reality?
Does it support decisions?
Does it help us act?

If the answer is no, the problem is not only in the machine. It is in the measurement system.

A badly measured OEE can generate wrong decisions: investing in the wrong place, blaming the wrong area, prioritizing minor losses, ignoring chronic problems, or launching actions that do not attack the real cause.

That is why, before becoming obsessed with improving the percentage, we must improve the quality of the indicator.

OEE is not valuable because it is exact to the second decimal. It is valuable when it helps make better decisions.

OEE and TPM: the indicator improves nothing without a working system

Measuring losses does not eliminate losses.

A factory may have a well-calculated OEE, updated dashboards, automatic alarms, and daily meetings. But if there is no real discipline to act on the causes, the indicator becomes a ceremony.

And ceremonies alone do not improve availability, performance, or quality.

This is where TPM becomes essential.

Sometimes TPM is interpreted as “maintenance done by production” or as a collection of cleaning routines, tags, checklists, and visual activities. That view is too narrow. TPM, properly understood, is a management system to maximize equipment effectiveness by involving the whole organization.

That is why OEE and TPM fit so well together.

OEE shows where we lose.
TPM organizes how we act.

If OEE shows availability losses, TPM should help us understand whether the problem comes from repetitive breakdowns, insufficient preventive maintenance, critical spare parts, deteriorated basic conditions, weak autonomous inspection, obsolescence, poor accessibility, technical capability gaps, or design weaknesses.

If OEE shows performance losses, TPM should lead us to analyze micro-stoppages, reduced speed, actual cycle versus standard cycle, small adjustments, waiting time, blockages, unstable sensors, recovery modes, PLC sequences, robot paths, or part feeding issues.

If OEE shows quality losses, TPM should connect with process defects, rework, basic conditions, critical parameters, repeatability, traceability, poka-yokes, tooling validation, and training.

TPM turns OEE into structured work.

For example, if an automatic tightening station loses OEE because of repeated torque and angle curve rejects, it is not enough to say:

Tightening is not working well.

We need to analyze the spindle, nozzle, screw feeding, joint geometry, part condition, tightening strategy, traceability, maintenance condition, and acceptance criteria.

If a windshield bonding station loses availability because of pressure alarms, we need to review adhesive conditions, temperature, nozzles, pumps, cleaning, preventive maintenance, calibration, sensors, and intervention standards.

If a filling station loses performance because flow has been reduced to avoid defects, we need to ask whether this is a temporary measure that became permanent, whether the process can be optimized, or whether the current technology limits capacity.

That is the key: TPM must prevent the factory from getting used to living with eternal temporary solutions.

OEE also helps prioritize. Not all problems deserve the same energy. A TPM team must attack the losses with the greatest impact, not necessarily the ones that create the most noise. Sometimes the most visible breakdown is not the biggest loss. Sometimes the real enemy is small, repeated, normalized micro-stoppages.

That is why Pareto analysis is so important. Not to create a nice chart, but to decide where to act first.

However, TPM cannot work if OEE is used as a whip. If every review becomes a search for someone to blame, teams will become defensive. And when people defend themselves, they stop learning.

Leadership must protect one basic idea:

The problem is not who is guilty of the loss. The problem is which part of the system allows the loss to keep happening.

That change of question is fundamental.

It is also important to connect OEE with daily routines. The indicator cannot be reviewed only at the end of the month. It must be part of daily management: shift meetings, area meetings, loss analysis, action follow-up, problem escalation, and result review.

A good routine should answer simple questions:

What was yesterday’s main loss?
What cause explains it?
What action is open?
Who is responsible?
When will the result be checked?
Has the action reduced the loss, or has it only created activity?

The last question is uncomfortable, but necessary.

Continuous improvement is not about doing many actions. It is about eliminating losses.

A mature OEE implementation within TPM is not measured only by having data. It is measured by the organization’s ability to turn that data into decisions, actions, and learning.

OEE without TPM remains a diagnosis.
TPM without OEE can become scattered activity.
Together, they can become a serious system for improving industrial effectiveness.

OEE, OPEX and CAPEX: when to repair, when to improve, and when to invest

There is one interpretation of OEE that is often underused in factories: its value for economic decision-making.

OEE is not only useful for asking:

What loss do we have?

It can also help answer a much more strategic question:

Does it make sense to keep spending OPEX on this asset, or should we propose a CAPEX investment?

This conversation is especially important in automotive assembly lines, where new equipment coexists with old installations, modified robots, adapted tooling, obsolete vision systems, aging tightening stations, dosing systems requiring too many interventions, and temporary solutions that have been operating for years as if they were permanent.

At first, when an asset loses effectiveness, the right approach is usually TPM. Restore basic conditions, improve preventive maintenance, eliminate micro-stoppages, train teams, review standards, analyze root causes, and stabilize the process.

But at some point, the question changes.

If a station maintains low OEE for months, if losses are recurrent, if maintenance costs are increasing, if spare parts are difficult to obtain, if technology limits cycle time, if quality requires continuous rework, and if the asset is not ready for future models, maybe we are no longer facing a continuous improvement problem.

Maybe we are facing an investment decision.

Think about a robotic windshield bonding station. If losses come from repeated dosing failures, pressure alarms, adhesive bead variability, frequent cleaning, and limited traceability, the real cost is not only maintenance. It is downtime, rework, wasted material, quality risk, technical support hours, and lost capacity.

Or think about a cockpit insertion station. If the system requires constant adjustments, creates interferences, has positioning problems, and does not handle new product variants well, the question is not only how to maintain it. The question is whether it remains a valid solution for the future industrial strategy.

The same applies to automatic tightening stations. A system may still run, but run poorly: problematic spindles, unstable feeders, rejected curves, repeated cycles, weak traceability, and permanent technical support. On the surface, the station keeps producing. In reality, it is consuming OPEX and generating risk.

Here, a well-measured OEE becomes evidence.

It helps quantify how much time is lost, what part of the loss is recurrent, what approximate cost it has, what impact it creates on quality, which models are affected, and whether TPM actions are actually reducing the problem or only containing it.

But we must be careful: low OEE does not automatically justify CAPEX.

Sometimes a new machine is requested to solve problems that are actually management problems: lack of standards, weak maintenance, poor parameterization, insufficient training, poorly managed spare parts, or lack of daily discipline.

In those cases, buying a new asset only transfers disorder to a more expensive installation.

OEE should help avoid two mistakes.

The first is continuing to spend OPEX indefinitely on an asset that has no real future.

The second is requesting CAPEX to hide problems that should be solved through better management.

A mature decision appears when we combine OEE with other elements: total cost of ownership, asset criticality, obsolescence, safety, quality, customer risk, spare parts availability, energy consumption, flexibility, future capacity, cybersecurity, digital integration, and alignment with plant strategy.

From this perspective, OEE stops being only an operational KPI and becomes an asset management tool.

It helps decide whether to repair, redesign, improve, renew, add capacity, automate differently, or retire equipment.

It also helps build stronger business cases. Not based on generic statements such as “the machine causes many problems,” but on data: availability losses, performance losses, quality losses, rework cost, intervention hours, failure recurrence, throughput impact, and future risk.

That changes the conversation with leadership.

A weak argument says:

We need to invest because the station performs badly.

A stronger argument says:

We are losing capacity, quality, and cost due to structural causes; we have applied TPM actions; we understand the asset’s limits; and the investment is justified because it reduces current losses and prepares the plant for future needs.

That is a much more mature conversation.

OEE should have a stronger role in OPEX/CAPEX decisions. Not as the only criterion, but as a central piece of evidence.

Because a competitive factory must not only know how to maintain its assets. It must also know when to stop maintaining what no longer makes sense to maintain.

From manual OEE to digital OEE

For OEE to have strategic value, it cannot live in paper, Excel, or isolated systems. It needs a connected architecture from the shopfloor to the ERP.

Data cannot arrive late, broken, and disputed.
It must flow.

For a long time, OEE has been calculated using manual reports, shift sheets, paper records, intermediate Excel files, and meetings where people tried to reconstruct what had happened. That approach may be useful as a starting point. But it cannot be the foundation of a modern factory.

An automated automotive assembly line generates a huge amount of data: PLC states, alarms, cycles, micro-stoppages, tightening curves, robot parameters, dosing data, vision system results, flows, pressures, temperatures, vehicle traceability, rejects, rework, maintenance interventions, and quality results.

If all of that ends up summarized in a manual report or an Excel sheet, we are losing value.

A digital factory is not about filling the plant with screens. It is about making information flow from the shopfloor to management without breaking, without paper, without islands, and without constant reinterpretation.

The first idea is clear: no paper.

Paper may exist as a temporary support, but not as the nervous system of the factory. If a stoppage is written on paper, transcribed later, consolidated in Excel, and presented days after, the data arrives late and degraded.

The second idea is that the whole stack must be connected:

shopfloor, edge, SCADA/HMI, MES, CMMS, quality systems, ERP, and analytics layer.

This is not about connecting everything because it sounds modern. It is about avoiding a situation where every system tells a different version of reality.

At the base is the shopfloor: robots, PLCs, sensors, cameras, tightening systems, dosing systems, flowmeters, automated stations, assisted manual stations, and industrial devices. That is where data is born.

Above that sits the edge layer, which captures, filters, and contextualizes data close to the machine. This layer is key for micro-stoppages, fast signals, frequent alarms, process curves, and events that need structure before being sent to higher-level systems.

The SCADA/HMI layer allows teams to supervise and interact with the installation. It is essential for operation and maintenance, but it should not be the only place where information lives.

The MES connects real production with operations management: orders, sequence, traceability, cycles, confirmations, consumption, production states, and manufacturing events.

The CMMS manages breakdowns, preventive maintenance, spare parts, work orders, intervention times, and asset history.

Quality systems provide defects, rework, audits, containment actions, releases, test results, and root cause analysis.

The ERP provides economic and planning context: costs, materials, inventory, purchasing, planning, bill of materials, and financial data.

Finally, the analytics layer enables dashboards, advanced analysis, machine learning, executive reporting, simulations, and decision-support systems.

When this architecture works, OEE stops being an isolated number. We can see not only that a station lost availability, but how much it cost, which model was affected, which spare part was consumed, which intervention was performed, which defect appeared later, and whether the problem is growing.

That is where the concept of Single Pane of Glass, or SPOG, appears: a platform that provides a centralized and coherent view of critical information sources.

It is not about having a beautiful screen.
It is about having one shared reality on which production, maintenance, quality, engineering, and leadership can make decisions.

Because one of the great enemies of the digital factory is the island.

The production island.
The maintenance island.
The quality island.
The engineering island.
The Excel island.
The “my data is the right data” island.

The goal is not for everyone to look at exactly the same chart. The goal is for everyone to discuss the same truth.

This is also where IT/OT convergence comes in.

IT and OT come from different worlds. IT thinks about security, standardization, scalability, and governance. OT thinks about production continuity, robustness, cycle time, machines, and urgency.

Neither world is unnecessary. Both must converge.

If IT designs without understanding the plant, it will build elegant but impractical solutions.
If OT acts without IT, it will create fast solutions that are difficult to scale and protect.

The digital factory needs both languages.

There is also room for citizen developers and technology democratization. Low-code/no-code tools can allow people close to the process to build small solutions adapted to real needs. But with governance. Without governance, democratization becomes a new source of islands.

And finally, there is artificial intelligence.

A well-connected digital OEE can feed prediction models, anomaly detection, cause analysis, loss prioritization, and action recommendations. But it is worth remembering:

AI does not fix poor data architecture. It amplifies it.

If the data is poor, AI will only automate confusion.

That is why digital OEE must be understood as part of a common strategy. It is not an end. It is a means to achieve higher productivity, better quality, lower cost, lower risk, better use of capital, and better decisions.

A factory is not more digital because it has more screens.
It is more digital if it makes better decisions with better data.

Final reflection

OEE starts with a simple formula:

Availability × Performance × Quality

But when properly implemented, it connects much more than that.

It connects automated stations, TPM, data quality, leadership, OPEX, CAPEX, digital architecture, IT/OT convergence, machine learning, and strategy.

In the end, OEE is not only about measuring equipment.

It is about building a factory capable of seeing its losses, understanding them, prioritizing them, and acting on them.

It is about knowing when a problem requires maintenance, when it requires process discipline, when it requires engineering, when it requires investment, and when it requires a different management conversation.

The real value of OEE is not the number.

The real value is the quality of the decisions that the number makes possible.

And in an increasingly complex industrial environment, that is no longer optional.

It is a management necessity.