The Intelligence Gap in Industrial Process Control

Industrial automation has solved the execution problem, and it has solved it well.

PLCs execute deterministic control logic at millisecond scan rates. DCS platforms provide supervisory control with certified reliability. SCADA systems collect and visualise operational data across geographically distributed assets. Advanced Process Control — particularly Model Predictive Control — optimises steady-state operation across interacting process variables within individual process units. Safety Instrumented Systems provide independent, certified protection against catastrophic failures.

These represent substantial engineering achievement. They are not the problem.

The problem is that execution is mature relative to cross-boundary reasoning. No layer in the current automation stack can reason across unit boundaries, coordinate across organisational functions, adapt during the non-steady-state transitions when intelligence is most needed, or preserve the tacit operational knowledge that experienced operators carry and that no existing system captures.

This gap is structural. It cannot be closed by incremental improvements to any single layer of the existing stack.

The ISA-95 Boundary Problem

The ISA-95 enterprise-control hierarchy defines four functional levels, each with distinct data scope and decision authority.

Level 4 (Business Planning and Logistics) holds business objectives, supply chain constraints, and market information. It has no visibility into real-time process conditions. Level 3 (Manufacturing Operations Management) manages production schedules, maintenance plans, and quality records. It does not track process variables in real time. Level 2 (Supervisory Control) monitors and controls the process through SCADA and DCS. It does not have access to maintenance schedules, market conditions, or business objectives. Level 1 (Basic Control) executes regulatory control deterministically. It cannot reason about context at any higher level.

Each level operates within its defined scope. No system spans the boundaries between them with reasoning capability. The humans who see across boundaries — typically shift supervisors and operations managers — coordinate through periodic meetings and ad hoc communication. Conditions change continuously. The coordination cadence does not.

Nine Structural Gaps

1. APC Model Degradation

Model Predictive Control delivers measurable economic value when properly commissioned and maintained. The structural limitation is in the “maintained” qualifier.

Peer-reviewed control performance literature documents that a substantial fraction of MPC installations operate below design capability or are effectively inoperative in practice. Desborough and Miller (2002), reporting on large-scale industrial experience at Honeywell, found that significant portions of deployed MPC applications were constrained, clamped, or operating in degraded modes. Darby and Nikolaou (2012) confirm that sustainable MPC performance depends on continuous model maintenance — requirements that are frequently underestimated at commissioning.

The primary driver of degradation is model mismatch. MPC encodes process behaviour at commissioning time. The process keeps moving: feed composition changes, equipment ages, catalyst deactivates, seasonal conditions shift. Model re-identification requires specialist control engineering knowledge that is increasingly scarce, creating a maintenance burden that compounds with workforce attrition. Industry estimates suggest 30-50% of installed APC controllers are in manual or degraded mode at any given time — not because the technology failed, but because the model maintenance burden exceeded available engineering capacity.

APC also assumes well-tuned base-layer control. Systematic studies of control loop performance in large process facilities report that significant fractions of PID loops demonstrate excessive, correctable process variability. If the regulatory control layer is oscillating or sluggish, the APC layer cannot optimise over it. This prerequisite is frequently unmet, compounding the maintenance challenge.

2. Unit-Level Scope with No Cross-Boundary Reasoning

APC optimises within a single process unit. The optimisation opportunity in most facilities lies across units.

A distillation column APC does not know what the upstream feed preparation unit is doing, what the downstream fractionation unit needs, or what the market requires from the product slate today. It has no knowledge of scheduled maintenance that will change hydraulics next week, no access to lab results that reveal actual product quality versus inferential estimates, and no connection to commercial signals that should shift the optimisation objective.

The coordination across these boundaries is handled by humans in a morning production meeting. That meeting happens once a day. Conditions change continuously.

3. Failure During Transitions When Intelligence Is Most Needed

MPC is designed for continuous steady-state operation. It handles gradual changes well. It handles discrete events poorly.

The U.S. Chemical Safety Board has documented that startup and shutdown periods concentrate hazardous conditions, citing multiple investigations where incidents occurred during non-routine operations. Industry estimates suggest that 20-30% of operating time involves non-steady-state conditions — precisely the periods when APC controllers are typically switched to manual because the process is far from the steady-state region where the linear model is valid.

During these periods, experienced operators apply heuristics developed over years of direct experience: sequences for warming up equipment, rules of thumb for managing grade transitions, judgement calls about when conditions are stable enough to return to automatic control. This operational knowledge exists almost entirely in human heads. It is not encoded in any control system, documented in any procedure manual with sufficient fidelity, or captured by any historian.

4. Operational Opacity and Operator Trust

An MPC controller moves setpoints because the optimisation algorithm determines that the action minimises the objective function subject to constraints, given current model predictions. This is mathematically sound and operationally opaque.

Operators do not trust what they cannot understand. When conditions become unusual, operators who cannot follow the controller’s reasoning switch it to manual — not because the controller is wrong, but because they cannot verify that it is right. This is a primary driver of APC controllers being deactivated. The technology works. The human interface to that technology does not provide the explanatory transparency required for sustained operational trust.

5. No Learning from Operator Interventions

When an operator overrides an APC recommendation, that override contains valuable information. The operator knows something the model does not: a heuristic about how this specific column behaves under these specific conditions, a pattern recognised from a similar upset three years ago, a mechanical limitation observed but never formally documented.

APC systems have no mechanism to learn from overrides. The information is lost. Multiply this across thousands of overrides per year across an operating facility, and the accumulated operational wisdom evaporates.

6. Prohibitive Deployment Economics

A typical APC project for a single process unit costs $500K-$2M and takes 6-18 months. It requires process engineers for model development, control engineers for tuning, vendor specialists for commissioning, and ongoing maintenance every 1-3 years.

The result: APC is deployed on the highest-value units and not deployed on the hundreds of smaller units that collectively represent significant optimisation opportunity. Most industrial plants operate at 70-85% of their theoretical optimum. APC captures some of this gap, but only on the units where it is deployed, only during steady-state operation, and only within the scope of its model. The rest is left on the table.

7. Alarm Flood Dynamics and Cognitive Overload

EEMUA Publication 191 and ISA-18.2 define a manageable alarm rate as approximately one alarm per ten minutes per operator under normal conditions. The flood threshold is generally recognised at ten or more alarms within a ten-minute window.

During abnormal situations, alarm rates routinely exceed these thresholds by an order of magnitude or more. Peer-reviewed studies of alarm flood dynamics report hundreds of alarms firing within the first minutes of a major process upset. At these rates, operators abandon systematic alarm response and revert to pattern recognition or selective attention, both of which degrade diagnostic accuracy.

The consequential alarm problem compounds this: a single root-cause event triggers cascading threshold breaches across downstream process variables. The operator must identify the root cause while managing multiple consequences simultaneously — a cognitive task that exceeds reliable human performance at sustained high alarm rates.

The Texas City Refinery incident (2005) and the Buncefield Oil Storage Depot incident (2005) both illustrate how alarm system deficiencies, instrumentation failures, and organisational factors combine to produce conditions under which safe human decision-making was not reliably achievable. Alarm rationalisation is a human process trying to solve a machine-speed problem.

No system in the current automation stack can correlate alarms across systems, identify root causes in real time, suppress consequential alarms caused by a single upstream event, and present the operator with one actionable insight instead of dozens of symptoms.

8. The Workforce Knowledge Cliff

The demographic trajectory of the industrial workforce creates compounding urgency across every gap described above.

Industry workforce surveys consistently report that the average age of oil and gas sector workers is in the mid-50s. The APQC/ATD/Lightcast Knowledge Exodus survey projects that 52.4% of the current industrial products workforce will retire or leave within five years. Training a replacement to equivalent competency requires five to ten years of direct operational experience.

The operational consequences are already visible. Plants run more conservatively — further from economic optima to give inexperienced operators wider margins for error. They experience more frequent incidents and near-misses, more unplanned downtime. The knowledge that fills the gaps in automated systems — what to do during transitions, when to override the controller, how to interpret a pattern of alarms — is walking out the door and is encoded nowhere.

The workforce crisis is not a separate problem. It is a multiplier on every other problem.

9. Existing Alternatives Do Not Close the Gap

Data platforms and digital twins provide genuine value for visualisation, analytics, and simulation. Most production digital twins are fundamentally passive representations: they define what exists and how it behaves, but rarely encode action semantics — what actions are permissible, what preconditions must be met, what state changes will result. The gap between insight and action is still filled by a human operator who is increasingly unavailable, inexperienced, or cognitively overloaded.

Predictive maintenance generates predictions. The gap is in coordination. A bearing failure prediction does not create the work order, assign the crew, order the replacement part, notify the shift supervisor, or assess the production impact of taking the equipment offline.

AI assistants and copilots answer questions. They do not act. The industrial challenge is not a shortage of advice. It is a shortage of governed action that is safe, auditable, and coordinated across systems.

All of these approaches stop at the boundary of the system they operate within. None cross the boundaries — between units, between systems, between organisational functions — where the coordination opportunity lies.

The Structural Gap: A Summary

Mature Capability	Structural Gap
Deterministic execution (PLC)	Reasoning about what to execute and why
Supervisory control (DCS/SCADA)	Understanding root cause versus symptom
Unit-level optimisation (APC/MPC)	Cross-unit, cross-function optimisation
Threshold-based alarming (ISA-18.2)	Alarm prioritisation, correlation, and triage
Data recording (process historian)	Learning from recorded operational patterns
Procedure execution (SOPs)	Handling situations outside documented procedures
Steady-state optimisation (APC)	Intelligence during transitions and upsets
Individual system capability	Coordination across ISA-95 levels

The gap is not in execution. Execution is mature. The gap is in governed reasoning: the ability to reason, correlate, learn, and coordinate across the boundaries that the current automation stack was never designed to cross.

The Software-Defined Automation Transition

Software-Defined Automation — IEC 61499 function blocks, OPC-UA as the universal protocol layer, containerised control on standard edge compute — is separating control logic from hardware. This creates programmable surfaces that were previously physically wired and vendor-locked.

This solves important problems around flexibility and interoperability. But it also opens a new question: if control logic becomes software-configurable, what governs what changes are permissible? Who or what validates a proposed control structure change before it reaches the physical process? The existing safety architecture — SIS, PLC interlocks, DCS limits — remains in place, but the space between “software proposes a change” and “safety system prevents catastrophe” needs a governance layer that the current stack does not provide.

The intelligence gap and the SDA transition are converging on the same requirement: a governed reasoning layer that operates at the ISA-95 Level 2/3 boundary, understands OT context, and can act through supervisory paths — with every action validated against formal constraints before it reaches a control system.

That is the problem IndustrialClaw is built to solve.

References available on request. Key sources include Desborough & Miller (2002), Darby & Nikolaou (2012), EEMUA Publication 191 (4th Ed., 2024), ISA-18.2, CSB investigation reports (2007, 2018), and the ATD/Lightcast Knowledge Exodus survey (2026).