LAKESHORELABS
← All work

Fortune 500 Manufacturing Company

Predictive maintenance platform

IoT-driven failure prediction for manufacturing equipment: streaming telemetry ingestion, time-series models, and operational dashboards across multiple facilities.

2024 6 months 8 engineers completed
45%Reduction in downtime
$8MSaved annually
92%Prediction accuracy
Predictive maintenance architecture: 50,000+ IoT sensors across 15 facilities stream through edge gateways into an Azure pipeline where ensemble models trained on 5 years of failure history drive risk-prioritized work orders and mobile alerts for field techniciansFIG. 01 / PREDICTIVE MAINTENANCE: SENSOR TO WORK ORDER PIPELINEFORTUNE 500 MANUFACTURER / 10,000+ ASSETS / ENSEMBLE FAILURE PREDICTION ON AZUREEDGEx 15 FACILITIESCLOUDAZUREFIELD OPSIOT SENSOR ARRAY50,000+ SENSORSMS SAMPLINGEDGE GATEWAYON DEVICE ANOMALY DETECTIONANOMALYIMMEDIATE LOCAL ALARMNO CLOUD ROUND TRIPTELEMETRYAZURE IOT HUBSTREAM INGESTEVENTSTIME SERIES STORE5 YR FAILURE HISTORYFEATURESENSEMBLE MODELSFAILURE PREDICTIONTUNED PER EQUIPMENT TYPE92%ACCURACYRISK SCORESRISK ENGINEPRIORITIZED ALERTING30 DAYSLEAD TIMEHEALTH DASHBOARDSLIVE FLEET VIEWCMMS WORK ORDERSAUTO GENERATEDMOBILE ALERTSFIELD TECHNICIANSMAINTENANCE OUTCOMES45%LESS UNPLANNED DOWNTIME200+FAILURES PREVENTED / YR 1$8MSAVED ANNUALLY

The challenge

A Fortune 500 manufacturer running 15 facilities worldwide was absorbing more than 2,000 hours of unplanned downtime a year. The maintenance budget exceeded $20M, but the spend was badly distributed: some equipment was serviced far more often than necessary while other assets ran until they broke. There was no centralized view of equipment health, so each facility made maintenance calls from local spreadsheets and operator intuition. The client asked us to turn five years of accumulated failure history and a fleet of 10,000+ assets into a system that predicts failures before they happen and tells technicians exactly what to fix first.

What we built

The architecture splits responsibility across three layers, matching the diagram above: edge, cloud, and field ops.

Edge layer and the local alarm path

We instrumented critical equipment with 50,000+ IoT sensors sampling at millisecond intervals, feeding an edge gateway at each of the 15 facilities. The gateway runs anomaly detection on device and can trip a local alarm with no cloud round trip. This was a deliberate design decision: when a bearing temperature spikes, the operator on the floor needs to know in milliseconds, not after a network hop to Azure and back. The edge path handles the urgent case; the cloud handles the predictive one. It also means a facility keeps its safety alarms even if the WAN link drops.

Cloud pipeline and time-series store

Gateways stream telemetry into Azure IoT Hub, our single ingest point for all facilities. Events land in a time-series store (InfluxDB) alongside five years of historical failure records the client already had. Consolidating live telemetry and failure history in one store is what makes the prediction layer possible: features are computed against the same data the models were trained on, and every new maintenance outcome flows back in to extend the training set.

Ensemble prediction and the risk engine

No single algorithm handles a stamping press and an HVAC compressor equally well, so we built ensemble models, combining multiple algorithms and tuning them per equipment type. The ensembles reached 92% prediction accuracy with failure warnings up to 30 days out, validated against the historical failure record before anything went live. Predictions feed a risk engine that ranks alerts by criticality rather than firing on every threshold crossing. The ranking is grounded in the equipment criticality analysis we did in month one, so a degrading asset on a single-point-of-failure line outranks a redundant pump showing the same signature.

Field ops delivery

Predictions only matter if someone acts on them. The risk engine drives three outputs: live health dashboards giving plant managers a fleet-wide view for the first time, auto-generated work orders pushed into the client’s CMMS, and mobile alerts to field technicians. Closed work orders feed maintenance outcomes back into the time-series store, so the models keep learning from what technicians actually found.

How it was delivered

A team of 8 shipped the platform in six months, phased so each stage de-risked the next.

Rolling out facility by facility rather than all at once let us prove the models on one site’s equipment mix before committing the next.

What shipped

The platform moved the client from reacting to failures to scheduling around them, and the feedback loop means it gets more accurate with every work order closed.

PythonTensorFlowKafkaInfluxDBGrafanaKubernetesAzure IoT Hub

Want something like this running against your data?

Start a prototype sprint