The roundtrip problem

Husqvarna built an AI system to help technicians troubleshoot broken machines.

On the production floor, the lag between a technician's question and a usable answer was long enough that people stopped waiting for it.

Before we get into what they changed, quick question:

Where does your plant currently run AI inference?

👋🏻 I'm Leonardo Ubbiali. This week we're looking at why cloud AI keeps failing on the factory floor, and why the manufacturers fixing it are doing it with smaller models.

A packaging line running at 300 units per minute processes 5 units every second.

The cloud-based vision system sends each image off-site, waits for a response, then acts. Round-trip latency on a good day runs 200 to 400 milliseconds.

By then, the flagged unit is three stations down the line.

Husqvarna's AI Factory Companion worked in controlled testing. On the floor, it didn't.

What they rebuilt

Husqvarna kept cloud AI for training, long-term analytics, and pattern recognition across their global facilities.

The troubleshooting moved to local edge hardware where answers come back in milliseconds.

Daniel Johansson, Husqvarna's manager of manufacturing digitalization and global operations, says technicians were spending two hours or more on some stoppages.

"In some cases, we'll reduce the time by a small amount, but in others, we'll reduce it by 50 to 60%."

Siemens found the same thing with predictive maintenance.

Data flowing to the cloud for analysis introduced latency that prevented real-time response.

Their edge AI now processes sensor data locally and flags equipment issues before they escalate.

Husqvarna and Siemens arrived at the same architecture independently. Edge for decisions that need to happen in seconds, and Cloud for everything else.

The architecture shift only works if the model fits the floor.

In late 2024, Rockwell Automation put Microsoft's Phi-3 directly onto automated packaging lines.

Phi-3 is a compact model built to run on hardware with limited compute, answer narrow questions accurately, and respond fast.

On a packaging line, that specificity is the point. Operators need to know why a fill sensor is misbehaving right now, not two seconds later after a roundtrip to a data center.

Neural processing units consume 10 to 20 times less power than GPUs while matching their response speed for narrow industrial use cases.

Real deployments are reporting 25 to 40% reductions in unplanned downtime.

Five things you can do this quarter

The problem: Your AI-assisted troubleshooting system is too slow and operators have stopped using it.

What you need: Current deployment architecture, actual round-trip latency under production load, and two or three examples where the delay caused a technician to bypass the system.

The Prompt (copy this):

❝

I'm a [YOUR ROLE] at a [FACILITY TYPE] plant. We deployed AI for [USE CASE] but operators bypass it because response time is too slow for our line speed of [UNITS PER MINUTE].

Current setup: [cloud / on-premise / edge / model name if known] Examples where latency caused a problem: [describe two or three]

Is this an architecture problem, a model selection problem, or a data pipeline problem? What would a faster deployment look like for this use case, and what hardware and model type should I evaluate?