back
January 21, 2025
|
Industry insights

Why moving data between OT and IT systems is harder than anyone expects

Discover the hidden complexities of OT-IT integration and anticipate the core challenges that you'll run into when starting your transformation journey.

Banner image for the article "Why moving data between OT and IT systems is harder than anyone expects" published on the Quix blog

Python stream processing, simplified

Pure Python. No JVM. No wrappers. No cross-language debugging. Use streaming DataFrames and the whole Python ecosystem to build stream processing applications.

Python stream processing, simplified

Pure Python. No JVM. No wrappers. No cross-language debugging. Use streaming DataFrames and the whole Python ecosystem to build stream processing applications.

Data integration, simplified

Ingest, pre-process and load high volumes of data into any database, lake or warehouse, without overloading your systems or budgets.

The 4 Pillars of a Successful AI Strategy

Foundational strategies that leading companies use to overcome common obstacles and achieve sustained AI success.
Get the guide

Guide to the Event-Driven, Event Streaming Stack

Practical insights into event-driven technologies for developers and software architects.
Get the guide
Quix is a performant, general-purpose processing framework for streaming data. Build real-time AI applications and analytics systems in fewer lines of code using DataFrames with stateful operators and run it anywhere Python is installed.

When starting the process of digital transformation, many teams discover that one of the most fundamental challenges isn't with the new technology - it's getting their existing systems to share data effectively. The gap between operational technology (OT) and information technology (IT) systems runs deeper than most realize, because they’ve been evolving separately for decades. If you understand these challenges early, you’ll be better equipped to account for them in your project planning.

These are the challenges I’ll cover:

  1. Being “data-driven” often means increasing hardware costs
  2. Factory and IT teams have opposing goals and priorities
  3. Factory systems use disparate protocols that aren’t easy to translate
  4. Factory and business networks have incompatible timing requirements
  5. Factory data is hard to interpret without specialist knowledge
  6. Factory data is siloed and hard to consolidate
  7. Factory systems can’t handle today’s data volumes
  8. Factory systems need to run 24/7 and can’t stop for updates
  9. Connecting OT to IT networks can create dangerous security holes

 But, before we get to the challenges, let’s take a closer look at the origins of each type of system.

OT and IT systems evolved for different purposes

Operational Technology (OT) originated in industrial control and safety applications, where even millisecond delays could result in equipment or product damage. Business systems (IT) grew out of data processing needs, where the priority was handling large volumes of information reliably and securely.

Imagine a robotic welding cell in an automotive plant. The factory system controlling the robot must coordinate precise movements with microsecond-level timing, monitor temperatures in real time, and react instantly to any anomalies. By contrast, the business system tracking production numbers only needs to know how many good welds were completed each hour—a delay of a few seconds in this reporting is perfectly acceptable.

This difference in requirements led these systems to develop radically different architectures. OT networks became an integral part of the control mechanism itself, like a nervous system, where every signal must arrive at exactly the right moment. Business networks evolved more like utilities, providing general-purpose connectivity without strict timing guarantees. These architectural differences ripple through every aspect of how these systems handle data, leading to the challenges we'll explore in the following sections.

Challenges in moving data from OT to IT systems

Being “data-driven” often means increasing hardware costs

Once you decide to be more data driven, you might discover you need to measure more aspects of your machinery and production line. In an IT system, you can just install a new logging agent but in the OT world this can require an upfront investment in more hardware. You’ll need specialized devices that can survive industrial environments. When prototyping, you might use cheap components like the Raspberry Pi, but production deployments need industrial-grade sensors and gateways that can handle extreme temperatures, vibration, and electromagnetic interference. This hardware typically costs 10-20 times more than its consumer-grade equivalents.

Even the seemingly simple approach of retrofitting existing machines with sensors brings significant costs.

Each data point requires:

  • An industrial sensor (typically $500-2000)
  • A compatible gateway device ($1000-5000)
  • Installation labor and certification
  • Network infrastructure modifications

This creates a practical constraint: you must carefully choose which data points are worth collecting. While cloud services allow you to scale your costs up and down, the fixed costs of physical hardware mean you need to identify specific valuable data sources before beginning collection.

For example, collecting vibration data from a critical machine might cost $3000 in hardware but prevent $50,000 in annual maintenance costs. By contrast, adding temperature sensors to every machine on a production line might cost $100,000 in hardware while providing little actionable information. The key challenge here is not the ongoing data handling costs, but rather the upfront cost of physically accessing the data in the first place.

Once you’ve acquired the right hardware, you need to get your teams to work together to install and operate it. This is a separate challenge on its own.

Factory and IT teams often have opposing goals and priorities

OT teams run the factory equipment and production lines. For them, system reliability and real-time performance are non-negotiable requirements. A production line that stops for even a few minutes costs significant money and disrupts delivery schedules. They view any change as a potential risk to stability.

IT teams handle data systems, security, and modernization initiatives. They need to implement standardized data collection, enforce security policies, and integrate factory systems with business systems. Some changes they propose might cause brief system interruptions or add latency - which is exactly what OT teams cannot accept.

A 2022 IDC study quantified this problem: 37.2% of manufacturers cite expertise gaps between IT and OT teams as their top challenge, while 30.6% point to organizational complexity and conflicting decision-makers. The study included interviews with IT managers, one of whom described their OT counterparts as "old school," complaining that "they think they can run the plant with pencil and paper and don't see the value in what we're trying to do."

This isn't just a matter of different perspectives - it's a structural problem. OT teams are measured on production uptime and output. IT teams are measured on system security, data quality, and digital transformation progress. When these metrics conflict, collaboration breaks down. Neither side is wrong - they're simply optimizing for different things.some text

Factory systems use legacy protocols that aren’t easy to translate

Consider an older extrusion line using the Modbus protocol that needs to be integrated with a modern quality control system using OPC UA. At first glance, this might seem like a straightforward protocol conversion task. 

However, the reality quickly becomes more complex:

  • The temperature data in Modbus is stored as raw integer values in registers, but these values require different scaling factors depending on the operating mode of the extruder
  • Some temperature readings are derived from multiple registers that must be read atomically—reading them separately could produce inconsistent results.
  • When the extruder enters diagnostic mode, the meaning of register values changes completely.
  • Critical alarms are encoded in complex bit patterns that have accumulated historical meaning over years of operation.

What initially looks like a simple matter of converting data formats becomes a complex reverse-engineering project requiring deep knowledge of how the equipment actually operates. Early industrial protocols like Modbus were designed with very specific constraints in mind. They needed to work reliably over noisy electrical connections, handle deterministic timing requirements, and operate on hardware with extremely limited memory and processing power. These constraints led to some key design decisions that still affect us today.

More modern protocols like MQTT and OPC UA have evolved to address these issues, but there’s still a great deal of conversion involved before you can get the data into the cloud.

Factory and business networks have incompatible timing and network requirements

Factory networks and business networks operate on very different timing scales. A robotic arm controller requires responses within 10 milliseconds, and a high-speed bottling line must coordinate actions down to the millisecond. Any timing variation causes immediate equipment damage or product waste. By contrast, business systems typically record timestamps to the nearest second and sample data every few minutes.

This timing mismatch has three practical consequences:

  1. Factory networks are designed for real-time control. These networks are inseparable from the physical production process - any network disruption means production stops. Business networks, on the other hand, can tolerate interruptions like any utility. This means standard IT practices like network scanning, which briefly interrupts network traffic, are safe for business systems but can halt production lines.
  1. Correlating data between systems becomes quite difficult. When tracking a product defect, you cannot establish clear cause-and-effect between high-speed manufacturing data (thousands of events per second) and quality inspection records (logged every few seconds).
  1. Each system's clock drifts at a different rate. Business systems can tolerate this drift, but in manufacturing, even small timing discrepancies between systems make it impossible to reconstruct a product's precise path through multiple processing steps. For example, when a quality problem affects 1% of products, accurate timestamps are essential to identify which specific items to recall. But if the packaging line's clock is off by a few seconds from the quality inspection system's clock, you might end up quarantining the wrong items.

Factory data is hard to interpret without specialist knowledge

Raw data without context is meaningless or, worse, misleading. Consider a temperature reading from a plastics extrusion line. A simple number like "180" tells us very little without knowing:

Basic Measurement Context:

  • Is this Celsius or Fahrenheit?
  • What's the sensor's accuracy range?
  • When was it last calibrated?
  • Is this a raw reading or has it been filtered?

Process Context:

  • Which phase of the extrusion process is active?
  • Is this temperature from the die entrance or exit?
  • What's the screw speed?
  • Has there been a recent material changeover?

Equipment Context:

  • Is the sensor showing signs of drift?
  • Has there been recent maintenance?
  • Are there known issues with this particular sensor position?
  • What's the normal operating range for this specific product?

Product Context:

  • Which product specification is being run?
  • What's the target temperature range for this material?
  • How critical is temperature control for this particular recipe?
  • What's the acceptable variance?

In the manufacturing environment, experienced operators carry much of this context in their heads. They know that sensor TE-101 always reads slightly high, or that temperature variations are more important during material transitions. When we try to move this data to business systems, this tribal knowledge often gets left behind.

Factory data is siloed and hard to consolidate

Industrial companies store operational data across many disconnected systems. A single manufacturing site might spread data across standalone data historians, spreadsheets maintained by different teams, and custom applications built over the years. When you multiply this across multiple facilities or divisions, it creates significant technical barriers to understanding operations at scale.

Even within a single facility, different departments often maintain their own separate datasets. Operations might track production metrics in one system while quality control uses another. Maintenance records live somewhere else entirely. This fragmentation makes it difficult to understand how different aspects of production influence each other.

Consider how this affects equipment maintenance. Each manufacturing line generates constant streams of sensor data about temperature, vibration, pressure, and other useful metrics. But when this information stays trapped in separate systems at each site, maintenance teams can't spot patterns that might warn of impending failures. A bearing showing early signs of wear at one facility can't help predict similar issues at another location.

Moving all this operational data into modern cloud platforms seems like an obvious solution. But industrial companies face unique constraints around data latency, security, and costs that make this more complex than a typical IT data migration. The sheer volume of sensor data generated by industrial operations makes matters worse because OT networks and storage solutions were never built to handle such vast qualities of data.

Factory systems can’t handle today’s data volumes

Nowadays, a single production line might have thousands of sensors sampling multiple times per second, high-resolution quality inspection systems, and real-time process monitoring equipment.

Yet, traditional OT networks prioritize predictable, low-latency delivery of small control messages, not high-bandwidth data transfer. Thus, when IT teams try to collect more high-frequency sensor data they quickly run into bandwidth limitations. The common workaround is to heavily downsample data or only collect it periodically in batches. This means losing potentially valuable high-resolution data.

The industrial protocols I mentioned earlier are another bottleneck because they were ‌designed for reliability rather than throughput. While protocols like Modbus and PROFIBUS are good at ensuring control messages get through, they're inefficient for moving large volumes of raw data. Even when the physical network infrastructure could handle more traffic, the protocols themselves became a bottleneck.

Lastly, there’s the storage problem. This scalability hurdle also explains why data historians are siloed across an organization. No single historian solution can store all the data being collected. Plus, vendor licensing models make it prohibitively expensive to try storing high-resolution data anyway (for a deeper dive on this topic see my previous article “Are data historians getting in the way of Industry 4.0?”).

Factory systems need to run 24/7 and can’t stop for updates

Industrial production lines can't stop for system upgrades. Every minute of downtime has a direct cost in lost production. This creates a constant tension between the need to modernize systems and the imperative to keep production running.

Changes that might take days in an IT environment can take months or years in a manufacturing setting, simply because they have to be implemented without disrupting production. And when changes do occur, they often have to maintain compatibility with existing systems, leading to complex hybrid architectures that need to bridge old and new technologies.

For example, suppose a pharmaceutical manufacturer needs to upgrade their process control system to enable better data collection. They can't simply shut down production for a week while they install and test new systems - they have orders to fill and strict regulatory requirements around product availability. Any changes need to be carefully planned and executed in phases, often during scheduled maintenance windows that might only come once or twice a year.

Connecting OT to IT networks can create dangerous security holes

Traditionally, factory systems were secured through physical isolation—the famous "air gap." This wasn't just a security choice; these systems were designed to operate independently, with very different requirements from business networks. However, digital transformation initiatives require breaking this isolation.

This causes several issues:

  • Factory systems often lack basic security features common in IT systems
  • Standard IT security practices like network scanning can disrupt time-sensitive communications
  • Traditional IT security approaches like patching and updates conflict with the need for continuous availability
  • Each new connection between OT and IT requires carefully weighing operational benefits against potential security implications

What makes this particularly challenging is that OT systems often can't use the same granular access controls common in IT. A misconfigured security policy or compromised credential in an industrial system creates risks far beyond typical IT security concerns. When an attacker breaches industrial control systems, they can cause physical destruction - damaging equipment, disrupting essential services, or creating dangerous conditions.

Making OT and IT systems work together

Every factory today faces a basic challenge: how do you extract data from older equipment and systems that were never designed to share it? Most industrial machinery was built to run reliably and safely, not to feed data to cloud analytics or AI systems.

The typical workarounds all have problems:

  • Periodically copying data in batches means working with stale information
  • Trying to read data directly from control systems risks interfering with their primary job
  • Manual exports are time-consuming and error-prone

Rather than fighting these limitations, a more modern approach is to treat industrial data as a continuous stream of events. Instead of directly querying OT systems, you can:

  • Capture changes and readings as an append-only log of events
  • Process these streams to transform raw readings into useful formats
  • Let multiple systems consume this data without impacting the source
  • Build different views of the data for different purposes (real-time monitoring, historical analysis, etc.)

The tools for this exist. For example, Quix provides an efficient and reliable way to consolidate, store and process data at an industrial scale.  The key is to build interfaces that respect both the OT and IT ‌worlds: letting control systems focus on their principal tasks while providing the continuous flow of data needed for modern applications. 

If this still sounds a bit “pie in the sky”, stay tuned. I’ll be giving you more detailed insight into how you can overcome ‌many of these challenges using our modern cloud-native platform with our processing engine and connector frameworks.

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.

Related content

Banner image for the article "Why moving data between OT and IT systems is harder than anyone expects" published on the Quix blog
Industry insights

Why moving data between OT and IT systems is harder than anyone expects

Discover the hidden complexities of OT-IT integration and anticipate the core challenges that you'll run into when starting your transformation journey.
Mike Rosam
Words by
Banner image for the article "Are data historians getting in the way of Industry 4.0?" published on the Quix blog
Industry insights

Are data historians getting in the way of Industry 4.0?

Learn how data historians impact Industry 4.0 adoption, understand their limitations and discover alternative approaches to managing data from OT systems.
Mike Rosam
Words by
Banner image for the article "Rethinking Build vs Buy" published on the Quix blog
Industry insights

The challenges of processing data from devices with limited connectivity and how to solve them

Need to process data from frequently disconnected devices? Better use an event streaming platform paired with a powerful stream processing engine. Here's why.
Mike Rosam
Words by