Future-proofing industrial R&D: Why you need to migrate from MATLAB to Python now
How moving from MATLAB to Python accelerates and simplifies digital transformation for industrial R&D teams while saving thousands on licensing costs.
Today’s industrial R&D engineers are drowning in data. A single hour of engine testing can generate terabytes of sensor data across thousands of channels at 10kHz+ frequencies. Previously, engineers might have dumped this data into storage and used tools like MATLAB to process it in batches. Now that digital twins are becoming mainstream, more R&D engineers are opting to run simulation models on their data in real-time—a task that is easier to manage with Python frameworks.
The problem with MATLAB is that it struggles to process the data volumes collected from today’s industrial systems. It can be done but it’s painful. Both technically and financially. Back when I was head of innovation at McLaren, we moved most of our data pipeline from MATLAB to Python which was cheaper and easier for everyone to learn. My goal is to convince you to do the same. Why? Because, in the long run, it will help you develop better products at a faster rate, and at a fraction of the cost.
Before I make my case, I want to reiterate that I have absolutely nothing against MATLAB. I used to be an engineer and I’ve worked in product engineering for decades. I’m just saying that it’s not a great fit for modern industry 4.0 use cases that require real-time data processing. MATLAB is still a tremendously powerful tool and I want to give it credit where credit is due. So let’s first take a quick look at how it became the de facto tool for scientific computing.
Why MATLAB is so popular
MATLAB started in academia but became a commercial software application in 1984 after the founding of MathWorks. Over time, MATLAB expanded beyond linear algebra to include toolboxes for signal processing, control systems, and many other domains. It also added Simulink in the 1990s, a graphical modeling tool for system simulation, making it popular in engineering and applied sciences.
Today, MATLAB is widely used in fields like robotics, machine learning, and computational finance. Engineers and scientists love its integrated development environment which provides a unified platform for coding, testing, and analyzing data. MathWorks also provides excellent support, documentation and training and they offer a vast library of specialized toolboxes tailored to specific industries.The problem is… none of this comes cheap. In fact, it can be wildly expensive.
How much is innovation worth?
Suppose that you have a small team of 10 Industrial R&D Engineers using MATLAB. For the sake of simplicity, let’s suppose you license it annually. For a 10-person team it’s about $8,600 per year. After 3 years it’s $25,800. And that’s without Simulink, one of their most popular products. Simulink costs about $1,330 on an annual license, after three years of using MATLAB and Simulink, you would have spent about $65,700. Then, there’s the toolboxes. I won't get into in detail about these but if you include a few of them, you can easily spend about $100,000 over 3 years.
If you want to run MATLAB or Simulink as automated processes on a server you’ll need yet another kind of license. And managing these server licenses is an utter pain in the ass especially when trying to use MATLAB in the cloud. Just check out this license error cheat sheet as an example of what users often run into. If you want to run MATLAB in a virtualized environment, (like Docker) your engineers will likely spend many, many hours troubleshooting these kinds of errors—and I’m speaking from experience here.
Is this all worth the hassle and complexity? Well, no. Especially when you consider that many IDEs that support Python are also completely free (such as Visual Studio Code and PyCharm Community Edition). And standalone processes can be run and deployed in the cloud with free open source tools that rarely need to verify that a license exists on a proprietary licensing server.
Obviously, Industrial R&D has its own unique complexities such as incorporating hardware and running advanced mathematical algorithms. But I would argue that MATLAB does not scale well to the requirements of industry 4.0, so it’s no longer worth the hefty price tag and operational cost of running it.
Digital twins need real-time data processing—which MATLAB wasn’t built for
More R&D teams are embracing the concept of the digital twin which needs real-time data by design. For the uninitiated, a digital twin is a dynamic, virtual replica of a physical asset, system, or process (like a car’s engine or a machine on a production line). It updates in real time using data from its physical counterpart. Unlike static simulation models, digital twins let you simulate, monitor, and optimize performance on the fly.
This makes digital twins a fantastic R&D tool because you can use real-time data to simulate various "what if" scenarios. You can test new ideas, and explore different options without disturbing the actual physical asset, or process.
The challenge is, you need a certain kind of infrastructure to get this running properly which can be a hassle if you rely too much on MATLAB.To see what I mean, let’s look at an example from an academic paper titled “Automated and Systematic Digital Twins Testing for Industrial Processes” (most companies don’t share the details of their internal R&D pipelines so this is the next best option).
The paper describes a digital twin (DT) that simulates the complex process of induction heating of steel bars in a forging factory. It models the dynamic behavior of the heating furnace and the steel bars, as well as the control systems used to manage the process, and the sensor data that monitors it. The goal is to help optimize and automate the production line by using real time-data.
I won't go into the technical details but the example is helpful for understanding how the components of a DT architecture are deployed. The following diagram shows how they use a Deep Reinforcement Learning (DRL) Model as part of a control mechanism that learns to optimize the production process by interacting with the Digital Twin environment:
The next diagram is my reinterpretation of the architecture they used. I've revised the diagram because the Python library they originally used for stream processing, Faust, has been forked and is no longer maintained by its original creator, Robinhood.
Instead, I've replaced Faust with Quix. All the other components remain the same. The OPC-UA server sends factory data to a ThingsBoard IoT gateway that converts the OPC-UA data to MQTT and sends it on to an MQTT broker. Quix Streams (running in Quix Cloud) use an MQTT connector to read the messages, collect the data and turn it into metrics. It creates regular snapshots of the processed data and compares it with the data being simulated by the digital twin. The simulation in the digital twin can then be updated in response to the "real life" factory data.
Now, you could run MATLAB in containers to process the data too, but you’ll quickly hit a major sticking point.
Running MATLAB in Docker is painful and expensive
I’ve dealt with many teams who have trouble running MATLAB in Docker—for example, a digitalization team at a renewable energy company who had this specific issue. They wanted to simulate chemical processes within reactors using a digital twin. It was costing them too much to run MATLAB applications in the cloud so they wanted to switch to Python. This is because MATLAB containers are far larger than Python containers, require more resources and are more expensive to run on platforms like AWS.
Migrating to Python enables the team to work with smaller, more efficient containers that use fewer resources, thus lowering their cloud computing expenses. It also helps them avoid the technical hassle of validating MATLAB licenses from within containers. There are other operational wins too. Python is more streamlined and works better with containerization generally. This lets them work faster and makes it easier to deploy and manage their applications.
Note that the team isn’t abandoning MATLAB completely, as they want to reuse some of their existing MATLAB functions. They intend to build processing services that apply these functions to sensor data streams. The goal is more to extract valuable algorithms from MATLAB and deploy them as Python applications.
The business case for technical standardization
Chances are, Python is used often in other parts of your organization. If industry 4.0 is about connecting and consolidating data, then you want most of your tools to speak the same language.
Take the company I mentioned in the previous section: they already use Python for much of their API and back-end development. By incorporating more of their analytical and modeling components into Python, they hope to reduce friction in their wider development pipeline. The goal is to standardize their tools and improve the developer experience overall. They also want to incorporate real-time data into their web platform. Python can easily handle real-time data processing and has good libraries (such as Quix Streams) for building the necessary tools.
The shift from MATLAB to Python seems to be part of a broader company's direction to standardize their tech stack around Python. This initiative will make it easier to manage and maintain the company’s software infrastructure and allow for a more cohesive development environment. As the company grows, they need a technology stack that will scale easily and cost-effectively. Python-based applications are highly scalable and can be easily deployed on any cloud platform.
Migrating to Python is a sound strategic bet
Let's be honest - migrating from MATLAB to Python isn't a small undertaking. Your team will need time to learn new tools and port existing code. There will be grumbling. There will be growing pains. But here's the thing: this transition is inevitable for most industrial R&D teams, and the longer you wait, the more technical debt you'll accumulate.
The writing is on the wall. Python is becoming the de facto standard for real-time data processing, machine learning, and digital twin development. It's more cost-effective, has a larger talent pool, and integrates better with modern cloud infrastructure. While MATLAB remains an excellent tool for certain specialized tasks, it wasn't designed for the real-time, cloud-native requirements of Industry 4.0.
The good news is that you don't have to make the switch overnight. MATLAB has great interoperability with Python so you can use both together as a bridging strategy. Start small - perhaps with a new project or a specific component that would benefit from real-time processing. Use this as an opportunity to build expertise and confidence in Python within your team. Over time, you can gradually migrate more components while keeping your critical MATLAB functions running where they make sense.
Remember, this isn't just about saving money on licenses or finding cheaper talent. It's about future-proofing your R&D capabilities. As digital twins and real-time processing become more central to industrial innovation, you'll want a technology stack that can keep up. Python isn't just a cost-cutting measure - it's a strategic investment in your organization's ability to innovate and compete in the age of Industry 4.0.
Make no mistake, even industrial R&D is gradually pivoting towards Python and away from proprietary frameworks. The teams who do it early will come out on top. The laggards will find out the hard way that it’s always harder to modernize your infrastructure when you’ve left it to the last minute.
What’s a Rich Text element?
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
Static and dynamic content editing
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
How to customize formatting for each rich text
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.
Mike Rosam is Co-Founder and CEO at Quix, where he works at the intersection of business and technology to pioneer the world's first streaming data development platform. He was previously Head of Innovation at McLaren Applied, where he led the data analytics product line. Mike has a degree in Mechanical Engineering and an MBA from Imperial College London.