Explainer

Explainer
September 27, 2023
ActiveMQ vs. Kafka: A comparison of differences and use cases
The main difference between them is that Kafka is a distributed event streaming platform designed to ingest and process massive amounts of data, while ActiveMQ is a traditional message broker that supports multiple protocols and flexible messaging patterns.

Explainer
September 19, 2023
Apache Kafka vs. RabbitMQ: Comparing architectures, capabilities, and use cases
The main difference between them is that Kafka is an event streaming platform designed to ingest and process massive amounts of data, while RabbitMQ is a general-purpose message broker that supports flexible messaging patterns, multiple protocols, and complex routing.

Explainer
August 29, 2023
Apache Beam vs. Apache Spark: Big data processing solutions compared
The main difference between Spark and Beam is that the former enables you to both write and run data processing pipelines, while the latter allows you to write data processing pipelines, and then run them on various external execution environments (runners). But what are the other differences between Spark and Beam, and how are they similar?

Explainer
August 23, 2023
The anatomy of a machine learning pipeline
Explore the characteristics, challenges, and benefits of machine learning pipelines, and read about the steps involved in training and deploying ML models to production.

Explainer
July 19, 2023
Kafka vs Pulsar: Streaming Data Platforms Compared
An in-depth comparison of Kafka and Pulsar, covering criteria such as architectural differences, operational attributes, developer experience, ecosystems, deployment options, and security.

Explainer
July 14, 2023
The fundamentals of real-time machine learning
What is real-time machine learning? How is it different from batch ML? What are common real-time ML use cases? What are the challenges of building real-time ML capabilities? All these questions and more are answered in this article.

Explainer
July 13, 2023
Real-Time infrastructure tooling for data scientists
Explore the evolution of new tools for real-time pipelines that aim to solve the ongoing problem of data scientists' need for more infrastructure expertise.

Explainer
June 28, 2023
Feature engineering has a language problem
Should data scientists know Java? Java and Scala underpin many real-time, ML-based applications—yet data scientists usually work in Python. Someone has to port the Python into Java or adapt it to use a Python wrapper. Neither of these options is ideal, so what are some better solutions?

Explainer
June 16, 2023
Time series analysis: a gentle introduction
Explore the fundamentals of time series analysis in this comprehensive article. Learn about key concepts, use cases, and types of time series analysis, and discover models, techniques, and methods to analyze time series data.

Explainer
June 8, 2023
Telemetry data explained
Gain a thorough understanding of telemetry data and how it works, learn about its benefits, challenges, and applications across different industries, and discover technologies you can use to operationalize telemetry.

Explainer
May 31, 2023
How to fix the unknown partition error in Kafka
A look at the most common causes of Kafka's "unknown topic or partition" error along with practical steps and solutions to help you fix it.

Explainer
May 31, 2023
Apache Kafka vs Apache Flink: friends or rivals?
Explore the unique features and limitations of Apache Kafka and Apache Flink and learn how these open source streaming titans can either join forces or operate independently.

Explainer
May 24, 2023
Bridging the gap between data scientists and engineers in machine learning workflows
Moving code from prototype to production can be tricky—especially for data scientists. There are many challenges in deploying code that needs to calculate features for ML models in real-time. I look at potential solutions to ease the friction.

Explainer
May 24, 2023
The drawbacks of ksqlDB in machine learning workflows
Using ksqlDB for real-time feature transformations isn't as easy as it looks. I revisit the strategy to democratize stream processing and examine what's still missing.

Explainer
April 20, 2023
Quix as an Apache Flink alternative: a side-by-side comparison
Explore the differences between Quix and Apache Flink and find out when it's better to use Quix as a Flink alternative. If you’re searching for Apache Flink alternatives, this guide offers a detailed, fair comparison to help you make an informed decision.

Explainer
April 12, 2023
Kinesis vs Kafka - A comparison of streaming data platforms
A detailed comparison of Apache Kafka and Amazon Kinesis that covers categories such as operational attributes, pricing model, and time to production while highlighting their key differences and use cases that they typically address.

Explainer
March 28, 2023
Exploring real-time and batch analytics for e-bike telemetry with Quix and AWS
How Brompton's experiments with Quix and AWS technology are paving the way for an enhanced e-bike riding experience.

Explainer
November 9, 2022
Build a CDC pipeline with the Quix SQL Server connector
Create a CDC pipeline and publish data to Kafka topics in just a few minutes with our open source SQL Server connector.