Broker

Managed Python stream processing with Aiven for Apache Kafka and Quix

Why use Quix with Aiven for Apache Kafka?

Quix empowers Aiven for Apache Kafka users with a complete Python stream processing platform, making it simple to develop, deploy, monitor, and scale applications that process Kafka data streams.

100% Python

No JVM, wrappers, server-side engine, DSL, or cross-language debugging. Quix provides a Python Streaming DataFrame API that treats data streams as continuously updating tables.

Rich processing capabilities

Quix supports stateless and stateful operations, aggregations over hopping and tumbling windows, custom data processing functions, and exactly-once processing.

Dependable at scale

Quix is scalable, highly available, and fault tolerant. It’s optimized to process high-volume, high-velocity data with consistently low latencies.

How do Quix and Aiven for Apache Kafka work together?

Aiven for Apache Kafka acts as the streaming transport component. It has the following responsibilities:

Collects data from source systems (via Aiven for Apache Kafka Connect source connectors)
Acts as a streaming data source for Quix, the stream processing component
Consumes the output data from Quix post-processing and forwards it to downstream systems (via Aiven for Apache Kafka Connect sink connectors) so it can be stored and operationalized

Quix is the Python stream processor that complements your Aiven Kafka cluster:

Ingests data from Aiven Kafka topics and transforms it in real time
Publishes the transformed data back to Aiven Kafka topics
Uses data from Aiven Kafka topics to power real-time capabilities

Together, Aiven for Apache Kafka and Quix offer a complete, end-to-end solution for handling data streams and extracting actionable insights from real-time data.

For details on how to integrate Aiven for Apache Kafka with Quix, follow this short guide.

Once you’ve configured the integration, you can implement your stream processing pipeline.

You develop the stream processing logic using Quix Streams, an open-source technology that combines an Apache Kafka client with a Python stream processing library. Quix Streams offers the following key capabilities:

Librdkafka-compatible. Access to low-level Kafka producer and Kafka consumer classes to read from and write to Kafka topics.
Easy to learn. Intuitive Streaming DataFrame API (similar to pandas DataFrame) for tabular data transformations.
Windowing. Aggregations over hopping and tumbling windows.
Stateless processing. This includes grouping, filtering, projections, and dropping columns.
Custom processing. The ability to transform data and implement custom, user-defined processing functions.
Stateful operations. Stateful processing uses RocksDB to persist state.
Simple output routing. The ability to succinctly produce data to an output Kafka topic using the to_topic() method.
Flexible processing guarantees. At-least-once and exactly-once processing semantics.
Versatile serialization. Support for various data serialization formats: bytes, string, integer, double, JSON, Protobuf, and Avro.
Simplified deployment. Seamless integration with Quix Cloud, a fully managed platform offering a frictionless environment for deploying and managing your stream processing pipelines.

To learn more about Quix Streams and how to use its features, see the Quix Streams tutorials.

What are the benefits of using Quix alongside Aiven for Apache Kafka?

Pure Python development experience

Aiven provides a managed stream processing solution — Aiven for Apache Flink®. However, this product is not tailor-made for Python developers; it requires you to process data using SQL statements.

Meanwhile, Quix is a stream processor specifically designed to serve Python developers.

Compared to Aiven’s managed Apache Flink service, Quix offers the following advantages:

Pure Python coding and debugging experience
Intuitive Streaming DataFrame API with a modern Python syntax and a gentle learning curve (especially if you’re familiar with pandas)
A straightforward way to integrate your Python libraries of choice into your workflow (scikit-learn, TensorFlow, PyTorch, etc).

Flexible, comprehensive toolkit

Aiven for Apache Kafka reduces the complexity of managing Kafka clusters and data streams by offering user-friendly capabilities such as an intuitive UI, powerful CLI utilities, and seamless deployments.

Similarly, Quix offers everything you need to easily and conveniently build, deploy, and manage industrial-strength stream processing applications:

CI/CD support. Integrations with any Git provider (e.g., GitHub, Bitbucket, Azure DevOps) for seamless CI/CD processes.
Environment control. Multiple projects and environments (linked to Git) for streamlined environment management.
Team collaboration. Multi-user collaboration at project and environment levels through organization and permission management.
Infrastructure management. Infrastructure as code (IaC) using Quix YAML (similar to Helm charts) with automated synchronization.
Observability and monitoring. Real-time logs, metrics, data explorers, and waveform and table views.
Security. Securely manage secrets and sensitive information.
Dev tools. Online code editor, code templates, and pre-built connectors for various data sources and sinks (e.g., MQTT, InfluxDB, Redis).
Pipeline management. Functionality to scale resources, adjust replicas, and manage CPU and memory for your real-time data pipelines.
Rapid prototyping. An in-built Quix-hosted Apache Kafka broker for testing and fast prototyping.‍
Local development. CLI tool to create, debug, and run your pipeline locally, then deploy it to the cloud using only the command line.

Reduced costs and complexity, and faster time to market

Much like Aiven for Apache Kafka, Quix Cloud is a fully managed solution that brings several advantages:

No need for extensive infrastructure setup and ongoing maintenance
Predictable costs and significantly reduced DevOps, financial, and operational burden
You are free to focus entirely on innovating, building, and releasing new features, products, and capabilities that rely on real-time data

Scalability and reliability

Aiven for Apache Kafka is designed to reliably move massive amounts of data with very low latency.

Built by Formula 1 engineers and used in production by Formula 1 teams, Quix is a robust solution that’s optimized to handle high-volume, high-velocity data:

Highly scalable, leveraging Kafka and Kubernetes under the hood to provide data partitioning, consumer groups, and state management
Reliable data delivery and failure recovery through exactly-once processing, data and service replication, changelogs, and checkpointing
Highly available — Quix Cloud guarantees 99.99% uptime
Able to process billions of messages per day, with consistently low latencies (in the double-digit millisecond range)

By pairing Aiven for Apache Kafka and Quix, you end up with a stable, future-proof solution that can process and stream data in real time at any scale.

What kind of use cases can I enable with Aiven for Apache Kafka and Quix?

By leveraging Aiven for Apache Kafka as your streaming data platform and Quix as your Python stream processing engine, you can build complex event-driven systems, real-time data pipelines, streaming applications, and AI/ML products.

Here are but a few examples of real-time use cases you can deliver by pairing Aiven for Apache Kafka and Quix:

Fraud detection
Sentiment & clickstream analysis
Predictive maintenance
Motor racing analysis
Live dashboards
Streaming ETL and real-time ML pipelines

At InfluxData, we prioritize ‘time to awesome’ when developing InfluxDB, aiming to empower developers to quickly transition from beginners to experts and create impactful solutions. Quix perfectly aligns with this core value, offering an exceptional user experience and a seamlessly scalable platform right from the start.
‍
One of the features I appreciate the most is the built-in pandas DataFrame support. This functionality is invaluable for efficiently handling bulk transformations and enrichments of time series data within a multistage data pipeline, providing both power and simplicity.
‍
For those venturing into the realm of event streaming applications, or for anyone looking to construct a scalable task engine armed with Python’s power and flexibility for manipulating time series data, I wholeheartedly recommend Quix to the community.

Jay Clifford

Developer Advocate at InfluxData

Related resources

Getting started with Quix

Book a demo

Talk with our team of experts to learn more about how companies are building data integration pipelines with Quix.

Let's talk

Ask our community

Join our Slack to get help with Quix Streams from our friendly developer community.

Join us on Slack