Quix Streams

The open source Python framework for tabular data processing on Apache Kafka.

Copy
from quixstreams import Application

app = Application(broker_address="localhost:9092")

input_topic = app.topic("game-telemetry")
output_topic = app.topic("latency-hopping-windows")

# Converts JSON to a Streaming DataFrame (SDF) tabular format
sdf = app.dataframe(input_topic)

# Calculate hopping window of 1s with 200ms steps
sdf = sdf.apply(lambda row: row["ping"]) \
      .hopping_window(1000, 200).mean().final() 

# Send rows from SDF back to the output topic as JSON messages
sdf = sdf.to_topic(output_topic)

if __name__ == "__main__":
   app.run(sdf)
Learn more about windowing in Docs
Copy
from quixstreams import Application

app = Application(broker_address="localhost:9092")

input_topic = app.topic("game-telemetry")
output_topic = app.topic("low-latency-players")

# Converts JSON to a Streaming DataFrame (SDF) tabular format
sdf = app.dataframe(input_topic)

# Filter only windows where average ping is less than 4ms
sdf = sdf[sdf["ping"] < 4]

# Send rows from SDF back to the output topic as JSON messages
sdf = sdf.to_topic(output_topic)

if __name__ == "__main__":
   app.run(sdf)
Learn more about filtering in Docs
Copy
from quixstreams import Application

app = Application(broker_address="localhost:9092")

input_topic = app.topic("game-telemetry")
output_topic = app.topic("game-telemetry-with-features")

# Converts JSON to a Streaming DataFrame (SDF) tabular format
sdf = app.dataframe(input_topic)

# Project world positions columns to new column with 3 scalars
sdf["WorldPosition"] = sdf.apply(lambda row: {
    "X": row["Motion_WorldPositionX"],
    "Y": row["Motion_WorldPositionY"],
    "Z": row["Motion_WorldPositionZ"],
})

# Derive a new column based on source one.
sdf["is_low_latency"] = sdf["ping"] < 4

# Send rows from SDF back to the output topic as JSON messages
sdf = sdf.to_topic(output_topic)

if __name__ == "__main__":
   app.run(sdf)
Learn more about projection in Docs
Quix home heading animation with three dots.

Python-native processing and connectors

Pure Python, meaning no wrappers around Java and no cross-language debugging.

Sources & Sinks API for building custom connectors that integrate data with Kafka whilst handling retries and backpressure). JSON, Avro, Protobuf and schema registry support to keep your ever-changing data valid and clean.

Work with streaming data like it's in a database

Treat real-time data streams as continuously updating tables via a Streaming DataFrame API. Ideal for transitioning projects from Pandas or PySpark.

Use built-in operators for aggregation, windowing, filtering, group by, branching, merging and more to build stateful applications in fewer lines of code.

Flexible, scalable and fault tolerant

Clever checkpointing and exactly-once processing guarantees ensure your data pipelines are durable and fault tolerant through unpredictable infrastructure issues. 

By leveraging Kafka, you’re building infinitely horizontally scalable services. Data replication ensures redundancy and high availability for your data consumers.

Get started with streaming DataFrames

Learn how the Quix Streams library simplifies building real-time apps with Kafka and Python in our video walkthroughs and get up and running with streaming DataFrames in minutes.

Check out the playlist on our YouTube account

Developer love for Quix

Open source Python client library

Support the project by starring the repo, or submit a PR to become a contributor.

Check out the repo
Quix Streams GitHub

Start building your pipeline

Create, debug and run streaming data pipelines on your local machine using the Quix CLI.

Quickstart guide