Apache Iceberg

Sink data into Iceberg, quixly

Pre-process data before loading it into Apache Iceberg to simplify your lakehouse architecture.

Apache Iceberg

Sink data into Iceberg, quixly

Pre-process data before loading it into Apache Iceberg to simplify your lakehouse architecture.

Architecture diagram showing data sources, transformations and sinks

What can you build with Quix?

Integrate your data your way

Quix provides out of the box connectors for many destinations, including databases, data lakes and data warehouses. Unlike alternatives like Kafka Connect, they are not a black box: you can fork them into your own Git repository and customise them to your use case.

Pre-process data in a tabular data format

Quix’s open source Python library for stream processing, Quix Streams, enables you to transform your data in stream using a tabular data format. You can also aggregate, join, downsample or enrich data from any cache or external system.

Pure Python

Both connectors and transformations are written in pure Python, so data engineers and scientists can easily customise data ingestion pipelines. Specialized Source, Processing and Sink API’s take care of the heavy lifting so you can get the job done with less headaches.

No throughput limits

Send as much data as you want, Quix’s serverless infrastructure will be able to handle it. The Quix connectors will also handle any backpressure and checkpointing to ensure no data is duplicated or lost, and your systems aren’t overloaded.

No limits on how you structure your data

If your raw data has lots of nested layers that are not optimal for Iceberg, you can easily re-structure your data before sinking it to your cloud object storage.

Transform your data efficiently for optimal ingestion to databases, lakes and warehouses

Integrate data from any source

Pre-process, transform and validate data

Sink it to Apache Iceberg

Source

Ingest data from any source, including popular streaming technologies like Apache Kafka, AWS MSK or AWS Kinesis. Use out-of-the-box connectors, or when that’s not enough, you can quickly customise a connector by forking the nearest example. Building custom connectors is easy with Quix’s pure Python Source API.

Transformation

Prepare your data with Quix Streams, an open source Python library for processing data with streaming DataFrames. Use in-built operators for aggregation, windowing, filtering, group-by, branching, merging and more. Integrate and enrich your data before loading to Iceberg  by connecting  caches and external systems.

Destination

Sink data to cloud blob stores in Iceberg format, including AWS S3, GCS and Azure Blob Storage. Other databases, lake formats and warehouses are also supported. Quix sink connectors will automatically handle back pressure and checkpointing to ensure no data is duplicated or lost and your database is not overloaded.

Better performance and lower TCO

Integrate your data your way

Quix provides out of the box connectors for many destinations, including databases, data lakes and data warehouses. Unlike alternatives like Kafka Connect, they are not a black box: you can fork them into your own Git repository and customise them to your use case.

Pre-process data in a tabular data format

Quix’s open source Python library for stream processing, Quix Streams, enables you to transform your data in stream using a tabular data format. You can also aggregate, join, downsample or enrich data from any cache or external system.

Pure Python

Both connectors and transformations are written in pure Python, so data engineers and scientists can easily customise data ingestion pipelines. Specialized Source, Processing and Sink API’s take care of the heavy lifting so you can get the job done with less headaches.

No throughput limits

Send as much data as you want, Quix’s serverless infrastructure will be able to handle it. The Quix connectors will also handle any backpressure and checkpointing to ensure no data is duplicated or lost, and your systems aren’t overloaded.

No limits on how you structure your data

If your raw data has lots of nested layers that are not optimal for Iceberg, you can easily re-structure your data before sinking it to your cloud object storage.

Lower TCO

Integrate your data with your data lake for a fraction of the typical cost and with greater control, compared to popular streaming solutions such as AWS Kinesis Firehose or SaaS tools like Fivetran.

Get started

Get started quickly with our open source connectors.

Sink connector

Source connector

Apache Iceberg sink

Book a chat with us

Schedule a free 30 minute chat with a member of our team to discuss your use case and get all your questions answered.

Let's talk!
Quix Streams GitHub
Architecture diagram showing data sources, transformations and sinks

Better performance and lower TCO


TRUSTED BY DEVELOPERS AT: