Sinks API

quixstreams.sinks.base.sink

BaseSink

class BaseSink(abc.ABC)

[VIEW SOURCE]

This is a base class for all sinks.

Subclass it and implement its methods to create your own sink.

Note that Sinks are currently in beta, and their design may change over time.

BaseSink.init

def __init__(on_client_connect_success: Optional[
    ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None)

[VIEW SOURCE]

Arguments:

on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

BaseSink.flush

@abc.abstractmethod
def flush()

[VIEW SOURCE]

This method is triggered by the Checkpoint class when it commits.

You can use flush() to write the batched data to the destination (in case of a batching sink), or confirm the delivery of the previously sent messages (in case of a streaming sink).

If flush() fails, the checkpoint will be aborted.

BaseSink.add

@abc.abstractmethod
def add(value: Any, key: Any, timestamp: int, headers: HeadersTuples,
        topic: str, partition: int, offset: int)

[VIEW SOURCE]

This method is triggered on every new processed record being sent to this sink.

You can use it to accumulate batches of data before sending them outside, or to send results right away in a streaming manner and confirm a delivery later on flush().

BaseSink.setup

def setup()

[VIEW SOURCE]

When applicable, set up the client here along with any validation to affirm a valid/successful authentication/connection.

BaseSink.start

def start()

[VIEW SOURCE]

Called as part of Application.run() to initialize the sink's client. Allows using a callback pattern around the connection attempt.

BaseSink.on_paused

def on_paused()

[VIEW SOURCE]

This method is triggered when the sink is paused due to backpressure, when the SinkBackpressureError is raised.

Here you can react to the backpressure events.

BatchingSink

class BatchingSink(BaseSink)

[VIEW SOURCE]

A base class for batching sinks, that need to accumulate the data first before sending it to the external destinations.

Examples: databases, objects stores, and other destinations where writing every message is not optimal.

It automatically handles batching, keeping batches in memory per topic-partition.

You may subclass it and override the write() method to implement a custom batching sink.

BatchingSink.init

def __init__(on_client_connect_success: Optional[
    ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None)

[VIEW SOURCE]

Arguments:

on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

BatchingSink.write

@abc.abstractmethod
def write(batch: SinkBatch)

[VIEW SOURCE]

This method implements actual writing to the external destination.

It may also raise SinkBackpressureError if the destination cannot accept new writes at the moment. When this happens, the accumulated batch is dropped and the app pauses the corresponding topic partition.

BatchingSink.add

def add(value: Any, key: Any, timestamp: int, headers: HeadersTuples,
        topic: str, partition: int, offset: int)

[VIEW SOURCE]

Add a new record to in-memory batch.

BatchingSink.flush

def flush()

[VIEW SOURCE]

Flush accumulated batches to the destination and drop them afterward.

BatchingSink.on_paused

def on_paused()

[VIEW SOURCE]

When the destination is already backpressured, drop the accumulated batches.

quixstreams.sinks.base.batch

SinkBatch

class SinkBatch()

[VIEW SOURCE]

A batch to accumulate processed data by BatchingSink between the checkpoints.

Batches are created automatically by the implementations of BatchingSink.

Arguments:

topic: a topic name
partition: a partition number

SinkBatch.iter_chunks

def iter_chunks(n: int) -> Iterable[Iterable[SinkItem]]

[VIEW SOURCE]

Iterate over batch data in chunks of length n. The last batch may be shorter.

quixstreams.sinks.base.exceptions

SinkBackpressureError

class SinkBackpressureError(QuixException)

[VIEW SOURCE]

An exception to be raised by Sinks during flush() call

to signal a backpressure event to the application.

When raised, the app will drop the accumulated sink batches, pause all assigned topic partitions for a timeout specified in retry_after, and resume them when it's elapsed.

Arguments:

retry_after: a timeout in seconds to pause for

quixstreams.sinks.core.influxdb3

InfluxDB3Sink

class InfluxDB3Sink(BatchingSink)

[VIEW SOURCE]

InfluxDB3Sink.init

def __init__(token: str,
             host: str,
             organization_id: str,
             database: str,
             measurement: MeasurementSetter,
             fields_keys: FieldsSetter = (),
             tags_keys: TagsSetter = (),
             time_setter: Optional[TimeSetter] = None,
             time_precision: TimePrecision = "ms",
             allow_missing_fields: bool = False,
             include_metadata_tags: bool = False,
             convert_ints_to_floats: bool = False,
             batch_size: int = 1000,
             enable_gzip: bool = True,
             request_timeout_ms: int = 10_000,
             debug: bool = False,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None)

[VIEW SOURCE]

A connector to sink processed data to InfluxDB v3.

It batches the processed records in memory per topic partition, converts them to the InfluxDB format, and flushes them to InfluxDB at the checkpoint.

The InfluxDB sink transparently handles backpressure if the destination instance cannot accept more data at the moment (e.g., when InfluxDB returns an HTTP 429 error with the "retry_after" header set). When this happens, the sink will notify the Application to pause consuming from the backpressured topic partition until the "retry_after" timeout elapses.

NOTE: InfluxDB3Sink can accept only dictionaries. If the record values are not dicts, you need to convert them to dicts before sinking.

Arguments:

token: InfluxDB access token
host: InfluxDB host in format "https://"
organization_id: InfluxDB organization_id
database: database name
measurement: measurement name as a string. Also accepts a single-argument callable that receives the current message data as a dict and returns a string.
fields_keys: an iterable (list) of strings used as InfluxDB "fields". Also accepts a single-argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "tags_keys".
If empty, the whole record value will be used.

NOTE The fields' values can only be strings, floats, integers, or booleans. Default - ().
tags_keys: an iterable (list) of strings used as InfluxDB "tags". Also accepts a single-argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "fields_keys".
Given keys are popped from the value dictionary since the same key cannot be both a tag and field.
If empty, no tags will be sent.

NOTE: InfluxDB client always converts tag values to strings. Default - ().
time_setter: an optional column name to use as "time" for InfluxDB. Also accepts a callable which receives the current message data and returns either the desired time or None (use default). The time can be an int, string (RFC3339 format), or datetime. The time must match the time_precision argument if not a datetime object, else raises. By default, a record's kafka timestamp with "ms" time precision is used.
time_precision: a time precision to use when writing to InfluxDB. Possible values: "ms", "ns", "us", "s". Default - "ms".
allow_missing_fields: if True, skip the missing fields keys, else raise KeyError. Default - False
include_metadata_tags: if True, includes record's key, topic, and partition as tags. Default - False.
convert_ints_to_floats: if True, converts all integer values to floats. Default - False.
batch_size: how many records to write to InfluxDB in one request. Note that it only affects the size of one write request, and not the number of records flushed on each checkpoint. Default - 1000.
enable_gzip: if True, enables gzip compression for writes. Default - True.
request_timeout_ms: an HTTP request timeout in milliseconds. Default - 10000.
debug: if True, print debug logs from InfluxDB client. Default - False.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

quixstreams.sinks.core.csv

CSVSink

class CSVSink(BatchingSink)

[VIEW SOURCE]

CSVSink.init

def __init__(path: str,
             dialect: str = "excel",
             key_serializer: Callable[[Any], str] = str,
             value_serializer: Callable[[Any], str] = json.dumps)

[VIEW SOURCE]

A base CSV sink that writes data from all assigned partitions to a single file.

It's best to be used for local debugging.

Column format: (key, value, timestamp, topic, partition, offset)

Arguments:

path: a path to CSV file
dialect: a CSV dialect to use. It affects quoting and delimiters. See the "csv" module docs for more info. Default - "excel".
key_serializer: a callable to convert keys to strings. Default - str.
value_serializer: a callable to convert values to strings. Default - json.dumps.

quixstreams.sinks.community.file.sink

FileSink

class FileSink(BatchingSink)

[VIEW SOURCE]

A sink that writes data batches to files using configurable formats and destinations.

The sink groups messages by their topic and partition, ensuring data from the same source is stored together. Each batch is serialized using the specified format (e.g., JSON, Parquet) before being written to the configured destination.

The destination determines the storage location and write behavior. By default, it uses LocalDestination for writing to the local filesystem, but can be configured to use other storage backends (e.g., cloud storage).

FileSink.init

def __init__(
    directory: str = "",
    format: Union[FormatName, Format] = "json",
    destination: Optional[Destination] = None,
    on_client_connect_success: Optional[ClientConnectSuccessCallback] = None,
    on_client_connect_failure: Optional[ClientConnectFailureCallback] = None
) -> None

[VIEW SOURCE]

Initialize the FileSink with the specified configuration.

Arguments:

directory: Base directory path for storing files. Defaults to current directory.
format: Data serialization format, either as a string ("json", "parquet") or a Format instance.
destination: Storage destination handler. Defaults to LocalDestination if not specified.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

FileSink.write

def write(batch: SinkBatch) -> None

[VIEW SOURCE]

Write a batch of data using the configured format and destination.

The method performs the following steps: 1. Serializes the batch data using the configured format 2. Writes the serialized data to the destination 3. Handles any write failures by raising a backpressure error

Arguments:

batch: The batch of data to write.

Raises:

SinkBackpressureError: If the write operation fails, indicating that the sink needs backpressure with a 5-second retry delay.

quixstreams.sinks.community.file.destinations.azure

AzureContainerNotFoundError

class AzureContainerNotFoundError(Exception)

[VIEW SOURCE]

Raised when the specified Azure File container does not exist.

AzureContainerAccessDeniedError

class AzureContainerAccessDeniedError(Exception)

[VIEW SOURCE]

Raised when the specified Azure File container access is denied.

AzureFileDestination

class AzureFileDestination(Destination)

[VIEW SOURCE]

A destination that writes data to Microsoft Azure File.

Handles writing data to Azure containers using the Azure Blob SDK. Credentials can be provided directly or via environment variables.

AzureFileDestination.init

def __init__(connection_string: str, container: str) -> None

[VIEW SOURCE]

Initialize the Azure File destination.

Arguments:

connection_string: Azure client authentication string.
container: Azure container name.

Raises:

AzureContainerNotFoundError: If the specified container doesn't exist.
AzureContainerAccessDeniedError: If access to the container is denied.

AzureFileDestination.write

def write(data: bytes, batch: SinkBatch) -> None

[VIEW SOURCE]

Write data to Azure.

Arguments:

data: The serialized data to write.
batch: The batch information containing topic and partition details.

quixstreams.sinks.community.file.destinations.base

Destination

class Destination(ABC)

[VIEW SOURCE]

Abstract base class for defining where and how data should be stored.

Destinations handle the storage of serialized data, whether that's to local disk, cloud storage, or other locations. They manage the physical writing of data while maintaining a consistent directory/path structure based on topics and partitions.

Destination.setup

@abstractmethod
def setup()

[VIEW SOURCE]

Authenticate and validate connection here

Destination.write

@abstractmethod
def write(data: bytes, batch: SinkBatch) -> None

[VIEW SOURCE]

Write the serialized data to storage.

Arguments:

data: The serialized data to write.
batch: The batch information containing topic, partition and offset details.

Destination.set_directory

def set_directory(directory: str) -> None

[VIEW SOURCE]

Configure the base directory for storing files.

Arguments:

directory: The base directory path where files will be stored.

Raises:

ValueError: If the directory path contains invalid characters. Only alphanumeric characters (a-zA-Z0-9), spaces, dots, slashes, and underscores are allowed.

Destination.set_extension

def set_extension(format: Format) -> None

[VIEW SOURCE]

Set the file extension based on the format.

Arguments:

format: The Format instance that defines the file extension.

quixstreams.sinks.community.file.destinations.local

LocalDestination

class LocalDestination(Destination)

[VIEW SOURCE]

A destination that writes data to the local filesystem.

Handles writing data to local files with support for both creating new files and appending to existing ones.

LocalDestination.init

def __init__(append: bool = False) -> None

[VIEW SOURCE]

Initialize the local destination.

Arguments:

append: If True, append to existing files instead of creating new ones. Defaults to False.

LocalDestination.set_extension

def set_extension(format: Format) -> None

[VIEW SOURCE]

Set the file extension and validate append mode compatibility.

Arguments:

format: The Format instance that defines the file extension.

Raises:

ValueError: If append mode is enabled but the format doesn't support appending.

LocalDestination.write

def write(data: bytes, batch: SinkBatch) -> None

[VIEW SOURCE]

Write data to a local file.

Arguments:

data: The serialized data to write.
batch: The batch information containing topic and partition details.

quixstreams.sinks.community.file.destinations.s3

S3BucketNotFoundError

class S3BucketNotFoundError(Exception)

[VIEW SOURCE]

Raised when the specified S3 bucket does not exist.

S3BucketAccessDeniedError

class S3BucketAccessDeniedError(Exception)

[VIEW SOURCE]

Raised when the specified S3 bucket access is denied.

S3Destination

class S3Destination(Destination)

[VIEW SOURCE]

A destination that writes data to Amazon S3.

Handles writing data to S3 buckets using the AWS SDK. Credentials can be provided directly or via environment variables.

S3Destination.init

def __init__(bucket: str,
             aws_access_key_id: Optional[str] = getenv("AWS_ACCESS_KEY_ID"),
             aws_secret_access_key: Optional[str] = getenv(
                 "AWS_SECRET_ACCESS_KEY"),
             region_name: Optional[str] = getenv("AWS_REGION",
                                                 getenv("AWS_DEFAULT_REGION")),
             endpoint_url: Optional[str] = getenv("AWS_ENDPOINT_URL_S3"),
             **kwargs) -> None

[VIEW SOURCE]

Initialize the S3 destination.

Arguments:

bucket: Name of the S3 bucket to write to.
aws_access_key_id: AWS access key ID. Defaults to AWS_ACCESS_KEY_ID environment variable.
aws_secret_access_key: AWS secret access key. Defaults to AWS_SECRET_ACCESS_KEY environment variable.
region_name: AWS region name. Defaults to AWS_REGION or AWS_DEFAULT_REGION environment variable.
endpoint_url: the endpoint URL to use; only required for connecting to a locally hosted S3. NOTE: can alternatively set the AWS_ENDPOINT_URL_S3 environment variable
kwargs: Additional keyword arguments passed to boto3.client.

Raises:

S3BucketNotFoundError: If the specified bucket doesn't exist.
S3BucketAccessDeniedError: If access to the bucket is denied.

S3Destination.write

def write(data: bytes, batch: SinkBatch) -> None

[VIEW SOURCE]

Write data to S3.

Arguments:

data: The serialized data to write.
batch: The batch information containing topic and partition details.

quixstreams.sinks.community.file.formats.base

Format

class Format(ABC)

[VIEW SOURCE]

Base class for formatting batches in file sinks.

This abstract base class defines the interface for batch formatting in file sinks. Subclasses should implement the file_extension property and the serialize method to define how batches are formatted and saved.

Format.file_extension

@property
@abstractmethod
def file_extension() -> str

[VIEW SOURCE]

Returns the file extension used for output files.

Returns:

The file extension as a string.

Format.supports_append

@property
@abstractmethod
def supports_append() -> bool

[VIEW SOURCE]

Indicates if the format supports appending data to an existing file.

Returns:

True if appending is supported, otherwise False.

Format.serialize

@abstractmethod
def serialize(batch: SinkBatch) -> bytes

[VIEW SOURCE]

Serializes a batch of messages into bytes.

Arguments:

batch: The batch of messages to serialize.

Returns:

The serialized batch as bytes.

quixstreams.sinks.community.file.formats.json

JSONFormat

class JSONFormat(Format)

[VIEW SOURCE]

Serializes batches of messages into JSON Lines format with optional gzip compression.

This class provides functionality to serialize a SinkBatch into bytes in JSON Lines format. It supports optional gzip compression and allows for custom JSON serialization through the dumps parameter.

This format supports appending to existing files.

JSONFormat.init

def __init__(file_extension: str = ".jsonl",
             compress: bool = False,
             dumps: Optional[Callable[[Any], str]] = None) -> None

[VIEW SOURCE]

Initializes the JSONFormat.

Arguments:

file_extension: The file extension to use for output files. Defaults to ".jsonl".
compress: If True, compresses the output using gzip and appends ".gz" to the file extension. Defaults to False.
dumps: A custom function to serialize objects to JSON-formatted strings. If provided, the compact option is ignored.

JSONFormat.file_extension

@property
def file_extension() -> str

[VIEW SOURCE]

Returns the file extension used for output files.

Returns:

The file extension as a string.

JSONFormat.serialize

def serialize(batch: SinkBatch) -> bytes

[VIEW SOURCE]

Serializes a SinkBatch into bytes in JSON Lines format.

Each item in the batch is converted into a JSON object with "_timestamp", "_key", and "_value" fields. If the message key is in bytes, it is decoded to a string.

Arguments:

batch: The SinkBatch to serialize.

Returns:

The serialized batch in JSON Lines format, optionally compressed with gzip.

quixstreams.sinks.community.file.formats.parquet

ParquetFormat

class ParquetFormat(Format)

[VIEW SOURCE]

Serializes batches of messages into Parquet format.

This class provides functionality to serialize a SinkBatch into bytes in Parquet format using PyArrow. It allows setting the file extension and compression algorithm used for the Parquet files.

This format does not support appending to existing files.

ParquetFormat.init

def __init__(file_extension: str = ".parquet",
             compression: Compression = "snappy") -> None

[VIEW SOURCE]

Initializes the ParquetFormat.

Arguments:

file_extension: The file extension to use for output files. Defaults to ".parquet".
compression: The compression algorithm to use for Parquet files. Allowed values are "none", "snappy", "gzip", "brotli", "lz4", or "zstd". Defaults to "snappy".

ParquetFormat.file_extension

@property
def file_extension() -> str

[VIEW SOURCE]

Returns the file extension used for output files.

Returns:

The file extension as a string.

ParquetFormat.serialize

def serialize(batch: SinkBatch) -> bytes

[VIEW SOURCE]

Serializes a SinkBatch into bytes in Parquet format.

Each item in the batch is converted into a dictionary with "_timestamp", "_key", and the keys from the message value. If the message key is in bytes, it is decoded to a string.

Missing fields in messages are filled with None to ensure all rows have the same columns.

Arguments:

batch: The SinkBatch to serialize.

Returns:

The serialized batch as bytes in Parquet format.

quixstreams.sinks.community.bigquery

BigQuerySink

class BigQuerySink(BatchingSink)

[VIEW SOURCE]

BigQuerySink.init

def __init__(project_id: str,
             location: str,
             dataset_id: str,
             table_name: str,
             service_account_json: Optional[str] = None,
             schema_auto_update: bool = True,
             ddl_timeout: float = 10.0,
             insert_timeout: float = 10.0,
             retry_timeout: float = 30.0,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             **kwargs)

[VIEW SOURCE]

A connector to sink processed data to Google Cloud BigQuery.

It batches the processed records in memory per topic partition, and flushes them to BigQuery at the checkpoint.

NOTE: BigQuerySink can accept only dictionaries. If the record values are not dicts, you need to convert them to dicts before sinking.

The column names and types are inferred from individual records. Each key in the record's dictionary will be inserted as a column to the resulting BigQuery table.

If the column is not present in the schema, the sink will try to add new nullable columns on the fly with types inferred from individual values. The existing columns will not be affected. To disable this behavior, pass schema_auto_update=False and define the necessary schema upfront. The minimal schema must define two columns: "timestamp" of type TIMESTAMP, and "__key" with a type of the expected message key.

Arguments:

project_id: a Google project id.
location: a BigQuery location.
dataset_id: a BigQuery dataset id. If the dataset does not exist, the sink will try to create it.
table_name: BigQuery table name. If the table does not exist, the sink will try to create it with a default schema.
service_account_json: an optional JSON string with service account credentials to connect to BigQuery. The internal google.cloud.bigquery.Client will use the Application Default Credentials if not provided. See https://cloud.google.com/docs/authentication/provide-credentials-adc for more info. Default - None.
schema_auto_update: if True, the sink will try to create a dataset and a table if they don't exist. It will also add missing columns on the fly with types inferred from individual values.
ddl_timeout: a timeout for a single DDL operation (adding tables, columns, etc.). Default - 10s.
insert_timeout: a timeout for a single INSERT operation. Default - 10s.
retry_timeout: a total timeout for each request to BigQuery API. During this timeout, a request can be retried according to the client's default retrying policy.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
kwargs: Additional keyword arguments passed to bigquery.Client.

quixstreams.sinks.community.elasticsearch

ElasticsearchSink

class ElasticsearchSink(BatchingSink)

[VIEW SOURCE]

Pushes data to an ElasticSearch index.

By default, uses the kafka message key as the document ID, and dynamically generates the field types.

You can pass your own type mapping or document ID setter for custom behavior.

ElasticsearchSink.init

def __init__(url: str,
             index: str,
             mapping: Optional[dict] = None,
             document_id_setter: Optional[Callable[
                 [SinkItem], Optional[str]]] = _default_document_id_setter,
             batch_size: int = 500,
             max_bulk_retries: int = 3,
             ignore_bulk_upload_errors: bool = False,
             add_message_metadata: bool = False,
             add_topic_metadata: bool = False,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             **kwargs)

[VIEW SOURCE]

Arguments:

url: the ElasticSearch host url
index: the ElasticSearch index name
mapping: a custom mapping; the default dynamically maps all field types
document_id_setter: how to select the document id; the default is the Kafka message key
batch_size: how large each chunk size is with bulk
max_bulk_retries: number of retry attempts for each bulk batch
ignore_bulk_upload_errors: ignore any errors that occur when attempting an upload
add_message_metadata: add key, timestamp, and headers as __{field}
add_topic_metadata: add topic, partition, and offset as __{field}
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
kwargs: additional kwargs that are passed to the ElasticSearch client

quixstreams.sinks.community.iceberg

AWSIcebergConfig

class AWSIcebergConfig(BaseIcebergConfig)

[VIEW SOURCE]

AWSIcebergConfig.init

def __init__(aws_s3_uri: str,
             aws_region: Optional[str] = None,
             aws_access_key_id: Optional[str] = None,
             aws_secret_access_key: Optional[str] = None,
             aws_session_token: Optional[str] = None)

[VIEW SOURCE]

Configure IcebergSink to work with AWS Glue.

Arguments:

aws_s3_uri: The S3 URI where the table data will be stored (e.g., 's3://your-bucket/warehouse/').
aws_region: The AWS region for the S3 bucket and Glue catalog.
aws_access_key_id: the AWS access key ID. NOTE: can alternatively set the AWS_ACCESS_KEY_ID environment variable when using AWS Glue.
aws_secret_access_key: the AWS secret access key. NOTE: can alternatively set the AWS_SECRET_ACCESS_KEY environment variable when using AWS Glue.
aws_session_token: a session token (or will be generated for you). NOTE: can alternatively set the AWS_SESSION_TOKEN environment variable when using AWS Glue.

IcebergSink

class IcebergSink(BatchingSink)

[VIEW SOURCE]

IcebergSink writes batches of data to an Apache Iceberg table.

The data will by default include the kafka message key, value, and timestamp.

It serializes incoming data batches into Parquet format and appends them to the Iceberg table, updating the table schema as necessary.

Currently, supports Apache Iceberg hosted in:

AWS

Supported data catalogs:

AWS Glue

Arguments:

table_name: The name of the Iceberg table.
config: An IcebergConfig with all the various connection parameters.
data_catalog_spec: data cataloger to use (ex. for AWS Glue, "aws_glue").
schema: The Iceberg table schema. If None, a default schema is used.
partition_spec: The partition specification for the table. If None, a default is used.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

Example setup using an AWS-hosted Iceberg with AWS Glue:

from quixstreams import Application
from quixstreams.sinks.community.iceberg import IcebergSink, AWSIcebergConfig

# Configure S3 bucket credentials
iceberg_config = AWSIcebergConfig(
    aws_s3_uri="", aws_region="", aws_access_key_id="", aws_secret_access_key=""
)

# Configure the sink to write data to S3 with the AWS Glue catalog spec
iceberg_sink = IcebergSink(
    table_name="glue.sink-test",
    config=iceberg_config,
    data_catalog_spec="aws_glue",
)

app = Application(broker_address='localhost:9092', auto_offset_reset="earliest")
topic = app.topic('sink_topic')

# Do some processing here
sdf = app.dataframe(topic=topic).print(metadata=True)

# Sink results to the IcebergSink
sdf.sink(iceberg_sink)


if __name__ == "__main__":
    # Start the application
    app.run()

IcebergSink.write

def write(batch: SinkBatch)

[VIEW SOURCE]

Writes a batch of data to the Iceberg table.

Implements retry logic to handle concurrent write conflicts.

Arguments:

batch: The batch of data to write.

quixstreams.sinks.community.kinesis

KinesisStreamNotFoundError

class KinesisStreamNotFoundError(Exception)

[VIEW SOURCE]

Raised when the specified Kinesis stream does not exist.

KinesisSink

class KinesisSink(BaseSink)

[VIEW SOURCE]

KinesisSink.init

def __init__(
        stream_name: str,
        aws_access_key_id: Optional[str] = getenv("AWS_ACCESS_KEY_ID"),
        aws_secret_access_key: Optional[str] = getenv("AWS_SECRET_ACCESS_KEY"),
        region_name: Optional[str] = getenv("AWS_REGION",
                                            getenv("AWS_DEFAULT_REGION")),
        aws_endpoint_url: Optional[str] = getenv("AWS_ENDPOINT_URL_KINESIS"),
        value_serializer: Callable[[Any], str] = json.dumps,
        key_serializer: Callable[[Any], str] = bytes.decode,
        on_client_connect_success: Optional[
            ClientConnectSuccessCallback] = None,
        on_client_connect_failure: Optional[
            ClientConnectFailureCallback] = None,
        **kwargs) -> None

[VIEW SOURCE]

Initialize the KinesisSink.

Arguments:

stream_name: Kinesis stream name.
aws_access_key_id: AWS access key ID.
aws_secret_access_key: AWS secret access key.
region_name: AWS region name (e.g., 'us-east-1').
value_serializer: Function to serialize the value to string (defaults to json.dumps).
key_serializer: Function to serialize the key to string (defaults to bytes.decode).
kwargs: Additional keyword arguments passed to boto3.client.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

KinesisSink.add

def add(value: Any, key: Any, timestamp: int, headers: HeadersTuples,
        topic: str, partition: int, offset: int) -> None

[VIEW SOURCE]

Buffer a record for the Kinesis stream.

Records are buffered until the batch size reaches 500, at which point they are sent immediately. If the batch size is less than 500, records will be sent when the flush method is called.

KinesisSink.flush

def flush() -> None

[VIEW SOURCE]

Flush all records bufferred so far.

This method sends any outstanding records that have not yet been sent because the batch size was less than 500. It waits for all futures to complete, ensuring that all records are successfully sent to the Kinesis stream.

quixstreams.sinks.community.mongodb

MongoDBSink

class MongoDBSink(BatchingSink)

[VIEW SOURCE]

MongoDBSink.init

def __init__(host: str,
             db: str,
             collection: str,
             username: Optional[str] = None,
             password: Optional[str] = None,
             port: int = 27017,
             document_matcher: Callable[
                 [SinkItem], MongoQueryFilter] = _default_document_matcher,
             update_method: Literal["UpdateOne", "UpdateMany",
                                    "ReplaceOne"] = "UpdateOne",
             upsert: bool = True,
             add_message_metadata: bool = False,
             add_topic_metadata: bool = False,
             authentication_timeout_ms: int = 15000,
             value_selector: Optional[Callable[[MongoValue],
                                               MongoValue]] = None,
             **kwargs) -> None

[VIEW SOURCE]

A connector to sink processed data to MongoDB in batches.

Arguments:

host: MongoDB hostname; example "localhost"
db: MongoDB database name
collection: MongoDB collection name
username: username, if authentication is required
password: password, if authentication is required
port: port used by MongoDB host if not using the default of 27017
document_matcher: How documents are selected to update. A callable that accepts a BatchItem and returns a MongoDB "query filter". If no match, will insert if upsert=True, where _id will be either the included value if specified, else a random ObjectId.
Default: matches on _id, with _id assumed to be the kafka key.
upsert: Create documents if no matches with document_matcher.
update_method: How documents found with document_matcher are updated. 'Update' options will only update fields included in the kafka message. 'Replace' option fully replaces the document with the contents of kafka message. "UpdateOne": Updates the first matching document (usually based on _id). "UpdateMany": Updates ALL matching documents (usually NOT based on _id). "ReplaceOne": Replaces the first matching document (usually based on _id). Default: "UpdateOne".
add_message_metadata: add key, timestamp, and headers as __{field}
add_topic_metadata: add topic, partition, and offset as __{field}
value_selector: An optional callable that allows final editing of the outgoing document (right before submitting it). Largely used when a field is necessary for document_matcher, but not otherwise. NOTE: metadata is added before this step, so don't accidentally exclude it here!

MongoDBSink.write

def write(batch: SinkBatch) -> None

[VIEW SOURCE]

Note: Transactions could be an option here, but then each record requires a network call, and the transaction has size limits...so bulk_write is used instead, with the downside that duplicate writes may occur if errors arise.

quixstreams.sinks.community.neo4j

Neo4jSink

class Neo4jSink(BatchingSink)

[VIEW SOURCE]

Neo4jSink.init

def __init__(host: str,
             port: int,
             username: str,
             password: str,
             cypher_query: str,
             chunk_size: int = 10000,
             **kwargs) -> None

[VIEW SOURCE]

A connector to sink processed data to Neo4j.

Arguments:

host: The Neo4j database hostname.
port: The Neo4j database port.
username: The Neo4j database username.
password: The Neo4j database password.
cypher_query: A Cypher Query to execute on each record. Behavior attempts to match other Neo4j connectors:
Uses "dot traversal" for (nested) dict key access; ex: "col_x.col_y.col_z"
Message value is bound to the alias "event"; ex: "event.field_a".
Message key, value, header and timestamp are bound to "__{attr}"; ex: "__key".
chunk_size: Adjust the size of a Neo4j transactional chunk.
This does NOT affect how many records can be written/flushed at once.
The chunks are committed only if ALL of them succeed.
Larger chunks are generally more efficient, but can encounter size issues.
This is only necessary to adjust when messages are especially large.
kwargs: Additional keyword arguments passed to the neo4j.GraphDatabase.driver instance.

Example Usage:

from quixstreams import Application
from quixstreams.sinks.community.neo4j import Neo4jSink

app = Application(broker_address="localhost:9092")
topic = app.topic("topic-name")

# records structured as:
# {"name": {"first": "John", "last": "Doe"}, "age": 28, "city": "Los Angeles"}

# This assumes the given City nodes exist.
# Notice the use of "event" to reference the message value.
# Could also do things like __key, or __value.name.first.
cypher_query = '''
MERGE (p:Person {first_name: event.name.first, last_name: event.name.last})
SET p.age = event.age
MERGE (c:City {name: event.city})
MERGE (p)-[:LIVES_IN]->(c)
'''

# Configure the sink
neo4j_sink = Neo4jSink(
    host="localhost",
    port=7687,
    username="neo4j",
    password="local_password",
    cypher_query=cypher_query,
)

sdf = app.dataframe(topic=topic)
sdf.sink(neo4j_sink)

if __name__ == "__main__":
    app.run()

quixstreams.sinks.community.postgresql

PostgreSQLSink

class PostgreSQLSink(BatchingSink)

[VIEW SOURCE]

PostgreSQLSink.init

def __init__(host: str,
             port: int,
             dbname: str,
             user: str,
             password: str,
             table_name: TableName,
             schema_name: str = "public",
             schema_auto_update: bool = True,
             connection_timeout_seconds: int = 30,
             statement_timeout_seconds: int = 30,
             primary_key_columns: PrimaryKeyColumns = (),
             upsert_on_primary_key: bool = False,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             **kwargs)

[VIEW SOURCE]

A connector to sink topic data to PostgreSQL.

Arguments:

host: PostgreSQL server address.
port: PostgreSQL server port.
dbname: PostgreSQL database name.
user: Database username.
password: Database user password.
table_name: PostgreSQL table name as either a string or a callable which receives a SinkItem and returns a string.
schema_name: The schema name. Schemas are a way of organizing tables and not related to the table data, referenced as <schema_name>.<table_name>. PostrgeSQL uses "public" by default under the hood.
schema_auto_update: Automatically update the schema when new columns are detected.
connection_timeout_seconds: Timeout for connection.
statement_timeout_seconds: Timeout for DDL operations such as table creation or schema updates.
primary_key_columns: An optional single (string) or list of primary key column(s); len>1 is a composite key, a non-empty str or len==1 is a primary key, and len<1 or empty string means no primary key. Can instead provide a callable, which uses the message value as input and returns a string or list of strings. Often paired with upsert_on_primary_key=True. It must include all currently defined primary key columns on a given table.
upsert_on_primary_key: Upsert based on the given primary_key_columns. If False, every message is treated as an independent entry, and any primary key collisions will consequently raise an exception.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
kwargs: Additional parameters for psycopg2.connect.

quixstreams.sinks.community.pubsub

PubSubTopicNotFoundError

class PubSubTopicNotFoundError(Exception)

[VIEW SOURCE]

Raised when the specified topic does not exist.

PubSubSink

class PubSubSink(BaseSink)

[VIEW SOURCE]

A sink that publishes messages to Google Cloud Pub/Sub.

PubSubSink.init

def __init__(project_id: str,
             topic_id: str,
             service_account_json: Optional[str] = None,
             value_serializer: Callable[[Any], Union[bytes, str]] = json.dumps,
             key_serializer: Callable[[Any], str] = bytes.decode,
             flush_timeout: int = 5,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             **kwargs) -> None

[VIEW SOURCE]

Initialize the PubSubSink.

Arguments:

project_id: GCP project ID.
topic_id: Pub/Sub topic ID.
service_account_json: an optional JSON string with service account credentials to connect to Pub/Sub. The internal PublisherClient will use the Application Default Credentials if not provided. See https://cloud.google.com/docs/authentication/provide-credentials-adc for more info. Default - None.
value_serializer: Function to serialize the value to string or bytes (defaults to json.dumps).
key_serializer: Function to serialize the key to string (defaults to bytes.decode).
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
kwargs: Additional keyword arguments passed to PublisherClient.

PubSubSink.add

def add(value: Any, key: Any, timestamp: int, headers: HeadersTuples,
        topic: str, partition: int, offset: int) -> None

[VIEW SOURCE]

Publish a message to Pub/Sub.

PubSubSink.flush

def flush() -> None

[VIEW SOURCE]

Wait for all publish operations to complete successfully.

quixstreams.sinks.community.redis

RedisSink

class RedisSink(BatchingSink)

[VIEW SOURCE]

RedisSink.init

def __init__(host: str,
             port: int,
             db: int,
             value_serializer: Callable[[Any], Union[bytes, str]] = json.dumps,
             key_serializer: Optional[Callable[[Any, Any], Union[bytes,
                                                                 str]]] = None,
             password: Optional[str] = None,
             socket_timeout: float = 30.0,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             **kwargs) -> None

[VIEW SOURCE]

A connector to sink processed data to Redis.

It batches the processed records in memory per topic partition, and flushes them to Redis at the checkpoint.

Arguments:

host: Redis host.
port: Redis port.
db: Redis DB number.
value_serializer: a callable to serialize the value to string or bytes (defaults to json.dumps).
key_serializer: an optional callable to serialize the key to string or bytes. If not provided, the Kafka message key will be used as is.
password: Redis password, optional.
socket_timeout: Redis socket timeout. Default - 30s.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
kwargs: Additional keyword arguments passed to the redis.Redis instance.

quixstreams.sinks.community.influxdb1

InfluxDB1Sink

class InfluxDB1Sink(BatchingSink)

[VIEW SOURCE]

InfluxDB1Sink.init

def __init__(host: str,
             database: str,
             measurement: MeasurementSetter,
             port: int = 8086,
             username: Optional[str] = None,
             password: Optional[str] = None,
             fields_keys: FieldsSetter = (),
             tags_keys: TagsSetter = (),
             time_setter: Optional[TimeSetter] = None,
             time_precision: TimePrecision = "ms",
             allow_missing_fields: bool = False,
             include_metadata_tags: bool = False,
             convert_ints_to_floats: bool = False,
             batch_size: int = 1000,
             request_timeout_ms: int = 10_000,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None)

[VIEW SOURCE]

A connector to sink processed data to InfluxDB v1.

It batches the processed records in memory per topic partition, converts them to the InfluxDB format, and flushes them to InfluxDB at the checkpoint.

The InfluxDB sink transparently handles backpressure if the destination instance cannot accept more data at the moment (e.g., when InfluxDB returns an HTTP 429 error with the "retry_after" header set). When this happens, the sink will notify the Application to pause consuming from the backpressured topic partition until the "retry_after" timeout elapses.

NOTE: InfluxDB1Sink can accept only dictionaries. If the record values are not dicts, you need to convert them to dicts before sinking.

Arguments:

host: InfluxDB host in format "https://"
database: database name
measurement: measurement name as a string. Also accepts a single-argument callable that receives the current message data as a dict and returns a string.
username: database username
password: database password
fields_keys: an iterable (list) of strings used as InfluxDB "fields". Also accepts a single-argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "tags_keys".
If empty, the whole record value will be used.

NOTE The fields' values can only be strings, floats, integers, or booleans. Default - ().
tags_keys: an iterable (list) of strings used as InfluxDB "tags". Also accepts a single-argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "fields_keys".
Given keys are popped from the value dictionary since the same key cannot be both a tag and field.
If empty, no tags will be sent.

NOTE: InfluxDB client always converts tag values to strings. Default - ().
time_setter: an optional column name to use as "time" for InfluxDB. Also accepts a callable which receives the current message data and returns either the desired time or None (use default). The time can be an int, string (RFC3339 format), or datetime. The time must match the time_precision argument if not a datetime object, else raises. By default, a record's kafka timestamp with "ms" time precision is used.
time_precision: a time precision to use when writing to InfluxDB. Possible values: "ms", "ns", "us", "s". Default - "ms".
allow_missing_fields: if True, skip the missing fields keys, else raise KeyError. Default - False
include_metadata_tags: if True, includes record's key, topic, and partition as tags. Default - False.
convert_ints_to_floats: if True, converts all integer values to floats. Default - False.
batch_size: how many records to write to InfluxDB in one request. Note that it only affects the size of one write request, and not the number of records flushed on each checkpoint. Default - 1000.
request_timeout_ms: an HTTP request timeout in milliseconds. Default - 10000.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.

quixstreams.sinks.community.tdengine.sink

TDengineSink

class TDengineSink(BatchingSink)

[VIEW SOURCE]

TDengineSink.init

def __init__(host: str,
             database: str,
             supertable: SupertableSetter,
             subtable: SubtableNameSetter,
             fields_keys: FieldsSetter = (),
             tags_keys: TagsSetter = (),
             time_setter: Optional[TimeSetter] = None,
             time_precision: TimePrecision = "ms",
             allow_missing_fields: bool = False,
             include_metadata_tags: bool = False,
             convert_ints_to_floats: bool = False,
             batch_size: int = 1000,
             enable_gzip: bool = True,
             request_timeout_ms: int = 10_000,
             on_client_connect_success: Optional[
                 ClientConnectSuccessCallback] = None,
             on_client_connect_failure: Optional[
                 ClientConnectFailureCallback] = None,
             verify_ssl: bool = True,
             username: str = "",
             password: str = "",
             token: str = "",
             max_retries: int = 5,
             retry_backoff_factor: float = 1.0)

[VIEW SOURCE]

A connector to sink processed data to TDengine.

It batches the processed records in memory per topic partition, converts them to the InfluxDB line protocol, and flushes them to TDengine at the checkpoint.

NOTE: TDengineSink can accept only dictionaries. If the record values are not dicts, you need to convert them to dicts before sinking.

Arguments:

token: TDengine cloud token
host: TDengine host in format "http[s]://[:]".
username: TDengine username
password: TDengine password
verify_ssl: if True, verifies the SSL certificate. Default - True.
database: database name
supertable: supertable name as a string. Also accepts a single-argument callable that receives the current message data as a dict and returns a string.
subtable: subtable name as a string. Also accepts a single-argument callable that receives the current message data as a dict and returns a string. If the subtable name is empty string, a hash value will be generated from the data as the subtable name.
fields_keys: an iterable (list) of strings used as InfluxDB line protocol "fields". Also accepts a single argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "tags_keys".
If empty, the whole record value will be used.

NOTE The fields' values can only be strings, floats, integers, or booleans. Default - ().
tags_keys: an iterable (list) of strings used as InfluxDB line protocol "tags". Also accepts a single-argument callable that receives the current message data as a dict and returns an iterable of strings.
If present, it must not overlap with "fields_keys".
Given keys are popped from the value dictionary since the same key cannot be both a tag and field.
If empty, no tags will be sent.

NOTE: always converts tag values to strings. Default - ().
time_setter: an optional column name to use as "time" when convert to InfluxDB line protocol. Also accepts a callable which receives the current message data and returns either the desired time or None (use default). The time can be an int, string (RFC3339 format), or datetime. The time must match the time_precision argument if not a datetime object, else raises. By default, a record's kafka timestamp with "ms" time precision is used.
time_precision: a time precision to use when convert to InfluxDB line protocol. Possible values: "ms", "ns", "us", "s". Default - "ms".
allow_missing_fields: if True, skip the missing fields keys, else raise KeyError. Default - False
include_metadata_tags: if True, includes record's key, topic, and partition as tags. Default - False.
convert_ints_to_floats: if True, converts all integer values to floats. Default - False.
batch_size: how many records to write to TDengine in one request. Note that it only affects the size of one write request, and not the number of records flushed on each checkpoint. Default - 1000.
enable_gzip: if True, enables gzip compression for writes. Default - True.
request_timeout_ms: an HTTP request timeout in milliseconds. Default - 10000.
on_client_connect_success: An optional callback made after successful client authentication, primarily for additional logging.
on_client_connect_failure: An optional callback made after failed client authentication (which should raise an Exception). Callback should accept the raised Exception as an argument. Callback must resolve (or propagate/re-raise) the Exception.
max_retries: maximum number of retries for failed requests. Default - 5.
retry_backoff_factor: a backoff factor applied between retry attempts starting from the second retry. The sleep duration between retries is calculated as {backoff factor} * (2 ** ({number of previous retries})) seconds. Default - 1.0.

Sinks API

quixstreams.sinks.base.sink

BaseSink

BaseSink.__init__

BaseSink.flush

BaseSink.add

BaseSink.setup

BaseSink.start

BaseSink.on_paused

BatchingSink

BatchingSink.__init__

BatchingSink.write

BatchingSink.add

BatchingSink.flush

BatchingSink.on_paused

quixstreams.sinks.base.batch

SinkBatch

SinkBatch.iter_chunks

quixstreams.sinks.base.exceptions

SinkBackpressureError

quixstreams.sinks.core.influxdb3

InfluxDB3Sink

InfluxDB3Sink.__init__

quixstreams.sinks.core.csv

CSVSink

CSVSink.__init__

quixstreams.sinks.community.file.sink

FileSink

FileSink.__init__

FileSink.write

quixstreams.sinks.community.file.destinations.azure

AzureContainerNotFoundError

AzureContainerAccessDeniedError

AzureFileDestination

AzureFileDestination.__init__

AzureFileDestination.write

quixstreams.sinks.community.file.destinations.base

Destination

Destination.setup

Destination.write

Destination.set_directory

Destination.set_extension

quixstreams.sinks.community.file.destinations.local

LocalDestination

LocalDestination.__init__

LocalDestination.set_extension

LocalDestination.write

quixstreams.sinks.community.file.destinations.s3

S3BucketNotFoundError

S3BucketAccessDeniedError

S3Destination

S3Destination.__init__

S3Destination.write

quixstreams.sinks.community.file.formats.base

Format

Format.file_extension

Format.supports_append

Format.serialize

quixstreams.sinks.community.file.formats.json

JSONFormat

JSONFormat.__init__

JSONFormat.file_extension

JSONFormat.serialize

quixstreams.sinks.community.file.formats.parquet

ParquetFormat

ParquetFormat.__init__

ParquetFormat.file_extension

ParquetFormat.serialize

quixstreams.sinks.community.bigquery

BigQuerySink

BigQuerySink.__init__

quixstreams.sinks.community.elasticsearch

ElasticsearchSink

ElasticsearchSink.__init__

quixstreams.sinks.community.iceberg

AWSIcebergConfig

AWSIcebergConfig.__init__

IcebergSink

IcebergSink.write

quixstreams.sinks.community.kinesis

BaseSink.init

BatchingSink.init

InfluxDB3Sink.init

CSVSink.init

FileSink.init

AzureFileDestination.init

LocalDestination.init

S3Destination.init

JSONFormat.init

ParquetFormat.init

BigQuerySink.init

ElasticsearchSink.init

AWSIcebergConfig.init

KinesisSink.init

MongoDBSink.init

Neo4jSink.init

PostgreSQLSink.init

PubSubSink.init

RedisSink.init

InfluxDB1Sink.init

TDengineSink.init