How to optimize your Kafka producer for throughput

Question:

How can you optimize your Kafka producer application for throughput?

Edit this page

Example use case:

When optimizing for performance, you'll typically need to consider tradeoffs between throughput and latency. Because of Kafka’s design, it isn't hard to write large volumes of data into it. But many of the Kafka configuration parameters have default settings that optimize for latency. If your use case calls for higher throughput, this tutorial walks you through how to use `kafka-producer-perf-test` to measure baseline performance and tune your producer for large volumes of data.

Hands-on code example:

New to Confluent Cloud? Get started here.

Short Answer

Here are some producer configuration parameters you can set to increase throughput. The values shown below are for demonstration purposes, and you will need to further tune these for your environment.

  • batch.size: increase to 100000–200000 (default 16384)

  • linger.ms: increase to 10–100 (default 0)

  • compression.type=lz4 (default none, i.e., no compression)

  • acks=1 (default all, since Apache Kafka version 3.0)

For a detailed explanation of these and other configuration parameters, read these recommendations for Kafka developers.

Run it

Provision your Kafka cluster

1

This tutorial requires access to an Apache Kafka cluster, and the quickest way to get started free is on Confluent Cloud, which provides Kafka as a fully managed service.

Take me to Confluent Cloud
  1. After you log in to Confluent Cloud, click Environments in the lefthand navigation, click on Add cloud environment, and name the environment learn-kafka. Using a new environment keeps your learning resources separate from your other Confluent Cloud resources.

  2. From the Billing & payment section in the menu, apply the promo code CC100KTS to receive an additional $100 free usage on Confluent Cloud (details). To avoid having to enter a credit card, add an additional promo code CONFLUENTDEV1. With this promo code, you will not have to enter a credit card for 30 days or until your credits run out.

  3. Click on LEARN and follow the instructions to launch a Kafka cluster and enable Schema Registry.

Confluent Cloud

Initialize the project

2

Make a local directory anywhere you’d like for this project:

mkdir optimize-producer-throughput && cd optimize-producer-throughput

Next, create a directory for configuration data:

mkdir configuration

Write the cluster information into a local file

3

From the Confluent Cloud Console, navigate to your Kafka cluster and then select Clients in the lefthand navigation. From the Clients view, click Set up a new client and get the connection information customized to your cluster.

Create new credentials for your Kafka cluster, writing in an appropriate description so that the key is easy to find and delete later. The Confluent Cloud Console will show a configuration similar to below with your new credentials automatically populated (make sure Show API keys is checked).

Copy and paste it into a configuration/ccloud.properties file on your machine.

# Required connection configs for Kafka producer, consumer, and admin
bootstrap.servers={{ BOOTSTRAP_SERVERS }}
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
sasl.mechanism=PLAIN
Do not directly copy and paste the above configuration. You must copy it from the Confluent Cloud Console so that it includes your Confluent Cloud information and credentials.

Download and set up the Confluent CLI

4

This tutorial has some steps for Kafka topic management and producing and consuming events, for which you can use the Confluent Cloud Console or the Confluent CLI. Follow the instructions here to install the Confluent CLI, and then follow these steps connect the CLI to your Confluent Cloud cluster.

Create a topic

5

In this step we’re going to create a topic for use during this tutorial. Use the following command to create the topic:

confluent kafka topic create topic-perf

This creates a topic called topic-perf with a default number of 6 partitions. A topic partition is the unit of parallelism in Kafka, and messages to different partitions can be sent in parallel by producers, written in parallel by different brokers, and read in parallel by different consumers. In general, a higher number of topic partitions results in higher throughput, and to maximize throughput, you want enough partitions to distribute them across the brokers in your cluster. Although it might seem tempting just to create topics with a very large number of partitions, there are trade-offs to increasing the number of partitions. Choose the partition count carefully after benchmarking producer and consumer throughput in your environment. Also take into consideration the design of your data patterns and key assignments so that messages are distributed as evenly as possible across topic partitions to avoid a partition imbalance.

Run a baseline producer performance test

6

Run a performance test to capture a baseline measurement for your Kafka producer, using default configuration parameters. This test will send 10000 records of size 8000 bytes each.

docker run -v $PWD/configuration/ccloud.properties:/etc/ccloud.properties confluentinc/cp-server:7.3.0 /usr/bin/kafka-producer-perf-test \
    --topic topic-perf \
    --num-records 10000 \
    --record-size 8000 \
    --throughput -1 \
    --producer.config /etc/ccloud.properties

Your results will vary depending on your connectivity and bandwidth to the Kafka cluster.

10000 records sent, 134.560525 records/sec (1.03 MB/sec), 25175.34 ms avg latency, 44637.00 ms max latency, 26171 ms 50th, 39656 ms 95th, 42469 ms 99th, 44377 ms 99.9th.

The key result to note is in the last line: the throughput being 134.560525 records/sec (1.03 MB/sec). This is the baseline producer performance with default values.

Run a producer performance test with optimized throughput

7

Run the Kafka producer performance test again, sending the exact same number of records of the same size as the previous test, but this time use configuration values optimized for throughput.

Here are some producer configuration parameters you can set to increase throughput. The values shown below are for demonstration purposes, and you will need to further tune these for your environment.

  • batch.size: increase to 100000–200000 (default 16384)

  • linger.ms: increase to 10–100 (default 0)

  • compression.type=lz4 (default none, i.e., no compression)

  • acks=1 (default all, since Apache Kafka version 3.0)

For a detailed explanation of these and other configuration parameters, read these recommendations for Kafka developers.

docker run -v $PWD/configuration/ccloud.properties:/etc/ccloud.properties confluentinc/cp-server:7.3.0 /usr/bin/kafka-producer-perf-test \
    --topic topic-perf \
    --num-records 10000 \
    --record-size 8000 \
    --throughput -1 \
    --producer.config /etc/ccloud.properties \
    --producer-props \
        batch.size=200000 \
        linger.ms=100 \
        compression.type=lz4 \
        acks=1

Your results will vary depending on your connectivity and bandwidth to the Kafka cluster.

10000 records sent, 740.960285 records/sec (5.65 MB/sec), 3801.36 ms avg latency, 8198.00 ms max latency, 3297 ms 50th, 7525 ms 95th, 7949 ms 99th, 8130 ms 99.9th.

The key result to note is in the last line: the throughput being 740.960285 records/sec (5.65 MB/sec). For the test shown here, 5.65 MB/sec throughput is about a 5x improvement over the 1.03 MB/sec baseline throughput, but again, the improvement factor will vary depending on your environment.

This tutorial has demonstrated how to get started with improving the producer throughput, and you should do further testing in your environment. Continue to tune these configuration parameters, and test it with your specific Kafka producer, not just using kafka-producer-perf-test.

Teardown Confluent Cloud resources

8

You may try another tutorial, but if you don’t plan on doing other tutorials, use the Confluent Cloud Console or CLI to destroy all of the resources you created. Verify they are destroyed to avoid unexpected charges.