Skip to main content
JavaTestcontainers

Simplify development of Kafka applications with Redpanda and Testcontainers

By October 4, 2022No Comments

Redpanda is a developer-first streaming data platform compatible with the Kafka API, which also has several advantages like being ZooKeeper-free, deployed as a single native binary which helps with various Kubernetes based deployment scenarios, and so on. Redpanda delivers up to 10x lower average latencies and up to 6x faster transactions than with Kafka.

Testing is an essential part of the DevOps cycle, but without proper tools it can quickly become a chore and slow down your development process. In this article we look at how you can test your applications that need Kafka, and we’re going to use Testcontainers and Redpanda for the best developer experience and efficiency. Redpanda offers fast starting single binary that is friendly for containerized workloads, and a powerful easy CLI to integrate with the other technologies. Testcontainers reinvents the developer experience of creating reliable functional tests.

Testcontainers provides a programmatic API to create, configure and manage the lifecycle of lightweight, throwaway instances of common technologies applications rely on: databases, message brokers, or anything else that can run in a Docker container. Which is an extremely versatile approach to creating integration tests with the real technologies used in production. 

For example, you can use Testcontainers-java library to ensure your Spring application works well against an instance of Redpanda running in a Docker container: the drivers work, the API your code is using return the expected responses, the data marshalling mechanisms are compatible and so on. 

Using Redpanda for the tests helps you avoid issues that might be incredibly hard to reproduce with mocks or embedded Kafka or using other means of simulating it. And Testcontainers removes the complexity from that setup and makes tests reproducible on any team-member workstation and in CI. 

Let’s take a look at how you can run integration tests against an instance of Redpanda using Testcontainers. There are three main things Testcontainers excels at: 

  • Container lifecycle & cleanup
  • Container & service configuration in the container
  • Integration with application or test frameworks

The following line is all you need to create an object representing the Redpanda container: 

RedpandaContainer kafka = new RedpandaContainer("docker.redpanda.com/vectorized/redpanda:v22.2.1");

The API exposes the lifecycle methods, you can start and stop the container. It also allows you to programmatically configure both the container, for example, publishing the required ports or setting environment variables in the container and the service running in the container, for example by copying the configuration files into the containers or creating the database schema in a freshly created database. 

Testcontainers-java has different ways to manage the lifecycle, for example tying it to the lifecycle of the JUnit tests.

After the container is started, the last thing in the test setup is to make sure your application is aware where Redpanda is running. For Kafka compatible technologies it the location of the `bootstrap-servers` that clients connect to, which you can inquery this information directly from the container object you created:

kafka.getBootstrapServers();

For a Spring-boot app you can use the @DynamicPropertySource mechanism to propagate this data to the Spring-boot context in a very idiomatic way: 

  @DynamicPropertySource
  public static void setupthings(DynamicPropertyRegistry registry) {
    registry.add("spring.kafka.bootstrap-servers", kafka::getBootstrapServers);
  }

After that you can run the tests normally and Testcontainers will pull the Docker image, create the container, configure it, run it, and after all the tests clean everything up so the environment can be used for the next test run. 

Running the tests 

Let’s look at a sample run of the tests which include using a Redpanda container via Testcontainers. 

First, we add a dependency to the org.testcontainers:redpanda artifact:

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>redpanda</artifactId>
    <version>1.17.5</version>
    <scope>test</scope>
</dependency>

Then we define the container instance exactly how we saw it above and call .start() on it. 

RedpandaContainer kafka = new RedpandaContainer("docker.redpanda.com/vectorized/redpanda:v22.2.1");
kafka.start();

Let’s look at the logs of the test run and check what’s happening with the Redpanda container:

2022-09-10 21:00:47.926  INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1]     : Pulling docker image: docker.redpanda.com/vectorized/redpanda:v22.2.1. Please be patient; this may take some time but only needs to be done once.
2022-09-10 21:00:48.234  INFO 80778 --- [ers-lifecycle-1] o.t.utility.RegistryAuthLocator          : Credential helper/store (docker-credential-desktop) does not have credentials for docker.redpanda.com
2022-09-10 21:00:50.990  INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1]     : Starting to pull image
2022-09-10 21:00:50.999  INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1]     : Pulling image layers:  0 pending,  0 downloaded,  0 extracted, (0 bytes/0 bytes)
…
2022-09-10 21:01:03.636  INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1]     : Pulling image layers:  0 pending,  7 downloaded,  7 extracted, (128 MB/128 MB)
2022-09-10 21:01:03.666  INFO 80778 --- [tream--85575341] ?.r.com/vectorized/redpanda:v22.2.1]     : Pull complete. 7 layers, pulled in 12s (downloaded 128 MB at 10 MB/s)
2022-09-10 21:01:03.692  INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1]     : Creating container for image: docker.redpanda.com/vectorized/redpanda:v22.2.1
2022-09-10 21:01:04.223  INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1]     : Container docker.redpanda.com/vectorized/redpanda:v22.2.1 is starting: cd842c5676f3a87c4ce749b04d725bb57a7efb0b78a4e4ca0edcc5dc46e8652b
2022-09-10 21:01:05.801  INFO 80778 --- [ers-lifecycle-1] ?.r.com/vectorized/redpanda:v22.2.1]     : Container docker.redpanda.com/vectorized/redpanda:v22.2.1 started in PT18.080461S

Testcontainers downloads the Docker image for the container, luckily it’s a fairly modest 128M, which can be pulled pretty quickly (pulled in 12s (downloaded 128 MB at 10 MB/s)). 

The consecutive runs will of course reuse the local cache and won’t spend time pulling the image again considerably improving the run time: 

Container docker.redpanda.com/vectorized/redpanda:v22.2.1 started in PT3.244128S

And the tests pass normally, the application is using Redpanda via its Kafka compatible API, sending and receiving the messages normally. 

Another curious observation is that given the compatibility, you can easily replace the container definition with another Kafka implementation and rerun the tests. Substitute the Redpanda container instantiation with a more traditional: 

KafkaContainer kafka = new KafkaContainer(DockerImageName.parse("confluentinc/cp-kafka:5.4.6"));

and we don’t have to change another line in the tests and they run against a Kafka instance running in an ephemeral Docker container. 

The tests of course pass, but a curious thing, from the logs we do see that Redpanda container does start approximately twice as fast Apache Kafka: 

2022-09-10 21:46:03.783  INFO 81186 --- [ers-lifecycle-0] 🐳 [confluentinc/cp-kafka:5.4.6]         : Container confluentinc/cp-kafka:5.4.6 started in PT6.561209S

A curious detail about this is that having a Testcontainers module helps with it quite a bit. The RedpandaContainer class encapsulates information on the best practices both from the testing point of view and using Redpanda efficiently. For example, the module starts Redpanda in the dev-container mode. 

command = command + "/usr/bin/rpk redpanda start --mode dev-container ";

The dev-container flag is an umbrella switch for configuring Redpanda with the most sensible config for the tests. You can check the issue for more details, but in a nutshell the container for the tests can make different default tradeoff decisions compared to running in production, like being corrupted if everything crashes, or run faster than otherwise by limiting replication, or mapping some files into a memory filesystem, and so on. 

In the tests, you can of course override all and any config, so if you want particular tests run against Redpanda config without the dev-container mode, it’s possible. 

This is a great approach to improve developer experience further for developers testing their apps with Testcontainers and Redpanda. What’s also great is that it will also benefit from the optimizations added in the future without code changes in the tests. Note that other testcontainers modules also use similar optimizations because it’s really the most convenient integration point to specify that test-specific configuration without making default production config more confusing! 

Conclusion

In this article we looked at how simple it is to use Testcontainers for enabling integration tests against Redpanda running in Docker. We looked at a sample Spring Boot application running the tests and compared the startup times for Redpanda and Apache Kafka containers. In our tests, Redpanda starts twice as fast as Apache Kafka which is definitely an improvement for integration tests, especially if you want good levels of isolation and start containers more frequently. 

One of the reasons for the better startup time, is that Testcontainers provided abstractions are the ideal place to enable test-specific optimizations and config, like the dev-container flag in the Redpanda case enhancing developer experience for testing Java and Kafka applications.

You can look at the sample application, run the tests, do your own measurements in the GitHub repo

Happy testing!