This article assume that you have basic knowledge about Kafka, and Docker Compose. For development environment, there is no strict requirement about high availability and durability, we will try to make our cluster same as production, staging environment as much as possible without caring about faulty hardware corrupted.
Our infrastructure architecture is very simple, there is Virtual Machine that run the application and the other Virtual Machine that running the database, caching engine, search engine, etc…. The application will connect to Kafka using private ip address.
First create kafka-cluster
directory where we store needed assets to run docker container.
Next, download and extract kafka
to the current directory.
We will create configfiles
and dockerfiles
to store the properties
file for zookeeperkafka
and Dockerfile
for them.
[root@localhost kafka-cluster]# mkdir configfiles dockerfiles
Let prepare config file for zookeeper
, our zookeeper is private by default and do not expose any port to the host. The file is located at configfiles/zookeeper.properties
. We will mount this file to container when it’s running.
# the directory where the snapshot is stored.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=14000
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
# Disable the adminserver by default to avoid port conflicts.
# Set the port to something non-conflicting if choosing to enable this
admin.enableServer=false
# admin.serverPort=8080
Next step, we will create Dockerfile
for zookeeper to build zookeeper
image. The file will be located at dockerfiles/ZookeeperDockerfile
FROM openjdk:20-slim-buster
WORKDIR /app
COPY . .
CMD ./bin/zookeeper-server-start.sh config/zookeeper.properties
Next we will create docker-compose.yml
to setup zookeeper
service, volume and mount configuration file to container. This file will be located at current directory.
version: "3.9"
networks:
default:
name: kafka
driver: bridge
volumes:
zookeeperdata:
services:
zookeeper:
build:
context: kafka_2.13-3.3.1
dockerfile: ../dockerfiles/ZookeeperDockerfile
volumes:
- zookeeperdata:/tmp/zookeeper
- ./configfiles/zookeeper.properties:/app/config/zookeeper.properties
Now the current directory tree look like.
[root@localhost kafka-cluster]# tree .
.
├── configfiles
│ └── zookeeper.properties
├── docker-compose.yml
├── dockerfiles
│ └── ZookeeperDockerfile
├── kafka_2.13-3.3.1
│ ├── bin
│ │ ├── connect-distributed.sh
│ │ ├── connect-mirror-maker.sh
│ │ ├── connect-standalone.sh
│ │ ├── kafka-acls.sh
│ │ ├── kafka-broker-api-versions.sh
│ │ ├── kafka-cluster.sh
│ │ ├── kafka-configs.sh
│ │ ├── kafka-console-consumer.sh
│ │ ├── kafka-console-producer.sh
│ │ ├── kafka-consumer-groups.sh
│ │ ├── kafka-consumer-perf-test.sh
│ │ ├── kafka-delegation-tokens.sh
│ │ ├── kafka-delete-records.sh
│ │ ├── kafka-dump-log.sh
│ │ ├── kafka-features.sh
│ │ ├── kafka-get-offsets.sh
Now we can able to build and run zookeeper
container.
[root@localhost kafka-cluster]# docker compose up -d --build zookeeper
[+] Building 3.0s (8/8) FINISHED
[root@localhost kafka-cluster]#
You can see the log to make sure it’s running.
[root@localhost kafka-cluster]# docker compose ps
NAME COMMAND SERVICE
kafka-cluster-zookeeper-1 "/bin/sh -c './bin/z…" zookeeper
[root@localhost kafka-cluster]# docker compose logs -f zookeeper
Next we will setup three kafka brokers, let assume let the private ip address of our virtual machine is 10.0.2.15
. We will have 3 container that has name leesingaren
and temo
, each one will expose different port on host machine 40005000
and 6000
.
The Dockerfile
for each broker will look like this, the file is located at dockerfiles/KafkaDockerfile
For each broker, we will prepare their server.properties
file, the port that they will listen for inter-broker communicate and the port for receiving connection from external client.
For leesin
broker, we put the server.properties
into configfiles/leesin/server.properties
and mount the configuration file to the container when it start.
The content is look like this.
Two thing to keep eyes on is the listeners
value and advertised.listeners
.
listeners=INTERNAL://:9092,EXTERNAL://:4000
advertised.listeners=INTERNAL://leesin:9092,EXTERNAL://10.0.2.15:4000
For now, our docker-compose.yml
is look like this.
Before doing that, if you set up a firewall, make sure you allow the incoming traffic from port 40005000
and 6000
.
Now start the leesin
broker.
[root@localhost kafka-cluster]# docker compose up -d leesin
You can check the log and see that leesin
broker worked.
[root@localhost kafka-cluster]# docker compose ps
NAME COMMAND SERVICE STATUS PORTS
kafka-cluster-leesin-1 "/bin/sh -c './bin/k…" leesin running 0.0.0.0:4000->4000/tcp, :::4000->4000/tcp
kafka-cluster-zookeeper-1 "/bin/sh -c './bin/z…" zookeeper running
[root@localhost kafka-cluster]#
Keep create server.properties
file for 2 left broker garen
and temo
.
The garen
‘s file look like this, the file is located at configfiles/garen/server.properties
(the different from leesin
configuration file is just broker_id
, listener
, advertised.listeners.name
)
The temo
‘s file look like this, the file is located at configfiles/temo/server.properties
For now the directory tree is look like.
[root@localhost kafka-cluster]# tree
.
├── configfiles
│ ├── garen
│ │ └── server.properties
│ ├── leesin
│ │ └── server.properties
│ ├── temo
│ │ └── server.properties
│ └── zookeeper.properties
├── docker-compose.yml
├── dockerfiles
│ ├── KafkaDockerfile
│ └── ZookeeperDockerfile
├── kafka_2.13-3.3.1
│ ├── bin
│ │ ├── connect-distributed.sh
│ │ ├── connect-mirror-maker.sh
│ │ ├── kafka-server-start.sh
│ │ ├── zookeeper-security-migration.sh
│ │ ├── zookeeper-server-start.sh
│ │ ├── zookeeper-server-stop.sh
│ │ └── zookeeper-shell.sh
│ ├── config
│ │ ├── connect-console-sink.properties
│ │ ├── connect-console-source.properties
│ │ ├── connect-distributed.properties
│ │ ├── connect-file-sink.properties
Update docker-compose.yml
and add garentemo
service. The file’s content is look like this.
Time to start them
[root@localhost kafka-cluster]# docker compose up -d garen temo
Verify these container is running.
[root@localhost kafka-cluster]# docker compose ps
NAME COMMAND SERVICE STATUS PORTS
kafka-cluster-garen-1 "/bin/sh -c './bin/k…" garen running 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp
kafka-cluster-leesin-1 "/bin/sh -c './bin/k…" leesin running 0.0.0.0:4000->4000/tcp, :::4000->4000/tcp
kafka-cluster-temo-1 "/bin/sh -c './bin/k…" temo running 0.0.0.0:6000->6000/tcp, :::6000->6000/tcp
kafka-cluster-zookeeper-1 "/bin/sh -c './bin/z…" zookeeper running
[root@localhost kafka-cluster]#
Now, we should set up a user interface
to check the current kafka cluster start, i will use an opensource from https://github.com/provectus/kafka-ui
In docker-compose.yml
file, i will add this service.
ui:
image: provectuslabs/kafka-ui:latest
ports:
- "8080:8080"
env_file:
- env-files/ui.env
The env_files
is look like this. The file is located at env-files/ui.env
Start this service.
[root@localhost kafka-cluster]# docker compose up -d ui
Now visit 10.0.2.15:8080
to see the cluster status. By using UI you can also create some topic, push message to these topic, these steps is too make sure our cluster work as expected.
I pushed the source code to GitHub, you can download the source code here.
Thank for reading !