In our previous posts related to KAFKA, we learned what is KAFKA, it is use cases, and how KAFKA complements ETL.
The purpose of the current article is to install KAFKA with Zookeeper using 2 popular ways :
- KAFKA binaries
- Docker compose
The way of using docker-compose complements KAFKA binary installation, however, binary version installing will help you to gain a deeper understanding of KAFKA’s internal structure.
Installing via KAFKA binaries:
- Go to Apache Kafka
- Select “Download KAFKA” from the navigation
- Use the latest Scala version of binaries (the current release is kafka_2.13-3.5.1)
- Create a Kafka folder in C: drive and extract zipped files directly into C:/kafka folder
Before diving into details, let’s short to understand what is zookeeper.
In future posts, we will learn about KAFKA brokers, topics, and partitions.
For now, it is enough to understand that, one of the ways to run Kafka is by utilizing ZooKeeper. A few years ago, ZooKeeper was a fundamental component of Kafka's infrastructure, and Kafka was heavily reliant on ZooKeeper. It handled important tasks such as managing consumer offset information and metadata related to topics, partitions, and brokers. However, in recent times, it has become possible to operate Kafka without ZooKeeper due to the introduction of an internal ZooKeeper equivalent within Kafka itself. In essence, this internal component in Kafka now performs the role that ZooKeeper used to play. To elaborate, this component manages critical information about Kafka's brokers, topics, and partitions, and it also plays a pivotal role in the leader election process for partitions.
Most companies are heavily using ZooKeeper in production and for that reason, we’re using Zookeeper in the correct installation process.
We will use 2 configuration files under the config folder
- zookeeper.properties - all zookeeper configuration lives here
- server.properties - all configuration related to the KAFKA server lives here
Before operating with these files we need to store data/log files related to Zookeeper and Kafka. For this purpose, let’s create a data folder under C:\kafka\kafka_your_version and create 2 folders under it.
Now, go to zookeeper.properties (open using Notepad++ if possible) and edit it with the below changes:
Then let’s go and edit server.properties like below:
Cool, now when running KAFKA and Zookeeper they will be able to add their logs/data files to our folders.
KAFKA stores all running scripts under the bin folder. You can find multiple .sh files which is runnable for Linux/Unix OS mostly. However, for Windows OS there is a “windows” folder and multiple .bat files there. These files are analogs of .sh files.
We use “zookeeper-server-start.bat” to run Zookeeper and “kafka-server-start.bat” to run KAFKA.
We will use a command prompt (Win+R -> type cmd -> enter ) to execute these files.
To run Zookeeper properly follow the below steps:
- Open the command prompt.
- Navigate to C:/kafka/kafka_your_version folder
- Execute
bin\windows\zookeeper-server-start.bat config\zookeeper.properties
Zookeeper is ready! You need to remember that, to run Zookeeper we need to provide configuration ( zookeeper.properties) from the argument to our bat file.
After running Zookeeper, it is time to run KAFKA:
- Open another command prompt instance
- Navigate to C:/kafka/kafka_your_version folder
- Execute
bin\windows\kafka-server-start.bat config\server.properties
KAFKA is also ready!
If you see the below-generated log files then it means KAFKA has successfully been started!
You need to remember that, to run KAFKA we need to provide configuration ( server.properties) from the argument to our bat file.
As you understand, we can make things run more easily using environment variables.
- Press the Win button
- Type environment variables and select the given item.
- Select the “Environment variables” button
Add your KAFKA folders to the Path as a new row:
Sometimes adding/changing Path information can require logging off or restarting. After properly adding your KAFKA folder ( in our case, it is C:/kafka/kafka_3.5.1) you should be able to run all .bat files from the command line (command prompt) without specifying/changing the folder.
Photo10
Let’s try to run KAFKA and Zookeeper again with the new Path configuration:
- Open the command prompt
- Without navigating to any specific folder just type
zookeeper-server-start C:/kafka/kafka_3.5.1/config/zookeeper.properties
and press enter.
- For kafka to run properly just type
kafka-server-start C:/kafka/kafka_3.5.1/config/server.properties
and press enter.
If you want to check in Kafka and Zookeeper are running, just open another command prompt instance and type the following command :
If you receive “Created topic your_topic_name“ then it means everything is working just fine! There is no need to understand what this command is doing. We’ll dive into the details of topics in the next articles.
Installing KAFKA and Zookeeper via docker.
The last year’s one of the most popular development tools is, of course, Docker. Docker is a containerization system that helps isolate applications from various installation and conflict issues. If you want to understand the need for docker and its installation process you can follow this post.
Without diving into the details of docker-compose, you can just use the below commands to create a docker-compose.yml file.
version: "2"
services:
zookeeper:
image: docker.io/bitnami/zookeeper:3.8
ports:
- "2181:2181"
volumes:
- "zookeeper_data:/bitnami"
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: docker.io/bitnami/kafka:3.4
ports:
- "9092:9092"
volumes:
- "kafka_data:/bitnami"
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092,EXTERNAL://localhost:9094
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,PLAINTEXT:PLAINTEXT
depends_on:
- zookeeper
volumes:
zookeeper_data:
driver: local
kafka_data:
driver: local
Go to the folder where you created the docker-compose file and run the below code:
docker-compose up -d
PS: docker should be installed and runnable before executing the above command otherwise you’ll receive the below error
When successfully running, the command line should be like this:
You can select KAFKA from the docker desktop’s container, click on it, and navigate to the .sh files directory.
After navigating, follow the below commands to test KAFKA from the docker desktop "Terminal":
KAFKA is by default command line oriented. But if you want to use UI for interaction with it, you can download Offset Explorer and connect to KAFKA.
- Open Offset Explorer
- Right-click on Clusters and select “Add new connection”
- Configure “properties” as described below:
- Navigate to the Security item and configure it as described below:
- Click Add. If everything is ok, you’ll see the below screen: