In our last article, the center of discussion was “Topics and Partitions”. We learned that a topic is a logical container and acts as a wrapper/box over partitions. Meanwhile, partitions act as real physical containers that store messages. Topics are valuable when it comes to isolating users from the internal structure and making the produce and consume processes more understandable and easily configurable. Partitions make sense when it comes mostly to distribution, fault tolerance, and durability.
Get started with a practical approach to the topic and partitions
To create a topic, first, we need to have a running KAFKA. Check this article for more details. We’ll use containerized KAFKA because when running it as a binary in the Windows Operating System, there are some unsolved problems regarding topics that still exist in KAFKA for Windows. For example, there is still an error when deleting a topic in the Windows ecosystem. To avoid these types of problems, it is recommended not to run KAFKA in Windows. You can easily run it in Windows via docker (because docker runs using Linux VM) and directly in non-windows OS.
After running Kafka, click on the Kafka running container, go to the Terminal tab, and type.
cd opt/bitnami/kafka/bin
Our commands as .sh files live under this folder. You can check all commands using the ls command.
We will use kafka-topics.sh, command to create/delete/list/describe/change topics for KAFKA.
When calling Kafka-topics.sh without any arguments, it returns a list of arguments needed to interact with topics.
To use the command, the only required argument is --bootstrap-server. We already described in detail the concept of bootstrap-server. In short, it indicates to the Kafka broker that connecting to any broker is equal to connecting to the whole cluster.
Let's try to create a topic:
kafka-topics.sh --create --topic myfirsttopic --bootstrap-server localhost:9092
The –create argument indicates that we’re going to create a topic. To create a topic, we should also use --topic argument to exactly describe the topic name.
If you want to see the whole list of topics you have, just call the below command:
kafka-topics.sh --list --bootstrap-server localhost:9092
But how about partitions? Yes, as you realized, it is possible to create a topic without specifying partitions, but it is not a recommended way of creating a topic. When not specifying, KAFKA uses the default configuration file to get partition information. This config file lives in the kafka_version/config/server.properties file.
It is by default 1 in my case, so every topic for now contains only one partition, which is completely non-valuable. To make it valuable, you should define partition count > 1. If you plan to create the same amount of partitions when creating a topic, just change the number of partitions in the config file, and that is it. But in production, I assure you that it has mostly never happened. In almost all cases, we create topics with different amount of partitions depending on business requirements. To specify the exact/different amount of partitions, you should use the --partitions argument.
Kafka-topics.sh –create –topic mysecondtopic –partitions 5 –bootstrap-server localhost:9092
After creating a topic, you can see its internal structure (partition count, leader, In sync-replica, etc.) using the below command:
kafka-topics.sh --describe --topic mysecondtopic --bootstrap-server localhost:9092
Let's describe our first created topic. We use --describe argument to describe the topic details.
As we see, it contains only one partition because when creating a topic, we haven’t explicitly provided it via the --partitions argument.
It is not recommended to change the partition count after the topic is created, but using the --alter argument, it is possible to alter the number of partitions.
To delete a topic, it is enough to specify --delete the argument with the topic name.