How to install Apache Kafka on Ubuntu 22.04

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds

It is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

# CORE CAPABILITIES

  • HIGH THROUGHPUT – Deliver messages at network limited throughput using a cluster of machines with latencies as low as 2ms.
  • SCALABLE – Scale production clusters up to a thousand brokers, trillions of messages per day, petabytes of data, hundreds of thousands of partitions. Elastically expand and contract storage and processing.
  • PERMANENT STORAGE – Store streams of data safely in a distributed, durable, fault-tolerant cluster.
  • HIGH AVAILABILITY – Stretch clusters efficiently over availability zones or connect separate clusters across geographic regions.

Apache Kafka offers these four main interfaces (APIs – Application Programming Interfaces). Know more about each API at the official documentation page:

  • Admin API
  • Producer API
  • Consumer API
  • Streams API
  • Connect API

In this guide we will learn how to install and test Apache Kafka installation on an Ubuntu 22.04 server.

Checkout this also:

# Step 1: Ensure that the system is up to date

Update the system to ensure that all the packages are up to date

sudo apt update
sudo apt upgrade -y

# Step 2: Install Java

Apache Kafka needs Java to run, hence first we need to install that on our local environment and it must be equal or greater than Java 8. Well, we don’t need to add any third repository because the package to get JAVA is already there on the system base repo.

Let us install latest Java with:

sudo apt install openjdk-11-jre

Type y and press enter when prompted to accept the installation.

You can confirm the version of Java installed using this command:

$ java -version
openjdk version "11.0.16" 2022-07-19
OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)
OpenJDK 64-Bit Server VM (build 11.0.16+8-post-Ubuntu-0ubuntu122.04, mixed mode, sharing)

# Step 3: Get the latest Kafka

Apache Kafka is available as a tarball file on the official website. Head over the Kafka Downloads page here to find out about the latest version. As of the writing of this article, the current stable version is 3.2.3.

We are going to download kafka to the /opt directory. Use this command to switch directory and download the latest kafka

sudo -i
cd /opt
curl -LO https://dlcdn.apache.org/kafka/3.2.3/kafka_2.13-3.2.3.tgz

Next, let us extract the content and ensure that the content is available in the /opt/kafka directory.

tar -xzvf kafka_2.13-3.2.3.tgz
rm -rf kafka_*.tgz
mv kafka* kafka

# Step 3: Starting Kafka and Zookeper

When testing, we can can run both Zookeeper and Kafka service script directly, manually.

Run the following commands in order to start all services in the correct order:

First, start the ZooKeeper service. Do this in the directory with the extracted kafka files:

bin/zookeeper-server-start.sh config/zookeeper.properties

Open another terminal session and run this command to start the Kafka broker:

bin/kafka-server-start.sh config/server.properties

Once all services have successfully launched, you will have a basic Kafka environment running and ready to use.

# Step 4: Create systemd services for Zookeeper and Kafka

When running Kafka Service in a production server we have to run it in the background. Hence, create systemd units for both the scripts.

# Create systemd file Zookeeper

We need to start zookeeper first. Use this command to open the zookeeper systemd unit file in your favourite text editor. I am using vim

sudo vim /etc/systemd/system/zookeeper.service

Add this content to the file

[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Finally, save and close the file

# Create systemd file Kafka

Next, let us create a systemd file for kafka broker. Open the service file with this command:

sudo vim /etc/systemd/system/kafka.service

Add the following content to the file. Note: Change the Java_Home, in case you are using some other version. To find it you can use the command –  sudo find /usr/ -name *jdk

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
[Install]
WantedBy=multi-user.target

Save the file and exit.

Finally reload systemd units with this command for the newly added units to reflect

sudo systemctl daemon-reload

# Step 5: Start and enable the Zookeeper and Kafka Services

Now, let’s start and enable both server services to make sure they will also get active even after the system reboot.

sudo systemctl start zookeeper
sudo systemctl start kafka

Confirm the services status to ensure that they are both running as expected:

$ sudo systemctl status zookeeper
● zookeeper.service - Apache Zookeeper server
     Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-09-30 06:44:05 UTC; 6s ago
       Docs: http://zookeeper.apache.org
   Main PID: 7029 (java)
      Tasks: 28 (limit: 4392)
     Memory: 68.0M
        CPU: 2.196s
     CGroup: /system.slice/zookeeper.service
             └─7029 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20

And for kafka

$ sudo systemctl status kafka
● kafka.service - Apache Kafka Server
     Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-09-30 06:44:05 UTC; 1min 18s ago
       Docs: http://kafka.apache.org/documentation.html
   Main PID: 7043 (java)
      Tasks: 69 (limit: 4392)
     Memory: 319.0M
        CPU: 6.722s
     CGroup: /system.slice/kafka.service
             └─7043 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC
Sep 30 06:44:13 unstable-ubuntusrv kafka-server-start.sh[7043]: [2022-09-30 06:44:13,449] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
Sep 30 06:44:13 unstable-ubuntusrv kafka-server-start.sh[7043]: [2022-09-30 06:44:13,468] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Starting socket server acceptors and processors (kafka.network.SocketServer)
Sep 30 06:44:13 unstable-ubuntusrv kafka-server-start.sh[7043]: [2022-09-30 06:44:13,504] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Started data-plane acceptor and processor(s) for endpoint : ListenerName(PLAINTEXT) (kafka.network.SocketServer)
Sep 30 06:44:13 unstable-ubuntusrv kafka-server-start.sh[7043]: [2022-09-30 06:44:13,505] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Started socket server acceptors and processors (kafka.network.SocketServer)

# Step 6: Create Test Topics on Kafka

Kafka allows us to read, write, store, and process events across the various machines, however, to store these events we need someplace or folder and that called “Topics“. So on your server terminal create at least one topic using the following command, using the same later you can create as many Topics as you want.

Let’s say our first Topic name is – loginevents. So to create the same run:

Go to your Kafka directory:

cd /opt/kafka/

And use the Topics script:

./bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic loginevents

If it is successful, you should see this output: Created topic loginevents.

After creating as many topics as you want, you can list with this command:

./bin/kafka-topics.sh --list --bootstrap-server localhost:9092

# Step 7: Test publishing and Consuming from a topic

Finally, we can test publishing stuff to the topic and also consuming them. Kafka offers two APIs- Producer and Consumer,  for both it offers a command-line client. The producer is responsible for creating events and the Consumer uses them to display or reads the data generated by the Producer.

Open Two terminal tabs or sessions to understand the event generator and reader setup in real-time.

On one first terminal, we will have the consumer running so we can see what is being published to the topic. Run this command to check the commands generated in real time:

./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic loginevents --from-beginning

On the second terminal, we will test producing an event then watch the first terminal. Use this command:

./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic loginevents

The command will present a prompt where you can type any text, this will be consumed and displayed in the first window.

# Conclusion

In this guide we learnt how to install and test Apache Kafka on an Ubuntu 22.04 server. You can now connect to it through any broker or any programming language of your choice.

Last updated on Mar 20, 2024 17:19 +0300
comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy