How to Install and Configure Apache Cassandra 4.0 in Rocky/Alma Linux 9

In this guide we will go through the process of Installing and setting up Apache Cassandra Version 4 in a Rocky Linux 9 and RHEL 9 Linux distributions.

Apache Cassandra is a free and open source NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Apache Cassandra was initially developed by Facebook and later on acquired by Apache Foundation.

Apache Cassandra is suited for massive, exponentially-growing amounts of constantly transforming data.

Related content:

# Cassandra Compared to RDBMS

Cassandra have close analogies to concepts in relational databases:

  • Keyspace – Similar to a Database/schema in a RDBMS
  • Table – Similar to table in RDBMS
  • Row – Similar to Row in RDBMS
  • Column – Similar to Column in RDBMS
  • Primary key – Similar to Primary key in RDBMS

# Preprequisites

To follow along with this guide, you need:

  • A RHEL 9 based server like rocky Linux
  • Root access to the server or user with sudo access
  • Internet access to download the packages

The following are the steps we will follow to install Cassandra

  1. Ensure our system is up to date
  2. Install Java in the system
  3. Install Apache Cassandra in the system
  4. Install and configure Apache Cassandra Client (cqlsh)
  5. Configure Apache Cassandra

# Step 1 Ensure our system is up to date

Let’s make sure that the Rocky Linux 9 packages installed on the server are up to date. You can do this by running the following commands:

sudo dnf -y update

# Step 2 Install Java in the system

Apache Cassandra requires you to have java 8 in your system for it to run. Confirm that Java is installed by typing this command:

java -version

If you see this output:

# java -version
-bash: java: command not found

Then it means java is not installed. Lets install it with this command:

sudo dnf install -y java-11-openjdk

Once the installation is complete, confirm it with this :

$ java -version
openjdk version "11.0.16.1" 2022-08-12 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.16.1.1-1.el9_0) (build 11.0.16.1+1-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.16.1.1-1.el9_0) (build 11.0.16.1+1-LTS, mixed mode, sharing)

Now that Java 8 is installed in our system, lets install Cassandra.

# Step 3. Install Apache Cassandra in the system

Apache Cassandra is not available in the default Rocky Linux 9 repositories. Lets create a repo pointing to Cassandra Repos:

Create this /etc/yum.repos.d/cassandra.repo file with the content required using this command:

cat > /etc/yum.repos.d/cassandra.repo <<EOF
[cassandra]
name=Apache Cassandra
baseurl=https://downloads.apache.org/cassandra/redhat/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
EOF

Now that we have added the repo lets install cassandra:

sudo dnf install -y cassandra

Confirm that Cassandra has been installed using this command:

$ rpm -qi cassandra
Name        : cassandra
Version     : 4.0.5
Release     : 1
Architecture: noarch
Install Date: Sat 01 Oct 2022 09:44:55 AM UTC
Group       : Development/Libraries
Size        : 58087905
License     : Apache Software License 2.0
Signature   : RSA/SHA256, Tue 12 Jul 2022 10:48:07 AM UTC, Key ID e91335d77e3e87cb
Source RPM  : cassandra-4.0.5-1.src.rpm
Build Date  : Tue 12 Jul 2022 10:47:47 AM UTC
Build Host  : 65ec7b415fb7
URL         : http://cassandra.apache.org/
Summary     : Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store.
Description :
Cassandra is a distributed (peer-to-peer) system for the management and storage of structured data.

Once that commannd is installed, The cassandra binary will be created in /usr/sbin/cassandra. Since Rocky Linux 9 manages services using systemd, lets create a systemd file in /etc/systemd/system/cassandra.service with the content required to manage the cassandra service

sudo cat > /etc/systemd/system/cassandra.service <<EOF
[Unit]
Description=Apache Cassandra 4.0
After=network.target

[Service]
Type=simple
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra

ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always

[Install]
WantedBy=multi-user.target
EOF

Once the file is created, you can use systemd to manage the service:

Run this command to ensure our new systemd service is registered:

sudo systemctl daemon-reload

To start the service:

sudo systemctl start cassandra

Confirm that the service is running. Please ensure you see that its Active: active (running) in the following status command:

$ sudo systemctl status cassandra
● cassandra.service - Apache Cassandra 4.0
     Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
     Active: active (running) since Sat 2022-10-01 09:45:36 UTC; 5s ago
   Main PID: 210456 (java)
      Tasks: 15 (limit: 21385)
     Memory: 1.1G
        CPU: 2.474s
     CGroup: /system.slice/cassandra.service
             └─210456 /usr/bin/java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX>

Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/lang/Object;JILjava/lang/Object;JI)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/memory/BufferPool$LocalPool.tryGetInternal(IZ)Ljava/nio/ByteBuffer;

Enable the cassandra service to always run on boot:

sudo systemctl enable cassandra

Use the nodetool status command to confirm the status of the current node status:

$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                               Rack
UN  127.0.0.1  69.08 KiB  16      100.0%            820af619-f0d0-4703-aa1d-90c23fb3c982  rack1

Now that the service is up and running, lets install the client:

# Step 4. Install and configure Apache Cassandra Client (cqlsh)

Now that the Apache Cassandra service has been installed and configured we need to connect to it.

The client tool used to access Cassandra (cqlsh) is a python client. So before installing it we need to set up the environment for python:

Install python3 and python pip

sudo dnf install -y python3 python3-pip

Confirm the installation and Python and pip:

$ python3 -V
Python 3.9.10

$ pip3 -V
pip 21.2.3 from /usr/lib/python3.9/site-packages/pip (python 3.9)

Using pip, install cqlsh:

sudo pip3 install cqlsh

Now we can use cqlsh to connect to cassandra. Since Cassandra is installed locally we do not have to specify any host:

$ cqlsh
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh>

Now that the client is set up, let’s configure Cassandra

# Step 4. Configure Apache Cassandra

Lets configure Apache Cassandra. The main configuration file is located here /etc/cassandra/default.conf/cassandra.yaml

Notable configs to change:

  • cluster_name – the name of your cluster
  • seeds contains the list of IP addresses of your cluster seed, comma separated
  • listen_address contains the IP address of your node, this is what allows other nodes to communicate with this node

On top of that, you can also add the env variabble JVM_OPTS adding any additional JVM command line arguments. This will be passed to the cassandra service when starting.

By default, Cassandra’s cluster name is ‘Test Cluster’. You can change this to your preferred cluster name by logging in using cqlsh and running the command below.

UPDATE system.local 
SET cluster_name = 'Citizix Cluster' 
WHERE KEY = 'local';

After that update the config file /etc/cassandra/default.conf/cassandra.yaml with the new name:

sudo vim /etc/cassandra/default.conf/cassandra.yaml

Then update this line:

cluster_name: 'Citizix Cluster'

Save and exit.

Lets clear the system cache using this:

nodetool flush system

Then restart the cassandra service

sudo systemctl restart cassandra

Log in again to confirm the cluster name as shown.

# Conclusion

We managed to install and configure cassanda in the above guide. Please note the following:

  • Cassandra stores its log in this directory /var/log/cassandra/ with the main log file located in /var/log/cassandra/system.log
  • Since we created a systemd service, you can also check stdout and stderr logs using this command:sudo journalctl -fu cassandra
  • This /var/lib/cassandra is set as the default data directory. You can update that in the config file
  • Cassandra configuration files are stored in this directory /etc/cassandra/ including the default config file /etc/cassandra/default.conf/cassandra.yaml
  • To manage the cassandra service that we created:# Start the service sudo systemctl start cassandra # Check the service status sudo systemctl status cassandra # Stop the service sudo systemctl stop cassandra # Enable the service sudo systemctl enable cassandra # Restart the service sudo systemctl restart cassandra

Please also check the Cassandra Documentation page here for more info.

Last updated on Mar 20, 2024 17:19 +0300
comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy