In this guide we will go through the process of Installing and setting up Apache Cassandra Version 4 in a Rocky Linux 9 and RHEL 9 Linux distributions.
Apache Cassandra is a free and open source NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Apache Cassandra was initially developed by Facebook and later on acquired by Apache Foundation.
Apache Cassandra is suited for massive, exponentially-growing amounts of constantly transforming data.
Related content:
- How to run Cassandra 4 with Docker and Docker-Compose
- Install and Configure Apache Cassandra 4.0 in Rocky/Alma Linux 8
Cassandra Compared to RDBMS
Cassandra have close analogies to concepts in relational databases:
Keyspace
– Similar to a Database/schema in a RDBMSTable
– Similar to table in RDBMSRow
– Similar to Row in RDBMSColumn
– Similar to Column in RDBMSPrimary key
– Similar to Primary key in RDBMS
Preprequisites
To follow along with this guide, you need:
- A RHEL 9 based server like rocky Linux
- Root access to the server or user with sudo access
- Internet access to download the packages
The following are the steps we will follow to install Cassandra
- Ensure our system is up to date
- Install Java in the system
- Install Apache Cassandra in the system
- Install and configure Apache Cassandra Client (cqlsh)
- Configure Apache Cassandra
Step 1 Ensure our system is up to date
Let’s make sure that the Rocky Linux 9 packages installed on the server are up to date. You can do this by running the following commands:
sudo dnf -y update
Step 2 Install Java in the system
Apache Cassandra requires you to have java 8 in your system for it to run. Confirm that Java is installed by typing this command:
java -version
If you see this output:
# java -version
-bash: java: command not found
Then it means java is not installed. Lets install it with this command:
sudo dnf install -y java-11-openjdk
Once the installation is complete, confirm it with this :
$ java -version
openjdk version "11.0.16.1" 2022-08-12 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.16.1.1-1.el9_0) (build 11.0.16.1+1-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.16.1.1-1.el9_0) (build 11.0.16.1+1-LTS, mixed mode, sharing)
Now that Java 8 is installed in our system, lets install Cassandra.
Step 3. Install Apache Cassandra in the system
Apache Cassandra is not available in the default Rocky Linux 9 repositories. Lets create a repo pointing to Cassandra Repos:
Create this /etc/yum.repos.d/cassandra.repo
file with the content required using this command:
cat > /etc/yum.repos.d/cassandra.repo <<EOF
[cassandra]
name=Apache Cassandra
baseurl=https://downloads.apache.org/cassandra/redhat/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
EOF
Now that we have added the repo lets install cassandra:
sudo dnf install -y cassandra
Confirm that Cassandra has been installed using this command:
$ rpm -qi cassandra
Name : cassandra
Version : 4.0.5
Release : 1
Architecture: noarch
Install Date: Sat 01 Oct 2022 09:44:55 AM UTC
Group : Development/Libraries
Size : 58087905
License : Apache Software License 2.0
Signature : RSA/SHA256, Tue 12 Jul 2022 10:48:07 AM UTC, Key ID e91335d77e3e87cb
Source RPM : cassandra-4.0.5-1.src.rpm
Build Date : Tue 12 Jul 2022 10:47:47 AM UTC
Build Host : 65ec7b415fb7
URL : http://cassandra.apache.org/
Summary : Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store.
Description :
Cassandra is a distributed (peer-to-peer) system for the management and storage of structured data.
Once that commannd is installed, The cassandra binary will be created in /usr/sbin/cassandra
. Since Rocky Linux 9 manages services using systemd, lets create a systemd file in /etc/systemd/system/cassandra.service
with the content required to manage the cassandra service
sudo cat > /etc/systemd/system/cassandra.service <<EOF
[Unit]
Description=Apache Cassandra 4.0
After=network.target
[Service]
Type=simple
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always
[Install]
WantedBy=multi-user.target
EOF
Once the file is created, you can use systemd to manage the service:
Run this command to ensure our new systemd service is registered:
sudo systemctl daemon-reload
To start the service:
sudo systemctl start cassandra
Confirm that the service is running. Please ensure you see that its Active: active (running)
in the following status command:
$ sudo systemctl status cassandra
● cassandra.service - Apache Cassandra 4.0
Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
Active: active (running) since Sat 2022-10-01 09:45:36 UTC; 5s ago
Main PID: 210456 (java)
Tasks: 15 (limit: 21385)
Memory: 1.1G
CPU: 2.474s
CGroup: /system.slice/cassandra.service
└─210456 /usr/bin/java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX>
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/lang/Object;JILjava/lang/Object;JI)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Oct 01 09:45:40 unstable-rockysrv cassandra[210456]: CompileCommand: inline org/apache/cassandra/utils/memory/BufferPool$LocalPool.tryGetInternal(IZ)Ljava/nio/ByteBuffer;
Enable the cassandra service to always run on boot:
sudo systemctl enable cassandra
Use the nodetool status
command to confirm the status of the current node status:
$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 69.08 KiB 16 100.0% 820af619-f0d0-4703-aa1d-90c23fb3c982 rack1
Now that the service is up and running, lets install the client:
Step 4. Install and configure Apache Cassandra Client (cqlsh)
Now that the Apache Cassandra service has been installed and configured we need to connect to it.
The client tool used to access Cassandra (cqlsh
) is a python client. So before installing it we need to set up the environment for python:
Install python3 and python pip
sudo dnf install -y python3 python3-pip
Confirm the installation and Python and pip:
$ python3 -V
Python 3.9.10
$ pip3 -V
pip 21.2.3 from /usr/lib/python3.9/site-packages/pip (python 3.9)
Using pip, install cqlsh:
sudo pip3 install cqlsh
Now we can use cqlsh to connect to cassandra. Since Cassandra is installed locally we do not have to specify any host:
$ cqlsh
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh>
Now that the client is set up, let’s configure Cassandra
Step 4. Configure Apache Cassandra
Lets configure Apache Cassandra. The main configuration file is located here /etc/cassandra/default.conf/cassandra.yaml
Notable configs to change:
cluster_name
– the name of your clusterseeds
contains the list of IP addresses of your cluster seed, comma separatedlisten_address
contains the IP address of your node, this is what allows other nodes to communicate with this node
On top of that, you can also add the env variabble JVM_OPTS
adding any additional JVM command line arguments. This will be passed to the cassandra service when starting.
By default, Cassandra’s cluster name is ‘Test Cluster’. You can change this to your preferred cluster name by logging in using cqlsh
and running the command below.
UPDATE system.local
SET cluster_name = 'Citizix Cluster'
WHERE KEY = 'local';
After that update the config file /etc/cassandra/default.conf/cassandra.yaml
with the new name:
sudo vim /etc/cassandra/default.conf/cassandra.yaml
Then update this line:
cluster_name: 'Citizix Cluster'
Save and exit.
Lets clear the system cache using this:
nodetool flush system
Then restart the cassandra service
sudo systemctl restart cassandra
Log in again to confirm the cluster name as shown.
Conclusion
We managed to install and configure cassanda in the above guide. Please note the following:
- Cassandra stores its log in this directory
/var/log/cassandra/
with the main log file located in/var/log/cassandra/system.log
- Since we created a systemd service, you can also check
stdout
andstderr
logs using this command:sudo journalctl -fu cassandra
- This
/var/lib/cassandra
is set as the default data directory. You can update that in the config file - Cassandra configuration files are stored in this directory
/etc/cassandra/
including the default config file/etc/cassandra/default.conf/cassandra.yaml
- To manage the cassandra service that we created:
# Start the service sudo systemctl start cassandra # Check the service status sudo systemctl status cassandra # Stop the service sudo systemctl stop cassandra # Enable the service sudo systemctl enable cassandra # Restart the service sudo systemctl restart cassandra
Please also check the Cassandra Documentation page here for more info.