In this guide we will go through the process of Installing and setting up Apache Cassandra Version 4 in Rocky Linux 8 and RHEL 8 Linux distributions.
Apache Cassandra is a free and open source NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Apache Cassandra was initially developed by Facebook and later on acquired by Apache Foundation.
Apache Cassandra is suited for massive, exponentially-growing amounts of constantly transforming data.
Related content:
- How to run Cassandra 4 with Docker and Docker-Compose
- How to Install and Configure Apache Cassandra 4.0 in Rocky/Alma Linux 9
Cassandra Compared to RDBMS
Cassandra have close analogies to concepts in relational databases:
Keyspace
– Similar to a Database/schema in a RDBMSTable
– Similar to table in RDBMSRow
– Similar to Row in RDBMSColumn
– Similar to Column in RDBMSPrimary key
– Similar to Primary key in RDBMS
Preprequisites
To follow along with this guide, you need:
- Rocky Linux 8 server
- Root access to the server or user with sudo access
- Internet access to download the packages
The following are the steps we will follow to install Cassandra
- Ensure our system is up to date
- Install Java in the system
- Install Apache Cassandra in the system
- Install and configure Apache Cassandra Client (cqlsh)
- Configure Apache Cassandra
Step 1 Ensure our system is up to date
Let’s make sure that the Rocky Linux 8 packages installed on the server are up to date. You can do this by running the following commands:
sudo dnf -y update
Step 2 Install Java in the system
Apache Cassandra requires you to have java 8 in your system for it to run. Confirm that Java is installed by typing this command:
java -version
If you see this output:
# java -version
-bash: java: command not found
Then it means java is not installed. Lets install it with this command:
sudo dnf install -y java-1.8.0-openjdk
Once the installation is complete, confirm it with this :
# java -version
openjdk version "1.8.0_302"
OpenJDK Runtime Environment (build 1.8.0_302-b08)
OpenJDK 64-Bit Server VM (build 25.302-b08, mixed mode)
Now that Java 8 is installed in our system, lets install Cassandra.
Step 3. Install Apache Cassandra in the system
Apache Cassandra is not available in the default Rocky Linux 8 repositories. Lets create a repo pointing to Cassandra Repos:
Create this /etc/yum.repos.d/cassandra.repo
file with the content required using this command:
cat > /etc/yum.repos.d/cassandra.repo <<EOF
[cassandra]
name=Apache Cassandra
baseurl=https://downloads.apache.org/cassandra/redhat/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
EOF
Now that we have added the repo lets install cassandra:
sudo dnf install -y cassandra
Confirm that Cassandra has been installed using this command:
# rpm -qi cassandra
Name : cassandra
Version : 4.0.0
Release : 1
Architecture: noarch
Install Date: Tue 31 Aug 2021 08:59:00 AM UTC
Group : Development/Libraries
Size : 54941890
License : Apache Software License 2.0
Signature : RSA/SHA512, Thu 22 Jul 2021 10:22:35 PM UTC, Key ID 5e85b9ae0b84c041
Source RPM : cassandra-4.0.0-1.src.rpm
Build Date : Thu 22 Jul 2021 10:22:10 PM UTC
Build Host : 0b542adba94d
Relocations : (not relocatable)
URL : http://cassandra.apache.org/
Summary : Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store.
Description :
Cassandra is a distributed (peer-to-peer) system for the management and storage of structured data.
Once that commannd is installed, The cassandra binary will be created in /usr/sbin/cassandra
. Since Rocky Linux 8 manages services using systemd, lets create a systemd file in /etc/systemd/system/cassandra.service
with the content required to manage the cassandra service
cat > /etc/systemd/system/cassandra.service <<EOF
[Unit]
Description=Apache Cassandra 4.0
After=network.target
[Service]
Type=simple
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always
[Install]
WantedBy=multi-user.target
EOF
Once the file is created, you can use systemd to manage the service:
Run this command to ensure our new systemd service is registered:
sudo systemctl daemon-reload
To start the service:
sudo systemctl start cassandra
Confirm that the service is running. Please ensure you see that its Active: active (running)
in the following status command:
# sudo systemctl status cassandra
● cassandra.service - Apache Cassandra
Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2021-08-31 15:50:07 UTC; 8s ago
Main PID: 100752 (java)
Tasks: 54 (limit: 23800)
Memory: 1.1G
CGroup: /system.slice/cassandra.service
└─100752 java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfD>
Aug 31 15:50:13 ipa-server.citizix.light cassandra[100752]: INFO [main] 2021-08-31 15:50:13,284 NativeTransportService.java:68 - Netty using native Epoll event loop
Aug 31 15:50:13 ipa-server.citizix.light cassandra[100752]: INFO [CompactionExecutor:1] 2021-08-31 15:50:13,326 CompactionTask.java:150 - Compacting (20ffe200-0a73-11ec-a273-f980f7c7aa0a) [/var/lib/cassandra>
Aug 31 15:50:13 ipa-server.citizix.light cassandra[100752]: INFO [main] 2021-08-31 15:50:13,329 PipelineConfigurator.java:124 - Using Netty Version: [netty-buffer=netty-buffer-4.1.58.Final.10b03e6, netty-cod>
Aug 31 15:50:13 ipa-server.citizix.light cassandra[100752]: INFO [main] 2021-08-31 15:50:13,329 PipelineConfigurator.java:125 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)...
Aug 31 15:50:13 ipa-server.citizix.light cassandra[100752]: INFO [main] 2021-08-31 15:50:13,334 CassandraDaemon.java:780 - Startup complete
Enable the cassandra service to always run on boot:
sudo systemctl enable cassandra
Use the nodetool status
command to confirm the status of the current node status:
# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 166.07 KiB 16 100.0% 2b8341f0-2638-46bb-a0e0-e20b86f96d0a rack1
Now that the service is up and running, lets install the client:
Step 4. Install and configure Apache Cassandra Client (cqlsh)
Now that the Apache Cassandra service has been installed and configured we need to connect to it.
The client tool used to access Cassandra (cqlsh
) is a python client. So before installing it we need to set up the environment for python:
Install python3 and python pip
sudo dnf install -y python39 python39-pip
Confirm the installation and Python and pip:
# python3 -V
Python 3.9.2
# pip3 -V
pip 20.2.4 from /usr/lib/python3.9/site-packages/pip (python 3.9)
Using pip, install cqlsh:
sudo pip3 install cqlsh
Now we can use cqlsh to connect to cassandra. Since Cassandra is installed locally we do not have to specify any host:
$ cqlsh
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh>
Now that the client is set up, let’s configure Cassandra.
Step 4. Configure Apache Cassandra
Lets configure Apache Cassandra. The main configuration file is located here /etc/cassandra/default.conf/cassandra.yaml
Notable configs to change:
cluster_name
– the name of your clusterseeds
contains the list of IP addresses of your cluster seed, comma separatedlisten_address
contains the IP address of your node, this is what allows other nodes to communicate with this node
On top of that, you can also add the env variabble JVM_OPTS
adding any additional JVM command line arguments. This will be passed to the cassandra service when starting.
By default, Cassandra’s cluster name is ‘Test Cluster’. You can change this to your preferred cluster name by logging in using cqlsh
and running the command below.
UPDATE system.local
SET cluster_name = 'Citizix Cluster'
WHERE KEY = 'local';
After that update the config file /etc/cassandra/default.conf/cassandra.yaml
with the new name:
sudo vim /etc/cassandra/default.conf/cassandra.yaml
Then update this line:
cluster_name: 'Citizix Cluster'
Save and exit.
Lets clear the system cache using this:
nodetool flush system
Then restart the cassandra service
sudo systemctl restart cassandra
Log in again to confirm the cluster name as shown.
Conclusion
We managed to install and configure cassanda in the above guide. Please note the following:
Cassandra stores its log in this directory
/var/log/cassandra/
with the main log file located in/var/log/cassandra/system.log
Since we created a systemd service, you can also check
stdout
andstderr
logs using this command: sudo journalctl -fu cassandraThis
/var/lib/cassandra
is set as the default data directory. You can update that in the config fileCassandra configuration files are stored in this directory
/etc/cassandra/
including the default config file/etc/cassandra/default.conf/cassandra.yaml
To manage the cassandra service that we created:
# Start the service
sudo systemctl start cassandra
Check the service status
sudo systemctl status cassandra
Stop the service
sudo systemctl stop cassandra
Enable the service
sudo systemctl enable cassandra
Restart the service
sudo systemctl restart cassandra
Please also check the Cassandra Documentation page here for more info.