In this guide we will go through the process of Installing and setting up Apache Cassandra Version 4 in Rocky Linux 8 and RHEL 8 Linux distributions.
Apache Cassandra is a free and open source NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Apache Cassandra was initially developed by Facebook and later on acquired by Apache Foundation.
Apache Cassandra is suited for massive, exponentially-growing amounts of constantly transforming data.
- How to run Cassandra 4 with Docker and Docker-Compose
- How to Install and Configure Apache Cassandra 4.0 in Rocky/Alma Linux 9
Cassandra Compared to RDBMS
Cassandra have close analogies to concepts in relational databases:
Keyspace– Similar to a Database/schema in a RDBMS
Table– Similar to table in RDBMS
Row– Similar to Row in RDBMS
Column– Similar to Column in RDBMS
Primary key– Similar to Primary key in RDBMS
To follow along with this guide, you need:
- Rocky Linux 8 server
- Root access to the server or user with sudo access
- Internet access to download the packages
The following are the steps we will follow to install Cassandra
- Ensure our system is up to date
- Install Java in the system
- Install Apache Cassandra in the system
- Install and configure Apache Cassandra Client (cqlsh)
- Configure Apache Cassandra
Step 1 Ensure our system is up to date
Let’s make sure that the Rocky Linux 8 packages installed on the server are up to date. You can do this by running the following commands:
sudo dnf -y update
Step 2 Install Java in the system
Apache Cassandra requires you to have java 8 in your system for it to run. Confirm that Java is installed by typing this command:
If you see this output:
# java -version -bash: java: command not found
Then it means java is not installed. Lets install it with this command:
sudo dnf install -y java-1.8.0-openjdk
Once the installation is complete, confirm it with this :
# java -version openjdk version "1.8.0_302" OpenJDK Runtime Environment (build 1.8.0_302-b08) OpenJDK 64-Bit Server VM (build 25.302-b08, mixed mode)
Now that Java 8 is installed in our system, lets install Cassandra.
Step 3. Install Apache Cassandra in the system
Apache Cassandra is not available in the default Rocky Linux 8 repositories. Lets create a repo pointing to Cassandra Repos:
/etc/yum.repos.d/cassandra.repo file with the content required using this command:
cat > /etc/yum.repos.d/cassandra.repo <<EOF [cassandra] name=Apache Cassandra baseurl=https://downloads.apache.org/cassandra/redhat/40x/ gpgcheck=1 repo_gpgcheck=1 gpgkey=https://downloads.apache.org/cassandra/KEYS EOF
Now that we have added the repo lets install cassandra:
sudo dnf install -y cassandra
Confirm that Cassandra has been installed using this command:
# rpm -qi cassandra Name : cassandra Version : 4.0.0 Release : 1 Architecture: noarch Install Date: Tue 31 Aug 2021 08:59:00 AM UTC Group : Development/Libraries Size : 54941890 License : Apache Software License 2.0 Signature : RSA/SHA512, Thu 22 Jul 2021 10:22:35 PM UTC, Key ID 5e85b9ae0b84c041 Source RPM : cassandra-4.0.0-1.src.rpm Build Date : Thu 22 Jul 2021 10:22:10 PM UTC Build Host : 0b542adba94d Relocations : (not relocatable) URL : http://cassandra.apache.org/ Summary : Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Description : Cassandra is a distributed (peer-to-peer) system for the management and storage of structured data.
Once that commannd is installed, The cassandra binary will be created in
/usr/sbin/cassandra. Since Rocky Linux 8 manages services using systemd, lets create a systemd file in
/etc/systemd/system/cassandra.service with the content required to manage the cassandra service
cat > /etc/systemd/system/cassandra.service <<EOF [Unit] Description=Apache Cassandra 4.0 After=network.target [Service] Type=simple PIDFile=/var/run/cassandra/cassandra.pid User=cassandra Group=cassandra ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid Restart=always [Install] WantedBy=multi-user.target EOF
Once the file is created, you can use systemd to manage the service:
Run this command to ensure our new systemd service is registered:
sudo systemctl daemon-reload
To start the service:
sudo systemctl start cassandra
Confirm that the service is running. Please ensure you see that its
Active: active (running) in the following status command:
# sudo systemctl status cassandra ● cassandra.service - Apache Cassandra Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2021-08-31 15:50:07 UTC; 8s ago Main PID: 100752 (java) Tasks: 54 (limit: 23800) Memory: 1.1G CGroup: /system.slice/cassandra.service └─100752 java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfD> Aug 31 15:50:13 ipa-server.citizix.light cassandra: INFO [main] 2021-08-31 15:50:13,284 NativeTransportService.java:68 - Netty using native Epoll event loop Aug 31 15:50:13 ipa-server.citizix.light cassandra: INFO [CompactionExecutor:1] 2021-08-31 15:50:13,326 CompactionTask.java:150 - Compacting (20ffe200-0a73-11ec-a273-f980f7c7aa0a) [/var/lib/cassandra> Aug 31 15:50:13 ipa-server.citizix.light cassandra: INFO [main] 2021-08-31 15:50:13,329 PipelineConfigurator.java:124 - Using Netty Version: [netty-buffer=netty-buffer-4.1.58.Final.10b03e6, netty-cod> Aug 31 15:50:13 ipa-server.citizix.light cassandra: INFO [main] 2021-08-31 15:50:13,329 PipelineConfigurator.java:125 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)... Aug 31 15:50:13 ipa-server.citizix.light cassandra: INFO [main] 2021-08-31 15:50:13,334 CassandraDaemon.java:780 - Startup complete
Enable the cassandra service to always run on boot:
sudo systemctl enable cassandra
nodetool status command to confirm the status of the current node status:
# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 166.07 KiB 16 100.0% 2b8341f0-2638-46bb-a0e0-e20b86f96d0a rack1
Now that the service is up and running, lets install the client:
Step 4. Install and configure Apache Cassandra Client (cqlsh)
Now that the Apache Cassandra service has been installed and configured we need to connect to it.
The client tool used to access Cassandra (
cqlsh) is a python client. So before installing it we need to set up the environment for python:
Install python3 and python pip
sudo dnf install -y python39 python39-pip
Confirm the installation and Python and pip:
# python3 -V Python 3.9.2 # pip3 -V pip 20.2.4 from /usr/lib/python3.9/site-packages/pip (python 3.9)
Using pip, install cqlsh:
sudo pip3 install cqlsh
Now we can use cqlsh to connect to cassandra. Since Cassandra is installed locally we do not have to specify any host:
$ cqlsh Connected to Test Cluster at 127.0.0.1:9042 [cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5] Use HELP for help. cqlsh>
Now that the client is set up, let’s configure Cassandra.
Step 4. Configure Apache Cassandra
Lets configure Apache Cassandra. The main configuration file is located here
Notable configs to change:
cluster_name– the name of your cluster
seedscontains the list of IP addresses of your cluster seed, comma separated
listen_addresscontains the IP address of your node, this is what allows other nodes to communicate with this node
On top of that, you can also add the env variabble
JVM_OPTS adding any additional JVM command line arguments. This will be passed to the cassandra service when starting.
By default, Cassandra’s cluster name is ‘Test Cluster’. You can change this to your preferred cluster name by logging in using
cqlsh and running the command below.
UPDATE system.local SET cluster_name = 'Citizix Cluster' WHERE KEY = 'local';
After that update the config file
/etc/cassandra/default.conf/cassandra.yaml with the new name:
sudo vim /etc/cassandra/default.conf/cassandra.yaml
Then update this line:
cluster_name: 'Citizix Cluster'
Save and exit.
Lets clear the system cache using this:
nodetool flush system
Then restart the cassandra service
sudo systemctl restart cassandra
Log in again to confirm the cluster name as shown.
We managed to install and configure cassanda in the above guide. Please note the following:
- Cassandra stores its log in this directory
/var/log/cassandra/with the main log file located in
- Since we created a systemd service, you can also check
stderrlogs using this command:
sudo journalctl -fu cassandra
/var/lib/cassandrais set as the default data directory. You can update that in the config file
- Cassandra configuration files are stored in this directory
/etc/cassandra/including the default config file
- To manage the cassandra service that we created:
# Start the service sudo systemctl start cassandra # Check the service status sudo systemctl status cassandra # Stop the service sudo systemctl stop cassandra # Enable the service sudo systemctl enable cassandra # Restart the service sudo systemctl restart cassandra
Please also check the Cassandra Documentation page here for more info.