How to run yugabytedb in docker and docker-compose

YugabyteDB is a PostgreSQL-compatible Open-Source Distributed SQL database. It adds horizontal scalability to applications built for PostgreSQL. It offers all the benefits of a typical relational database (e.g. SQL, strong consistency, ACID transactions) with the advantages of a globally-distributed auto-sharded database system (e.g., NoSQL document databases).

Since YugabyteDB builds on top of PostgreSQL, every tool that works with PostgreSQL works with Yugabyte as well. Not only will you be able to use PgAdmin to connect to Yugabyte, but you can use any software framework or library that works with the PostgreSQL drivers.

YugabyteDB is versatile when it comes to data and traffic volumes. Because it provides auto-scaling, auto-sharding, and auto-balancing, you won’t have to rearchitect your system the moment it becomes too successful for the initial architecture to cope with.

It is the best fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and requires at least one of the following: scalability, high tolerance to failures, and globally-distributed deployments.

YugabyteDB is Cassandra API compliant and Postgres API compliant which makes it very powerful. It combines and complements the strength of the NoSQL Cassandra database and the PostgreSQL database as well.

In this guide we will Create a local cluster on a single host.

Prerequisites

Before proceeding with this guide to install YugabyteDB, ensure that you have the Docker runtime installed on your local machine. To download and install Docker, checkout the installation page.

Running YugabyteDB with docker

To create a 1-node cluster with a replication factor (RF) of 1, first pull the docker image:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ docker pull yugabytedb/yugabyte:2.18.0.1-b4

2.18.0.1-b4: Pulling from yugabytedb/yugabyte
6cd55418c38e: Pull complete
8d173d46dac8: Pull complete
8e6321237433: Pull complete
20c6d227782e: Pull complete
061fc51f8956: Pull complete
cbdefd461ea5: Pull complete
64f07608ca74: Pull complete
866db348ded5: Pull complete
56035b96eb1b: Pull complete
7fa5a20081ec: Pull complete
77565a1d10fe: Pull complete
a34370038700: Pull complete
d033f067031c: Pull complete
04127a190dc6: Pull complete
e188da68db3d: Pull complete
8fdbb705c342: Pull complete
62006d2d98f3: Pull complete
Digest: sha256:bb5221992907b01377a21d9aa6849186e9a41d8d2e4a0c39df940259de671a97
Status: Downloaded newer image for yugabytedb/yugabyte:2.18.0.1-b4
docker.io/yugabytedb/yugabyte:2.18.0.1-b4

Once we have the docker image downloaded, we can create a new container using the following docker run command:

1
2
3
4
5
6
docker run -d --name yugabyte \
    -p 7000:7000 \
    -p 9000:9000 \
    -p 5433:5433 \
    -p 9042:9042 \
    yugabytedb/yugabyte:2.18.0.1-b4 bin/yugabyted start --daemon=false

If you are running macOS Monterey, replace -p7000:7000 with -p7001:7000. This is necessary because Monterey enables AirPlay receiving by default, which listens on port 7000. This conflicts with YugabyteDB and causes yugabyted start to fail unless you forward the port as shown. Alternatively, you can disable AirPlay receiving, then start YugabyteDB normally, and then, optionally, re-enable AirPlay receiving.

Below is a breakdown of the options:

-d: The detach option runs the container as a background process and displays the container ID. This option is needed to regain control of the shell since the yugabyted process is intended to be long-lived.
--name yugabyte: This option gives the container a user-friendly name that can be used later.
-p7000:7000 -p9000:9000 -p5433:5433 -p9042:9042: These options expose internal ports to the host so they can be interacted with from outside the container. These are YugabyteDB significant ports and will be discussed later.
yugabytedb/yugabyte:2.18.0.1-b4: This is the container image and version (tag) to run.
bin/yugabyted start --daemon=false: This command starts yugabyted, the parent process for YugabyteDB and passes additional options to set the base directory for the YugabyteDB data folder and directs the process to not run in the background (the default behavior which would cause the container to stop).

It is important to note that YugabyteDB is a distributed SQL database and that the image used is only a single node deployment (i.e. a replication factor of 1). This is not typical for a production environment which would usually be RF=3 or even RF=5. Running a multi-node environment locally is possible but beyond the scope of this guide.

Run the following command to check the cluster status:

1
2
3
$ docker ps
CONTAINER ID   IMAGE                        COMMAND                  CREATED          STATUS          PORTS                                                                                                                                                                                                                                                 NAMES
a857794263fb   yugabytedb/yugabyte:latest   "/sbin/tini -- bin/y&#x2026;"   16 minutes ago   Up 16 minutes   0.0.0.0:5433->5433/tcp, :::5433->5433/tcp, 6379/tcp, 7100/tcp, 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 7200/tcp, 9100/tcp, 10100/tcp, 11000/tcp, 0.0.0.0:9042->9042/tcp, :::9042->9042/tcp, 12000/tcp   yugabyte

Check logs

1
2
3
4
$ docker logs -f yugabyte
Starting yugabyted...
YugabyteDB Started
Data placement constraint successfully verified

Clients can now connect to the YSQL and YCQL APIs at http://localhost:5433 and http://localhost:9042 respectively.

Run YugabyteDB with Docker in a persistent volume

In the preceding docker run command, the data stored in YugabyteDB does not persist across container restarts. To make YugabyteDB persist data across restarts, you can add a volume mount option to the docker run command, as follows:

Create a ~/yb_data directory by executing the following command:

1
mkdir ~/yb_data

Run Docker with the volume mount option by executing the following command:

1
2
3
4
5
docker run -d --name yugabyte \
  -p7000:7000 -p9000:9000 -p5433:5433 -p9042:9042 \
  -v ~/yb_data:/home/yugabyte/yb_data \
  yugabytedb/yugabyte:latest bin/yugabyted start \
  --base_dir=/home/yugabyte/yb_data --daemon=false

Using docker-compose to run YugabyteDB

Development using the Docker Compose file is faster and error-free compared to provisioning Docker containers individually.

Save this as docker-compose.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
 version: '3.9'

services:
  yugabyte:
    image: yugabytedb/yugabyte:2.18.0.1-b4
    restart: always
    command: bin/yugabyted start --daemon=false
    ports:
      - 7000:7000
      - 5433:5433
      - 9000:9000
      - 9042:9042
    volumes:
      - yugabyte_db:/var/lib/yugabyteql/data
    environment:
      - POSTGRES_PASSWORD=eutychus
      - POSTGRES_USER=eutychus
      - POSTGRES_DB=eutychus
    networks:
      - yugabyte_net

volumes:
  yugabyte_db:

networks:
  yugabyte_net:

To run, use the following command:

1
docker-compose up -d

Confirm that it is running as expected:

1
docker-compose ps

Connecting to the Admin UI

The cluster you have created consists of two processes: YB-Master which keeps track of various metadata (list of tables, users, roles, permissions, and so on) and YB-TServer which is responsible for the actual end user requests for data updates and queries.

Each of the processes exposes its own Admin UI that can be used to check the status of the corresponding process, as well as perform certain administrative operations. The yb-master Admin UI is available at http://localhost:7000 and the yb-tserver Admin UI is available at http://localhost:9000. To avoid port conflicts, you should make sure other processes on your machine do not have these ports mapped to localhost.

Connect to the Yugabyte database

Using the YugabyteDB SQL shell, ysqlsh, you can connect to your cluster and interact with it using distributed SQL. ysqlsh is installed with YugabyteDB and is located in the bin directory of the YugabyteDB home directory.

To open the YSQL shell, run ysqlsh.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
 docker exec -it yugabyte /home/yugabyte/bin/ysqlsh

# To echo queries
docker exec -it yugabyte /home/yugabyte/bin/ysqlsh --echo-queries

# You can also connect using the docker host from docker inspect
docker exec -it yugabyte /home/yugabyte/bin/ysqlsh --echo-queries -h 172.17.0.2 -p 5433

# or you can check hostname to avoid docker inspecy
docker exec -it yugabyte bash -c 'ysqlsh --echo-queries -h $(hostname) -p 5433'

You can can also connect to YugaByteDB using any PostgreSQL-compatible tool like the the PgAdmin UI tool to connect to both my local PostgreSQL server and the YugabyteDB server running on Docker.

From your favorite programming language, you can connect to Yugabyte just like you’d do for PostgreSQL.

SQL Actions

Load Sample Dataset

1
2
CREATE DATABASE yb_demo;
\c yb_demo;

Import data

1
2
3
4
5
\i share/schema.sql
\i share/products.sql
\i share/users.sql
\i share/orders.sql
\i share/reviews.sql

Run queries:

1
SELECT users.id, users.name, users.email, orders.id, orders.total FROM orders INNER JOIN users ON orders.user_id=users.id LIMIT 10;

Cleaning up

If you no longer want to run YugabyteDB, you can use these commands to clean up:

1
2
3
4
5
# Stop the container
docker stop yugabyte

# Remove the container data
docker rm yugabyte

Conclusion

This should be enough information to get started using YugabyteDB locally in a Docker container.

Since YugabyteDB is postgres compatible, it allows you to reuse lots of tools that you’re already familiar with.

You can easily migrate existing applications and workloads from PostgreSQL to YugabyteDB and benefit from its many features like auto-scaling capabilities.