In this guide, we will learn how to set up Prometheus for monitoring on a Kubernetes cluster. This setup collects node, pods, and services metrics automatically using Prometheus service discovery configurations.
# About Prometheus
Prometheus is a free open source software application used for event monitoring and alerting. It was originally built at SoundCloud. It is now a standalone open source project and maintained independently of any company.
Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels. Metrics are numeric measurements, time series mean that changes are recorded over time. What users want to measure differs from application to application. For a web server it might be request times, for a database it might be number of active connections or number of active queries etc.
- Metric Collection: Prometheus uses the pull model to retrieve metrics over HTTP. There is an option to push metrics to Prometheus using
Pushgateway
for use cases where Prometheus cannot Scrape the metrics. One such example is collecting custom metrics from short-lived kubernetes jobs & Cronjobs - Metric Endpoint: The systems that you want to monitor using Prometheus should expose the metrics on an
/metrics
endpoint. Prometheus uses this endpoint to pull the metrics in regular intervals. - PromQL: Prometheus comes with
PromQL
, a very flexible query language that can be used to query the metrics in the Prometheus dashboard. Also, the PromQL query will be used by Prometheus UI and Grafana to visualize metrics. - Prometheus Exporters: Exporters are libraries which converts existing metric from third-party apps to Prometheus metrics format. There are many official and community Prometheus exporters . One example is, Prometheus node exporter. It exposes all Linux system-level metrics in Prometheus format.
- TSDB (time-series database): Prometheus uses TSDB for storing all the data efficiently. By default, all the data gets stored locally. However, to avoid single point of failure, there are options to integrate remote storage for Prometheus TSDB.
If you would like to run prometheus on your local machine checkout How to run Prometheus with docker and docker-compose, otherwise checkout How To Install and Configure Prometheus On a Linux Server if you are running prometheus on a linux server.
Prometheus is often used in conjunction with Alert manager to set up alerts and Grafana to graph metrics collected.
# Prometheus Monitoring Setup on Kubernetes
I assume that you have a kubernetes cluster up and running with kubectl setup on your workstation. If not please checkout these guides:
- How to set up Kubernetes Cluster on Ubuntu 20.04 with kubeadm and CRI-O
- How to set up Kubernetes Cluster on Debian 11 with kubeadm and CRI-O
- How to install and configure Prometheus AlertManager in Linux
- How to Setup a Kubernetes Cluster with K3S in Rocky Linux 8
- How to use Terraform to create a vpc network and a GKE in GCP
- How to Set up Prometheus Node exporter in Kubernetes
Latest Prometheus is available as a docker image in its official docker hub account. We will use that image for the setup.
# Prometheus Kubernetes Manifest Files
We can finally create manifests for our set up
# Create a Namespace
First, we will create a Kubernetes namespace for all our monitoring components. If you don’t create a dedicated namespace, all the Prometheus kubernetes deployment objects get deployed on the default namespace.
Save the following in namespace.yaml
:
---
apiVersion: v1
kind: Namespace
metadata:
name: prometheus
Then execute the following command to create a new namespace named prometheus.
kubectl apply -f namespace.yaml
# Create a ClusterRole
Prometheus uses Kubernetes APIs to read all the available metrics from Nodes, Pods, Deployments, etc. For this reason, we need to create an RBAC policy with read access
to required API groups and bind the policy to the prometheus
namespace.
First, create a file named clusterRole.yaml
and copy the following RBAC role.
In the role, given below, you can see that we have added
get
,list
, andwatch
permissions to nodes, services endpoints, pods, and ingresses. The role binding is bound to the monitoring namespace. If you have any use case to retrieve metrics from any other object, you need to add that in this cluster role.
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: - "" resources: - nodes - nodes/proxy - services - endpoints - pods verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - nonResourceURLs: - /metrics verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: default namespace: prometheus
Next, create the role using the following command.
kubectl create -f clusterRole.yaml
# Create a Config Map To Externalize Prometheus Configurations
All configurations for Prometheus are part of prometheus.yaml
file and all the alert rules for Alertmanager are configured in prometheus.rules
.
prometheus.yaml
: This is the main Prometheus configuration which holds all the scrape configs, service discovery details, storage locations, data retention configs, etc)prometheus.rules
: This file contains all the Prometheus alerting rules
By externalizing Prometheus configs to a Kubernetes config map, you don’t have to build the Prometheus image whenever you need to add or remove a configuration. You need to update the config map and restart the Prometheus pods to apply the new configuration.
The config map with all the Prometheus scrape config and alerting rules gets mounted to the Prometheus container in /etc/prometheus
location as prometheus.yaml
and prometheus.rules
files.
Create a file called config-map.yaml
and add the following file contents:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus
labels:
name: prometheus
data:
prometheus.rules: |-
groups:
- name: citizix demo alert
rules:
- alert: High Pod Memory
expr: sum(container_memory_usage_bytes) > 1
for: 1m
labels:
severity: slack
annotations:
summary: High Memory Usage
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
Execute the following command to create the config map in Kubernetes.
kubectl create -f config-map.yaml
It creates two files inside the container.
Note: In Prometheus terms, the config for collecting metrics from a collection of endpoints is called a
job
.
The prometheus.yaml
contains all the configurations to discover pods and services running in the Kubernetes cluster dynamically. We have the following scrape jobs in our Prometheus scrape configuration.
kubernetes-apiservers
: It gets all the metrics from the API servers.kubernetes-nodes
: It collects all the kubernetes node metrics.kubernetes-pods
: All the pod metrics get discovered if the pod metadata is annotated withprometheus.io/scrape
and prometheus.io/port
annotations.kubernetes-cadvisor
: Collects all cAdvisor metrics.kubernetes-service-endpoints
: All the Service endpoints are scrapped if the service metadata is annotated with prometheus.io/scrape and prometheus.io/port annotations. It can be used for black-box monitoring.
prometheus.rules
contains all the alert rules for sending alerts to the Alertmanager.
# Create a Prometheus Deployment
Next we create a prometheus deployment. We are using the official prometheus image from docker hub. We are also not using any persistent storage volumes for this basic set up. Please consider a persistent storage when setting up prometheus for production use cases.
In this configuration, we are mounting the Prometheus config map as a file inside /etc/prometheus
as explained in the previous section. Save the following content to deployment.yaml
.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: prometheus
name: prometheus
namespace: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- args:
- --storage.tsdb.retention.time=24h
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus/
image: prom/prometheus:v2.37.0
name: prometheus
ports:
- containerPort: 9090
name: http
protocol: TCP
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 500m
memory: 500M
volumeMounts:
- mountPath: /etc/prometheus/
name: prometheus-config-volume
- mountPath: /prometheus/
name: prometheus-storage-volume
volumes:
- configMap:
defaultMode: 420
name: prometheus
name: prometheus-config-volume
- emptyDir: {}
name: prometheus-storage-volume
Create a deployment on monitoring namespace using the above file.
kubectl create -f deployment.yaml
You can check the created deployment using the following command.
kubectl get deployments --namespace=prometheus
kubectl get pods --namespace=prometheus
# Connecting To Prometheus Dashboard
You can view the deployed Prometheus dashboard in three different ways.
- Using Kubectl port forwarding
- Exposing the Prometheus deployment as a service with NodePort or a Load Balancer.
- Adding an Ingress object if you have an Ingress controller deployed.
# Using Kubectl port forwarding
Using kubectl port forwarding, you can access a pod from your local workstation using a selected port on your localhost
. This method is primarily used for debugging purposes.
First, get the Prometheus pod name.
kubectl get pods --namespace=prometheus
The output will look like the following.
➜ kubectl get pods --namespace=prometheus NAME READY STATUS RESTARTS AGE prometheus-5bccbcfc94-rbd9g 1/1 Running 0 38m
Execute the following command with your pod name to access Prometheus from localhost port 8080.
Note: Replace prometheus-5bccbcfc94-rbd9g with your pod name.
kubectl port-forward prometheus-5bccbcfc94-rbd9g 8080:9090 -n prometheus
Now, if you access http://localhost:8080
on your browser, you will get the Prometheus home page.
# Exposing Prometheus as a Service [NodePort & LoadBalancer]
To access the Prometheus dashboard over a IP
or a DNS
name, you need to expose it as a Kubernetes service.
Create a file namedservice.yaml
and copy the following contents. We will expose Prometheus on all kubernetes node IP’s on port 30000
.
Note: If you are on AWS, Azure, or Google Cloud, You can use Loadbalancer type, which will create a load balancer and automatically points it to the Kubernetes service endpoint.
--- apiVersion: v1 kind: Service metadata: name: prometheus annotations: prometheus.io/scrape: 'true' prometheus.io/port: '9090' labels: app: prometheus namespace: prometheus spec: selector: app: prometheus type: NodePort ports: - name: prometheus protocol: TCP port: 9090 targetPort: http nodePort: 30000
The annotations
in the above service YAML
makes sure that the service endpoint is scrapped by Prometheus. The prometheus.io/port
should always be the target port mentioned in service YAML
Create the service using the following command.
kubectl create -f service.yaml --namespace=prometheus
Once created, you can access the Prometheus dashboard using any of the Kubernetes nodes IP on port 30000
. If you are on the cloud, make sure you have the right firewall rules to access port 30000
from your workstation.
Now if you browse to status --> Targets
, you will see all the Kubernetes endpoints connected to Prometheus automatically using service discovery.
You can head over to the homepage and select the metrics you need from the drop-down and get the graph for the time range you mention. An example graph for container_cpu_usage_seconds_total
.
# Exposing Prometheus Using Ingress
If you have an existing ingress controller setup, you can create an ingress object to route the Prometheus DNS to the Prometheus backend service.
Also, you can add SSL for Prometheus in the ingress layer. You can refer to the Kubernetes ingress TLS/SSL Certificate guide for more details.
Here is a sample ingress object.
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: kubernetes.io/ingress.class: nginx labels: app.kubernetes.io/instance: prometheus app.kubernetes.io/name: prometheus name: prometheus namespace: prometheus spec: rules: - host: prometheus.citizix.com http: paths: - backend: service: name: prometheus port: number: 9090 path: / pathType: ImplementationSpecific
# Setting Up Kube State Metrics
Kube state metrics service will provide many metrics which is not available by default. Please make sure you deploy Kube state metrics to monitor all your kubernetes API objects like deployments
, pods
, jobs
, cronjobs
etc.
# Setting Up Alertmanager
Alertmanager handles all the alerting mechanisms for Prometheus metrics. There are many integrations available to receive alerts from the Alertmanager (Slack, email, API endpoints, etc)
# Setting Up Grafana
Using Grafana you can create dashboards from Prometheus metrics to monitor the kubernetes cluster.
The best part is, you don’t have to write all the PromQL queries for the dashboards. There are many community dashboard templates available for Kubernetes. You can import it and modify it as per your needs.
# Setting Up Node Exporter
Node Exporter will provide all the Linux system-level metrics of all Kubernetes nodes.
The scrape config for node-exporter is part of the Prometheus config map. Once you deploy the node-exporter, you should see node-exporter targets and metrics in Prometheus.
# Prometheus Production Setup Considerations
For the production Prometheus setup, there are more configurations and parameters that need to be considered for scaling, high availability, and storage. It all depends on your environment and data volume.
For example, Prometheus Operator project makes it easy to automate Prometheus setup and its configurations.
Also, the CNCF project Thanos helps you aggregate metrics from multiple Kubernetes Prometheus sources and have a highly available setup with scalable storage.
# Conclusion
In this article, we learnt how to set up Prometheus on Kubernetes.