The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts. Prometheus can generate alerts when a target is unavailable and send them to the Alert Manager, sending you a notification to let you know that a target is down. This is just an example. Prometheus can send alerts to Alert Manager depending on any Prometheus metrics. So, the possibilities are limitless.
In this guide we will learn how to install and set up alert manager in Linux. We will also learn how to configure Prometheus and Alert Manager to send you slack notification when a Prometheus target is down (unavailable)
Prerequisites
You need to have a working prometheus set up before proceeding. Checkout the following guides if you need help setting up prometheus
- How to Set up Prometheus Node exporter in Kubernetes
- How To Install and Configure Prometheus On a Linux Server
- How To Monitor Linux Servers Using Prometheus Node Exporter
- How to run Prometheus with docker and docker-compose
Installing Alert Manager
Alert manager is available as a released tar file from the prometheus downloads page. Head over there and grab the latest version. In my case I am using a Linux server, this is the comand that will download the package.
|
|
Once the download is complete, extract it and move to the /opt/alertmanager
directory.
|
|
The directory contains two important files; the alertmanager
binary application and the alertmanager.yml
configuration file with the initial configurations.
Since we will be running the application as the prometheus
user (which was created as part of prometheus set up), make sure that user owns the directory.
|
|
Creating a Data Directory
Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the prometheus
system user. The prometheus
system user must have access (read, write, and execute permissions) to that data directory.
You can create the data/
directory in the /opt/alertmanager/
directory as follows:
|
|
Create a systemd Service unit for Alertmanager
To manage the service, we will use a systemd. Systemd allows us to start, stop, restart, and enable service start on os startup. Create a service file in the following path:
|
|
And add the following content to the file
|
|
Save and exit the file.
For the systemd changes to take effect, run the following command:
|
|
Now, start the alertmanager
service with the following command:
|
|
Confirm that the service is running as expected by checking its status:
|
|
Add the alertmanager
service to the system startup so that it automatically starts on boot with the following command:
|
|
Configuring Prometheus
Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. We will learn how to do both in this section.
To scrape alert manager metrics, we need to add a section in the scrape_configs
sections of prometheus configuration file.
In my case the alert manager and prometheus server is running in the same host, so I will use 127.0.0.1:9093
as the target otherwise substitute 127.0.0.1
with your alertmanager host IP. You can find the host IP using this command:
|
|
This is how my configs in prometheus.yml
file looks like after adding alert manager:
|
|
Also, type in the IP address and port number of Alert Manager in the alerting > alertmanagers
section of the prometheus.yml
file
|
|
For the changes to take effect, restart the prometheus service as follows:
|
|
Visit the URL http://<server_ip>:9090/targets
from your favorite web browser, and you should see that alertmanager
is in the UP
state. So, Prometheus can access Alert Manager just fine.
Creating a Prometheus Alert Rule
On Prometheus, you can use the up expression to find the state of the targets added to Prometheus in the graph search section.
The targets that are in the UP state (running and accessible to Prometheus) will have the value 1, and targets that are not in the UP (or DOWN) state (not running or inaccessible to Prometheus) will have the value ****.
If you stop one of the targets node_exporter
(let’s say).
|
|
The UP
value of that target in prometheus should be 0
. So, you can use the up == 0
expressions to list only the targets that are not running or inaccessible to Prometheus.
This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus.
To create a Prometheus Alert, create a new file rules.yml
in the /opt/prometheus/
directory as follows:
|
|
Now, type in the following lines in the rules.yml
file.
|
|
Here, the alert InstanceDown will be fired when targets are not running or inaccessible to Prometheus (that is up == 0) for a minute (1m).
Now, update the Prometheus configuration file /opt/prometheus/prometheus.yml
as follows:
|
|
Another important option of the prometheus.yml
file is evaluation_interval
. Prometheus will check whether any rules matched every evaluation_interval
time. The default is 15s (15 seconds). So, the Alert rules in the rules.yml
file will be checked every 15 seconds.
For the changes to take effect, restart the prometheus service:
|
|
Now, navigate to the URL http://<server_ip>:9090/rules
from your favorite web browser, and you should see the rule InstanceDown
that you’ve just added.
Navigate to the URL http://<server_ip>:9090/alerts
from your favorite web browser, and you should see the state of the alert InstanceDown
.
As you’ve stopped node_exporter
earlier, the alert is active, and it is waiting to be sent to the Alert Manager.
After a minute has passed, the alert InstanceDown
should be in the FIRING
state. It means that the alert is sent to the Alert Manager.
Configuring Slack Receiver on Alert Manager
In this section, I will show you how to configure Slack as the Alert Manager receiver so that you can get messages on your Slack account from Alert Manager if a Prometheus target is DOWN.
If you want to receive notifications via Slack, you should be part of a Slack workspace. If you are currently not a part of any Slack workspace, or you want to test this out in separate workspace, you can quickly create one here.
To set up alerting in your Slack workspace, you’re going to need a Slack API URL. Go to Slack -> Administration -> Manage apps.
In the Manage apps directory, search for Incoming WebHooks and add it to your Slack workspace.[][1]
Next, specify in which channel you’d like to receive notifications from Alertmanager. (I’ve created #citizix-alerts
channel.) After you confirm and add Incoming WebHooks integration, webhook URL (which is your Slack API URL) is displayed. Copy it.
Then you need to modify the alertmanager.yml
file. Fill out your alertmanager.yml
based on the template below. Use the url that you have just copied as slack_api_url
.
|
|
In the above configs, we have updated the alertmanager receiver to slack-notifications
, the receiver we have created and added configs. It will use that from now on.
repeat_interval
in route configurations is also an important Alert Manager option. By default, repeat_interval
is set to 1h
(1 hour). If Alert Manager has successfully sent you a message on slack, it will wait an hour before sending you another one. If you don’t want to get emails very frequently, you can increase it.
Now, restart the alertmanager
service for the changes to take effect:
|
|
You should get an message on slack shortly, as you had stopped node_exporter
earlier, remember?
That is it
In this article, we have learnt how to install and configure Alertmanager in a Linux server. We have learnt how to configure Alert Manager and Prometheus to send slack notifications when a Prometheus target is DOWN.