Monitoring with Prometheus and Grafana
What is Prometheus?
Prometheus is a monitoring tool which was created to monitor highly dynamic container environments like Kubernetes, Docker swarm etc. It can also be used to monitor bare server where applications are directly deployed.
Prometheus working.
Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.
More info on data model of metrics: https://prometheus.io/docs/concepts/data_model/
Prometheus pulls metric data, stores metrics data in a time series database which is stored in local, it then accepts queries called PromQL.
Architecture diagram
Components of monitoring with prometheus
1. Target: A machine, application server, microservice, containers, log services etc. We will monitor our own machine in the example.
2. Prometheus server
3. Exporter: Each target will have an exporter which exposes an end point like localhost:3030/metrics The end point is/metrics
Prometheus pulls the metrics from this URL. For our example since we are monitoring our machine we will use node exporter. For windows we have windows exporter.
Prometheus has a long list of exporters available for different targets: https://prometheus.io/docs/instrumenting/exporters/
Once we have metrics being collected by prometheus we will use Grafana. Prometheus lacks a good visualization tool, hence we will use grafana.
Setup
We can do all the setup using dockers or can directly install/run on our machine.
Step 1: Install prometheus
Download the archive file for your system.
tar xvfz prometheus-*.tar.gz cd prometheus-*
Prometheus already has a default configuration file called prometheus.yaml .
Open the file it should have three parts to it with explanation.
global: scrape_interval: 15s evaluation_interval: 15s rule_files: # - "first.rules" # - "second.rules" scrape_configs: - job_name: prometheus static_configs: - targets: ["localhost:9090"]
As you have noticed we already have a job called prometheus which is monitoring the prometheus server itself at localhost:9090 .
rule_files: this will be another yaml file which has rules either recording i.e. precompute frequently needed or computationally expensive expressions and save their result as a new set of time series or alerting rules.
Run prometheus using the following command
./prometheus --config.file=prometheus.yml
By default prometheus uses port 9090 . Which means we have to check the end point http://localhost:9090/metrics  for metrics. Open up http://localhost:9090/ in your browser to open the prometheus dashboard.
Open it and check how the metrics appears.
Execute few queries and check for data.
promhttp_metric_handler_requests_total
You may not get same as above as it has more targets.
Step 2: Install the Exporter – node_exporter/windows_exporter
Node Exporter: https://prometheus.io/download/#node_exporter
Windows Exporter: https://github.com/prometheus-community/windows_exporter/releases
tar xvfz node_exporter-*.*-amd64.tar.gz cd node_exporter-*.*-amd64 ./node_exporter
Node Exporter by default runs on port 9100 . So we get expect the metrics in http://localhost:9100/metrics
Open up the prometheus.yaml . And add the node exporter job to the scape_configs .
- job_name: my_machine static_configs: - targets: ["localhost:9100"]
Restart the prometheus service with new config.
Run the query to monitor the average amount of CPU time spent in system mode, per second, over the last minute (in seconds)
rate(node_cpu_seconds_total{mode=”system”}[1m])
Step 3: Grafana visualization
Download Grafana: https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1
Install grafana and start server.
sudo systemctl daemon-reload sudo systemctl start grafana-server sudo systemctl status grafana-server sudo systemctl enable grafana-server
By default it runs on port 3000. To change the port edit the configuration file as explained here.
Open: http://localhost:3000/
Default credentials are admin as username and password.
Grafana uses Prometheus as data source and also uses PromQL under the hood for queries.
Adding a datasource
1. Go to configuration to add datasource. http://localhost:3000/datasources
2. Click on Add Data Source and select Prometheus.
3. Add appropriate URL which is http://localhost:9090/
4. Click on Save & Test
Adding a dashboard
We have to create dashboards in order to visualize the data.
Though instead of creating a new one from scratch, we can use dashboards which are already available publicly.
Node Exporter dashboard: https://grafana.com/grafana/dashboards/1860
Windows Exporter dashboard: 2. You can copy the JSON or better just copy the dashboard ID, paste it and click on Load.
3. Select Prometheus as data source and click on Import. You should be redirected to the newly created dashboard.https://grafana.com/grafana/dashboards/14694
1. Click on the plus icon and click on Import option.
2. You can copy the JSON or better just copy the dashboard ID, paste it and click on Load
3. Select Prometheus as data source and click on.You should be redirected to the newly created dashboard.
All dashboard and panel configs are available as json. To learn how any panel is created or which metric and formula is used to calculate a visualization. Follow the steps.
1. Click on the down arrow on a panel.
2. Click on edit.
Alternatively, you can click on a panel and press
For this example the node(Host) and job(value of job from metric) variables are used as filters. The are values of the metric node_cpu_seconds_total
More on Grafana Visualization: https://grafana.com/docs/grafana/latest/visualizations/
Check other dashboards/panels and checkout the values used and create your own dashboard to monitor your system.
Here we have only done monitoring. A proper monitoring solution should also have alerting system. As mentioned prometheus allows alerting and so does Grafana.
Learn how to setup alerts from the following links
- https://grafana.com/docs/grafana/latest/alerting/
- https://grafana.com/docs/grafana/latest/alerting/old-alerting/create-alerts/
- https://grafana.com/blog/2021/06/14/the-new-unified-alerting-system-for-grafana-everything-you-need-to-know/
- https://prometheus.io/docs/alerting/latest/alertmanager/ (Since we are using grafana, we don’t have to setup alerting in prometheus)
References
- https://prometheus.io/docs/introduction/overview/
- https://grafana.com/docs/grafana/latest/
- https://www.youtube.com/watch?v=h4Sl21AKiDg
- https://www.youtube.com/watch?v=cF2P9d7rBlg
Posted By: Vikas Kyatannawar, Osmosee
Comments (0)