In this blog post, I will look at the new dashboards in Percona Monitoring and Management (PMM) for Prometheus exporters.
Percona Monitoring and Management (PMM) uses Prometheus exporters to capture metrics data from the system it monitors. Those Prometheus exporters are an important part of your monitoring infrastructure, and understanding their performance and other operational details is critical for well-implemented monitoring.
To help you with this we’ve added a number of new dashboards to Percona Monitoring and Management.
The Prometheus Exporters Overview dashboard provides a high-level overview of your installed Prometheus exporter infrastructure:
The summary shows you how many hosts are monitored and how many exporters you have running, as well as how much CPU and memory they are using.
Note that the CPU usage shown in this graph is only the CPU usage of the exporter itself. It does not include the additional resource usage that is required to produce metrics by the application or operating system.
Next, we have an overview of resource usage by the host:
These graphs allow us to analyze the resource usage for different hosts, allowing us to clearly see if any of the hosts have unusually high CPU or memory usage by exporters.
You may notice some of the CPU usage reported on these graphs is very high. This is due to the fact that we use very high-resolution sampling and very underpowered instances for this demonstration environment. CPU usage numbers like this are not typical.
The next graphs show resource usage by the type of exporter:
In this case, we measure CPU usage in “CPU Cores” rather than as a percent – it is more meaningful. Otherwise, the same amount of actual resource usage by the exporter will look very different on a system with one core versus a system with 64 cores. Core usage numbers have a pretty stable baseline, though.
Then there is a list of your monitored hosts and the exporters they are running:
This shows your CPU usage and memory usage per host, as well as the number of exporters running and system details.
Prometheus Exporter Status dashboard allows you to investigate how specific exporters are performing for the given host. Each of the well-known exporters has its own row in this dashboard.
Node Exporter Status shows us the resource usage, uptime and performance of Node Exporter (the exporter responsible for capturing OS-level metrics):
The “Collector Scrape Successful” shows which node_exporter collector category (which are modules that collect specific information) have returned data reliably. If you have anything but a flat line on “1” here, you need to check for problems.
“Collector Execution Time” shows how long on average it takes to execute your enabled collectors. This shows which collectors are generally more expensive to run (or if some of them are experiencing performance problems).
MySQL Exporter Status shows us how MySQL exporter is performing:
Additionally, in resource usage we see the rate of scrapes for High, Medium and Low resolution data.
Generally, you should see three flat lines here if everything is working well. This is not the case for this host, and we can see some scrapes are not successful – either failing to complete, or not triggered by Prometheus Server altogether (due to overload or connectivity issues).
These graphs provide information about MySQL Exporter Errors – permission errors and other issues. It also shows if MySQL Server was up during this time. There are also similar details reported for MongoDB and ProxySQL exporters if they are running on the host.
I hope these new dashboards help you to understand your Prometheus exporter performance better!