PMM Alerting with Grafana: Working with Templated DashboardsPeter Zaitsev
In this blog post, we will look into more intricate details of PMM alerting. More specifically, we’ll look at how to set up alerting based on templated dashboards.
Percona Monitoring and Management (PMM) 1.0.7 includes Grafana 4.0, which comes with the Alerting feature. Barrett Chambers shared how to enable alerting in general. This blog post looks at the specifics of setting up alerting based on the templated dashboards. Grafana 4.0 does not support basic alerting out-of-the-box.
This means if I try to set up an alert on the number of MySQL threads running, I get the error “Template variables are not supported in alert queries.”
What is the solution?
Until Grafana provides a better option, you need to do alerting based on graphs (which don’t use templating). This is how to do it.
Click on “Create New” in the Dashboards list to create a basic dashboard for your alerts:
Click on “Add Panel” and select “Graph”:
Click on the panel title of the related panel on the menu sign, and then click on “Panel JSON”.
This shows you the JSON of the panel, which will look like something like this:
Now you need to go back to the other browser window, and the dashboard with the graph you want to alert on. Show the JSON panel for it. In our case, we go to “MySQL Overview” and show the JSON for “MySQL Active Threads” panel.
Copy the JSON from the “MySQL Active Threads” panel and paste it into the new panel in the dashboard created for alerting.
Once we have done the copy/paste, click on the green Update button, and we’ll see the broken panel:
It’s broken because we’re using templating variables in dashboard expressions. None of them are set up in this dashboard. Expressions won’t work. We must replace the template variables in the formulas with actual hosts, instances, mount points, etc., for we want to alert on:
We need to change $host to the name of the host we want to alert on, and the $interval should align with the data capture interval (here we’ll set it to 5 seconds):
If correctly set up, you should see the graph showing the data.
Finally, we can go to edit the graph. Click on the “Alert” and “Create Alert”.
Specify Evaluate Every to create an alert. This sets up the evaluation interval for the alert rule. Obviously, the more often the alert evaluates the condition, the more quickly you get alerted if something goes wrong (as well as alert conditions).
In our case, we want to get an alert if the number of running threads are sustained at a high rate. To do this, look at the minimum number of threads for last minute to be above 30:
Note that our query has two parameters: “A” is the number of threads connected, and “B” is the number of threads running. We’re choosing to Alert on “B”.
The beautiful thing Grafana does is show the alert threshold clearly on the graph, and allows you to edit the alert just by moving this alert line with a mouse:
You may want to click on the floppy drive at the top to save dashboard (giving it whatever identifying name you want).
At this point, you should see the alert working. A little heart sign appears by the graph title, colored green (indicating it is not active) or red (indicating it is active). Additionally, you will see the red and green vertical lines in the alert history. These show when this alert gets triggered and when the system went back to normal.
You probably want to set up notifications as well as see alerts on the graphs.
To set up notifications, go to the Grafana Configuration menu and configure Alerting. There are Grafana Support Email, Slack, Pagerduty and general Webhook notification options (with more on the way, I’m sure).
The same way you added the “Graph” panel to set up an alert, you can add the “Alert List” panel to see all the alerts you have set up (and their status):
As you can see, it is possible to set up alerts in PMM using the new Grafana 4.0 alerting feature. It is not very convenient or easy to do. This is first alerting support release for Grafana and PMM. As such, I’m sure it will become much easier and more convenient over time.