Percona Monitoring and Management, Meet Prometheus Alertmanager

alertmanagerOne of the requests we get most often on the Percona Monitoring and Management (PMM) team is “Do you support alerting?”  The answer to that question has always been “Yes” but the feedback on how we offered it natively was that it was, well, not robust enough!  We’ve been hard at work to change that and are excited to offer, starting with the newly released PMM version 2.3.0, a more dynamic alerting mechanism for your PMM installations: Integration with Prometheus Alertmanager.

Prometheus Alertmanager

If you don’t know what Alertmanager is you can read all about it on the Prometheus website, but the short version is that Alertmanger is a receiver, consolidator, and router of alerting messages that offers LOTS of flexibility when it comes to configurations.  From my old days as a SysAdmin, the tools I used weren’t smart enough to deduplicate alerts so I’d have my boss yelling, my coworkers emailing, and my phone (ok…Blackberry) battery depleting itself vibrating to the same alert over and over until I could manage to put the alert in maintenance mode and the queue of alerts drained.  Alertmanager is smart enough to deduplicate alerts so you don’t get 50 pages telling you the disk is 90% full before you can grow the volume or purge files. It’s also extremely easy to group alerts so that you don’t get alerts for ‘Application Down’, ‘MySQL Down’, ‘CPU|RAM|Disk: Unavail’, etc. because someone rebooted the DB server without putting monitoring in maintenance mode.  Alertmanager also offers many native integrations so you can route alerts to email, SMS, PagerDuty, Slack, and more!

Now, this is our first iteration of Alertmanager support so at this point you will need your own working Alertmanager installation that your PMM server can communicate with.  The only other thing you’ll need are the rules you want to trigger alerts from. That’s basically it! You most likely already know how to create yaml style rules but for the curious, it looks something like this:

alertmanager

The above will trigger an alert to let you know which PMM instances of PostgreSQL are down for more than 5 minutes.  Since this first pass targets the experienced users, I’ll leave it to you to craft your own rules but we’re really excited to be adding this sorely needed functionality!

 

For more information, you can read our AlertManager integration documentation and FAQs.  Update your instance today and let us know what you think, we would love to hear your feedback!

Share this post

Comments (6)

  • Eugene Reply

    I’m trying to add the following in the Alertmanager rule section:

    – alert: MongodbReplicationLag
    expr: (avg by (cluster,environment,set)(mongodb_mongod_replset_member_optime_date{state=”PRIMARY”}) – min by (cluster,environment,set) (mongodb_mongod_replset_member_optime_date{state=”SECONDARY”})) > 120
    for: 5m
    labels:
    severity: warning
    source: pmmprod

    Once I click “Apply Alertmanager settings button” button, the error pops-up: Invalid Alert Manager rules.

    What am I doing wrong here?

    February 25, 2020 at 7:24 am
    • Eugene Reply

      ttext in the comment above did not preserve yaml indentation which was correct originally

      February 25, 2020 at 7:28 am
    • Vadim Yalovets Reply

      Could you check your expr part directly in prometheus UI?
      https:///prometheus/

      February 25, 2020 at 9:10 am
  • elisetta1984 Reply

    Hello.
    If I don’t want to use the new Prometheus Alertmanager, is it still possible to use the Grafana Alerting feature? I cannot find anymore the Alert Tab on the dashboard graph panel for PMM 2.3.0.
    Thanks and Regards, Elisa

    March 2, 2020 at 7:55 am
    • Steve Hoffman Reply

      elisetta1984,
      Grafana Alerting still works in PMM 2.3.0, I was just able to set one up on my test instance and it’s alerting me as expected. I’ll assume you have configured a notification channel, once that’s done, you can create a new dashboard, add a panel to it and the option for alerts isn’t a tab anymore but a “bell” icon along the left side of the “edit panel” dialog boxes (there will be an icon for: Queries, Visualization, General, and finally Alert) . If you’re looking to add alerts to the existing graphs that’s possible but a little more involved as many of our dashboards are templated. Here’s a blog post from PMM1 but the bulk of what’s outlined still holds true in PMM2 https://www.percona.com/blog/2017/02/02/pmm-alerting-with-grafana-working-with-templated-dashboards/

      We are also looking at incorporating the latest Grafana 6.6 as there’s been some significant work done with alerting but that’s going to be a few releases out for us.

      Hope this helps!
      Steve

      March 3, 2020 at 5:14 pm
  • Sai Reply

    Very nice article.

    Couple of questions —

    1. Does this resolve template variables issue which we have been facing till graphana 4.x?
    2. If we have 250 mongo servers, How can we configure and send alerts only to those servers where there is issue?

    March 16, 2020 at 12:41 pm

Leave a Reply