Emergency

PMM graphs look strange

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PMM graphs look strange

    We have a PMM installation and the graphs look like this...I don't think there are any network issues because that would probably affect our production and we would get notified about it very quickly.
    The server that runs the PMM docker container is an old database server, it's an PowerEdge R720 with SSD's in RAID10 and 127 Gb memory so it should be able to handle the monitoring tool. The server is also located at the same network segment as the servers it monitor

    Any tuning that needs to be done or what can i look for. The tool is useful but the graphs are not when they look like this
    Attached Files

  • #2
    Hi Catoman , I would suggest you send us a screen capture of the Prometheus dashboard. Usually we see gaps in the graphs for the following reasons:
    • db server overloaded, and times out on metrics retrieval from mysqld_exporter
    • network TTL taking longer than scrape interval (1s by default) - common
    • PMM server overloaded. Given the specs you mention you're well provisioned
    Also send us a screen capture of /prometheus/targets

    Comment


    • #3
      Hi Michael,

      I have attached files for config and targets. Not sure what you mean by "Prometheus dashboard".
      Attached Files

      Comment


      • #4
        More files
        Attached Files

        Comment


        • #5
          Running a test now so we can let this rest until next week.

          Comment


          • #6
            Hi Catoman,

            as I can see you have context deadline exceeded errors, it is the most popular reason for gaps in graphs.
            context deadline exceeded error means that mysqld_exporter works longer that prometheus expected (cannot finish work in 1 second).
            so mysqld_exporter creates long additional load on the database server.
            mysqld_exporter runs many queries to the database, so we can disable some checks to speed up it.
            usually, the longest query is 'tablestat', it is possible to disable this query by the following commands.
            Code:
            pmm-admin remove mysql:metrics
            pmm-admin add mysql:metrics --disable-tablestats
            also --disable-userstats --disable-processlist --disable-binlogstats options available.

            Comment


            • #7
              We can close this....I migrated (well, re-installed) the PPM server on an HP DL385 instead and now the graphs are solid. No gaps or anything.
              The problem was the Dell server

              Comment

              Working...
              X