Emergency

PMM Performance

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PMM Performance

    Hi,

    I've been running PMM for serveral months on a Hyper-V guest. Data was hosted on a RAID10 array of 4 10k disks. It struggled a bit with IO, but usually kept up and didn't lose data. In recent weeks it was struggling and holes appearing in data, so I took the opportunity to create a new server on dedicated tin.

    The new server has a RAID10 array of 6 disks. It is still a VM, however it has a direct LUN attachment to avoid disk alignment issues. It has 28GB of RAM assigned and all CPU cores. Prometheus only seems to be using a couple of gig.

    PMM is the latest build from Docker, installed on CentOS 7.

    There are six Percona servers reporting in, from two PxC clusters, and are on latest PMM-Client.

    Spec below:



    The new PMM ran OK for a couple of days, hitting hard in write IO as before. However, after a couple of days the read IO has gone high, and now we are seeing data missing.


    Prometheus stats below



    I have removed mysql:queries, but this doesn't seem to have helped (slowlog was off anyway, so possibly this wasn't a factor). Here's example of the client config of one cluster.

    Any suggestions on what I can do to get some performance back? Looking at some other threads, 157k may be a lot of "Time Series" for six servers. Is there a way to just make it collect less data?





  • #2
    I've potentially fixed this myself. Based on what I saw in two other articles:

    1) Increased the memory allocated to Prometheus (256MB default. Now raised to 4GB)
    2) Turned off table stats
    sudo pmm-admin remove mysql:metrics
    sudo pmm-admin add mysql:metrics --user pmm-mysql --password whateveryourpasswordis --disable-tablestats

    Time Series is now a tenth of what it was


    and disk IO has gone to something much more manageable.


    I think the installation instructions for PMM-Client need to be include --disable-tablestats as I'm sure everyone will be doing the same thing. Or at least a mention that this is recommended for large deployments (which I'm sure is largely the case with PxC users).

    PS - do we really need a CAPTCHA for every post and edit given we're all authenticated? (and apparently first post vetted?)
    Attached Files
    Last edited by RichardGriffiths; 05-10-2017, 05:11 AM. Reason: screen shot sizing

    Comment


    • #3
      Hi RichardGriffiths ,
      1. Your tuning approach is correct, increasing memory for the PMM server container is the first step to ensure Prometheus can cache incoming scrapes in memory.
      2. Disabling table stats is an acceptable step in order to throttle the incoming volume of data to disk. One other approach would be to set lower the scrape_interval to perhaps 5s in /etc/prometheus.yml so that you still get the advantage of maximum data collection but just at a lower resolution.
      3. I've filed a request to our web team about the CAPTCHA requirement - as a moderator I'm also required to do this each post and yes it is a little excessive. Watch for some improvement shortly.

      Comment


      • #4
        Hello Richard Griffiths, Michael.
        Captcha is no longer required, keep posting!
        Is this an emergency? Get assistance from Percona Support. Click here.

        Comment

        Working...
        X