Emergency

PMM linux:metrics service causing abnormal load on client machine

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PMM linux:metrics service causing abnormal load on client machine

    Hi,

    PMM - 1.9.0
    PMM linux:metrics causing HIGH CPU load on client machine however SAR report doesn't suggest load.
    CPU LOAD goes upto 500-1000 (using "w"command).

    at the same time
    PMM "system overview", "load average" graph also showing the same load. (900+) post which it stopped populating the "system overview" graphs.
    we have multiple machine configured to monitor using PMM and the problem is only for this one client.

    The moment PMM linux:metrics service is stopped CPU load came down to normal.
    We started and stopped the PMM linux:metrics multiple times to confirm the issue.

    Attached here-with the pmm-server.log and pmm-client.log


    Attached Files

  • #2
    Hi vaibhav_upadhyay40

    I didn't see anything that jumps out from the command line output you shared, however I see that linux:metrics was down when you performed the data collection, so we don't have a full capture of the poor performance behaviour.

    Can I suggest you start linux:metrics up on one server, and then after a period of time please share a snapshot of dashboard Prometheus Exporter Status for the host where you have linux:metrics running? It should then report to us which submodule is consuming the bulk of the CPU time and causing the high load.
    https://pmmdemo.percona.com/graph/dashboard/db/prometheus-exporter-status?refresh=1m&orgId=1&var-interval=$__auto_interval&var-host=ps57

    Snapshots is a new feature since PMM 1.9, please see the documentation here: https://www.percona.com/doc/percona-...ting-snapshots

    Comment

    Working...
    X