CPU% increases up to 80% after 1 week uptime with insignificant load

  • Filter
  • Time
  • Show
Clear All
new posts

  • CPU% increases up to 80% after 1 week uptime with insignificant load


    We are experiencing problems with our Percona Server. After 1 week of uptime the CPU% increases up to 80% with insignificant load. After restarting the server, CPU% drop to less then 3%. This obviously worries us because we expect big growth in the upcoming days. Slowqueries.log does not show us anything significant.

    How can we debug this problem?

    These are our specs:
    Opsys: CentOS release 6.2 (Final)
    Kernel: 2.6.32-220.13.1.el6.x86_64 GNU/Linux
    Server version: 5.5.27-28.0-log Percona Server (GPL), Release rel28.0, Revision 291

    Thank you,

  • #2
    Please, are there any suggestions? Why does Percona spend 80% CPU under insignificant load? "Show processlist" shows only 3 or 4 queries that are handled in split millisecond. "Slow queries" does not show anything (even with a threshold value of 0.3 seconds).

    Why does Percona spend 80% CPU and after restart less than 3%? I am new to Percona (shifted from MySQL) and most /etc/my.cnf settings are reasonable, as far as I can see.

    Should we upgrade version? Or downgrade? Anybody?

    See attachement....


    • #3
      Do you have trending graphs for memory usage and disk read/writes from before and after the migration to Percona? I'd be curious to see what is going on there, as if you are swapping or have increased disk access it could be causing the increase in CPU load.

      And to clarify, when you say you restarted the server, do you mean just MySQL (Percona), or the entire server?


      • #4

        Thank you for your reply!

        I checked wait I/O with mptat and it seemed pretty normal during the 90% spikes. With restart, I meant only the Percona Server, that is what is worrying me. The Percona Server runs on a virtual machine hosted by some company.

        Attached you will find all the stats that we receive in graphs. The swap graphs we should not be worried about, we have been told. Probably due to virtualization? Or maybe I am being naive. We have seen these spikes in the swap charts but it never really influenced performance during our load test. Also, during the second peak of 60% CPU usage there is no swap, and we really have no signifigant load yet.

        The database is not so big it should *definately* fit in memory.

        Let me attach the charts.


        • #5
          Hard to tell much from the graphs without a longer history. The second smaller spike could have been due to a backup, so it might be unrelated.

          It could possibly be due to the leap second. I'd check that to at least rule it out:

          (use sudo where necessary)
          $ /etc/init.d/mysql stop
          $ /etc/init.d/ntpd stop
          $ date -s "`date`"
          $ /etc/init.d/ntpd start
          $ /etc/init.d/mysql start

          That aside, I'd watch it for another day to gather some more data. Your graphs show it staying pretty flat, so if it spikes back up again try to gather as much data as you can via "free -m" and "top" to see what is actually using the CPU.


          • #6

            Thanks for your reply.

            I gathered data at the peak and saw only high CPU% by the mysql process with top. I will check free -m next time, but response times where OK. No weirdness in slow queries. Show process list showed no hanging processes. We do not have a slave configuration. Very weird.

            It is not a backup problem, because even with mysqldump it takes only a few seconds. The database is still very small. We didn't install Perconas backup scripts yet, but this is not causing the high CPU%. Also it was for a long time, until I restarted Percona Server.

            Is there any data in particular I should look out for when this happens again? Are there any tools for monitoring Percona itself? Where is all the CPU% going within the mysqld process?



            • #7
              The best free trending type monitoring solution we use is Cacti with the Percona Monitoring plugins. That will give you the majority of the info you need, and creates nice graphs.


              Did you try the steps for fixing the potential leap second issue? That can cause sustained high CPU usage by the mysql process, so I would definitely try that.


              • #8
                Thanks for your help I will get into that when we are spiking again..

                What do you mean by the leap second? Is this some known problem with Percona? I know a leap year, never heard of a leap second

                $ /etc/init.d/mysql stop
                $ /etc/init.d/ntpd stop
                $ date -s "`date`"
                $ /etc/init.d/ntpd start
                $ /etc/init.d/mysql start

                What does this give me? What should I look for?

                NTP *should* be running synchronized on the app (2 of them) and db server.


                • #9
                  It's a common issue overall, but MySQL / Percona can be affected by it.

                  Basically you are just re-setting the time, so the only output you should see is the date (other than messages for the services stopping / starting):

                  [user@server ~]$ sudo /etc/init.d/mysql stop
                  Shutting down MySQL (Percona Server).......... [ OK ]
                  [user@server ~]$ sudo /etc/init.d/ntpd stop
                  Shutting down ntpd: [ OK ]
                  [user@server ~]$ sudo date -s "`date`"
                  Fri Oct 5 11:34:09 PDT 2012
                  [user@server ~]$ sudo /etc/init.d/ntpd start
                  Starting ntpd: [ OK ]
                  [user@server ~]$ sudo /etc/init.d/mysql start
                  Starting MySQL (Percona Server)....... [ OK ]
                  [user@server ~]$

                  Below is a "bug" someone submitted, with similar circumstances, that turned out to be the leap second:

                  And below is a blog post talking about it:
                  http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-sec ond-high-cpu-and-the-fix/


                  • #10
                    Haha the leap second! OK I will try, according to the post only:

                    date -s "`date`"

                    Should solve the issue. I rather do not restart the database...

                    Lets wait for the next spike and thank you very much for your help!



                    • #11
                      I would at least stop ntpd first, then set date, then start ntpd.

                      I've read mixed things about what people have actually needed to restart to fix the issue. The few times I have ran into this personally have been on non-critical servers, so I restarted MySQL just to make sure.


                      • #12
                        When I restarted Percona (services mysqld restart), everything went back to normal <3% CPU so that definitely works. But I rather not restart Percona in production as it will result in failing queries and eventually HTTP requests by real people. There must be a better fix...


                        • #13
                          Yeah I'd just do all the steps minus restarting MySQL then and see how it goes. =)


                          • #14
                            Thanks for guiding I will post if it helped in the coming few days