Announcement

Announcement Module
Collapse
No announcement yet.

MySQL Performance Degrades after hitting ~100% CPU Load

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • MySQL Performance Degrades after hitting ~100% CPU Load

    Hi everyone,

    I am having some interesting performance issues with MySQL installation and looking to get some suggestions as I have exhausted any ways to troubleshoot I know.

    First of all, info:
    OS - Fedora 2.6.40.4-5.fc15.x86_64
    MySQL - mysql-5.5.27-linux2.6-x86_64
    Hardware - HP DL580g7, 2 x E74820,64Gb ram, Raid 10 storage with 1Gb flash buffer

    my.cnf:

    [mysqld]port = 3306socket = /tmp/mysql.sockdefault-storage-engine = myisamtmpdir = /var/tmp/tmpmysql #Logslog=/var/log/mysql.loggeneral_log = 0slow_query_log = 1long_query_time = 10skip-external-lockingconcurrent_insert = 2 thread_concurrency = 16thread_cache_size = 8key_buffer_size = 28Gmax_allowed_packet = 32Mtable_open_cache = 2048sort_buffer_size = 2Mread_buffer_size = 2Mjoin_buffer_size = 128Mread_rnd_buffer_size = 32Mmyisam_sort_buffer_size = 2Gmyisam_max_sort_file_size = 8Gbulk_insert_buffer_size = 64M query_cache_size = 128Mquery_cache_limit = 64Mtmp_table_size = 256Mmax_tmp_tables = 86max_heap_table_size = 2Gback_log = 70max_connections = 500max_connect_errors = 10#InnoDBinnodb_thread_concurrency = 16innodb_write_io_threads = 8innodb_read_io_threads = 8innodb_buffer_pool_size = 4Ginnodb_data_home_dir = /usr/local/mysql/datainnodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextendinnodb_log_gro up_home_dir = /usr/local/mysql/datainnodb_additional_mem_pool_size = 20Minnodb_log_file_size = 1Ginnodb_log_buffer_size = 8Minnodb_flush_method=O_DIRECTinnodb_flush_log_at_ trx_commit = 0innodb_lock_wait_timeout = 120


    ------------------------------------------------------

    So the issue I am having is that MySQL would suddenly lose performance executing queries after running in heavy load for couple of hours. The same queries start running 10 -100 times slower. After restart everything is back to normal.

    This behaviour got more noticeable with addition of load. When running 1K queries per second, this used to never happen, 1.5K qps - would happen once a week, 2K qps - happens every day or more often.
    With this load the server is often at 100%, but this doesn't not explain why performance does not recover after the load is removed.

    Now about query types:
    60% selects on myisam tables
    25% inserts/updates on myisam tables
    15% innodb tables

    Around 50-60 concurrent connections

    The most common query executed:

    SELECT ( 4 * HOUR( a.date ) + FLOOR( MINUTE( a.date ) /15 ) ) AS period, SUM( a.mv ) AS count, DATE( date ) AS dateFROM table aWHERE a.date > DATE_SUB( '2012-08-08 16:14:55', INTERVAL 10MINUTE )AND a.date <= '2012-08-08 16:44:58'AND location = 'some_location'AND ...GROUP BY period


    It looks like I am hitting a bottleneck somewhere, but can not exactly pinpoint the spot. Things i have checked:

    Proper index - yes
    Tmp tables - all in memory
    Key cache - always enough
    Query execution plan does not change when problem occurs
    iostat - storage is loaded 10% at most
    CPU - hits ~ 100%, but goes back to 10%-20% with mysql degraded performance
    Ram - 36G out of 64G used by MySQL, the rest is for OS, never goes to swap


    What else can I check or do? Any suggestions are appreciated!
    Thanks.

  • #2
    After some digging on internet and trying some things, I found that executing "FLUSH TABLES" seems to restore performance close to original levels temporarily. Any thoughts about that?

    Comment


    • #3
      Just wanted to add the screenshot of "top" to show the database load. The database is at that load level approximately 50% of the time. It shows 32 Cpu, but really its 2 - 8 core cpu's + hyper-threading.
      Is it possible MySQL does not have enough cpu cycles left to clean up some internal queues/caches ... and "flush tables" explicitly does that?

      Comment


      • #4
        Try decreasing the size of the table cache.

        http://www.mysqlperformanceblog.com/2009/11/16/table_cache-n egative-scalability/

        Comment


        • #5
          Thanks for reply Justin,

          I actually tried running the database with my-huge.cnf default setting for a day to eliminate issues in configuration and the issue i described was still occurring. table_open_cache was set to 512, so I know this is not it.

          I also found a bug report describing somewhat similar issues: http://bugs.mysql.com/bug.php?id=36556

          Any suggestions anyone?

          Comment


          • #6
            Ok, i think a actually found something. I noticed that "Key_blocks_not_flushed" variable keeps growing with time and at some point a significant degradation in performance occurs. After "FLUSH TABLES", 'Key_blocks_not_flushed' goes to a very small number and performance restores.

            I wrote a small script to monitor query performance and flush the tables when performance drops. Take a look what happened tonight:


            20120811 03:19:01 Key_blocks_not_flushed : 1615585 Test query run time : 0.3005740642547620120811 03:20:02 Key_blocks_not_flushed : 1615585 Test query run time : 0.5485630035400420120811 03:21:02 Key_blocks_not_flushed : 1620897 Test query run time : 0.5150871276855520120811 03:22:01 Key_blocks_not_flushed : 1626296 Test query run time : 0.3300859928131120120811 03:23:02 Key_blocks_not_flushed : 1629474 Test query run time : 0.5076937675476120120811 03:24:02 Key_blocks_not_flushed : 1615890 Test query run time : 0.5435559749603320120811 03:25:02 Key_blocks_not_flushed : 1615890 Test query run time : 0.3887138366699220120811 03:26:01 Key_blocks_not_flushed : 1617865 Test query run time : 0.3771240711212220120811 03:27:02 Key_blocks_not_flushed : 1617323 Test query run time : 0.3522229194641120120811 03:28:01 Key_blocks_not_flushed : 1616188 Test query run time : 0.2303678989410420120811 03:29:41 Key_blocks_not_flushed : 1616713 Test query run time : 39.959482908249!!! MySQL Maintenance multiple copiesFlush occured, time : 22.11358690261820120811 03:31:01 Key_blocks_not_flushed : 2088 Test query run time : 0.1137628555297920120811 03:32:02 Key_blocks_not_flushed : 25602 Test query run time : 0.1705498695373520120811 03:33:01 Key_blocks_not_flushed : 35473 Test query run time : 0.1589679718017620120811 03:34:01 Key_blocks_not_flushed : 51337 Test query run time : 0.1690609455108620120811 03:35:01 Key_blocks_not_flushed : 67919 Test query run time : 0.2325990200042720120811 03:36:01 Key_blocks_not_flushed : 8413 Test query run time : 0.1932499408721920120811 03:37:02 Key_blocks_not_flushed : 13171 Test query run time : 0.1894078254699720120811 03:38:01 Key_blocks_not_flushed : 12972 Test query run time : 0.167176961898820120811 03:39:02 Key_blocks_not_flushed : 12964 Test query run time : 0.1297600269317620120811 03:40:01 Key_blocks_not_flushed : 41718 Test query run time : 0.1579930782318120120811 03:41:01 Key_blocks_not_flushed : 53476 Test query run time : 0.1629450321197520120811 03:42:01 Key_blocks_not_flushed : 63676 Test query run time : 0.1416990756988520120811 03:43:01 Key_blocks_not_flushed : 81924 Test query run time : 0.1917388439178520120811 03:44:02 Key_blocks_not_flushed : 116203 Test query run time : 0.09872603416442920120811 03:45:01 Key_blocks_not_flushed : 134443 Test query run time : 0.158867120742820120811 03:46:02 Key_blocks_not_flushed : 147924 Test query run time : 0.1986248493194620120811 03:47:01 Key_blocks_not_flushed : 156297 Test query run time : 0.20126295089722


            So it looks like I have a workaround for now, but I would still like to understand why?

            I also found this bug: http://bugs.mysql.com/bug.php?id=36556 and I think it might be related to what I am experiencing.

            Comment


            • #7
              The next thing I am going to try is to split key cache into 3 separate caches and store key cache for tables with heavy inserts separately. The idea is that "Key_blocks_not_flushed" will be smaller and maybe mysql will be able to flush the keys by itself.
              As far as i know, there are no performance metrics for custom key caches, so I will not really know the whole picture. Does anyone know how to obtain stats for custom key caches?

              Comment


              • #8
                Just an update, from the logs I can see now that performance degradation occurs when I have approximately 1.6 million key blocks not flushed ( happened 3 times already).
                Why that number? Any thoughts?

                Comment


                • #9
                  We've also encountered this issue on one of our production machines (H/W is similar to the one, described by the topic starter, with the only difference that the OS is running inside ESX VM).

                  mysql> flush tables;
                  Query OK, 0 rows affected (15 min 57.33 sec)

                  The FLUSH TABLES workaround did work, the whole operation took ~15 mins, however the server returned to the normal condition almost immediately after issuing the command. As a temporary solution I've set global flush_time to 1800 (make full flush every 3 hours).

                  Any suggestions? I can provide any extra information required to troubleshoot this.

                  Thanks in advance.

                  Comment


                  • #10
                    An update.
                    After splitting key cache into 3 parts, the problem still persists and performance drops at 1.6 Million blocks


                    20120814 03:21:02 Key_blocks_not_flushed : 1635384 Test query run time : 0.1908519268035920120814 03:22:01 Key_blocks_not_flushed : 1656557 Test query run time : 0.1006259918212920120814 03:23:01 Key_blocks_not_flushed : 1624868 Test query run time : 0.1470339298248320120814 03:24:02 Key_blocks_not_flushed : 1624868 Test query run time : 0.4183540344238320120814 03:25:02 Key_blocks_not_flushed : 1626411 Test query run time : 0.1844708919525120120814 03:26:01 Key_blocks_not_flushed : 1640527 Test query run time : 0.1888408660888720120814 03:27:01 Key_blocks_not_flushed : 1630911 Test query run time : 0.1034309864044220120814 03:28:06 Key_blocks_not_flushed : 1625262 Test query run time : 4.9216568470001Flush occured, time : 27.56007099151620120814 03:29:02 Key_blocks_not_flushed : 0 Test query run time : 0.1564400196075420120814 03:30:01 Key_blocks_not_flushed : 0 Test query run time : 0.07496213912963920120814 03:31:02 Key_blocks_not_flushed : 0 Test query run time : 0.2308518886566220120814 03:32:01 Key_blocks_not_flushed : 6194 Test query run time : 0.1582000255584720120814 03:33:02 Key_blocks_not_flushed : 6194 Test query run time : 0.1361639499664320120814 03:34:01 Key_blocks_not_flushed : 6194 Test query run time : 0.1661710739135720120814 03:35:01 Key_blocks_not_flushed : 6194 Test query run time : 0.1131398677825920120814 03:36:01 Key_blocks_not_flushed : 6194 Test query run time : 0.07656216621398920120814 03:37:01 Key_blocks_not_flushed : 8218 Test query run time : 0.07390999794006320120814 03:38:02 Key_blocks_not_flushed : 8218 Test query run time : 0.1435379981994620120814 03:39:01 Key_blocks_not_flushed : 8218 Test query run time : 0.1360759735107420120814 03:40:02 Key_blocks_not_flushed : 8218 Test query run time : 0.08897209167480520120814 03:41:01 Key_blocks_not_flushed : 8218 Test query run time : 0.13148593902588

                    Comment


                    • #11
                      meph,

                      Maybe you could post your problem here as well: http://bugs.mysql.com/bug.php?id=36556
                      That way we might get more traction with this issue.

                      Comment


                      • #12
                        One more observation.

                        It looks like MySQL tends to flush tables when deletes occur. In my db deletes are very infrequent and usually occur in bulk. Can it be the type of workload that exposes this bug? - frequent inserts/updates, no deletes.

                        Let me know what do you guys think.

                        Comment


                        • #13
                          Ok, so now I know how to reproduce this problem.

                          This happens on MyISAM tables with very large index > 2Gb with delay_key_write = 1 query mix: select/insert/update, but no deletes. This in turn will cause the key unflushed blocks grow to "unknown" size when MySQL performance will crap out (probably due to inefficient key cache use). Right now I take consistent performance hits when Key_blocks_not_flushed ~2.2Gb (this changed from 1.6Gb after I created separate key caches). Solution is to: delay_key_write = 0 for the table and take a performance hit, or flush the table periodically.

                          Comment

                          Working...
                          X