September 21, 2014

More Gotchas with MySQL 5.0

Working on large upgrade of MySQL 4.1 running Innodb to MySQL 5.0 and doing oprofile analyzes we found very interesting issue of buf_get_latched_pages_number being responsible for most CPU usage. It did not look right.

The close look revealed this is the function which is used to compute number of latched pages in Innodb Buffer Pool, which is called when you run SHOW INNODB STATUS and SHOW STATUS. In this case for same of monitoring SHOW GLOBAL STATUS was run every couple of seconds being considered fast operation, as it is in MySQL 4.1

But why buf_get_latched_pages_number ? Because this function traverses whole buffer pool to compute number of latched pages, which is quite slow with buffer pools containing millions of pages. But even worse this is done while holding global buffer pool mutex.

Another function which we’ve seen being responsible for high CPU usage is ha_print_info which also may traverse large arrays.

But this is only one of the gotchas. The second issue we ran into is number of Threads_Running being much higher for MySQL 5.0 compared to MySQL 4.1 It was something like 2-3 for MySQL 4.1 vs 40-60 for MySQL 5.0
At the same time however there were virtually no disk IO and CPU being loaded less than 20% (out of 4 Cores) in all cases which points to some locks/latches being responsible for this high amount of threads.

It turns out this issue is side effect of the first issue described – if many queries are being run and buffer pool is large, a lot of queries pile up waiting on locked innodb buffer pool mutex, so SHOW STATUS shows high number of running threads.

This theory can be confirmed by the fact using mysqladmin processlist | grep -v Sleep shows much smaller value which is close in MySQL 5.0 and 4.1

Hopefully this issue will be fixed sooner or later (I would be even happy to give up Innodb_buffer_pool_pages_latched n SHOW STATUS output

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Paul Tuckfield says:

    I’ve seen this exact effect in our environment, but never attempted to explain it. nice analysis as always.

    I bet changing the function you pointed to will fix this problem, but it sure would be nice to be selective about what we’re looking at in some way. If “show variables like” did a more complete where clause implementation, this and future problems could be worked around.

    If it makes you feel any better, i can tell you that this happens in oracle too when examining various statistics (oracle style, select * from v$whatever) under high traffic. It starts to feel like the heisenburg principle. But like I said above, the more detailed oracle stats that result in more disruptive high traffic serialization can generally be avoided since they’re not all under one command.

  2. peter says:

    Thanks Paul,

    Yes unfortunately SHOW STATUS LIKE does not help here, while it surely could be implemented to lookup only variable(s) you’re trying to look at.

    There is similar problem with SHOW VARIABLES which are somewhat slow in 5.0 (and JDBC used to run it on every connection) but VARIABLES are at least accessible as SELECT @@key_buffer_size which is fast. Status variables are not.

  3. Bill says:

    Thanks for finding this issue. Has this been reported to Mysql in a bug report? I don’t seem to be able to find it. I was thinking of using SHOW STATUS for innodb in 5.0, and I’d like to monitor the progress of this issue.

  4. peter says:

    I have not submitted it as a Bug yet but sent Heikki email. Will see what he comments :)

  5. SHOW STATUS can provide an alternative to LIKE ‘innodb%’ using WHERE. For example:

    mysql> show global status where variable_name in (‘Com_insert’,’Innodb_buffer_pool_pages_latched’,’threads_running’);

    The downside is you loose the wildcard capability, however you can string commands together with a like

    mysql> show global status where variable_name like ‘innodb%’ or variable_name like ‘com%';

    mysql> select @@version;
    +———–+
    | @@version |
    +———–+
    | 5.0.37 |
    +———–+

  6. peter says:

    Ronald,

    How does this work internally ? If I look at threads_running is only that variable is being queried or does it populate everything and just throws the other stuff away ?

    Looking at a lot of stuff with information schema it is not always the way you would think it should be implemented.

  7. Peter,

    I’m not certain. I suspect you are correct, however I do need to know myself, I’ll add it to my long list to investigate just what’s happening under the covers (if only had more time)

    Ronald

  8. Vasil Dimov says:

    Peter,

    Can you change the body of buf_get_latched_pages_number() function to be something like

    buf_get_latched_pages_number()
    {
    return(13);
    }

    and see how much things improve? While this will come up with wrong result (obviously) it will give a good hint if it is worth having the number of latched pages in a “ready to read” variable, without having to really implement it.

    Thanks!

  9. peter says:

    Vasil,

    I think this is similar to what Google does in their patches – disabling this feature.

  10. wow gold says:

    Thanks for finding this issue. Has this been reported to Mysql in a bug report? I don’t seem to be able to find it. I was thinking of using SHOW STATUS for innodb in 5.0, and I’d like to monitor the progress of this issue.

  11. Wow. The bug is still there even in the 5.1.23 InnoDB plugin. I have filed a bug — http://bugs.mysql.com/bug.php?id=36600. I have changed my builds to not call buf_get_latched_pages_number. Otherwise, the CPU gets saturated when SHOW STATUS commands are run.

  12. peter says:

    Mark,

    I’m not at all surprised :)

    Innodb developers are very disconnected from MySQL user base and customer base. I do not think there are guys which deal with large Innodb installations on daily basics in Innodb team and Innodb/Oracle does not have Support or Consulting teams which could work close with developers on issues like this.

    I mean of course people report bugs like this but I do not think their importance is felt by hart unless you deal with it in production.

  13. Vasil Dimov says:

    Can someone give an estimate how much SHOW STATUS improves when buf_get_latched_pages_number() is disabled?

    Thanks!

  14. All,

    we are going to remove the loop to calculate buf_get_latched_pages_number from the standard build of mysqld. We may keep it in the UNIV_DEBUG build.

    Since SHOW INNODB STATUS\G prints all reserved latches, people can get the information, and in a much more detailed form, from SHOW INNODB STATUS\G.

    ###

    Concerning the “disconnectedness of InnoDB developers” from the MySQL user base, InnoDB developers look at bugs.mysql.com every day. And consulting companies like Percona exist, who give us feedback :).

    Regards,

    Heikki

  15. peter says:

    Vasil,

    I think it is not the time SHOW INNODB STATUS take which is important but the fact other queries starve waiting on innodb buffer pool which can be significant. I’ve seen hundreds of queries to be queued this way which then released at once and cause significant resource competition.

  16. peter says:

    Heikki,

    Thanks this is good change.
    I know you guys read bugs and listen to our feedback but it is not exactly the same as dealing with Innodb in production.

  17. Vasil Dimov says:

    Hmm,

    Peter, do you mean SHOW STATUS instead of SHOW INNODB STATUS?
    buf_get_latched_pages_number() is called when SHOW STATUS is executed, it is not called when SHOW INNODB STATUS is executed.

    I have prepared a simple patch that removes the Innodb_buffer_pool_pages_latched variable from SHOW STATUS.

  18. peter says:

    Oh yes indeed.

  19. Vasil Dimov says:

    Patch that removes the variable Innodb_buffer_pool_pages_latched from SHOW STATUS has been committed in InnoDB 5.0 and 5.1, follow the bug report: http://bugs.mysql.com/36600

  20. Laine says:

    Has this been addressed in the percona binaries?

  21. Vadim says:

    Laine,

    Yes, and actually it is fixed in mysql binaries also (5.0.80+)

Speak Your Mind

*