A few weeks ago I was asked to isolate some functionalities from Mark Callaghan’s MySQL patch bundle. They were extensions adding per-user and per-table accounting to the database, accessible with a new set of commands such as SHOW TABLE_STATISTICS, SHOW INDEX_STATISTICS and SHOW USER_STATISTICS. The first two can interest anyone to periodically check what data or which index are the most active or which are not being used at all and could be candidates for dropping. All the patch features will surely be a great help for hosting providers to produce detailed reports on how each customer utillizes the database.
Having many different users running their queries on a single database instance means they will be constantly “battling” for the resources, each one caring only for his own application, completely ignoring the fact they chose shared hosting. Of course this alone is not a problem and theoretically if you don’t over-allocate, the users should never become conscious of the struggle happening underneath.
Obviously, that’s only in the ideal world and either it’s not even possible or, at best, economically inefficient to allocate the entire machine for only a few customers simply because their skills in SQL are limited.
The real life customers generate very uneven load – some barely any and some a lot. It has to be said that the SQL performance is very tricky, it’s very easy to produce large sets of data for the hardware to process from even small tables. Therefore often the resulting database load may not be directly related to the web traffic some customer has. The load can also be changing in time, mostly growing, often spiking. One way or another after a while every server becomes too tight for everyone. And for you it is best to be pro-active, to observe the changes as they happen and react before any user can start complaining. You can watch the load growth and based on that make the decision on splitting half of the users to some newly purchased hardware. But how to avoid the undesired situation that you only sort out the least active customers from the heavier ones?
It’s probably not uncommon someone uploads a really badly designed application, which runs queries not using indexes or which examine a lot of rows. This can elevate the load, but just as well it can bring the server down. You can have slow query log enabled and review it occasionally, but doing it offline means you can’t react in the real time on the critical situation. You can have some sophisticated mechanisms to browse the process list or scan the slow log live, but that is a bit ugly solution.
That is where this patch can be very useful. With simple MySQL SHOW command it gives you the basic statistics on the database activity broken down by users, tables or indexes. Monitoring those numbers every minute can show you changes in usage by customer and immediately catch any spikes as they happen pinpointing the source to make an intervention. Looking at totals over a longer period you can catch users causing the most traffic to help you with some strategic decisions.
Sample outputs:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
mysql> SHOW USER_STATISTICS LIKE 'hosting2'G *************************** 1. row *************************** User: hosting2 Total_connections: 84 Concurrent_connections: 0 Connected_time: 20 Busy_time: 1 Rows_fetched: 7861 Rows_updated: 240 Select_commands: 1759 Update_commands: 254 Other_commands: 85 Commit_transactions: 0 Rollback_transactions: 0 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
mysql> SHOW TABLE_STATISTICS LIKE 'hosting2%'; +--------------------------------+-----------+--------------+-------------------------+ | Table | Rows_read | Rows_changed | Rows_changed_x_#indexes | +--------------------------------+-----------+--------------+-------------------------+ | hosting2.wp_terms | 335 | 2 | 4 | | hosting2.wp_comments | 12 | 1 | 3 | | hosting2.wp_options | 6586 | 200 | 400 | | hosting2.wp_postmeta | 5 | 2 | 6 | | hosting2.wp_posts | 542 | 4 | 12 | | hosting2.wp_term_relationships | 720 | 9 | 18 | | hosting2.wp_users | 109 | 1 | 3 | | hosting2.wp_links | 910 | 7 | 21 | | hosting2.wp_term_taxonomy | 729 | 3 | 6 | | hosting2.wp_usermeta | 427 | 6 | 18 | +--------------------------------+-----------+--------------+-------------------------+ 10 rows in set (0.00 sec) |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
mysql> SHOW INDEX_STATISTICS LIKE 'hosting2%'; +--------------------------------------------+-----------+ | Index | Rows_read | +--------------------------------------------+-----------+ | hosting2.wp_options.option_name | 287 | | hosting2.wp_usermeta.meta_key | 8 | | hosting2.wp_term_relationships.PRIMARY | 720 | | hosting2.wp_postmeta.post_id | 1 | | hosting2.wp_links.link_visible | 455 | | hosting2.wp_terms.PRIMARY | 67 | | hosting2.wp_posts.type_status_date | 459 | | hosting2.wp_term_taxonomy.term_id_taxonomy | 204 | | hosting2.wp_posts.PRIMARY | 9 | | hosting2.wp_term_taxonomy.PRIMARY | 5 | | hosting2.wp_usermeta.user_id | 413 | +--------------------------------------------+-----------+ 11 rows in set (0.00 sec) |
The statistics for tables and indexes can be cleaned with the corresponding FLUSH command (e.g. FLUSH INDEX_STATISTICS).
All that can be fed into scripts to bring you automated reports or to perform actions such as:
Personally I would see a few more things implemented. A few examples to mention:
Also combining these new commands with our msl patch, which introduces the advanced query logging, would give the administrator powerful tool to almost effortlessly catch the users doing bad things on the database and point them to the specific problems they have (e.g. particular query using full joins).
If you want the statistics patch mentioned in this article, then Bluehost.com CEO, Matt Heatton, has published it on his blog.
Resources
RELATED POSTS