November 26, 2014

Using Multiple Key Caches for MyISAM Scalability

I have written before – MyISAM Does Not Scale, or it does quite well – two main things stopping you is table locks and global mutex on the KeyCache.

Table Locks are not the issue for Read Only workload and write intensive workloads can be dealt with by using with many tables but Key Cache Mutex will still hunt you. If you aware of MySQL history you may think Key Cache scalability was fixed with new Key Cache in MySQL 4.1, and indeed previously it did not even scale with one CPU as global lock was held during IO duration, In MySQL 4.1 the lock is held only when key block (1KB to 4KB) is being copied from Key Cache to thread local buffer, which is terrible contention spot in particular on systems with many CPU cores.

Happily there is solution, or at least half of it.

If you have chosen a way of using Multiple Tables to solve Table Locks problem you can also use multiple Key Caches to reduce or virtually eliminate key cache contention. Too bad you can only map single table to single key cache – it would be so much more helpful if you could use multiple key caches for the same table, for example caching even/odd key blocks or something similar, or actually just keep hash of locks instead of one.

When you decide to use Multiple Key Caches the question is how many to use, what sizes to allocate and how to map tables to them. One simple solution I use – create separate key cache for all actively accessed tables (assuming there are only few of them), allocating key_cache proportional to their size and load, but no more than the index size (assuming table sizes are relatively static)

To get accurate information about table usage I will use Percona Patches:

For table sizes we can use traditional TABLES table:

Now with a bit of INFORMATION_SCHEMA magic and a bit of waiting on “efficient” Information Schema Query Execution (as you may guess we just need to join two previous results sets here) we can get the information about relative table index sizes and their relative use activity. I just summed rows modified and updated but you can surely use different formula if you like.

A bit more query hacking and we get a query which will return statements to initialize key buffers according to table sizes and activity (in this case taken with 50-50 weight though you may use other formula), while maintaining the restriction on the sum key buffer size (4000000000 in this case) and actual index size:

Pass this via SELECT INTO OUTFILE or pipe it to mysql directly as explained here to create key caches.

Now you can use much more simple command to assign tables to the key caches:

So going through complex or not multiple key cache creation exercise you probably wonder how much performance gains should you expect. In fact it can be very significant.

For CPU bound workload with 16 Cores Inserting data to about 20 tables I’ve seen performance gains as much as 10x compared to using single shared key cache of the same size.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Thomas B. says:

    Hi peter,

    thank you very much for all your valuable tips! I did not even think the KeyCache could be a problem.

    But I wonder how to find out if I am hit by this problem or not. Does “For CPU bound workload” mean that I should have a look at this if the CPU usage on my database server is high?

    Maybe it would make the articles even more useful if they started with “You should look at this if you have this problem: …”.

    Best Regards

    Thomas B.

  2. peter says:

    Thomas,

    It is hard to explain all the cases when one or another problem may be affecting you. I try best I can when possible. You are also always welcome to hire us to help you to understand what problem you’re really running into.

  3. Abhishek says:

    Hi Peter,

    How can we monitor performance of multiple key caches, like for monitoring global key cache we use “show status like ‘key_read%'; ” but how i can see same information for new created key cache.

  4. John Dzilvelis says:

    To view the usage of the individual key buffers, I use “mysqladmin debug” and then view the error log.

  5. Allen D. says:

    Hi, Peter:

    First off, your blog is an AWESOME resource and I want to thank you for it. I have found the intel contained herein of great value.

    I just wanted to post and confirm that smart use of multiple key caches for MyISAM tables can *dramatically* increase performance in high write applications where you have very little reads (think master tables). The MySQL documentation is not quite as informative as I think it should be for this feature – they suggest creating 3 key caches without really talking about the flexibility you can have. While the MySQL docuementation appraoch may help some you can definitely squeeze more performance with N key caches as you describe.

    We have a very high-write environment that uses MyISAM for a number of reason (too much history to go into). Before tinkering with multiple key caches I was able to only get 6k – 7k QPS (really all writes). This was not utilizing the full capacity of the hardware we were running this particular master on. We took the time to examine and segment the tables that were candidates for giving their own key cache too. Our approach was not one cache per table but one per group of tables that are typically updated together by a single process. We choose this route really for manageability (we have a large warehouse with lots of daily tables). We were able to size key caches easily based on index sizes and we now routinely see 15k-20k QPS now! I have seen it go as high as 40k with the current configuration. The good news is the more ram and processor you can throw at your writes, the further you can segment your key caches. Obviously there will be upper bounds but we haven’t even come close to hammering them yet.

    As Peter suggests, take time and look at your operational posture and your index sizes and you will be shocked at how much more you can get out of MyISAM. It does not solve all the locking contention but if you have good table designs for your operational posture MyISAM can fly.

  6. Ted Osadchuk says:

    I experimented with separate key caches on a very busy instance and cpu intensive queries and did not see ANY improvement using separate key caches

    This is yet another tuning knob that’s completely useless!

  7. peter says:

    Ted,

    It did not work for you but it works in a lot of other cases. CPU bound load does not mean it it bottleneck on the key cache mutex.

    Also note there is another solution in the meanwhile. MariaDB has got fix to the key cache splitting this mutex to several hashes. You may want to try that.

  8. Allen D. says:

    Ted:

    I would agree with Peter. We do not have anything CPU-bound rather all concurrency and IO. The posture of the servers in question is 90% writes and NO deletes. Reads are filled by memcacxhe with the DB as backup if the cache for a record has expired. Without the key cahces performance is abysimal! In our scenario, having the right indexes (read primary and a unique key only) and memory allocated to each key cache helps concurrency a ton.

Speak Your Mind

*