Note: This blog post is part 1 of 4 on building our training workshop.
The Percona training workshop will not cover sharding. If you follow our blog, you’ll notice we don’t talk much about the subject; in some cases it makes sense, but in many we’ve seen that it causes architectures to be prematurely complicated.
So let me state it: You don’t want to shard.
Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. The reason you need to shard basically comes down to one of these two reasons:
(Yes, I am simplifying some of the scalability issues with MySQL on big machines, but I have faith that Yasufumi is making this better).
Despite my cautions, if you have established that you need to shard there are quite a few options available to you:
(Tip: There are a few famous cases of both (a) bad hashing algorithms and (b) users becoming unequal all of the sudden;Â You don’t want to shard based on the first character of a username – as there will be a lot more ‘M’ than ‘Z’.Â For users becoming unequal all of the sudden, it’s always interesting to think of what scaling challenges Flickr would have had for the official Obama photographer in the lead up to the 08 election.)
(Note: I’ve left out some of the more complicated sharding architectures.Â For example; another solution is to have shards all store fragments of data, and to cross backup those fragments across shards.)
The reason it’s complex comes down to two reasons:
I think that a lot of people remember (1), but (2) can be a real pain point.Â It can take a lot of work to build an application that works correctly when you are rolling through an upgrade where the schema will not be the same on all nodes.Â A lot of these tasks remain only semi-automated, so from an operations perspective there can often be a lot more work to be done.
This concludes Part 1 – I hope I’ve justified why we are not covering sharding.Â In Part 2, I will write about something that is going to be in the course – “XtraDB: The top 10 enhancements”, and in Part 3 “XtraDB: The top 10 parameters”.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.