MySQL Sharding - Practical and Hands-On Lab

Trends in Architecture and Design
3 December 09:00 - 12:00 @ Kensington Suite

When a component of your organization goes from medium to large in its size or volume, there are a series of performance problems that must be addressed. Sharding the cluster of MySQL databases is almost always the proper solution, but there are many different ways of sharding. This tutorial will go over some of them with actual working examples from high-volume shops in the past.

When a component of your organization goes from medium to large in its size or volume, there are a series of performance problems that must be addressed. Sharding the cluster of MySQL databases is almost always the proper solution, but there are many different ways of sharding. This tutorial will go over some of them and you will be able to build your own large-scale MySQL cluster from scratch and administer it using industry-standard tools.

Some of the tools that will be useful in sharding:

  1. Vitess/Vtocc - from YouTube, this does query rewriting and multiplexing.
  2. Gizzard - a sharding software framework for rolling your own distributed database on existing infrastructure components.
  3. JetPants - an automation toolkit for handling large/complex MySQL replication topologies.

In this tutorial, we'll build a large realistically-configured MySQL "cluster" made up of multiple pools and shards. We'll fill the shards with many gigabytes of data and perform a shard split under load. Participants will be given a step-by-step guide on doing this themselves, either during the tutorial along with the speaker, or afterward on their own in their own labs at home.

HAproxy, Vitess/Vtocc, and Gizzard will be covered at a high-level, and Jetpants will be used for the hands-on portions of the lab.

Speakers

Tim Ellis
CTO, PalominoDB
Speaker Biography: 
Tim Ellis presided over some of the larger MySQL installations during the 2000-2011 timeframe at such places as Digg, Mozilla, Riot Games (League of Legends), and StumbleUpon. Running large scaled database installations using Opsdev methodologies was his specialty. In the last 5 years, he has begun specialising in building hybrid database clusters, using MySQL for relational data and various distributed databases (often called "NoSQL") to store key/value or sparse hash map data. He has always been a strong proponent of using the right tool for the job, which can sometimes be a surprising and radical notion.


Slides