Scaling MySQL – A Good Problem to Have

September 23, 2022

Author

Vadim Tkachenko

Insight for DBAs

Insight for Developers

MySQL

Share this Post:

scaling MySQL When you develop an application you expect success, and often success comes with growth problems. These problems especially show themselves in the area of data storage, where being stateful is not as easy to scale as the stateless parts of the application.

There are several stages of approaching database scalability:

Configuration and query optimization. This step can help a lot, and I would recommend a recent book by Daniel Nichter “Efficient MySQL Performance: Best Practices and Techniques”, which goes into this topic.

If #1 is done and you continue to push the limits of your database, the next step is to improve the hardware: adding extra memory, improving storage throughput (regular SSD, NVMe storage layer, etc.), or increasing the size of the cloud instances (this is what I call “optimization by credit card”). This typically should help, but only to a certain limit. And there is only so much memory or storage you can push into a server before it becomes very expensive very quickly, never mind the physical limits of how much memory you can fit into a single server.

Step 3 is distributing the workload on multiple servers when the limit of the single server is met. When the workload is read-intensive, it can be typically solved by a regular MySQL source->replica replication, however, if you need to scale writes, it becomes quite complicated.

I would like to provide an excerpt from the Square Cash Engineering blog, which describes their experience (Sharding Cash | Square Corner Blog (squareup.com)):

“Cash was growing tremendously and struggled to stay up during peak traffic. We were running through the classic playbook for scaling out: caching, moving out historical data, replica reads, buying expensive hardware. But it wasn’t enough. Each one of these things bought us time, but we were also growing really fast and we needed a solution that would let us scale out infinitely. We had one final item left in the playbook.”

Before jumping to the solution that Square Cash is using, I should mention a traditional solution to sharding (this is how we name distributing MySQL workload into multiple smaller pieces – servers) – sharding on the application level – basically, an application decides what server to use to execute a query. However, there is a constantly increasing need to have sharding logic separated from the application, so developers are focusing on creating business outcomes rather than solving database workload distribution problems again and again. Therefore there is a need to move the “distribute workload” logic from an application level to a middleware or even to a database level.

There is already an available middleware that helps with these problems (and this is what Square Cash is using): Vitess (Vitess | A database clustering system for horizontal scaling of MySQL)

Vitess is an open source software that works like a proxy and helps to distribute MySQL workload across multiple servers (for more details about Vitess architecture I will refer to the Sugu Sougoumarane presentation: 2019-sugu-highload.pdf (vitess.io) and to our introduction: Introduction to Vitess on Kubernetes for MySQL – Part I of III – Percona Database Performance Blog).

Originally, Vitess was developed to scale YouTube, although I’ve heard recent rumors that Google migrates YouTube to a different backend. Later, it was adopted not only by Square Cash but also by companies like Slack, Pinterest, GitHub, and HubSpot. They all are looking to solve MySQL scalability problems.

I wrote this blog post to collect your opinions: