"Shard early, shard often"

November 16, 2009
Author
Morgan Tocker
Share this Post:

I wrote a post a while back that said why you don’t want to shard.  In that post that I tried to explain that hardware advances such as 128G of RAM being so cheap is changing the point at which you need to shard, and that the (often omitted) operational issues created by sharding can be painful.

What I didn’t mention was that if you’ve established that you will need to eventually shard, is it better to just get it out of the way early?  My answer is almost always no. That is to say I disagree with a statement I’ve been hearing recently; “shard early, shard often”.  Here’s why:

  • There’s an order of magnitude better performance that can be gained by focusing on query/index/schema optimization.  The gains from sharding are usually much lower.
  • If you shard first, and then decide you want to tune query/index/schema to reduce server count, you find yourself in a more difficult position – since you have to apply your changes across all servers.

Or to phrase that another way:
I would never recommend sharding to a customer until I had at least reviewed their slow query log with mk-query-digest and understood exactly why each of the queries in that report were slow.  While we have some customers who have managed to create their own tools for shard automation, it’s always easier to propose major changes to how data is stored before you have a cluster of 50+ servers.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved