Online shard migration (OLM) at Facebook
Facebook has a large MySQL deployment with thousands of instances. Online shard migration (OLM) is a utility for moving a logical database from one MySQL instance A (master) to another MySQL instance B (also a master) with near zero downtime. When the database is moved from one master A to another master B, it also moves from slaves of A to slaves of B.
There are at least two use cases for OLM:
· Load balancing: To move a database from an overloaded instance to an underloaded instance.
· Consolidation: To move all the logical databases from an instance, thereby freeing up that instance. We could consolidate the logical databases into fewer instances.
To be able to leverage a utility like OLM in Facebook, we not only need the core OLM mechanism, but we also need an OLM policy that determines which database to move to which destination, and an OLM runner that can start these migrations.
Most of the talk is about OLM mechanism, but we also briefly talk about OLM policy and OLM runner.