In this blog post, we will discuss how can we migrate from a replica set to sharded cluster.
Before moving to migration let me briefly explain Replication and Sharding and why do we need to shard a replica Set.
Replication: It creates additional copies of data and allows for automatic failover to another node in case Primary went down. It also helps to scale our reads if the application is fine to read data that may not be the latest.
Sharding: It allows horizontal scaling of data writes by allowing data partition in multiple servers by using a shard key. Here, we should understand that a shard key is very important to distribute the data evenly across multiple servers.
We need sharding due to the below reasons:
Sharded cluster will include two more components which are Config Servers and Query routers i.e. MongoS.
Config Servers: It keeps metadata for the sharded cluster. The metadata comprises a list of chunks on each shard and the ranges that define the chunks. The metadata indicates the state of all the data and its components within the cluster.
Query Routers(MongoS): It caches metadata and uses it to route the read or write operations to the respective shards. It also updates the cache when there are any metadata changes for the sharded cluster like Splitting of chunks or shard addition etc.
Note: Before starting the migration process it’s recommended that you perform a full backup (if you don’t have one already).
Note: Do not enable sharding on any database until the shard key is finalized. If it’s finalized then we can enable the sharding.
Here, we are assuming that a Replica set has three nodes (1 primary, and 2 secondaries)
Perform necessary OS, H/W, and disk-level tuning. To know more about it, please visit our blog on Tuning Linux for MongoDB.
|
1 |
sudo semanage fcontext -a -t mongod_var_lib_t '/dbPath/mongod.*'<br><br>sudo chcon -Rv -u system_u -t mongod_var_lib_t '/dbPath/mongod'<br><br>sudo restorecon -R -v '/dbPath/mongod'<br><br>sudo semanage fcontext -a -t mongod_log_t '/logPath/log.*'<br><br>sudo chcon -Rv -u system_u -t mongod_log_t '/logPath/log'<br><br>sudo restorecon -R -v '/logPath/log'<br><br>sudo semanage port -a -t mongod_port_t -p tcp 27019 |
Start all the Config server mongod instances and connect to any one of them. Create a temporary user on it and initiate the replica set.
|
1 |
> use admin<br><br>> rs.initiate()<br><br>> db.createUser( { user: "tempUser", pwd: "<password>", roles:[{role: "root" , db:"admin"}]}) |
Create a role anyResource with action anyAction as well and assign it to “tempUser“.
|
1 |
>db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",<br><br> "privileges": [<br><br> { "resource": { "anyResource": true },<br><br> "actions": [ "anyAction" ]<br><br> }<br><br> ],<br><br> "roles": []<br><br> });<br><br>> <br><br>>db.grantRolesToUser( "tempUser", [{role: "pbmAnyAction", db: "admin"}] )<br><br>> rs.add("config_host[2-3]:27019") |
Now our Config server replica set is ready, let’s move to deploying Query routers i.e. MongoS.
|
1 |
clusterRole: shardsvr |
Login to any MongoS and authenticate using “tempUser” and add the existing replica set as a shard.
|
1 |
> sh.addShard( "replicaSetName/<URI of the replica set>") //Provide URI of the replica set |
Verify it with:
|
1 |
> sh.status() or db.getSiblingDB("config")['shards'].find() |
Connect to the Primary of the replica set and copy all the users and roles. To authenticate/authorize mention the replica set user.
|
1 |
> var mongos = new Mongo("mongodb://put MongoS URI string here/admin?authSource=admin") //Provide the URI of the MongoS with tempUser for authentication/authorization.<br><br>>db.getSiblingDB("admin").system.roles.find().forEach(function(d) {<br><br>mongos.getDB('admin').getCollection('system.roles').insert(d)});<br><br>>db.getSiblingDB("admin").system.users.find().forEach(function(d) { mongos.getDB('admin').getCollection('system.users').insert(d)}); |
Shard the database:
|
1 |
>sh.enableSharding("<db>") |
Shard the collection with hash-based shard key:
|
1 |
>sh.shardCollection("<db>.<coll1>", { <shard key field> : "hashed" } ) |
Shard the collection with range based shard key:
|
1 |
sh.shardCollection("<db>.<coll1>", { <shard key field> : 1, ... } ) |
Migration of a MongoDB replica set to a sharded cluster is very important to scale horizontally, increase the read/write operations, and also reduce the operations each shard manages.
We encourage you to try our products like Percona Server for MongoDB, Percona Backup for MongoDB, or Percona Operator for MongoDB. You can also visit our site to know “Why MongoDB Runs Better with Percona“.
Resources
RELATED POSTS