In this day and age of agile methodologies and fast-paced development methods, database schema changes are as inevitable as the growth of data. We have been thoroughly familiar with the effectivity of pt-online-schema-change for the last couple of years but it has one limitation. If the table is too hot to acquire metadata locks to add or remove the necessary TRIGGERs, online schema change will be almost impossible.
However, gh-ost overcomes this limitation by simply using the binary log stream and acting as a regular replication client to capture changes. It reflects those changes into a shadow table, whilst also copying currently untouched data in the background. On a regular Aurora cluster, we do not have the option of making certain schema changes from a read replica simply because Aurora read replicas are, well, exclusively read-only for us mortals. We can take advantage though of the fact that we can still use gh-ost to accomplish this task with an absolute requirement that the cluster binlog_format is set to ROW. Aside from that, just follow along with the procedures below and customize the commands to your use case.
First, the environment we tested with for this post:
Our sysbench command:
|
1 |
sysbench --db-driver=mysql --mysql-user=percona --mysql-password=p3rc0na_ <br> --mysql-db=ghost --tables=1 --table-size=10000000 --auto-inc=off --threads=2 <br> --time=0 --rand-type=pareto --report-interval=1 <br> --mysql-host=revin-aurora.can0nprz8rtd.us-east-1.rds.amazonaws.com <br> oltp_update_index run 2>&1 | tee sysbench.log |
Our gh-ost command:
|
1 |
gh-ost -allow-on-master -assume-rbr -exact-rowcount <br> -critical-load Threads_running=400 -critical-load-hibernate-seconds 60 <br> -database ghost -max-load Threads_running=100 -nice-ratio 0.1 <br> -chunk-size 5000 -ask-pass -table sbtest1 -user percona <br> -host revin-aurora.can0nprz8rtd.us-east-1.rds.amazonaws.com <br> -postpone-cut-over-flag-file /home/ubuntu/gh-ost-sentinel <br> -throttle-additional-flag-file /home/ubuntu/gh-ost-throttle_ghost.sbtest1 <br> -alter 'ADD COLUMN ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP' <br> -verbose -execute 2>&1 | tee gh-ost.log |
Let’s look at some of the interesting bits on this command:
On the test dataset above, with the gh-ost command, the process took about 6.5mins.
|
1 |
2018-05-30 14:19:40 INFO Row copy complete<br>Copy: 10000000/10000000 100.0%; Applied: 110170; Backlog: 0/1000; Time: 6m20s(total), 6m20s(copy); streamer: mysql-bin-changelog.000043:26623659; State: migrating; ETA: due<br>Copy: 10000000/10000000 100.0%; Applied: 111250; Backlog: 526/1000; Time: 6m25s(total), 6m20s(copy); streamer: mysql-bin-changelog.000043:49727477; State: postponing cut-over; ETA: due |
There was a noticeable impact on the sysbench test as soon as we started. The 390 second mark is when the copy process completed and only the applier process is running (binlog stream). The 469 second mark is after the process fully completed (gh-ost killed).
|
1 |
[ 9s ] thds: 2 tps: 599.04 qps: 599.04 (r/w/o: 0.00/599.04/0.00) lat (ms,95%): 4.41 err/s: 0.00 reconn/s: 0.00<br>[ 10s ] thds: 2 tps: 269.00 qps: 269.00 (r/w/o: 0.00/269.00/0.00) lat (ms,95%): 13.46 err/s: 0.00 reconn/s: 0.00<br>[ 11s ] thds: 2 tps: 229.93 qps: 229.93 (r/w/o: 0.00/229.93/0.00) lat (ms,95%): 19.65 err/s: 0.00 reconn/s: 0.00<br>...<br>[ 389s ] thds: 2 tps: 245.99 qps: 245.99 (r/w/o: 0.00/245.99/0.00) lat (ms,95%): 16.12 err/s: 0.00 reconn/s: 0.00<br>[ 390s ] thds: 2 tps: 358.01 qps: 358.01 (r/w/o: 0.00/358.01/0.00) lat (ms,95%): 10.27 err/s: 0.00 reconn/s: 0.00<br>[ 391s ] thds: 2 tps: 434.89 qps: 434.89 (r/w/o: 0.00/434.89/0.00) lat (ms,95%): 7.70 err/s: 0.00 reconn/s: 0.00<br>[ 392s ] thds: 2 tps: 470.10 qps: 470.10 (r/w/o: 0.00/470.10/0.00) lat (ms,95%): 6.21 err/s: 0.00 reconn/s: 0.00<br>...<br>[ 468s ] thds: 2 tps: 459.93 qps: 459.93 (r/w/o: 0.00/459.93/0.00) lat (ms,95%): 7.30 err/s: 0.00 reconn/s: 0.00<br>[ 469s ] thds: 2 tps: 553.09 qps: 553.09 (r/w/o: 0.00/553.09/0.00) lat (ms,95%): 4.91 err/s: 0.00 reconn/s: 0.00<br>[ 470s ] thds: 2 tps: 526.84 qps: 526.84 (r/w/o: 0.00/526.84/0.00) lat (ms,95%): 4.74 err/s: 0.00 reconn/s: 0.00<br>[ 471s ] thds: 2 tps: 592.18 qps: 592.18 (r/w/o: 0.00/592.18/0.00) lat (ms,95%): 4.57 err/s: 0.00 reconn/s: 0.00 |
With a -nice-ratio setting of 0.4 the performance impact is reduced, but the overall completion time increased as expected:
|
1 |
[ 12s ] thds: 2 tps: 595.85 qps: 595.85 (r/w/o: 0.00/595.85/0.00) lat (ms,95%): 4.57 err/s: 0.00 reconn/s: 0.00<br>[ 13s ] thds: 2 tps: 586.14 qps: 586.14 (r/w/o: 0.00/586.14/0.00) lat (ms,95%): 4.82 err/s: 0.00 reconn/s: 0.00<br>[ 14s ] thds: 2 tps: 560.85 qps: 560.85 (r/w/o: 0.00/560.85/0.00) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00<br>[ 15s ] thds: 2 tps: 271.09 qps: 271.09 (r/w/o: 0.00/271.09/0.00) lat (ms,95%): 16.12 err/s: 0.00 reconn/s: 0.00<br>[ 16s ] thds: 2 tps: 337.00 qps: 337.00 (r/w/o: 0.00/337.00/0.00) lat (ms,95%): 11.24 err/s: 0.00 reconn/s: 0.00<br>[ 17s ] thds: 2 tps: 285.02 qps: 285.02 (r/w/o: 0.00/285.02/0.00) lat (ms,95%): 17.32 err/s: 0.00 reconn/s: 0.00<br>[ 18s ] thds: 2 tps: 346.95 qps: 346.95 (r/w/o: 0.00/346.95/0.00) lat (ms,95%): 11.45 err/s: 0.00 reconn/s: 0.00<br>[ 19s ] thds: 2 tps: 305.01 qps: 305.01 (r/w/o: 0.00/305.01/0.00) lat (ms,95%): 15.27 err/s: 0.00 reconn/s: 0.00 |
|
1 |
Copy: 10000000/10000000 100.0%; Applied: 201060; Backlog: 0/1000; Time: 9m36s(total), 9m36s(copy); streamer: mysql-bin-changelog.000060:36359774; State: migrating; ETA: due |
With gh-ost, one interesting feature is that aside from load, replica lag and custom queries/hooks to throttle writes, it will also automatically throttle based on the queue of binlog events it has not applied yet (1000 write events). When we increased the number of threads for sysbench from 2 to 4 on our tests, this had an effect of freezing the copy process altogether. The high churn rate on the table causes the process to get really slow, especially as changes need to be captured from the replication stream.
|
1 |
Copy: 430000/9872189 4.4%; Applied: 420423; Backlog: 1000/1000; Time: 9m0s(total), 9m0s(copy); streamer: mysql-bin-changelog.000020:95528700; State: migrating; ETA: 3h17m37s |
This is one of the reasons why doing schema changes from an async replica makes sense as it would otherwise allow you to throttle the change and copy threads as fast as possible without affecting production workload.
Increasing the -dml-batch-size from the default 10 to 100 helped speed up processing of the binlog event queue.
My favorite part of the process is gh-ost accepting interactive commands. You can change a number of configuration options dynamically. For example, changing -dml-batch-size on the fly allows you to tune based on variable workloads and not lose progress.
|
1 |
ubuntu@ip-10-1-2-95:~$ echo 'dml-batch-size=100' | nc -U /tmp/gh-ost.ghost.sbtest1.sock<br># Migrating `ghost`.`sbtest1`; Ghost table is `ghost`.`_sbtest1_gho`<br># Migrating ip-172-17-3-77:3306; inspecting ip-172-17-3-77:3306; executing on ip-10-1-2-95<br># Migration started at Wed May 30 14:48:05 +0000 2018<br># chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 100; max-load: Threads_running=100; critical-load: Threads_running=400; nice-ratio: 0.100000<br># throttle-additional-flag-file: /home/ubuntu/gh-ost-throttle_ghost.sbtest1<br># postpone-cut-over-flag-file: /home/ubuntu/gh-ost-sentinel [set]<br># Serving on unix socket: /tmp/gh-ost.ghost.sbtest1.sock<br>Copy: 2975000/10000000 29.8%; Applied: 72353; Backlog: 1000/1000; Time: 2m15s(total), 2m15s(copy); streamer: mysql-bin-changelog.000067:30693074; State: migrating; ETA: 5m19s |
Another nice thing is that when -postpone-cut-over-flag-file is set, it will automatically create the file. It’s kind of intuitive since specifying the option implicitly assumes you do not want to cutover as soon as copy/apply processes are complete thus creating the file for you and achieving the intended behavior.
Resources
RELATED POSTS