Recently, we received an alert for one of our Managed Services customers indicating that the auto_increment value for the table was 80% of its maximum capacity. The column was INT UNSIGNED, which has a limit of 4,294,967,295.
At 80%, we have enough time to change it to BIGINT.…. Right? Let’s see.
So we used pt-online-schema-change to perform the alter.
It started running at a good pace but slowed over time.
Why?
Well, let’s look at the definition of the table:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
mysql> show create table myschema.mytableG *************************** 1. row *************************** Table: mytable Create Table: CREATE TABLE `mytable` ( `id` int unsigned NOT NULL AUTO_INCREMENT, `long_column` varchar(1000) NOT NULL, `state` tinyint unsigned NOT NULL, `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `short_column` varchar(30) NOT NULL, PRIMARY KEY (`id`), KEY `idx_long_column` (`long_column`,`state`), KEY `idx_short_column` (`short_column`,`state`), KEY `idx_short_col2` (`short_column`) ) ENGINE=InnoDB AUTO_INCREMENT=4009973818 DEFAULT CHARSET=utf8mb3 |
NOTE1: The index on long_column is for a varchar column with a length of 1000; it may not be required, and an index prefix may be more helpful here.
NOTE2: The index idx_short_col2 is duplicated, as it is covered by the index idx_short_column.
Those changes require testing and are out of scope for this emergency, but they are worth mentioning.
Table size:
|
1 2 3 4 5 |
+---------------+------------+------------+---------+----------+---------+----------+--------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | DATA_GB | INDEX_GB | FREE_GB | TOTAL_GB | ENGINE | +---------------+------------+------------+---------+----------+---------+----------+--------+ | myschema | mytable | 3906921584 | 1118 | 1790 | 0 | 2907 | InnoDB | +---------------+------------+------------+---------+----------+---------+----------+--------+ |
Look at the indexes being way bigger than the data.
|
1 2 3 4 5 6 7 8 9 |
mysql> SELECT database_name, table_name, index_name, ROUND(stat_value * @@innodb_page_size / 1024 / 1024, 2) AS size_in_mb FROM mysql.innodb_index_stats WHERE stat_name = 'size' AND index_name != 'PRIMARY' and database_name='myschema' and table_name='mytable' ORDER BY size_in_mb DESC; +---------------+------------+-------------------+------------+ | database_name | table_name | index_name | size_in_mb | +---------------+------------+-------------------+------------+ | myschema | mytable | idx_long_column | 1583538.95 | | myschema | mytable | idx_short_column | 126432.98 | | myschema | mytable | idx_short_col2 | 122699.95 | +---------------+------------+-------------------+------------+ 3 rows in set (0.01 sec) |
While the pt-online-schema-change runs, it copies the data to a new table. As the data is being copied, the secondary indexes must be maintained.
NOTE the huge index for a varchar(1000) that is ~1.5T in size. Maintaining such an index becomes increasingly expensive as the data size increases.
The pt-online-schema-change had been running for ~8 days, and its latest estimate was 53 more days, which we can’t afford, since the maximum value would be exceeded in ~15 days.
|
1 2 3 4 5 6 7 8 9 |
Copying `myschema`.`mytable`: 12% 53+16:48:01 remain Copying `myschema`.`mytable`: 12% 53+16:48:30 remain Copying `myschema`.`mytable`: 12% 53+16:48:59 remain Copying `myschema`.`mytable`: 12% 53+16:49:26 remain Copying `myschema`.`mytable`: 12% 53+16:49:53 remain Copying `myschema`.`mytable`: 12% 53+16:50:19 remain Copying `myschema`.`mytable`: 12% 53+16:50:49 remain Copying `myschema`.`mytable`: 12% 53+16:51:17 remain Copying `myschema`.`mytable`: 12% 53+16:51:45 remain |
So what do we do now?
We suggested canceling the pt-online-schema-change and creating an Aurora blue-green deployment.
Then perform the direct ALTER on the green cluster. And finally, when ready, do the failover.
Sounds good, doesn’t it?
First, we need to ensure that the new cluster (green) has the replica_type_conversions parameter in its cluster parameter group to “ALL_NON_LOSSY, ALL_UNSIGNED” in order to be able to replicate from an int unsigned column to a bigint unsigned column.
So we tried that, it started too fast ~0.036% per minute, that’s 2 days. That’s great!
We left the process running over the weekend, but we noticed it started to slow down again… By Monday, it was advancing at ~0.01% every 5 mins, which gives an ETA of 34 days.
Why?
Again, using the direct ALTER MySQL copies the data to a temp table, and the bigger the data, the harder it is to maintain the indexes.
Again, unacceptable.
Note that with the above 2 approaches, we lost ~12 days of precious time, and the deadline for auto_increment exhaustion was approaching.
Then we thought: What if we drop the secondary indexes, do the alter, and then add the indexes back?
In theory, it should be faster, as:
“Online DDL support for adding secondary indexes means that you can generally speed the overall process of creating and loading a table and associated indexes by creating the table without secondary indexes, then adding secondary indexes after the data is loaded.”
https://dev.mysql.com/doc/refman/8.4/en/innodb-online-ddl-operations.html
So let’s do this:
The deletion of the indexes was really quick, as expected (metadata-only operation):
|
1 2 3 |
mysql> ALTER TABLE myschema.mytable DROP INDEX idx_long_column, DROP INDEX idx_short_column, DROP INDEX idx_short_col2; Query OK, 0 rows affected (49.40 sec) Records: 0 Duplicates: 0 Warnings: 0 |
Then the change of the datatype:
|
1 2 3 |
mysql> ALTER TABLE myschema.mytable CHANGE COLUMN id id bigint unsigned NOT NULL AUTO_INCREMENT; Query OK, 4058047205 rows affected (13 hours 9 min 10.62 sec) Records: 4058047205 Duplicates: 0 Warnings: 0 |
Looks very promising!!!
The final step, add back the indexes:
|
1 2 |
mysql> ALTER TABLE myschema.mytable ADD INDEX `idx_long_column` (`long_column`,`state`), ADD INDEX `idx_short_column` (`short_column`,`state`), ADD INDEX `short_col2` (`short_column`); ERROR 1878 (HY000): Temporary file write failure. |
Why?
Well, the INPLACE operation uses the tmp dir to write sort files. In Aurora, there are certain limits for the temporary space based on the instance type.
In a regular MySQL instance, we can modify the innodb_tmpdir to another location with enough disk space; however, in Aurora, the parameter is not modifiable, which could have made the whole process easier.
Even with a larger instance type, it’s hard to create the 1.5T index without breaking open the piggy bank.
Last resort, add the indexes back with the COPY algorithm:
|
1 2 3 |
mysql> ALTER TABLE myschema.mytable ALGORITHM=COPY, ADD INDEX `idx_long_column` (`long_column`,`state`), ADD INDEX `idx_short_column` (`short_column`,`state`), ADD INDEX `idx_short_col2` (`short_column`); Query OK, 4147498819 rows affected (6 days 1 hour 55 min 57.00 sec) <span style="font-weight: 400;">Records: 4147498819 Duplicates: 0 Warnings: 0</span> |
Why does it work? Because ALTER TABLE using the COPY algorithm uses the datadir as the destination for the temporary table, the rows are copied there. It doesn’t have the limitation of the temporary directory mentioned above.
We were able to make it on time about 4 days before the auto_increment exhaustion, preventing downtime.
In retrospective we could have used the following approach to avoid the use of the blue/green deployment:
What initially looked like an easy task with pt-online-schema-change, ended up being more complex.
You need to check the data definition, the index sizes, the Aurora limits, and how the different algorithms work to make a decision on the best way to proceed with those tasks, specially on situations like these where you have the pressure of the auto_increment being exhausted and there’s risk of downtime if it is not done on time.
And of course, monitor auto_increment exhaustion for your tables, and use a reasonable threshold that gives you enough time to plan and change the table definition. You can use Percona Monitoring and Management for this, specifically on the MySQL > MySQL Table Details dashboard.