Among many of the improvements you can enjoy in MySQL 5.6, there is one that addresses a huge operational problem that most DBAs and System Administrators encounter in their life: schema changes.
While it is usually not a problem for small tables or those in early stages of product life cycle, schema changes become a huge pain once your tables get a significant amount of data. Planning for maintenance is becoming more and more difficult, and your worldwide users want the service to be up and running 24/7, while on the other hand, your developers desire to introduce schema changes every week.
PITA
But what is the real problem here? Let me illustrate very typical case:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Session1> ALTER TABLE revision ADD COLUMN mycol tinyint; Query OK, 1611193 rows affected (1 min 5.74 sec) Records: 1611193 Duplicates: 0 Warnings: 0 Session2> INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1; Query OK, 1 row affected (48.30 sec) Session3 > show processlist; +----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+ | 1 | root | localhost | test | Query | 47 | copy to tmp table | ALTER TABLE revision ADD COLUMN mycol tinyint | | 2 | root | localhost | test | Query | 30 | Waiting for table metadata lock | INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1 | | 3 | root | localhost | NULL | Query | 0 | init | show processlist | +----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+ 3 rows in set (0.01 sec) |
What you see above is how changing table’s structure works in MySQL in all pre-5.6 versions. Normally the example INSERT statement is done in matter of microseconds. But once a DBA runs this ALTER TABLE (session1), the application (session2) has to wait for the ALTER to complete before INSERT and other DML statements can succeed. As a result, application will stall on writes to this table until ALTER TABLE completes.
WORKAROUNDS
There are many less or more complicated approaches to the ALTER TABLE problem in MySQL, starting from just well planned downtime window, through master/slave switching techniques, ending on using advanced tools that let you do the change in less possible intrusive way. These tools are out there for years, yet it appears that still many MySQL users are not aware of them. Let me just name here pt-online-schema-change from Percona Toolkit, oak-online-alter-table from Openark Kit or Facebook OSC.
LONG STORY
Historically, MySQL had to perform full table copy for all DDL (Data Definition Language) operations, while you could not write to a table being altered during the process. This problem is even more painful when awaiting writes can also block following reads.
An important, though small step forward was made for InnoDB storage engine since new InnoDB version (aka InnoDB plugin) was introduced for MySQL 5.1. Since then, you can create and drop indexes without copying the whole table contents. So at least the operation that is practised very often for query optimizations, can be done much much quicker.
Next significant improvement (or perhaps I should say fix) in DDL operations area was introduced in MySQL 5.5 as Metadata Locking. Basically from now on schema changes are working properly and consistent with transactions. You will find this explained in details here.
ONLINE(!) DDL in MySQL 5.6
Fortunately, this is not where so much desired evolution in this area has stopped! With MySQL 5.6, this time a huge step forward was made: from now on most of the ALTER types won’t block writes to a table that is being changed!
Another improvement is that in addition to existing instant ALTER implementations (like change default value for a column), now you can perform also following operations without the need of full table copy:
- Change auto-increment value for a column
- Rename a column (if data type remains the same)*
- Add/Drop a foreign key constraint
As said before though, the main improvement in MySQL 5.6 is that during the ALTER operation a table being changed can still be fully accessible by clients, so both reads and writes (with few exceptions) are allowed! Complete statement matrix can be found here.
Let’s see some examples in practice
Example 1 – reset auto-increment value for a column
It can happen that wrong explicit insert sets the value for an auto-increment column higher then necessary. It may be even close to the data type limit and we want to make the table using lower values back again. We realize that and delete that high value row, but…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
Session1> SELECT t.TABLE_NAME,c.COLUMN_NAME,c.COLUMN_KEY,t.AUTO_INCREMENT FROM information_schema.tables t JOIN information_schema.columns c USING (TABLE_SCHEMA,TABLE_NAME) WHERE t.TABLE_SCHEMA="test" AND t.TABLE_NAME="revision" AND c.EXTRA="auto_increment"; +------------+-------------+------------+----------------+ | TABLE_NAME | COLUMN_NAME | COLUMN_KEY | AUTO_INCREMENT | +------------+-------------+------------+----------------+ | revision | rev_id | PRI | 2000002 | +------------+-------------+------------+----------------+ 1 row in set (0.03 sec) Session1> select max(rev_id) from revision; +-------------+ | max(rev_id) | +-------------+ | 1700000 | +-------------+ 1 row in set (0.00 sec) Session1> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1; Query OK, 1 row affected (0.02 sec) Session1> select max(rev_id) from revision; +-------------+ | max(rev_id) | +-------------+ | 2000002 | +-------------+ 1 row in set (0.00 sec) |
Not possible since the table has next AUTO_INCREMENT=2000002. So this alter is our only help:
1 2 3 4 5 6 7 8 9 10 11 |
Session1> ALTER TABLE revision AUTO_INCREMENT=1700001; Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Session1> insert into revision set rev_page=4,rev_comment="test",rev_text_id=1; Query OK, 1 row affected (0.01 sec) Session1> select max(rev_id) from revision; +-------------+ | max(rev_id) | +-------------+ | 1700001 | +-------------+ |
Finally, this operation in MySQL 5.6 is instant! In previous MySQL versions such ALTER causes full table rewrite and blocks the table for writes for the whole process time. In version 5.5.31 the same ALTER on the same hardware looks like that:
1 2 3 |
Session1> ALTER TABLE revision AUTO_INCREMENT=1700001; Query OK, 1611226 rows affected (1 min 3.42 sec) Records: 1611226 Duplicates: 0 Warnings: 0 |
Example 2 – DROP COLUMN
1 2 3 |
Session1> ALTER TABLE revision DROP COLUMN rev_sha1; Query OK, 0 rows affected (1 min 39.24 sec) Records: 0 Duplicates: 0 Warnings: 0 |
During the ALTER is in progress:
1 2 |
Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2; Query OK, 1 row affected (0.01 sec) |
After the ALTER finished:
1 2 |
Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2; ERROR 1054 (42S22): Unknown column 'rev_sha1' in 'field list' |
Great! Drop table was non-blocking, we can use the table without interruption.
Example 3 – RENAME COLUMN
Original column definition was:
rev_deleted
tinyint(1) unsigned NOT NULL DEFAULT '0',
1 2 3 |
Session1> ALTER TABLE revision CHANGE COLUMN rev_deleted rev_deleted1 tinyint(1) unsigned NOT NULL DEFAULT '0'; Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 |
(The same column rename in MySQL 5.5 or earlier copies the whole table.)
But let’s try another column:
rev_timestamp
char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',
1 2 3 |
Session1> ALTER TABLE revision CHANGE COLUMN rev_timestamp rev_date char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT ''; Query OK, 1611226 rows affected (1 min 43.34 sec) Records: 1611226 Duplicates: 0 Warnings: 0 |
What the heck?
Hmm, let’s try another one:
rev_comment
tinyblob NOT NULL,
1 2 3 |
Session1> ALTER TABLE revision CHANGE COLUMN rev_comment rev_comment1 tinyblob NOT NULL; Query OK, 1611226 rows affected (2 min 7.91 sec) Records: 1611226 Duplicates: 0 Warnings: 0 |
So, by the way of writing this blog post I identified a new bug and reported it here: http://bugs.mysql.com/bug.php?id=69580
In short, Online DDL does not work as expected when you rename a column of binary data type, but also for a char type with binary collation. The bug not only leads to full table copy but also blocks a table for writes.
Example 4 – NEW ALTER TABLE OPTIONS
In case you are not sure if an ALTER TABLE will copy and/or block the table, and you want to make sure your DDL statement won’t cause such problems, you can add new alter specifications to the statement: ALGORITHM and LOCK.
1 2 |
Session1> ALTER TABLE revision ENGINE=InnoDB, LOCK=NONE; ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED. |
As we can see full table rebuild still needs locking.
1 2 |
Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=COPY, LOCK=NONE; ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED. |
So, ALGORITHM=COPY and LOCK=NONE are mutually exclusive.
1 2 |
Session1> ALTER TABLE revision MODIFY COLUMN rev_id bigint(8) unsigned NOT NULL AUTO_INCREMENT, ALGORITHM=INPLACE; ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY. |
Data type change of a column is both locking and makes full table copy.
1 2 3 |
Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=INPLACE, LOCK=NONE; Query OK, 0 rows affected (1 min 38.84 sec) Records: 0 Duplicates: 0 Warnings: 0 |
Above example is pretty interesting as the new method is used, so no full table copy takes place, however whole operation is still as much expensive as the old way. This is because significant data reorganization is needed inside the existing ibd tablespace.
ALGORITHM and LOCK options allow you also to force using old method with full table copy by adding ALGORITHM=COPY or protect a table from writes with LOCK=SHARED. The same effect can be achieved with SET old_alter_table=1 before the ALTER.
NEW DIAGNOSTICS
I would like to mention that MySQL 5.6 offers new monitoring tools that we can use to see more details also about ongoing ALTERs.
This is how it looks like during the ALTER with adding a column is in progress: