Improved InnoDB fast index creation

One of the serious limitations in the fast index creation feature introduced in the InnoDB plugin is that it only works when indexes are explicitly created using ALTER TABLE or CREATE INDEX. Peter has already blogged about it before, here I’ll just briefly reiterate other cases that might benefit from that feature:

  • when ALTER TABLE does require copying the data into a temporary table, secondary indexes are updated by inserting individual rows rather than sorting;
  • OPTIMIZE TABLE could be faster if secondary indexes were temporarily dropped and then recreated using fast index creation;
  • dumps produced by mysqldump first create tables with all secondary indexes and then load the data, which is also inefficient.

Percona Server as of versions 5.1.56 and 5.5.11 allows utilizing fast index creation for all of the above cases, which can potentially speed them up greatly. This feature is controlled by the expand_fast_index_creation system variable which is OFF by default.

Let’s look at each of the above cases in more detail.

 

ALTER TABLE

By temporarily dropping secondary indexes from the new table before copying the data, and then recreating them later, ALTER TABLE can take advantage of the fast index creation feature even when it has to copy the entire table.

To illustrate this, I have performed a number of simple benchmarks. Let’s start with a table containing 4 million rows and one secondary key:

Let’s trigger a table rebuild by adding a new column and see what execution time is like when the default method is used:

Now let’s see how performance is affected when turning expand_fast_index_creation on. Here and in later examples I’m extending the VARCHAR column to trigger table rebuilds without affecting the table size.

As seen from the SHOW PROFILE output, copying the data to a temporary table without updating indexes took 18 seconds, and about the same time was spent on rebuilding the index using fast index creation. So we have 36 seconds in total which is about 1.7 times faster than updating indexes by insertion.

Let’s see if having more secondary indexes in the table makes any difference: