InnoDB: look after fragmentation

November 5, 2009
Author
Vadim Tkachenko
Share this Post:

One problem made me puzzled for couple hours, but it was really interesting to figure out what’s going on.

So let me introduce problem at first. The table is

Table has 11864696 rows and takes Data_length: 698,351,616 bytes on disk

The problem is that after restoring table from mysqldump, the query that scans data by primary key was slow. How slow ? Let me show.

The query in question is (Q1):

On cold buffer_pool, it took:

However the query (again on cold buffer_pool) (Q2)

Difference is impressive. 4 min 13.61 sec vs 18.81 sec

If you want EXPLAIN plain, here it is:

For Q1:

For Q2:

Query Q1 is executed using Primary Key, and Query Q2 is using block_id key.

To get more details I ran both queries with our extended stats in slow.log (available in 5.0-percona releases)

So for query Q1:

And for query Q2:

As you see for Q1 IO read took 236.149003 sec vs 0.165124 for Q2. But Q1 is scan by primary key, which supposed to be
sequential!

Let’s see on another statistic, which available in innodb_check_fragmentation patch:

for Q1:

and for Q2:

So you see for Q1 it was not sequential scan, even it is primary key, but it is sequential for Q2.

So what’s the answer ? It’s fragmentation of primary key (and whole data table, as InnoDB data == primary key). But how it could happen with
primary key after mysqldump ? The answer here if we look on

EXPLAIN SELECT * FROM tracker;

We see that dump is taken in key “block_id” order, not in primary key order. And later when we load this table, INSERTS into primary key happens in random order, and that gives us the fragmentation we see here.

How to fix it in our case. It’s easy:

, it will force InnoDB to rebuild table in primary key order.

After that Q1:

You see that time returned to appropriate 17.72 sec.

You may ask what happens now with Q2 ? yes, it’s getting slow now, as we made key “block_id” inserted not in order.

As for mysqldump you may use

options to force dump in primary key order.

So notes to highlight:

  • InnoDB fragmentation may hurt your query significantly, especially when data is not in buffer_pool and execution goes to read from disk
  • Fragmentation by secondary key is much more likely than by primary key, and you cannot really control it (tough it is possible in XtraDB / InnoDB-plugin with FAST INDEX creation) so be careful with queries scan many records by secondary key
  • To check if you query affected by fragmentation you can use statistics in 5.0-percona releases (coming to 5.1-XtraDB soon)

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved