Recently I was working with the customer who need quick warmup – to get Innodb table fetched in memory as fast as possible to get good in memory access performance.
To do it I run the query: “SELECT count(*) FROM tbl WHERE non_idx_col=0” I use this particular form of query because it will do full table scan – running count(*) without where clause may pick to scan some small index instead.
If your table is not fragmented one of two things should happen – either you should be reading at your hard drive sequential read rate or you would see MySQL becoming CPU bound if IO subsystem is too fast.
In this case however I saw neither – The vmstat showed read speed less than 10MB/sec which is very low for this system which had 6 15K SAS hard drives in RAID10.
Another indication of bad fragmentation was average IO size seen in SHOW INNODB STATUS output. It was around 20KB which means most reads are single page (16K reads). In case of non fragmented table you would see Innodb sequential read-ahead kick in which does reads in 1MB blocks and so you would see average IO size in hundreds of KB.
Now it is worth to notice you can see poor sequential scan performance even if table is not logically fragmented and Innodb is reading data in large blocks – this can happen in case Innodb table file is itself fragmented.
To check if this is the case I usually do “cat table.ibd > /dev/null” and watch IO statistics. If you see small IO request sizes in iostat and simply read speed. Like for the customer in question I saw file read speed of about 50MB/sec which is of course much better than 10MB/sec but well below RAID array capacity.
To check if file fragmentation is the issue or it is poor or miss configured IO subsystem I do another check by running cat /dev/sdb1 > /dev/null – Physical hard drive should never suffer fragmentation so you can get as much sequential IO as you can get (using IO pattern “cat” uses). In this case I got about 300MB/sec which confirmed file fragmentation is also the issue.
Interesting enough the “cure” for both fragmentation issues is the same – OPTIMIZE TABLE tbl – this command recreates the table by writing the new .ibd file (if you’re using innodb_file_per_table=1) which normally would be much less fragmented because it is written at once. Too bad however it requires table to be locked while it is being rebuilt and also it really only defragments clustered key but not the index.
P.S It would be cool to get Innodb objects (data and Index) fragmentation statistics which actually should not be that hard to implement.