In the previous post I mentioned a way I use to preload Clustered Index (data) for Innodb tables. Though I thought this topic would benefit from a bit more information.
But lest first start with feature request for Innodb Team: All ways I mention here are hacks and they can’t be as efficient as native support. It would be great if Innodb would implement command to preload table to Innodb buffer pool, which would simply go through .ibd file sequentially and inject pages in the buffer pool. This would make preload done using sequential file scan even if indexed suffered a lot of page splits.
Now lets continue to the hacks
So As I mentioned you can load Innodb Table Clustered Index in the buffer pool pretty efficiently by using something like SELECT count(*) FROM tbl WHERE non_index_col=0 This works relatively well (though can be slow for fragmented tables) but it does not preload indexes in memory neither it does externally stored objects – BLOB and TEXT fields.
If you would like some non PRIMARY Indexes preloaded you can use something like SELECT count(*) from tbl WHERE index_col like “%0%” for each index. Only one such query per index is enough even if it is multiple column index.
To fetch BLOB/TEXT externally stored columns you can use similar query: SELECT count(*) from tbl WHERE blob_col like “%0%”. Note if you preloading BLOB/TEXT columns you do not need to use first query I mentioned because scanning potentially externally stored blobs will also scan Clustered key Anyway.
Now, say you have bunch of tables having few indexes – should you run multiple queries in parallel to get best preload speed ?
It depends – depending on key/clustered key fragmentation it may be faster to run queries one by one (keeping IO more sequential) or run multiple queries at once to get more outstanding requests at the same time – benchmark to find out.
If you just need to preload single large table you can chop it into several ranges and preload in parallel, such as SELECT count(*) FROM tbl WHERE id BETWEEN 1 and 10000000 AND non_index_col=0
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.