In one of his recent posts Vadim already gave some information about possible benefits from using new InnoDB file format but in this post I’d like to share some real-life example how compression in InnoDB plugin could be useful for large warehousing tasks.
One of our clients had really many problems with one of his servers. This is pretty powerful server (Intel Xeon E5430 / 8Gb RAM / BBU-enabled RAID10 with fast SAS drives) with really interesting dataset on it. The only table customer has on this server is one huge innodb table with a set of TEXT fields. We’ve been trying to optimize this server before by playing with various innodb parameters (including page size changes) but at the end of the day server was dying of I/O load because dataset size was 60Gb+ and all reads from this table were pretty random (buffer pool didn’t help).
Few days ago we’ve decided to try to convert customer’s data to Barracuda file format using new InnoDB plugin recently released by Oracle. First thing I’ve tried was DYNAMIC row format. After 3,5 hours of conversion we’ve got 58Gb data file and 30% drop in our I/O load. This was related to the fact that with DYNAMIC row format InnoDB was able to keep an entire PRIMARY index in memory (TEXT and BLOB fields are stored off-page in this format) and all our lookups weren’t involving random I/O anymore.
This 30% I/O reduction was nice, but we’ve decided to go further with our optimizations and I’ve tried to convert this table using COMPRESSED row format. This time conversion took 1,5 hours and results were really surprising:
- Only 5Gb data file (from 60Gb)
- ~5% I/O load according to iostat (from 99%)
- ~5% CPU load according to top (from 80-100% mostly waiting for I/O)
- 0.01 sec average lookup time by primary key (from 1-20 sec before the conversion)
These interesting performance results obviously come from the fact that in addition to index pages compression (by default it is trying to compress 16kb pages down to 8kb) new innodb plugin performs TEXT/BLOB/VARCHAR fields compression off-page so every large TEXT field could become as small as a few bytes if its compression ratio is good. One thing I’d like to mention here that in our case we’ve actually tried a few possible options for compressed page sizes: 1k, 4k and 8k. Results were not so different because most of our data was in TEXT fields and there weren’t many other indexes aside from PRIMARY, but for some other cases it could make sense to try other page sizes (not default 8k) to make data set smaller.
So far, so good, customer is really happy with these changes and we’re really glad to have such a nice tool as new InnoDB plugin which now proved its usefulness in data warehousing tasks.