I recently did a quick analysis of the distribution of writes to InnoDB’s log files. On a high-traffic commodity MySQL server running Percona XtraDB for a gaming workload (mostly inserts to the “moves” table), I used strace to gather statistics about how the log file writes are distributed in terms of write size. InnoDB writes to the log files in multiples of 512 bytes. Mark Callaghan explained this and some of its performance implications here. How many big writes does InnoDB do, and how many small writes?
First, I found out the file descriptor numbers of the log files:
|
1 |
<pre><br># lsof -p $(pidof mysqld) | grep ib_log<br>mysqld 29772 mysql 8uW REG 8,2 268435456 7143989 /var/lib/mysql/ib_logfile0<br>mysqld 29772 mysql 9uW REG 8,2 268435456 7143993 /var/lib/mysql/ib_logfile1<br> |
The file descriptors are 8 and 9. We’ll need to capture writes to both of those; InnoDB round-robins through them. The following grabs the write sizes out of 100k calls to pwrite() and aggregates them:
|
1 |
<pre><br># strace -f -p $(pidof mysqld) -e pwrite -s1 -xx 2>&1 <br> | grep 'pwrite([89],' |head -n 100000 <br> | awk '{writes[$5]++}END{for(w in writes){print w, " ", writes[w]}}'<br> |
I could have done it better with a little more shell scripting, but the output from this was enough for me to massage into a decent format with another step or two! Here is the final result:
|
1 |
<pre><br>bytes count<br>512 44067<br>1024 30740<br>1536 15221<br>2048 7094<br>2560 1810<br>3072 570<br>3584 219<br>4096 112<br>4608 39<br>5120 23<br>5632 16<br>6144 15<br>6656 5<br>7168 3<br>7680 8<br>8192 2<br>8704 2<br>9216 1<br>9728 2<br>10240 1<br>10752 2<br>11264 1<br>11776 1<br>14848 1<br>15360 1<br>15872 2<br>16384 4<br>16896 4<br>17408 2<br>17920 2<br>18432 2<br>18944 8<br>19456 7<br>19968 4<br>20480 4<br>21504 1<br>22016 2<br>24064 1<br>40960 1<br> |
So, in sum, we see that about 3/4ths of InnoDB log file writes on this workload are 512 or 1024 bytes. (It might vary for other workloads.) Now, what does this actually mean? There are a lot of interesting and complex things to think about here and research further:
In the end, the distribution is a simple observation to make, but the InnoDB redo log system is intricate. It’s not something to guess about, but rather something to measure and study more deeply. Perhaps we can follow this up with some more benchmarks or observations under different InnoDB settings and different workloads. Or then again, maybe Yasufumi will read this post when he returns from vacation and already know all the answers by heart!
Resources
RELATED POSTS