November 24, 2014

SHOW INNODB STATUS walk through

Many people asked me to publish a walk through SHOW INNODB STATUS output, showing what you can learn from SHOW INNODB STATUS output and how to use this info to improve MySQL Performance.

To start with basics SHOW INNODB STATUS is command which prints out a lot of internal Innodb performance counters, statistics, information about transaction processing and all kinds of other things. In MySQL 5 number of Innodb performance counters were exported and now available in SHOW STATUS output. Most of them are same as you previously could find in SHOW INNODB STATUS, there are however few which were not available before.

In SHOW INNODB STATUS many values are per second. If you’re planning to use these values make sure they are sampled over decent period of time. In the very start of printout Innodb will print:

Make sure data is sampled for at least 20-30 seconds. If averages are calculated for last 0 or 1 second they are pretty much unusable.
To be honest I do not really like averages Innodb provides as it is hard to get average for interval you want to have, if you’re writing scripts to look at SHOW INNODB STATUS it is much better to use global counters and get averages manually. They are still however quite helpful if you’re just looking at output.

Next sections in Semaphores information:

There are two portions in this section. One is list of current waits. This section will only contain any entries if you’re running in high concurrency envinronment, so Innodb has to fall back to OS waits frequently. If wait was resolved via Spinlock it will not be seen in this section.

Looking at this section you can get an idea what might be hot spot in your workload. It however requires some knowledge of source code – you only get file names and lines (which are different in different versions), you get no information what this object is responsible for. You however can well guess from file names – in this case file is “buf0buf.ic” what means there is some buffer pool contention. However if you want to know more – you need to browse source.

You also see some details printed about wait. “lock var” is current value for the mutex object (locked=1/free=0) , “waiters flag” is current number of waiters, plus you can see wait status information “wait is ending” in this case which means mutex is already free for grabs but os has not yet scheduled thread so it could proceed with execution.

The second piece of information is event counters – “reservation count” and “signal count” show how actively innodb uses internal sync array – how frequently slots are allocated in it and how frequently threads are signaled using sync array. These counters can be used to represent frequency with which Innodb needs to fall back to OS Wait. There is direct information about OS waits as well – you can see “OS Waits” for
mutexes, as well as for read-write locks. For these information both for exclusive locks and for shared locks is displayed. OS Wait is not exactly the same as “reservation” – before falling back to complex wait using sync_array Innodb tries to “yield” to OS hoping when name thread is scheduled next time object will be free already. OS Waits are relatively slow, and if you get tens of thousands of OS waits per second it may be the problem. The other way to look at it is context switch rate in your OS stats.

The other important peice of information is number of “spin waits” and “spin rounds”. Spin locks are low cost wait, compared to OS wait, it is however active wait which wastes your CPU cycles, so if you see very large amount of spin waits and spin rounds significant CPU resources may be wasted. It should come to hundreds of thousands spin rounds per second to start really worry for most CPUs. innodb_sync_spin_loops can be used to ballance between wasting CPU time running spin locks and doing unneeded context switches.

The next section is about deadlock errors:

For last deadlock Innodb shows transactions which caused deadlocks, their state during deadlock, what locks they were holding and what they were waiting for, which of transactions Innodb decided to roll back to resolve deadlock. Note – Innodb only prints information about few of the locks which transaction is holding. Also only last statement from each transactions is displayed, while locks rows could be locked by one of previous statements. For complex deadlock investigations you might need to look at the log files to find truly conflicting statements. For most simple cases information from SHOW INNODB STATUS is good enough.

As for deadlock information we have similar information about last failed foreign key constraint:

Innodb will print statement which caused error. Definition of foreign key which failed as well as the closest match which was located in parent table. There is a lot of scary information in hex but it is not important for most diagnostic cases – It is left by Innodb developers so SHOW INNODB STATUS can also be used as debugging tool.

Next section you will find in SHOW INNODB STATUS is information about currently active transactions:

If you have small number of connections all connections will be printed in transaction list, if you have large number of connections Innodb will only print number of them, cutting the list so SHOW INNODB STATUS output will not grow too large.

Transaction id is current transaction identifier – it is incremented for each transaction. Purge done for trx’s n:o is number of transaction to which purge is done. Innodb can only purge old versions if they there are no running transactions potentially needing them. Old stale uncommitted transactions may block purge process eating up resources. By looking at transaction counter difference between current and last purged transaction you will be able to spot it. In some rare cases purge also could have hard time to keep up with update rate, in this case difference between these values will also grow and innodb_max_purge_lag will become your friend. “undo n:o” will show the undo log record number which purge is currently processing, if it is active otherwise it will be zero.

History list length 6 is number of unpurged transactions in undo space. It is increased as transactions which have done updates are commited and decreased as purge runs.

Total number of lock structs in row lock hash table is number of row lock structures allocated by all transactions. Note not same as number of locked rows – there are normally many rows for each lock structure.

For each of connections for MySQL there will be ether not started state if there is no active Innodb transaction for this connection, or ACTIVE if transaction is active. Note transaction can be active even if connection is in “Sleep” stage – if it is multiple statement transaction. Innodb also will print OS thread_id and process id which may be helpful if you would like to use gdb to connect to running mysqld for troubleshooting purposes and similar things. Also transaction status is reported which is basically what transaction is doing it can be “fetching rows”, “updating” and couple of other values. “Thread declared inside InnoDB 400″ means thread is running inside Innodb kernel and still has 400 tickets to use. Innodb tries to limit thread concurrency allowing only innodb_thread_concurrency threads to run inside Innodb kernel at the same time. If thread is not runniing inside innodb kernel status could be “waiting in InnoDB queue” or “sleeping before joining InnoDB queue”. Latest one is quite interesting – to avoid too many threads competing to enter innodb queue at the same time Innodb makes thread to sleep for some time before trying to wait (if no free slot was available). This may cause number of threads active inside kernel being less than number of threads allowed by innodb_thread_concurrency. For certain workloads it may help to decrease the time thread waits before it enters the queue. This is done by adjusting innodb_thread_sleep_delay variable. Value is specified in microseconds.

mysql tables in use 1, locked 0 is number of tables used by transaction in question (meaning it was accessed) and number of tables locked by transactions. Innodb does not lock tables for normal operation so number of tables locked normally stays 0, unless it is ALTER TABLE or similar statement, or if LOCK TABLES was used.

In addition to Innodb specific information, there is generic statement information which is visible in SHOW PROCESSLIST showed in SHOW INNODB STATUS, such as statement which is being executed, query id, query status etc.

Next section you will see is section showing details of file IO:

This section shows state of file IO helper threads – insert buffer thread, log thread, read thread and write thread. These are responsible appropriately for insert buffer merges, asynchronous log flushes, read-ahead and flushing of dirty buffers. Normal reads originated from query executions are executed by threads running queries. On Unix/Linux you will always see 4 helper threads, on Windows it however can be adjusted by innodb_file_io_threads variable. For each helper thread you can see thread state – if thread is ready – waiting for i/o request or if it is executing certain operation.

Number of pending operation is shown for each of helper threads – these are amount of operations queued for execution or being executed at the same time. Also number of pending fsync operations is displayed. For writes Innodb has to ensure data makes it to the disk – just passing it to OS cache is not enough. This is typically done by calling fsync() for modified files. Constant high values for any of these variables is indication of IO bound workload. Note however – IO requests submited by threads executing requests are not accounted here so you may have these at zeroes while workload being IO bound still.

Next, number of file IO operations is shown as well as computed averages. This is parameters which is great for graphing and monitoring.
“16384 avg bytes/read” shows average size of read requests. For random IO these should be 16K – page size, for full table scan or index scan read-ahead may be performed which can increase average read size significantly. So you can think about this value as read-ahead efficiency.

This section shows insert buffer and adaptive hash status. First line shows status of insert buffer – segment size and free list as well as if
there are any records is insert buffer. Next it shows how many inserts were done in insert buffer, how many recs were merged and how many merges did it took. Ratio of number of merges to number of inserts is pretty much insert buffer efficiency.

Adaptive hash index is hash index Innodb builds for some pages to speed up row lookup replacing btree search with hash search. This section shows hash table size, number of used cells and number of buffers used by adaptive hash index. You can also see number of hash index lookups and number of non-hash index lookups which is indication of hash index efficiency.

There is currently not much you can do to adjust adaptive hash index or insert buffer behavior so it is pretty much for informational purposes only.

Log section provides information about log subsystem of Innodb. You can see current log sequence number – which is amount of bytes Innodb has written in log files since system tablespace creation. You can also see up to which point logs have been flushed – so how much data is unflushed in log buffer as well as when last checkpoint was performed. Innodb uses fuzzy checkpointing so this line hold log sequence, all changes up to which has been flushed from buffer pool. Changes having higher log sequences may still only be recored in logs and not flushed from buffer pool so such log sequences can’t be overwritten in log files. By monitoring log sequence number and value up to which logs have been flushed you can check if your innodb_log_buffer_size is optimal – if you see more than 30% of log buffer size being unflushed you may want to increase it.

You also can see number of pending normal log writes and number of checkpoint log writes. Number of log/io operations allows to separate tablespace related IO from log related IO so you can see how much IO your log file requires. Note depending on your innodb_flush_log_at_trx_commit value your log writes may be more or less expensive. If innodb_flush_logs_at_trx_commit=2 log writes are done to OS cache, and being sequential writes these logs writes are pretty fast.

This section shows Buffer pool activity and memory usage. You can see total memory allocated by Innodb (sometimes it is higher than you anticipated), amount of memory allocated in additional memory pool (so you can check if it is sized right), total number of pages in buffer pool, number of pages free, pages allocated by database pages and dirty pages. From these values you can learn if your buffer pool is sized well – if you have constantly a lot of pages free, it probably means your active database size is smaller than allocated buffer pool size so you can tune it down. Even if free pages is zero as in this case database pages will not be equal to total size of buffer pool, because buffer pool also stores lock information, adaptive hash indexes and some other system structures.

Pending reads and writes are pending requests on buffer pool level. Innodb may merge multiple requests to one on file level so these are different. We can also see different types of IO submited by Innodb – pages to be flushed via LRU pages – dirty pages which were not accessed long time, flush list – old pages which need to be flushed by checkpointing process and single page – independent page writes.

We can also see number of pages being read and written. Created pages is empty pages created in buffer pool for new data – when previous page content was not read to the buffer pool.

Finally you can see buffer pool hit ratio which measures buffer pool efficiency. 1000/1000 corresponds to 100% hit rate. It is hard to tell what buffer pool hit rate is good enough – it is very workload dependent. Sometimes 950/1000 will be enough, sometimes you can see IO bound workload with hit rate of 995/1000.

Finally last section – row operations which shows activity on the row basics and some system information.

It shows innodb thread queue status – how many threads are waiting and being active. How many read views are open inside Innodb – this is when transaction was started but no statement is currently active, state of Innodb main thread which controls scheduling of number of system operations – flushing dirty pages, checkpointing, purging, flusing logs, doing insert buffer merge. Values for “state” field are rather self explanatory.

You can also see number of rows operation since system startup as well as average values. This is also very good values to monitor and graph – row operations is very good measure of Innodb load. Not all row operations are created equal of course and accessing of 10 byte rows is much cheaper than accessing 10MB blog, but it is still much more helpful than number of queries, which is even more different.

One more thing to note – SHOW INNODB STATUS is not consistent – it does not correspond to some particular point in time. Different lines in SHOW INNODB STATUS populated in different point in times, so sometimes you may see a bit conflicting information. This is by design as requiring global lock to provide consitent information would cause significant overhead.

Note: our book has an updated, more complete, more in-depth explanation of SHOW INNODB OUTPUT.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Nice I’ve been waiting to get some issues cleared up with SHOW INNODB status,

    I created a link to mysqldba.blogspot.com that has a summary of this information as well as some stuff I added. 99% is just a summary of the information above.

  2. Xaprb says:

    Thanks, this is very helpful. I’ve been trying to read the source code to learn some of this, because it’s mostly undocumented in the MySQL manual. I’ve been building a tool to display all this information easily (innotop) and this is helping me learn more about what it all means.

  3. Alexey says:

    Isn’t it violation of the syntax? I think it should be “SHOW ENGINE INNODB STATUS”.

  4. peter says:

    Dathan,

    Thanks a lot for your summary. I see you’re mentioning you can change number of IO threads on Unix – do you have a patch for it so I save time researching it. It would be interesting to see if it works with different number of threads and if it impacts perfomance for Innodb a lot.

    I was thinking to create some kind of “power tuning” patches for Innodb which would allow to change a lot of internal values which are now constants. At least this could help to find constants which work better now – Innodb ones were probably tested on 5 year old desktop computers and might not be optimal now.

  5. peter says:

    Xaprb,

    Thank you. I was going to take a look at your tool (it looks handly from your blog post). I however did not have a time yet to look at it. Should be quite helpful.

  6. peter says:

    Alexey,

    Yes “SHOW ENGINE INNODB STATUS” is official statement name now, MySQL will even print warning if you use “SHOW INNODB STATUS”.
    For us MySQL old timers, it is hard to switch, especially if both of variants work. I still use type=innodb instead of engine=innodb in create table statement most of the time.

  7. mike says:

    “Number of log/io operations allows to separate tablespace related IO from log related IO so you can see how much IO your log file requires.”

    Hi,

    Can you be more specific about how to seperate tablespace related IO from log related IO? Thanks

  8. Safari says:

    “…innodb_sync_spin_loops can be used to ballance between wasting CPU time running spin locks and doing unneeded context switches…”

    So if we have a very large amount of spin waits and spin rounds, will we increase innodb_sync_spin_loops or decrease it in order to reduce spin waits/rounds?

  9. Safari says:

    “… innodb_sync_spin_loops can be used to ballance between wasting CPU time running spin locks and doing unneeded context switches….”

    So if I see very large amount of spin waits and spin rounds, will I increase or decrease the innodb_sync_spin_loops in order to reduce CPU resources wasted?

    Thanks,

  10. peter says:

    Safari,

    You can reduce it to reduce CPU wasted by doing spin loops. But it can cause more cpu to be wasted doing context switches. Finding balance can be tough here.

  11. Many people are finding ARA helps a lot with their MySQL database performance.

    I have built an Adaptive Readahead kernel package for Debian users, strongly based on and feature compatible with the kernel which ships with Debian Etch (other than it has ARA turned on!). Looking for feedback from anyone who has tried this in production.

  12. mike says:

    Hi,

    I never got an answer to my question back in 7, so I’ll try again. I can see that in the LOG section, the log i/o’s/second represents
    log i/o, this seems obvious. My question is what is represented in the FILE I/O section? Does writes/s and reads/s and fsyncs/s
    include the log i/o or exclude it? I’m trying to get a picture for the total i/o done. Do I add the numbers from the two sections together
    or not?

    Thanks

  13. peter says:

    Mike,

    File IO includes all IO including log IO. The numbers of IO may be different is in buffer pool IO is counted in pages while in FileIO it is actual IO operations which can touch many pages.

  14. Mark says:

    Peter,
    Touching on an earlier question… I’m showing the following:

    ———-
    SEMAPHORES
    ———-
    OS WAIT ARRAY INFO: reservation count 50573942, signal count 48103669
    Mutex spin waits 0, rounds 1286836814, OS waits 29544296
    RW-shared spins 17771305, OS waits 8349871; RW-excl spins 10508828, OS waits 5276204

    Currently I have innodb_sync_spin_loops set to the default (20). Would you suggest lowering that?

    Thanks

  15. mike says:

    Hi Peter,

    Thanks for the clarification. Would I be able to subtract log i/o’s/second from
    file i/o writes/s to get a number for data writes/s?

    Thanks,

    Mike

  16. peter says:

    Should be. Mike.

  17. xin says:

    peter, it’s nothing about innodb status but i have to point out one point from the mysqlperformanceblog webside.

    I can not find an easy link (nor a better “media=print” css) to print articles here. that’s annoying.

    Thanks.

  18. peter says:

    Xin,

    This is WordPress blog I will need to look for a pluging which provides nice printing layout. Do you know one ?

  19. Kiril says:

    Hello, I’ve read the article and I found it very useful, but I want to ask something on this topic. When I invoke the command I got the output printed onto STDOUT, which is normal, but, If I want this info stored into log file, e.g. command > logfile, the output gets filled with : “\n=====================================\ +———– ” etc separators.
    Is there are way that I will preserve the original format and store it in logfile ? Thanks in advance for any suggestions.

  20. Kiril says:

    Actually, I managed to overcome this, it seems that the output needs to be “piped” with ” | ” and passed to sed for example, then the proper interactions should be made like : sed “s/\\\\n/\\n/g” – in order to remove the new lines, then you can redirect the output with > to a logfile.

  21. peter says:

    I think you can use –raw or some similar option mysql client option to avoid new lines replaced with \n

  22. david says:

    purge stop working and feel stall,pls give us some clues. as the following snippet:
    Trx id counter 3 680747938
    Purge done for trx’s n:o

  23. Massimo says:

    in a situation like this:

    ———-
    SEMAPHORES
    ———-
    OS WAIT ARRAY INFO: reservation count 866195, signal count 850669
    Mutex spin waits 7873759, rounds 34666553, OS waits 328525
    RW-shared spins 178137, OS waits 79292; RW-excl spins 259415, OS waits 230381

    I’ve both spin waits and OS waits very high. What do you suggest to reduce the number of slow OS waits?

    Thank you

  24. peter says:

    How long uptime is it for ? Just 8 million of spin waits is not a lot. I would worry when it is over 100.000/sec

  25. Bart Huls says:

    Hi,

    Can someone tell me how to get the free datapages available.
    I have 50 Gb of InnoDb, and I can’t see for how many it’s filled up.

  26. peter says:

    Check Comments for Innodb tables it shows amount of space free.

  27. Bart Huls says:

    Hi,
    When I enter : show table status like “test” \G;

    I get


    Comment:
    1 row in set (0.05 sec)

    ERROR:
    No query specified

    The Comment field is empty

  28. Pranab says:

    Thanks for the info.
    Regards,
    Pranab
    http://www.goguwahati.com

  29. Suresh Kuna says:

    Useful info peter. Is the semaphores section specific to Innodb engine queries alone or any other which belongs to operating system and queries belongs to other engines also can effect this section.

    Thanks
    Suresh Kuna

  30. Jan says:

    In InnoDB is their a way to automatically rollback stale transactions, without the need to reboot the database?
    I’m thinking of a kind of time-out property, but can’t seem to find a solution anywhere.

  31. Jingqi Xu says:

    Hi peter,

    should “History list length” <= "Trx id counter" – "Purge done for trx's n:o" ?

  32. Jason says:

    Great read Peter.

    We have a large database and buffer pool, with heavy read/write and we are seeing many semaphores like this:

    –Thread 1251916096 has waited at btr/btr0cur.c line 487 for 28.000 seconds the semaphore:
    S-lock on RW-latch at 0x2ac843e8e770 ‘&new_index->lock’
    a writer (thread id 1265760576) has reserved it in mode exclusive
    number of readers 0, waiters flag 1, lock_word: 0
    Last time read locked in file btr/btr0cur.c line 487
    Last time write locked in file btr/btr0cur.c line 480

    The DB locks up for 10 seconds at a time, and disk activity drops to 0 and the mysql processlist shoots up as queries all appear to be locked. Then everything back to normal. I think it’s buffer pool contention. Peter, do you have any experience with that?

  33. Jason,

    Why do not you post your question in the forum and I will try to reply:
    http://forum.percona.com

  34. Jason says:

    Will do. Thanks.

  35. Vaseeswaran says:

    Thanks for the info!!!
    This is much useful.

  36. JohnDow says:

    Hello Peter,

    I have trouble getting the monitores working – i created the different tables for the innodb (Standard, Lock,Tablespace,Table) and the server doesn’t report the data as described every 15 seconds instead it doesn’t create any row at all. How can that be, what have i overlooked?

    Thank you

Speak Your Mind

*