InnoDB Troubleshooting: Q & A

InnoDB TroubleshootingIn this blog, I will provide answers to the Q & A for the InnoDB Troubleshooting webinar.

First, I want to thank everybody for attending the August 11 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: What’s a good speed for buffer pool speed/size for maximum query performance?

A: I am sorry, I don’t quite understand the question. InnoDB buffer pool is an in-memory buffer. In an ideal case, your whole active dataset (rows that are accessed by application regularly) should be in the buffer pool. There is a good blog post by Peter Zaitsev describing how to find the best size for the buffer pool.

Q: Any maximum range for these InnoDB options?

A: I am again sorry, I only see questions after the webinar and don’t know which slide you were on when you asked about options. But generally speaking, the maximum ranges should be limited by hardware: the size of InnoDB buffer pool limited by the amount of physical memory you have, the size of innodb_io_capacity  limited by the number of IOPS which your disk can handle, and the number of concurrent threads limited by the number of CPU cores.

Q: On a AWS r3.4xlarge, 16 CPU, 119GB of RAM, EBS volumes, what innodb_thread_concurrency, innodb_read_io_threads, innodb_write_io_threads would you recommend? and innodb_read_io_capacity?

A: innodb_thread_concurrency = 16, innodb_read_io_threads = 8, innodb_write_io_threads = 8, innodb_io_capacity — but it depends on the speed of your disks. As far as I know, AWS offers disks with different speeds. You should consult IOPS about what your disks can handle when setting innodb_io_capacity, and “Max IOPS” when setting  innodb_io_capacity_max.

Q: About InnoDB structures and parallelism: Are there InnoDB settings that can prevent or reduce latching (causes semaphore locks and shutdown after 600s) that occur trying to add an index object to memory but only DML queries on the primary key are running?

A: Unfortunately, semaphore locks for the CREATE INDEX command are not avoidable. You only can affect other factors that speed up index creation. For example, how fast you write records to the disk or how many concurrent queries you run. Kill queries that are waiting for a lock too long. There is an old feature request asking to handle long semaphore waits gracefully. Consider clicking “Affects Me” button to bring it to the developers’ attention.

Q: How can we check these threads?

A: I assume you are asking about InnoDB threads? You can find information about running threads in SHOW ENGINE INNODB STATUS :

And in the Performance Schema THREADS table:

Q: Give brief on InnoDB thread is not same as connection thread.

A: You create a MySQL connection thread each time the client connects to the server. Generally, the lifetime of this thread is the same as the connection (I won’t discuss the thread cache and thread pool plugin here to avoid unnecessary complexity). This way, if you have 100 connections you have 100 connection threads. But not all of these threads do something. Some are actively querying MySQL, but others are sleeping. You can find the number of threads actively doing something if you examine the status variable Threads_running. InnoDB doesn’t create as many threads as connections to perform its job effectively. It creates fewer threads (ideally, it is same as the number of CPU cores). So, for example just 16 InnoDB threads can handle100 and more connection threads effectively.

Q: How can we delete bulk data in Percona XtraDB Cluster?  without affecting production? nearly 6 million records worth 40 GB size table

A: You can use the utility pt-archiver. It deletes rows in chunks. While your database will still have to handle all these writes, the option --max-flow-ctl  pauses a purge job if the cluster spent too much time pausing for flow control.

Q: Why do we sometimes get “–tc-heuristic-recover” message in error logs? Especially when we recover after a crash? What does this indicate? And should we commit or rollback?

A: This means you used two transactional engines that support XA in the same transaction, and mysqld crashed in the middle of the transaction. Now mysqld cannot determine which strategy to use when recovering transactions: either COMMIT or ROLLBACK. Strangely, this option is documented as “not used”. It certainly is, however. Test case for bug #70860 proves it. I reported a documentation bug #82780.

Q: Which parameter controls the InnoDB thread count?

A: The main parameter is innodb_thread_concurrency. For fine tuning, use  innodb_read_io_threads, innodb_write_io_threads, innodb_purge_threads, innodb_page_cleaners. Q:

Q: At what frequency will the InnoDB status be dumped in a file by using innodb-status-file?

A: Approximately every 15 seconds, but it can vary slightly depending on the server load.

Q: I faced an issue that once disk got detached from running server due to some issue on AWS ec2. MySQL went to default mode. After MySQL stopped and started, we observed slave skipped some around 15 mins data. We got it by foreign key relationship issue. Can you please explain why it was skipped data in slave?

A: Amazon Aurora supports two kinds of replication: physical as implemented by Amazon (this is the default for replicas in the same region), and the regular asynchronous replication for cross-region replication. If you use the former, I cannot help you because this is a closed-source Amazon feature. You need to report a bug to Amazon. If you used the latter, this looks buggy too. According to my experience, it should not happen. With regular replication you need to check which transactions were applied (best if you use GTIDs, or at least the  log-slave-updates option) and which were not. If you find a gap, report a bug at

Q: Can you explain more about adaptive hash index?

A: InnoDB stores its indexes on disks as a B-Tree. While B-Tree indexes are effective in general, some queries can take advantage of using much simpler hash indexes. While your server is in use, InnoDB analyzes the queries it is currently processing and builds an in-memory hash index inside the buffer pool (using the prefix of the B-Tree key). While adaptive hash index generally works well, “with some workloads, the speedup from hash index lookups greatly outweighs the extra work to monitor index lookups and maintain the hash index structure” Another issue with adaptive hash index is that until version 5.7.8, it was protected by a single latch — which could be a contention point under heavy workloads. Since 5.7.8, adaptive hash index can be partitioned. The number of parts is controlled by option  innodb_adaptive_hash_index_parts.


Share this post

Comments (2)

  • V.K.

    Some answers still require proofreading:

    “… and the number of concurrent threads the by the number of CPU cores.”

    “If you used the latter, this looks buggy too.”

    September 5, 2016 at 7:55 am
  • Sveta Smirnova

    Thank you, Valerii! We should hire you part-time for proof-reading =)

    September 5, 2016 at 8:11 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.