What index will be used when you count all rows in a table? Well, the MySQL documentation provides a straightforward answer to this, quoting:

InnoDB processes SELECT COUNT(*) statements by traversing the smallest available secondary index unless an index or optimizer hint directs the optimizer to use a different index. If a secondary index is not present, InnoDB processes SELECT COUNT(*) statements by scanning the clustered index.

And how do we measure the cost of such a counting query? Is it like any other query type?

With a simple sysbench table example, first let’s check the query plan:

As expected, the query optimizer chose the secondary index and intends to do a scan on it. Running the query will increment handlers accordingly and report the same in the slow log:

The first surprise is that the slow log does not report any rows_examined after the above execution:

What if there is no secondary key available? The second example table has only the primary key defined:

Here comes the second surprise: handlers do not change!

Will the same query cost differ in yet another table, this time without any index defined?

Well, apart from observing the execution time (which is comparable to a table with the primary key), it is hard to tell as neither the status handlers don’t increment nor the Rows_examined information from the slow log. Moreover, global InnoDB counters, like Innodb_rows_read, as well as Performance Schema table statistics also do NOT change when the count is made with clustered index (explicit or implicit)!

The select count(*) runs are clearly ignored in the above when there is no secondary key involved! Then how can we check what index was actually read during the count query execution and what amount of data had to be fetched? Well, I can think of only one way – on a freshly restarted, idle server, we can check the Buffer Pool contents (here the BP is big enough to keep whole tables and innodb_buffer_pool_load_at_startup = OFF). 

WARNING: the below example query may be extremely expensive on a busy production server! Therefore, it is meant only for investigation purposes on a test instance, not for regular monitoring!

Based on the above result, we can see how many index pages were loaded to execute the count, where the small secondary index wins as it does not contain the data. In this case, it did not translate straightforwardly to the execution times, but I will not dig deeper here, as it depends on more factors, including the innodb_parallel_read_threads, etc.

Another unexpected challenge after upgrading?

Has this lack of handlers, rows examined, and table statistics always impacted MySQL? Actually, no. The above results were tested with MySQL versions 8.4.3 and 8.0.40, while here is how MySQL 5.7.44 behaves with the same test tables (sbtest1 has primary and secondary key, sbtest2 has primary key only, and sbtest3 has no defined keys):

A careful eye will catch another surprise above – the explicit primary key uses Handler_read_next while the internal (hidden) cluster key uses Handler_read_rnd_next instead.

The slow log looks very similar in all cases:

The Innodb_rows_read status variable does increment in MySQL 5.7 as expected when reading either secondary or clustered index.

Summary

This unexpected broken InnoDB instrumentation was reported by my colleague Iwo already in 2022: https://bugs.mysql.com/bug.php?id=107352

Sometimes new major version improvements may bring some unexpected behavior or even break important instrumentation. When analyzing query statistics in MySQL 8.0+, one may get easily confused when seeing no rows or key entries read during table counts. But it may turn out that it’s not about empty tables but rather non-working instrumentation.

Subscribe
Notify of
guest

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jean-François Gagné

Interesting finding. This reminds me of another problem with Rows_examined:
https://jfg-mysql.blogspot.com/2022/03/rows-examined-not-trustworthy-because-index-condition-pushdown.html

As JF writes in his blog post: “When reporting rows examined, MySQL is actually reporting rows returned by the storage engine.”

In MySQL 8.0, InnoDB has implemented multi-threaded counting. This will be used when counting all rows using the primary index. So there will be only 1 handler call, and the MySQL server only sees one row; the result of the count.

Mark Callaghan

Thanks to Percona for reporting the bug. Hopefully it eventually gets fixed, or at least gets more feedback reported on the bug.

https://bugs.mysql.com/bug.php?id=107352