Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Contact Us

Beware the Innodb Table Monitor

November 30, 2011

Author

Jay Janssen

Insight for DBAs

MySQL

Share this Post:

As I stated in my last post, I decided to use the Innodb Table monitor to diagnose an Index count mismatch error a customers found in their mysqld.err log to verify if the problem still existed.

The Innodb Table Monitor has existed for basically forever in Innodb (the MySQL manual discusses it back in the 4.1 manual), and is from a time when what was going on inside of Innodb was a lot murkier than it is now. To use it, you create a table (in any database you choose), like this:

CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;

1	CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;

This, primitively, tells Innodb that it should start the Table monitor. The monitor itself dumps the contents of the Data dictionary to the mysql error log, which looks something like this:

===========================================
090420 12:09:32 INNODB TABLE MONITOR OUTPUT
===========================================
--------------------------------------
TABLE: name SYS_FOREIGN, id 0 11, columns 7, indexes 3, appr.rows 1
  COLUMNS: ID: DATA_VARCHAR DATA_ENGLISH len 0;
           FOR_NAME: DATA_VARCHAR DATA_ENGLISH len 0;
           REF_NAME: DATA_VARCHAR DATA_ENGLISH len 0;
           N_COLS: DATA_INT len 4;
           DB_ROW_ID: DATA_SYS prtype 256 len 6;
           DB_TRX_ID: DATA_SYS prtype 257 len 6;
  INDEX: name ID_IND, id 0 11, fields 1/6, uniq 1, type 3
   root page 46, appr.key vals 1, leaf pages 1, size pages 1
   FIELDS:  ID DB_TRX_ID DB_ROLL_PTR FOR_NAME REF_NAME N_COLS
  INDEX: name FOR_IND, id 0 12, fields 1/2, uniq 2, type 0
   root page 47, appr.key vals 1, leaf pages 1, size pages 1
   FIELDS:  FOR_NAME ID
  INDEX: name REF_IND, id 0 13, fields 1/2, uniq 2, type 0
   root page 48, appr.key vals 1, leaf pages 1, size pages 1
   FIELDS:  REF_NAME ID
--------------------------------------

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

===========================================

090420 12:09:32 INNODB TABLE MONITOR OUTPUT

===========================================

--------------------------------------

TABLE: name SYS_FOREIGN, id 0 11, columns 7, indexes 3, appr.rows 1

COLUMNS: ID: DATA_VARCHAR DATA_ENGLISH len 0;

FOR_NAME: DATA_VARCHAR DATA_ENGLISH len 0;

REF_NAME: DATA_VARCHAR DATA_ENGLISH len 0;

N_COLS: DATA_INT len 4;

DB_ROW_ID: DATA_SYS prtype 256 len 6;

DB_TRX_ID: DATA_SYS prtype 257 len 6;

INDEX: name ID_IND, id 0 11, fields 1/6, uniq 1, type 3

root page 46, appr.key vals 1, leaf pages 1, size pages 1

FIELDS: ID DB_TRX_ID DB_ROLL_PTR FOR_NAME REF_NAME N_COLS

INDEX: name FOR_IND, id 0 12, fields 1/2, uniq 2, type 0

root page 47, appr.key vals 1, leaf pages 1, size pages 1

FIELDS: FOR_NAME ID

INDEX: name REF_IND, id 0 13, fields 1/2, uniq 2, type 0

root page 48, appr.key vals 1, leaf pages 1, size pages 1

FIELDS: REF_NAME ID

--------------------------------------

Basically you can see every table, every column, and every index in that table. Once you get the output, you just DROP the table and it stops. It’s been around for forever, so it should be stable, right?

So, to my customer. It may be important to note that this customer had a few hundred tables. I scheduled a time with the customer during off hours to connect to their master and collect the output so we could confirm if Innodb really did have all the indexes or not. I collected the data just fine, and I found that indeed the indexes that Innodb knew about were correct, and that the error messages the customer saw was apparently only transitory.

However, as I was writing an email to the customer to explain it, I got an email explaining that he saw a big spike in io-wait on the server during the time I ran the Innodb Table monitor (it was just for a minute) and he was very glad we waited until after hours. The server itself was fine, and there didn’t seem to be any adverse effects from running the monitor, but something clearly was a lot busier than expected during execution.

He had data collected using Percona Tookit‘s pt-stalk during the time and asked me to take a look and see what happened. I looked through the data and found not only was there high io-wait on the system, but also there were over a hundred user queries in SHOW PROCESSLIST that were stuck in the ‘Opening Tables’ state, uh-oh.

I dug through the source code and found a few interesting things about the table monitor:

- It holds the dict_sys->mutex for the entire time it runs

- Before outputting the information for each table, it calls dict_update_statistics and appears to force table statistics to be regenerated for each table

- It’s possible I’m mistaken, but it sure looks like dict_update_statistics is called twice for each table, at least in the latest Percona Server 5.1 in: storage/innodb_plugin/dict/dict0load.c: line 226, and again in: storage/innodb_plugin/dict/dict0dict.c: line 4883

So, the io-wait spike is explained by table statistics being regenerated, and the SHOW PROCESSLIST connections stuck in ‘Opening Tables’ were waiting for the dict_sys->mutex, which is required for opening tables in Innodb.

It would be better for the Table Monitor if:

- Table stats weren’t regenerated unless it was necessary

- dict_sys->mutex weren’t such a big hammer

These points are probably moot, since I don’t think most people don’t use this feature very often.

Just because it’s old, doesn’t mean you should trust it.

0 0 votes

Article Rating

Subscribe

2 Comments

Oldest

Newest Most Voted

Raghavendra

14 years ago

Hey Jay,

I was searching for dict_sys mutex and this post was one of the results. I stumbled upon dict_sys mutex at another place — both row_import_tablespace_for_mysql and fsp_get_available_space_in_free_extents, first one being called for alter table and second one for show table like operations I guess — can get into contention.

Few bugs exist on bugs.mysql for this — http://bugs.mysql.com/bug.php?id=54538 — though it touched upon other areas where that mutex had affected.

0

Reply

Raghavendra

14 years ago

I looked at mysql tip source code, and it is called dict_stats_update now and looks like it has new code for lock propagation (to avoid repeated lock/unlock i guess).

From cscope, here is what I could get on its call sites:

0 storage/innobase/dict/dict0dict.cc dict_table_open_on_name 938 dict_stats_update(table,
1 storage/innobase/dict/dict0dict.cc dict_table_print_low 4944 dict_stats_update(table, DICT_STATS_FETCH, TRUE);
2 storage/innobase/dict/dict0load.cc dict_process_sys_tables_rec_and_mtr_commit 367 dict_stats_update(*table, DICT_STATS_FETCH, TRUE);
3 storage/innobase/fts/fts0fts.cc fts_get_total_document_count 2384 dict_stats_update(table, DICT_STATS_RECALC_TRANSIENT, FALSE);
4 storage/innobase/handler/ha_innodb.cc info_low 9684 ret = dict_stats_update(ib_table, stats_upd_option,FALSE);
5 storage/innobase/row/row0mysql.cc row_update_statistics_if_needed 980 dict_stats_update(table, DICT_STATS_FETCH, FALSE);
6 storage/innobase/row/row0mysql.cc row_truncate_table_for_mysql 3318 dict_stats_update(table, DICT_STATS_RECALC_PERSISTENT_SILENT,FALSE)

FALSE == where caller is not holding the lock and dict_stats_update will in turn hold the lock.

=========
| /* check whether caller_has_dict_sys_mutex is set correctly;
| note that mutex_own() is not implemented in non-debug code so
| we cannot avoid having this extra param to the current function */
| ut_ad(caller_has_dict_sys_mutex
| ? mutex_own(&dict_sys->mutex)
| : !mutex_own(&dict_sys->mutex));

================

0

Reply