Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Why audit logging with triggers in MySQL is bad for replication

September 29, 2008

Author

Ewen Fortune

Insight for DBAs

Share this Post:

Recently I was tasked with investigating slippage between master and slave in a standard replication setup.

The client was using Maatkit’s mk-table-checksum to check his slave data was indeed a fair copy of that of the master.

mk-table-checksum --algorithm=BIT_XOR h=hostname.local,u=root,p=xxx --replicate=checksum.checksum --emptyrepltbl --chunksize=500000 --databases mydb --sleep 1

1

mk-table-checksum --algorithm=BIT_XOR h=hostname.local,u=root,p=xxx --replicate=checksum.checksum --emptyrepltbl --chunksize=500000 --databases mydb --sleep 1

He could then examine the checksum.checksum table and see all was well, however there were various tables with different crc values.

db: mydb tbl: Foo_History chunk: 0 boundaries: 1=1 this_crc: 30627c76fe658fd9b77eaddf1ea8c03a this_cnt: 2593 master_crc: bdbadd7dae2636a8cf515bb886fb1295 master_cnt: 2593 ts: 2008-09-24 04:50:05

1
2
3
4
5
6
7
8
9

db: mydb
tbl: Foo_History
chunk: 0
boundaries: 1=1
this_crc: 30627c76fe658fd9b77eaddf1ea8c03a
this_cnt: 2593
master_crc: bdbadd7dae2636a8cf515bb886fb1295
master_cnt: 2593
ts: 2008-09-24 04:50:05

So, now I needed to find out what was updating the table. Here is where tools like mysqlsla and Maatkit’s mk-log-parser come into their own as they both allow you to quickly parse the binary log files, extracting the relevant statements.

mysqlbinlog bin_log.000001 | mysqlsla -lt binary -

1

mysqlbinlog bin_log.000001 | mysqlsla -lt binary -

Check out http://hackmysql.com/mysqlsla_filters for how to filter by statement.

Looking through the binary logs I could see this table is actually an audit table for changes to the Foo table. The trail is kept using two triggers on that table.

select trigger_name, event_object_table, Event_Manipulation from information_schema.triggers where trigger_schema = 'mydb' and action_statement like '%Foo_History%'G *************************** 1. row *************************** trigger_name: Foo_Update event_object_table: Foo Event_Manipulation: UPDATE *************************** 2. row *************************** trigger_name: Foo_Delete event_object_table: Foo Event_Manipulation: DELETE

1
2
3
4
5
6
7
8
9
10
11
12
13

select trigger_name,
event_object_table,
Event_Manipulation
from information_schema.triggers where trigger_schema = 'mydb'
and  action_statement like '%Foo_History%'G
*************************** 1. row ***************************
      trigger_name: Foo_Update
event_object_table: Foo
Event_Manipulation: UPDATE
*************************** 2. row ***************************
      trigger_name: Foo_Delete
event_object_table: Foo
Event_Manipulation: DELETE

So whats the problem with that?, well there is a situation where two overlapping transactions updating the Foo table can be reordered once serialized on the slave.

Here is an example:

I recreated the tables and triggers, populating the Foo table with a handful of rows and then ran the following.

Here is the update trigger:

Create Trigger Foo_Update After UPDATE on Foo For Each Row INSERT into Foo_History (Foo_History_ID, Name, Value, Field_Id) Values (Old.Foo_History_ID, Old.Name, Old.Value, Old.Field_Id);

1
2
3

Create Trigger Foo_Update After UPDATE on Foo
For Each Row INSERT into Foo_History (Foo_History_ID, Name, Value, Field_Id)
Values (Old.Foo_History_ID, Old.Name, Old.Value, Old.Field_Id);

Transaction 1 starts and updates.

Start Transaction; Update Foo set Value=6 Where Field_ID = 3;

1
2

Start Transaction;
Update Foo set Value=6 Where Field_ID = 3;

Transaction 2 starts, updates and commits.

Start Transaction; Update Foo set Value=6 Where Field_Id = 51; Commit;

1
2
3

Start Transaction;
Update Foo set Value=6 Where Field_Id = 51;
Commit;

Transaction 1 commits last.

Commit;

1

Commit;

Now when these statements get run on the slave they will be serialized, thus changing the order of the inserts made by the trigger. The Foo_History table is now out of sync.

Master:

*************************** 1. row *************************** Foo_History_Id: 1 Name: maxlength Value: 7 Field_Id: 3 *************************** 2. row *************************** Foo_History_Id: 2 Name: maxlength Value: 7 Field_Id: 51

1
2
3
4
5
6
7
8
9
10

*************************** 1. row ***************************
    Foo_History_Id: 1
          Name: maxlength
         Value: 7
      Field_Id: 3
*************************** 2. row ***************************
    Foo_History_Id: 2
          Name: maxlength
         Value: 7
      Field_Id: 51

Slave:

*************************** 1. row *************************** Foo_History_Id: 1 Name: maxlength Value: 7 Field_Id: 51 *************************** 2. row *************************** Foo_History_Id: 2 Name: maxlength Value: 7 Field_Id: 3

1
2
3
4
5
6
7
8
9
10

*************************** 1. row ***************************
    Foo_History_Id: 1
Name: maxlength
         Value: 7
  Field_Id: 51
*************************** 2. row ***************************
    Foo_History_Id: 2
Name: maxlength
         Value: 7
      Field_Id: 3

As you can see from the above, the updates were performed in a different order, with the inserts being assigned a different Foo_History_Id. This is because the statements are written to the binary log in commit order.

0 0 votes

Article Rating

10 Comments

Oldest

Newest Most Voted

Gregory Haase

17 years ago

I think you’re subject line might be a bit misleading. I mean, replication is technically behaving exactly as it should be. If MVCC rules are correct and we do not have dirty reads, then from a client perspective, the order of events on the slave is more correct.

Depending on how the replica will be used, this might be fine. From a historical perspective, the order can’t truly conflict or you’d have gotten a deadlock in the above process. So historical analysis shouldn’t be broken. If your replica exists for emergency backup/restore, then having a few lines out of order shouldn’t cause any problems.

One obvious way around this is to find a way not to have an auto-incremented key on the history table. If you can make a unique, combined key as primary key, it should be the same regardless of the insert order. If that can’t be done, then consider skipping a primary key altogether – maybe the history table isn’t accessed enough for this to be a performance issue. If you can’t do either of those, then I would suggest that you disable the crc check on this table and just have mk-table-checksum do a row count.

Author

Ewen Fortune

17 years ago

Hi Gregory,

I admit I had problems inventing a title for this post, the truth is that there is slightly more to the story. I also saw some statements in the binary logs that were deleting rows on the value of the autoincrement column, therefore deleting different row data. You are correct in stating that replication is behaving as expected, however from both the point of view of the developers and the system architect things are a little different. They are both expecting the slave to contain the same data as the master.

Baron Schwartz

17 years ago

Good points all. Another major thing here is that the client doesn’t want to have any checks showing “orange lights” that are to be ignored, because that leads to bad things — people get numb to any sort of warnings that the slave has different data.

I think this table probably doesn’t need a primary key. I had pondered how to avoid the problem but there’s no candidate key at all in the data (Ewen didn’t show the real client data), so if there’s to be a PK it has to be a surrogate.

Gregory Haase

17 years ago

I guess this is one of those situations where not being able to see the full story can lead to all kinds of incorrect assumptions. For instance, when I saw there were deletions on the auto_increment value I immediately jumped to the conclusion that the application is deleting audit data, which would be a major major no no.

Dmitry Lenev

17 years ago

Curious what is engine of “Foo_History” table? If it is the same engine as for “Foo” table then I would say that you might have encountered a bug worth reporting…

Author

Ewen Fortune

17 years ago

Dmitry,

Both Tables in this case were InnoDB.

Jesper Krogh

17 years ago

I’m sorry, but can you clarify how OLD.Foo_History_ID comes from? Looking at you trigger definition, it looks like it is comes from the table being updated, but from the result it looks like an auto_increment column on the history table.

If the former is the case, then they should be the same on the master and the slave.

If the latter is the case, then even without replication there might be a problem if the updated table also has an auto_increment column (which is my impression is not an issue here), as I believe MySQL doesn’t support having two auto_increments in the same “statement” (including triggers).

Of course, one should remember that the exact same issue exists, if the master is reloaded using the binlog.

Author

Ewen Fortune

17 years ago

Hi Jesper,

Its the latter, but the restriction is one autoincrement per table not per statement.

Dmitry Lenev

17 years ago

Hello Ewen!

I think that actually Jesper is right. In case of statement based replication there is also a restriction on number of auto-increment columns used by statement as whole.
Please see http://bugs.mysql.com/bug.php?id=19630. Using more than one auto-increment column in statement should be fine for row based or mixed mode replication though.

Best regards!

Jeremy

17 years ago

Very similar to what we are doing today, although order doesn’t matter since our application orders by timestamp. Giuseppe Maxia posted something recently on Revision Engine (http://www.ddengine.org/versioneng) and we started looking at this to gain logging without writing scripts to build triggers on every table.