EmergencyEMERGENCY? Get 24/7 Help Now!

Checking table definition consistency with mysqldiff

 | April 15, 2015 |  Posted In: MySQL

PREVIOUS POST
NEXT POST

Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:

How can I check the table definition consistency between servers?

Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.

mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.

Find table definition inconsistencies

mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.

The command line is pretty simple. This is used to compare the tables on “test” database:

If the database name is different:

If the table name is different:

Now I want to check the table definition consistency between two servers. The database’s name is “employees”:

There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.

It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:

So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.

In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:

As we can see, the tool is not perfect. There are two problems here:

1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.

2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.

I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.

What mysqldiff runs under the hood?

Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:

As a summary, it checks partitions, row format, collation, constraints and so on.

Conclusion

There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.

PREVIOUS POST
NEXT POST
Miguel Angel Nieto

Miguel joined Percona in October 2011. He has worked as a System Administrator for a Free Software consultant and in the supporting area of the biggest hosting company in Spain. His current focus is improving MySQL and helping the community of Free Software to grow.

5 Comments

  • Any other way is taking the mysqldump –no-data and comparing the files with:

    diff –show-function-line=’^CREATE TABLE’ -I ‘^–‘ -U 1 server1.sql server2.sql

  • Regarding your method, Danblack, I found that doesn’t work when a table has the same indexes but in a different order, mysqldump will list them in a different order and so diff will consider them different, when really they should be considered equivalent.

  • Careful about writing off Index order an insignificant, it does matter for insert on duplicate key update statements when determining the first unique index.

    Thanks for pointing out mysqldiff however. I hadn’t used it before.

  • Hmm, I’m not sure I can see how it affects insert on duplicate key update statements. Do you mind elaborating?

  • @Flimm

    Here is a (admittedly contrived) example to show that the order of the unqiue constraints can change the end result of a set of DML.

    drop table if exists t1;
    create table t1 (
    a int,
    b int,
    c int,
    unique key a (a),
    unique key b (b));

    drop table if exists t2;
    create table t2 (
    a int,
    b int,
    c int,
    unique key b (b),
    unique key a (a));

    insert into t1 values (1,10,0) on duplicate key update c = a+b;
    insert into t1 values (2,11,0) on duplicate key update c = a+b;
    insert into t1 values (1,11,0) on duplicate key update c = a+b;

    insert into t2 values (1,10,0) on duplicate key update c = a+b;
    insert into t2 values (2,11,0) on duplicate key update c = a+b;
    insert into t2 values (1,11,0) on duplicate key update c = a+b;

    select * from t1;
    +——+——+——+
    | a | b | c |
    +——+——+——+
    | 1 | 10 | 11 |
    | 2 | 11 | 0 |
    +——+——+——+

    select * from t2;
    +——+——+——+
    | a | b | c |
    +——+——+——+
    | 1 | 10 | 0 |
    | 2 | 11 | 13 |
    +——+——+——+

Leave a Reply