Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Checking table definition consistency with mysqldiff

April 15, 2015

Author

Miguel Angel Nieto

MySQL

Share this Post:

Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:

How can I check the table definition consistency between servers?

Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.

mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.

Find table definition inconsistencies

mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.

The command line is pretty simple. This is used to compare the tables on “test” database:

mysqldiff --server1=user@host1 --server2=user@host2 test:test

1	mysqldiff --server1=user@host1 --server2=user@host2 test:test

If the database name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb

1	mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb

If the table name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable

1	mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable

Now I want to check the table definition consistency between two servers. The database’s name is “employees”:

# mysqldiff --force --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
# WARNING: Using a password on the command line interface can be insecure.
# server1 on 127.0.0.1: ... connected.
# server2 on 127.0.0.1: ... connected.
# Comparing `employees` to `employees`                             [PASS]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Object definitions differ. (--changes-for=server1)
#

--- `employees`.`departments`
+++ `employees`.`departments`
@@ -1,6 +1,6 @@
 CREATE TABLE `departments` (
   `dept_no` char(4) NOT NULL,
-  `dept_name` varchar(40) NOT NULL,
+  `dept_name` varchar(256) DEFAULT NULL,
   PRIMARY KEY (`dept_no`),
   UNIQUE KEY `dept_name` (`dept_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`dept_emp` to `employees`.`dept_emp`       [PASS]
# Comparing `employees`.`dept_manager` to `employees`.`dept_manager`   [PASS]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Object definitions differ. (--changes-for=server1)
#

--- `employees`.`employees`
+++ `employees`.`employees`
@@ -5,5 +5,6 @@
   `last_name` varchar(16) NOT NULL,
   `gender` enum('M','F') NOT NULL,
   `hire_date` date NOT NULL,
-  PRIMARY KEY (`emp_no`)
+  PRIMARY KEY (`emp_no`),
+  KEY `last_name` (`last_name`,`first_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`salaries` to `employees`.`salaries`       [PASS]
# Comparing `employees`.`titles` to `employees`.`titles`           [PASS]
Compare failed. One or more differences found.

# mysqldiff --force --server1=root:[email protected]:21489 --server2=root:[email protected]:21490 employees:employees

# WARNING: Using a password on the command line interface can be insecure.

# server1 on 127.0.0.1: ... connected.

# server2 on 127.0.0.1: ... connected.

# Comparing `employees` to `employees` [PASS]

# Comparing `employees`.`departments` to `employees`.`departments` [FAIL]

# Object definitions differ. (--changes-for=server1)

--- `employees`.`departments`

+++ `employees`.`departments`

@@ -1,6 +1,6 @@

CREATE TABLE `departments` (

`dept_no` char(4) NOT NULL,

- `dept_name` varchar(40) NOT NULL,

+ `dept_name` varchar(256) DEFAULT NULL,

PRIMARY KEY (`dept_no`),

UNIQUE KEY `dept_name` (`dept_name`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

# Comparing `employees`.`dept_emp` to `employees`.`dept_emp` [PASS]

# Comparing `employees`.`dept_manager` to `employees`.`dept_manager` [PASS]

# Comparing `employees`.`employees` to `employees`.`employees` [FAIL]

# Object definitions differ. (--changes-for=server1)

--- `employees`.`employees`

+++ `employees`.`employees`

@@ -5,5 +5,6 @@

`last_name` varchar(16) NOT NULL,

`gender` enum('M','F') NOT NULL,

`hire_date` date NOT NULL,

- PRIMARY KEY (`emp_no`)

+ PRIMARY KEY (`emp_no`),

+ KEY `last_name` (`last_name`,`first_name`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

# Comparing `employees`.`salaries` to `employees`.`salaries` [PASS]

# Comparing `employees`.`titles` to `employees`.`titles` [PASS]

Compare failed. One or more differences found.

There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.

It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:

# Object definitions differ. (--changes-for=server1)

1	# Object definitions differ. (--changes-for=server1)

So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.

In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:

# mysqldiff --force --difftype=sql --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
[...]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Transformation for --changes-for=server1:

ALTER TABLE `employees`.`departments`
  DROP INDEX dept_name,
  ADD UNIQUE INDEX dept_name (dept_name),
  CHANGE COLUMN dept_name dept_name varchar(256) NULL;

[...]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Transformation for --changes-for=server1:
#

ALTER TABLE `employees`.`employees`
  DROP PRIMARY KEY,
  ADD PRIMARY KEY(`emp_no`),
  ADD INDEX last_name (last_name,first_name);

# mysqldiff --force --difftype=sql --server1=root:[email protected]:21489 --server2=root:[email protected]:21490 employees:employees

[...]

# Comparing `employees`.`departments` to `employees`.`departments` [FAIL]

# Transformation for --changes-for=server1:

ALTER TABLE `employees`.`departments`

DROP INDEX dept_name,

ADD UNIQUE INDEX dept_name (dept_name),

CHANGE COLUMN dept_name dept_name varchar(256) NULL;

[...]

# Comparing `employees`.`employees` to `employees`.`employees` [FAIL]

# Transformation for --changes-for=server1:

ALTER TABLE `employees`.`employees`

DROP PRIMARY KEY,

ADD PRIMARY KEY(`emp_no`),

ADD INDEX last_name (last_name,first_name);

As we can see, the tool is not perfect. There are two problems here:

1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.

2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.

I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.

What mysqldiff runs under the hood?

Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:

SHOW CREATE TABLE `departments`;

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA,
         REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME
  FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND
        REFERENCED_TABLE_SCHEMA IS NOT NULL;

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SHOW CREATE TABLE `departments`;

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS

FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,

COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY

FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,

SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,

PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION

FROM INFORMATION_SCHEMA.PARTITIONS

WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA,

REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME

FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE

WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND

REFERENCED_TABLE_SCHEMA IS NOT NULL;

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS

FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,

COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY

FROM INFORMATION_SCHEMA.COLUMNS

WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,

SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,

PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION

FROM INFORMATION_SCHEMA.PARTITIONS

WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

As a summary, it checks partitions, row format, collation, constraints and so on.

Conclusion

There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.

0 0 votes

Article Rating

5 Comments

Oldest

Newest Most Voted

danblack

11 years ago

Any other way is taking the mysqldump –no-data and comparing the files with:

diff –show-function-line=’^CREATE TABLE’ -I ‘^–‘ -U 1 server1.sql server2.sql

Flimm

11 years ago

Regarding your method, Danblack, I found that doesn’t work when a table has the same indexes but in a different order, mysqldump will list them in a different order and so diff will consider them different, when really they should be considered equivalent.

Daniel Black

11 years ago

Careful about writing off Index order an insignificant, it does matter for insert on duplicate key update statements when determining the first unique index.

Thanks for pointing out mysqldiff however. I hadn’t used it before.

Flimm

11 years ago

Hmm, I’m not sure I can see how it affects insert on duplicate key update statements. Do you mind elaborating?

MLBR

9 years ago

‘@Flimm

Here is a (admittedly contrived) example to show that the order of the unqiue constraints can change the end result of a set of DML.

drop table if exists t1;
create table t1 (
a int,
b int,
c int,
unique key a (a),
unique key b (b));

drop table if exists t2;
create table t2 (
a int,
b int,
c int,
unique key b (b),
unique key a (a));

insert into t1 values (1,10,0) on duplicate key update c = a+b;
insert into t1 values (2,11,0) on duplicate key update c = a+b;
insert into t1 values (1,11,0) on duplicate key update c = a+b;

insert into t2 values (1,10,0) on duplicate key update c = a+b;
insert into t2 values (2,11,0) on duplicate key update c = a+b;
insert into t2 values (1,11,0) on duplicate key update c = a+b;

select * from t1;
+——+——+——+
| a | b | c |
+——+——+——+
| 1 | 10 | 11 |
| 2 | 11 | 0 |
+——+——+——+

select * from t2;
+——+——+——+
| a | b | c |
+——+——+——+
| 1 | 10 | 0 |
| 2 | 11 | 13 |
+——+——+——+