]]>
]]>

You are here

Latest MySQL Performance Blog posts

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 6 min ago

Checking table definition consistency with mysqldiff

April 15, 2015 - 1:45pm

Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:

How can I check the table definition consistency between servers?

Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.

mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.

Find table definition inconsistencies

mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.

The command line is pretty simple. This is used to compare the tables on “test” database:

mysqldiff --server1=user@host1 --server2=user@host2 test:test

If the database name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb

If the table name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable

Now I want to check the table definition consistency between two servers. The database’s name is “employees”:

# mysqldiff --force --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees # WARNING: Using a password on the command line interface can be insecure. # server1 on 127.0.0.1: ... connected. # server2 on 127.0.0.1: ... connected. # Comparing `employees` to `employees` [PASS] # Comparing `employees`.`departments` to `employees`.`departments` [FAIL] # Object definitions differ. (--changes-for=server1) # --- `employees`.`departments` +++ `employees`.`departments` @@ -1,6 +1,6 @@ CREATE TABLE `departments` ( `dept_no` char(4) NOT NULL, - `dept_name` varchar(40) NOT NULL, + `dept_name` varchar(256) DEFAULT NULL, PRIMARY KEY (`dept_no`), UNIQUE KEY `dept_name` (`dept_name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 # Comparing `employees`.`dept_emp` to `employees`.`dept_emp` [PASS] # Comparing `employees`.`dept_manager` to `employees`.`dept_manager` [PASS] # Comparing `employees`.`employees` to `employees`.`employees` [FAIL] # Object definitions differ. (--changes-for=server1) # --- `employees`.`employees` +++ `employees`.`employees` @@ -5,5 +5,6 @@ `last_name` varchar(16) NOT NULL, `gender` enum('M','F') NOT NULL, `hire_date` date NOT NULL, - PRIMARY KEY (`emp_no`) + PRIMARY KEY (`emp_no`), + KEY `last_name` (`last_name`,`first_name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 # Comparing `employees`.`salaries` to `employees`.`salaries` [PASS] # Comparing `employees`.`titles` to `employees`.`titles` [PASS] Compare failed. One or more differences found.

There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.

It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:

# Object definitions differ. (--changes-for=server1)

So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.

In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:

# mysqldiff --force --difftype=sql --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees [...] # Comparing `employees`.`departments` to `employees`.`departments` [FAIL] # Transformation for --changes-for=server1: ALTER TABLE `employees`.`departments` DROP INDEX dept_name, ADD UNIQUE INDEX dept_name (dept_name), CHANGE COLUMN dept_name dept_name varchar(256) NULL; [...] # Comparing `employees`.`employees` to `employees`.`employees` [FAIL] # Transformation for --changes-for=server1: # ALTER TABLE `employees`.`employees` DROP PRIMARY KEY, ADD PRIMARY KEY(`emp_no`), ADD INDEX last_name (last_name,first_name);

As we can see, the tool is not perfect. There are two problems here:

1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.

2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.

I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.

What mysqldiff runs under the hood?

Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:

SHOW CREATE TABLE `departments`; SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION, SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD, PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA, REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND REFERENCED_TABLE_SCHEMA IS NOT NULL; SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION, SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD, PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

As a summary, it checks partitions, row format, collation, constraints and so on.

Conclusion

There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.

The post Checking table definition consistency with mysqldiff appeared first on MySQL Performance Blog.

Team Tokutek is proud to join Team Percona!

April 14, 2015 - 11:18am

If you haven’t already heard, on the Tuesday morning of the 2015 Percona Live MySQL Conference and Expo it was announced that Tokutek is now part of the Percona family.  This means TokuDB® for MySQL, and TokuMX™ for MongoDB are Percona products now; and that the Tokutek  team is now part of the Percona team.

Percona’s well-deserved reputation for unparalleled customer service and support in the MySQL market makes them the perfect home for Tokutek’s ground-breaking products.  And with the Tokutek acquisition, Percona can expand and extend their activities and offerings into the MongoDB market.

This is a win/win for NoSQL and MySQL fans alike.

More About Tokutek

Tokutek is the company that productized a new and revolutionary form of database indexing designed specifically for modern, Big Data applications.  Based on data science research on new methods for high-performance data processing for data sets that no longer fit in memory, Fractal Tree® indexing is the secret sauce inside TokuDB and TokuMX.

Unlike the 40-year-old B-tree indexing found in other MySQL and MongoDB solutions, Fractal Tree indexing enables: up to 50x better performance; as much as 90% data compression; and 95% better write-optimization.  That translates into significant customer satisfaction gains as well as major cost savings.

In addition, drawing upon their experience in the MySQL world, Tokutek developers introduced full ACID and MVCC transaction compliance, better concurrency, and an improved failover protocol to the MongoDB marketplace with TokuMX. And that means better reliability for mission-critical big data applications built with MongoDB.

Next Steps

The Tokutek team is very excited to be joining the Percona team as we move into the next phase of growth on the MySQL and NoSQL market.

For now, if you want to learn more about TokuDB and TokuMX please visit www.tokutek.com.  (In the coming weeks, the Tokutek site will be folded into the Percona site.)

If you want to strike up a conversation about enterprise subscriptions for either product drop us a line at tokutek@percona.com.

Regards,
Craig Clark
Vice President, Percona Sales

The post Team Tokutek is proud to join Team Percona! appeared first on MySQL Performance Blog.

Team Tokutek is proud to join Team Percona!

April 14, 2015 - 11:18am

If you haven’t already heard, on the Tuesday morning of the 2015 Percona Live MySQL Conference and Expo it was announced that Tokutek is now part of the Percona family.  This means TokuDB® for MySQL, and TokuMX™ for MongoDB are Percona products now; and that the Tokutek  team is now part of the Percona team.

Percona’s well-deserved reputation for unparalleled customer service and support in the MySQL market makes them the perfect home for Tokutek’s ground-breaking products.  And with the Tokutek acquisition, Percona can expand and extend their activities and offerings into the MongoDB market.

This is a win/win for NoSQL and MySQL fans alike.

More About Tokutek

Tokutek is the company that productized a new and revolutionary form of database indexing designed specifically for modern, Big Data applications.  Based on data science research on new methods for high-performance data processing for data sets that no longer fit in memory, Fractal Tree® indexing is the secret sauce inside TokuDB and TokuMX.

Unlike the 40-year-old B-tree indexing found in other MySQL and MongoDB solutions, Fractal Tree indexing enables: up to 50x better performance; as much as 90% data compression; and 95% better write-optimization.  That translates into significant customer satisfaction gains as well as major cost savings.

In addition, drawing upon their experience in the MySQL world, Tokutek developers introduced full ACID and MVCC transaction compliance, better concurrency, and an improved failover protocol to the MongoDB marketplace with TokuMX. And that means better reliability for mission-critical big data applications built with MongoDB.

Next Steps

The Tokutek team is very excited to be joining the Percona team as we move into the next phase of growth on the MySQL and NoSQL market.

For now, if you want to learn more about TokuDB and TokuMX please visit www.tokutek.com.  (In the coming weeks, the Tokutek site will be folded into the Percona site.)

If you want to strike up a conversation about enterprise subscriptions for either product drop us a line at tokutek@percona.com.

Regards,
Craig Clark
Vice President, Percona Sales

The post Team Tokutek is proud to join Team Percona! appeared first on MySQL Performance Blog.

Tokutek now part of the Percona family

April 14, 2015 - 11:13am

It is my pleasure to announce that Percona has acquired Tokutek and will take over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications.

At Percona we have been working with the Tokutek team since 2009, helping to improve performance and scalability. The TokuDB storage engine has been available for Percona Server for about a year, so joining forces is quite a natural step for us.

Fractal Tree indexing technology—developed by years of data science research at MIT, Stony Brook University and Rutgers University—is the new generation data structure which, for many workloads, leapfrogs traditional B-tree technology which was invented in 1972 (over 40 years ago!).  It is also often superior to LSM indexing, especially for mixed workloads.

But as we all know in software engineering, an idea alone is not enough.  There are hundreds of databases which have data structures based on essentially the same B-Tree idea, but their performance and scalability differs dramatically. The Tokutek engineering team has spent more than 50 man years designing, implementing and polishing this technology, which resulted  (in my opinion) in the only production-ready Open Source transactional alternative to the InnoDB storage engine in the MySQL space – TokuDB; and the only viable alternative distribution of MongoDB  – TokuMX.

Designed for Modern World –  TokuDB and TokuMX were designed keeping in mind modern database workloads, modern hardware and modern operating system properties which allowed for much more clean and scalable architecture, leading to great performance and scalability.

Compression at Speed  – As part of it, compression was an early part of design, so a very high level of compression can be achieved with low performance overhead. In fact, chances are with fast compression you will get better performance with compression enabled.

Great Read/Write Balance  – You find databases (or storage engines) are often classified into read optimized and write optimized, and even though you most likely heard about much better insert speed with Fractal Tree indexing, both for MySQL and MongoDB  you may not know that this is achieved with Read performance being in the same ballpark or better for many workloads. The difference is just not so drastic.

Multiple Clustered Keys  –  This is a great feature, which together with compression and low cost index maintenance, allows  TokuDB and TokuMX to reach much better performance for performance critical queries by clustering the data needed by such query together.

Messages    – When we’re speaking about conventional data structure such as B-trees or Hash tables, it is essentially a way data is stored and operations are being performed in it.  Fractal Tree indexing operates with a different paradigm which is focused around “Messages” being delivered towards the data to perform operations in questions.  This allows it to do a lot of clever stuff, such as implement more complex operations with the same message,  merge multiple messages together to optimize performance and use messages for internal purposes such as low overhead online optimization, table structure changes etc.

Low Overhead Maintenance  –  One of obvious uses of such Messages is  Low Overhead Maintenance.  The InnoDB storage engine allows you to add column “online,” which internally requires a full table rebuild, requiring a lot of time and resources for copy of the table.  TokuDB however, can use “broadcast message” to add the column which will become available almost immediately and will gradually physically propagate when data is modified. It is quite a difference!

Smart No-Read Updates –  Messages allow you to do smart complex updates without reading the data, dramatically improving performance.  For example this is used to implement “Read Free Replication”

Optimized In Memory Data Structures –  You may have heard a lot about in-memory databases, which are faster because they are using data structure optimized for properties on memory rather just caching the pages from disk, as, for example,  MyISAM and InnoDB do.   TokuDB and  TokuMX offer you the best of both worlds  by using memory optimized data structures for resident data and disk optimized data structures when data is pushed to disk.

Optimized IO  –  Whether you’re using legacy spinning media or Solid State Storage you will appreciate TokuDB having optimized IO – doing less and more sequential IO which helps spinning media performance, as well as dramatically reducing wear on flash, so you can improve longevity for your media or use lower cost storage.

Between the Tokutek engineering team and Percona we have a lot of ideas on how to take this technology even further, so it is the technology of choice for large portions of modern database workloads in the MySQL and MongoDB space. We are committed to working together to advance the limits of Open Source databases (relational or not)!

Interested to check out whether TokuDB or TokuMX is right for your application? Please contact us at tokutek@percona.com.

The post Tokutek now part of the Percona family appeared first on MySQL Performance Blog.

Tokutek now part of the Percona family

April 14, 2015 - 11:13am

It is my pleasure to announce that Percona has acquired Tokutek and will take over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications.

At Percona we have been working with the Tokutek team since 2009, helping to improve performance and scalability. The TokuDB storage engine has been available for Percona Server for about a year, so joining forces is quite a natural step for us.

Fractal Tree indexing technology—developed by years of data science research at MIT, Stony Brook University and Rutgers University—is the new generation data structure which, for many workloads, leapfrogs traditional B-tree technology which was invented in 1972 (over 40 years ago!).  It is also often superior to LSM indexing, especially for mixed workloads.

But as we all know in software engineering, an idea alone is not enough.  There are hundreds of databases which have data structures based on essentially the same B-Tree idea, but their performance and scalability differs dramatically. The Tokutek engineering team has spent more than 50 man years designing, implementing and polishing this technology, which resulted  (in my opinion) in the only production-ready Open Source transactional alternative to the InnoDB storage engine in the MySQL space – TokuDB; and the only viable alternative distribution of MongoDB  – TokuMX.

Designed for Modern World –  TokuDB and TokuMX were designed keeping in mind modern database workloads, modern hardware and modern operating system properties which allowed for much more clean and scalable architecture, leading to great performance and scalability.

Compression at Speed  – As part of it, compression was an early part of design, so a very high level of compression can be achieved with low performance overhead. In fact, chances are with fast compression you will get better performance with compression enabled.

Great Read/Write Balance  – You find databases (or storage engines) are often classified into read optimized and write optimized, and even though you most likely heard about much better insert speed with Fractal Tree indexing, both for MySQL and MongoDB  you may not know that this is achieved with Read performance being in the same ballpark or better for many workloads. The difference is just not so drastic.

Multiple Clustered Keys  –  This is a great feature, which together with compression and low cost index maintenance, allows  TokuDB and TokuMX to reach much better performance for performance critical queries by clustering the data needed by such query together.

Messages    – When we’re speaking about conventional data structure such as B-trees or Hash tables, it is essentially a way data is stored and operations are being performed in it.  Fractal Tree indexing operates with a different paradigm which is focused around “Messages” being delivered towards the data to perform operations in questions.  This allows it to do a lot of clever stuff, such as implement more complex operations with the same message,  merge multiple messages together to optimize performance and use messages for internal purposes such as low overhead online optimization, table structure changes etc.

Low Overhead Maintenance  –  One of obvious uses of such Messages is  Low Overhead Maintenance.  The InnoDB storage engine allows you to add column “online,” which internally requires a full table rebuild, requiring a lot of time and resources for copy of the table.  TokuDB however, can use “broadcast message” to add the column which will become available almost immediately and will gradually physically propagate when data is modified. It is quite a difference!

Smart No-Read Updates –  Messages allow you to do smart complex updates without reading the data, dramatically improving performance.  For example this is used to implement “Read Free Replication”

Optimized In Memory Data Structures –  You may have heard a lot about in-memory databases, which are faster because they are using data structure optimized for properties on memory rather just caching the pages from disk, as, for example,  MyISAM and InnoDB do.   TokuDB and  TokuMX offer you the best of both worlds  by using memory optimized data structures for resident data and disk optimized data structures when data is pushed to disk.

Optimized IO  –  Whether you’re using legacy spinning media or Solid State Storage you will appreciate TokuDB having optimized IO – doing less and more sequential IO which helps spinning media performance, as well as dramatically reducing wear on flash, so you can improve longevity for your media or use lower cost storage.

Between the Tokutek engineering team and Percona we have a lot of ideas on how to take this technology even further, so it is the technology of choice for large portions of modern database workloads in the MySQL and MongoDB space. We are committed to working together to advance the limits of Open Source databases (relational or not)!

Interested to check out whether TokuDB or TokuMX is right for your application? Please contact us at tokutek@percona.com.

The post Tokutek now part of the Percona family appeared first on MySQL Performance Blog.

Percona Toolkit 2.2.14 is now available

April 14, 2015 - 9:33am

Percona is pleased to announce the availability of Percona Toolkit 2.2.14.  Released April 14, 2015. Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release is the current GA (Generally Available) stable release in the 2.2 series. It includes multiple bug fixes for pt-table-checksum with better support for Percona XtraDB Cluster, various other fixes, as well as continued preparation for MySQL 5.7 compatibility. Full details are below. Downloads are available here and from the Percona Software Repositories.

New Features:

  • pt-slave-find can now resolve the IP address and show the slave’s hostname. This can be done with the new --resolve-address option.
  • pt-table-sync can now ignore the tables whose names match a specific Perl regex with the new --ignore-tables-regex option.

Bugs Fixed:

  • Fixed bug 925781: Inserting non-BMP characters into a column with utf8 charset would cause the Incorrect string value error when running the pt-table-checksum.
  • Fixed bug 1368244: pt-online-schema-change --alter-foreign-keys-method=drop-swap` was not atomic and thus it could be interrupted. Fixed by disabling common interrupt signals during the critical drop-rename phase.
  • Fixed bug 1381280: pt-table-checksum was failing on BINARY field in Primary Key. Fixed by implementing new --binary-index flag to optionally create checksum table using BLOB data type.
  • Fixed bug 1421405: Running pt-upgrade against a log with many identical (or similar) queries was producing repeated sections with the same fingerprint.
  • Fixed bug 1402730: pt-duplicate-key-checker was not checking for duplicate keys when --verbose option was set.
  • Fixed bug 1406390: A race condition was causing pt-heartbeat to crash with sleep argument error.
  • Fixed bug 1417558: pt-stalk when used along with --collect-strace didn’t write the strace output to the expected destination file.
  • Fixed bug 1421025: Missing dependency for perl-TermReadKey RPM package was causing toolkit commands to fail when they were run with --ask-pass option.
  • Fixed bug 1421781: pt-upgrade would fail when log contained SELECT...INTO queries. Fixed by ignoring/skipping those queries.
  • Fixed bug 1425478: pt-stalk was removing non-empty files that were starting with an empty line.
  • Fixed bug 1419098: Fixed bad formatting in the pt-table-checksum documentation.

Details of the release can be found in the release notes and the 2.2.14 milestone at Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

The post Percona Toolkit 2.2.14 is now available appeared first on MySQL Performance Blog.

Staying ahead of MySQL operational problems at Percona Live

April 13, 2015 - 7:35am

I’ve started my long journey from Florianópolis, Brazil, to Santa Clara, California and I type this words while waiting for a connection flight. Next Wednesday, Daniel Guzmán Burgos and I will be presenting in the Percona Live MySQL Conference and Expo (PLMCE).

I’m so excited with the new MySQL 101 program that has been added to this year’s event! Along the years I’ve been working as a Support Engineer at Percona I’ve heard two very distinct types of comments amongst others from some people, customers and community in general, about PLMCE:

1) That they went and it was awesome but they found it hard to follow as most of the contents in the program were high level for them;

2) that they wanted to attend the event but just didn’t felt experienced enough with MySQL just yet, so they would wait a year or two before committing to it.

Sometimes I’ve also been asked: “Don’t you guys organize a similar conference, but for beginners ?” Not exactly, I’d tell them, pointing them to the on-demand training that Percona offers, but this isn’t always a good fit for everybody.

Well, I’m glad that this request was reconsidered this year and that we now have an intensive MySQL 101 2-day program: “You send us developers and admins, and we’ll send you back MySQL DBAs.

Daniel and I will be talking about the use of a few key tools of the Percona Toolkit from the viewpoint of day-to-day operations of a MySQL DBA – we’ll showcase how some of them can be integrated to Nagios to improve monitoring. The goal: “stay ahead of MySQL operational problems,“which is also the base title of our talk.

Percona Toolkit is a rich collection of more than 30 command-line tools for MySQL, Percona Server and MariaDB that can help database administrators perform and automate a variety of database and system tasks. Some of the tools focus on diagnostic, some on performance improvement and some others are actually used to fix things. I just love them; they make my day-to-day work with MySQL that much easier!

I hope to see you in Santa Clara this week!

The post Staying ahead of MySQL operational problems at Percona Live appeared first on MySQL Performance Blog.

Measuring the impact of tcpdump on Very Busy Hosts

April 10, 2015 - 7:19am

A few years back Deva wrote about how to use tcpdump on very busy hosts. That post sparked my interest about exploring how to measure the impact of tcpdump on very busy hosts. In this post, I wanted to highlight how much of an impact there really is and what options you have to make the query collection much more effective.

Some things you need to know:

  • The test is a sysbench read-only workload, 8 tables, 8 threads, 1000000 rows each with 16G of buffer pool. Dataset fully in memory.
  • sysbench is ran on the same host, on 1Gbps connection, sysbench can saturate the network and therefore affect my network test with netcat so I decided to run locally.
  • There are 13 tests, 5 minutes each with 1 minute interval, varying on how the dump file is captured.
    • First one as baseline is the MySQL slow query log.A:mysql -e 'set global long_query_time=0, slow_query_log=1; select sleep(300); set global long_query_time=1, slow_query_log=0;'
    • Second group is tcpdump with -w option, which means tcpdump itself is writing to the capture file.B:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap port 3306C:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2'D:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap 'port 3306 and ( tcp[1] & 7 == 2 or tcp[3] & 7 == 2 )'
    • Third group, is using “packet-buffered” (-U option) to see if there will be improvement on response time.E:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap port 3306F:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2'G:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap 'port 3306 and ( tcp[1] & 7 == 2 or tcp[3] & 7 == 2 )'
    • Next streams the backup to a remote location via netcat.H:$DUMPCMD -i any -G 300 -W 1 -Z root -w - port 3306 | nc remote_ip 33061I:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w - port 3306 | nc remote_ip 33062J:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w - 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' | nc remote_ip 33063
    • The last group, the one most of us are probably accustomed with is piping the dumped packets to file.K:timeout -s KILL 300 $DUMPCMD -i any port 3306 > tcpdump.pcapL:timeout -s KILL 300 $DUMPCMD -i any 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' > tcpdump.pcapM:timeout -s KILL 300 $DUMPCMD -i any 'port 3306 and ( tcp[1] & 7 == 2 or tcp[3] & 7 == 2 )' > tcpdump.pcap
    • $DUMPCMD  is defined astcpdump -s 65535 -x -nn -q -tttt
  • On each group there is an AND and OR variation in port filtering. I wanted to see whether how much of additional impact port expressions have. And as you will see below, they do not have significant impact on performance, but on number of queries captured.

I’ve graphed the sysbench data during the test and labeled each test for easy comparison on the graphs.

Of course, I’ve also compared the size of resulting tcpdump capture and total queries identified when run through pt-query-digest.Findings
  • We see that, piping the pcap (K, L, M) data as decoded packets has significant overhead in terms of number of captured queries, response time and reads requests completed.
  • Using the slow log has about 30% overhead in response time, nearly 20% drop in throughput but have highest number of queries captured.
  • Writing captured packets directly to binary file using the -w option has the lowest overhead in response time, around 10%. Throughput drops depending on how much filtering is involved though while also there are noticeable stalls when the operating system flushes the page cache. This side effect causes sysbench to drop to 0 reads or even reach response times of several seconds!
  • Streaming packets to a capable remote server in terms of network bandwidth, IO performance combined with -w option to capture binary data produces 20-25% overhead in response time, 10-15% drop in throughput, no stalls and number of queries captured as close to slow query log.
Summary

Use tcpdump -w option in all cases and decode later. If you are looking for an overall view of ALL your queries, streaming tcpdump data to remote is also ideal. If you have low bandwidth though i.e. 100Mbps, this might not be enough as 5mins of binary tcpdump data produced 31G of file. That is 105MBps requirement! In which case, consider writing to a separate partition with enough IO.

If you are using Percona Server or MariaDB and is only looking to capture a portion of your workload i.e. table scans, temp table on disk or rate limit the collection, the extended slow query logging capability with this versions are also an excellent way to capture the data you need.

The post Measuring the impact of tcpdump on Very Busy Hosts appeared first on MySQL Performance Blog.

InnoDB locks and deadlocks with or without index for different isolation level

April 9, 2015 - 11:27am

Recently, I was working on one of the issue related to locks and deadlocks with InnoDB tables and I found very interesting details about how InnoDB locks and deadlocks works with or without index for different Isolation levels.

Here, I would like to describe a small test case about how SELECT ..FOR UPDATE (with and without limit) behave with INSERT/UPDATE and with READ-COMMITED and REPEATABLE-READ Isolation levels. I’m creating a small table data_col with few records. Initially, this test case was written by Bill Karwin to explain details to customer, but here I have used a bit modified test case.

CREATE TABLE data_col (dataname VARCHAR(10), period INT, expires DATE, host VARCHAR(10));

INSERT INTO data_col VALUES (‘med1′, 1,’2014-01-01 00:00:00′, ‘server1′);
INSERT INTO data_col VALUES (‘med2′, 1,’2014-02-15 00:00:00′, ‘server2′);
INSERT INTO data_col VALUES (‘med3′, 1,’2014-03-20 00:00:00′, ‘server3′);
INSERT INTO data_col VALUES (‘med4′, 1,’2014-04-10 00:00:00′, ‘server4′);
INSERT INTO data_col VALUES (‘med5′, 1,’2014-05-01 00:00:00′, ‘server5′);

Case 1: No index on expires, tx_isolation=READ-COMMITTED.

Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hangs */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* deadlocks */

------------------------ LATEST DETECTED DEADLOCK ------------------------ 2015-04-04 12:35:45 7f3f0a084700 *** (1) TRANSACTION: TRANSACTION 28960, ACTIVE 24 sec fetching rows mysql tables in use 1, locked 1 LOCK WAIT 3 lock struct(s), heap size 360, 7 row lock(s) MySQL thread id 39, OS thread handle 0x7f3f0a0b5700, query id 158 localhost root Creating sort index SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28960 lock_mode X locks rec but not gap waiting *** (2) TRANSACTION: TRANSACTION 28961, ACTIVE 17 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 MySQL thread id 40, OS thread handle 0x7f3f0a084700, query id 159 localhost root Creating sort index SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28961 lock_mode X locks rec but not gap *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28961 lock_mode X locks rec but not gap waiting *** WE ROLL BACK TRANSACTION (1)

With READ-COMMITTED, even If Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can Insert the record as Session 1 is not using gap lock (lock_mode X locks rec but not gap waiting) and we can insert/update the records outside of set of Session 1 examined. But when Session 1 tried to acquire locks on higher range (“expires < ‘2014-07-01′ “), it will be hanged and if we do the same thing from Session 2, it will turn to deadlock.

Here, When there is no primary key, InnoDB table will create it’s own cluster index, which is GEN_CLUST_INDEX.

Case 2: No index on expires, tx_isolation=REPEATABLE-READ.

Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* hangs */

------------ TRANSACTIONS ------------ Trx id counter 29502 Purge done for trx's n:o < 29500 undo n:o < 0 state: running but idle History list length 86 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 175 localhost root init show engine innodb status ---TRANSACTION 29501, ACTIVE 4 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s) MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 174 localhost root update INSERT INTO data_col VALUES ('med6', 1,'2014-06-03 00:00:00', 'server6') ------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 16 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 29501 lock_mode X insert intention waiting ------------------ ---TRANSACTION 29500, ACTIVE 10 sec 2 lock struct(s), heap size 360, 13 row lock(s) MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 173 localhost root cleaning up

With REPEATABLE-READ, we can see that when Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can’t Insert the record and waiting for the lock release (lock_mode X insert intention waiting) from Session 1 because it’s using gap locks. Here,  insert intention is one type of gap lock.  This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap.

This is why for some scenario/ business logic, REPEATABLE-READ is better isolation level to prevent deadlocks by using more row locks. (including gap locks)

Case 3: Added Primary Key on dataname and Index on expires, tx_isolation=READ-COMMITTED.

Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med13′, 1,’2014-06-13 00:00:00′, ‘server13′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* success */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hang*/

------------ TRANSACTIONS ------------ Trx id counter 29452 Purge done for trx's n:o < 29450 undo n:o < 0 state: running but idle History list length 68 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 140 localhost root init show engine innodb status ---TRANSACTION 29451, ACTIVE 23 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1 MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 139 localhost root Sending data SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE ------- TRX HAS BEEN WAITING 8 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 15 page no 4 n bits 80 index `expires` of table `nil`.`data_col` trx id 29451 lock_mode X locks rec but not gap waiting ------------------ ---TRANSACTION 29450, ACTIVE 29 sec 3 lock struct(s), heap size 360, 2 row lock(s) MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 138 localhost root cleaning up

If there is index on “expires” column which we use to filter the records, it will show different behavior with READ-COMMITED. In Case 1,  we were not able to acquire locks on higher range (“expires < ‘2014-07-01′ “)  while here, we can do that with index on expires. So when Session 1 has already locked the rows, another Session 2 can’t acquire the same lock and will be waiting to release locks from Session 1.

But it we remove LIMIT 1 from SELECT…FOR UPDATE then it will behave the same like Case 1.

———————–Just removed limit 1—————————————

Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med14′, 1,’2014-06-04 00:00:00′, ‘server14′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* hang */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* deadlock*/

------------------------ LATEST DETECTED DEADLOCK ------------------------ 2015-04-04 14:22:49 7f3f0a084700 *** (1) TRANSACTION: TRANSACTION 29051, ACTIVE 52 sec fetching rows mysql tables in use 1, locked 1 LOCK WAIT 4 lock struct(s), heap size 1184, 9 row lock(s) MySQL thread id 39, OS thread handle 0x7f3f0a0b5700, query id 251 localhost root Creating sort index SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires FOR UPDATE *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29051 lock_mode X locks rec but not gap waiting *** (2) TRANSACTION: TRANSACTION 29052, ACTIVE 13 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 MySQL thread id 40, OS thread handle 0x7f3f0a084700, query id 252 localhost root Creating sort index SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires FOR UPDATE *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29052 lock_mode X locks rec but not gap *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29052 lock_mode X locks rec but not gap waiting *** WE ROLL BACK TRANSACTION (2) ------------

Case 4: Added Primary Key on dataname and Index on expires, tx_isolation=REPEATABLE-READ.

Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med15′, 1,’2014-06-10 00:00:00′, ‘server15′); /* success */

Here, Unlike Case 2, we’ll be able to INSERT record from Session 2 as Session 1 is not using gap lock.

——————–Successfully happened but if I’ll remove the limit 1, —————————-

Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med11′, 1,’2014-06-11 00:00:00′, ‘server11′); /* hang */

------------ TRANSACTIONS ------------ Trx id counter 29454 Purge done for trx's n:o < 29450 undo n:o < 0 state: running but idle History list length 68 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 149 localhost root init show engine innodb status ---TRANSACTION 29453, ACTIVE 6 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s) MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 148 localhost root update INSERT INTO data_col VALUES ('med11', 1,'2014-06-11 00:00:00', 'server11') ------- TRX HAS BEEN WAITING 6 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29453 lock_mode X locks gap before rec insert intention waiting ------------------ ---TRANSACTION 29452, ACTIVE 18 sec 2 lock struct(s), heap size 360, 12 row lock(s) MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 147 localhost root cleaning up --------

But here, Session 1 will use gap lock, so Session 2 can’t able to insert record until lock release.  To know more about deadlock, I would suggest to read below blog post, “How to deal with MySQL deadlocks” by my colleague Peiran Song.

There are plenty of blog posts to describe InnoDB locks but few things are still missing in manual. Here is the list of some of those bugs.

Conclusion: We can see from above test case that to prevent deadlocks, sometimes we have to use indexes,  sometimes higher isolation level helps, even if it counter-intuitive and sometimes application OR table schema changes can help.

If you are coming to the Percona Live MySQL Conference and Expo 2015, then visit me for this talk: “Understanding InnDB locks and deadlocks” (16 April 3:00PM – 3:50PM @ Ballroom A).

The post InnoDB locks and deadlocks with or without index for different isolation level appeared first on MySQL Performance Blog.

pquery binaries with statically included client libs now available!

April 9, 2015 - 11:00am

After we released pquery to the community, and as we started logging bug reports with pquery testcases, it quickly became clear that pquery binaries with statically compiled-in client libraries would be of great convenience, both for ourselves and for the community.

(If you haven’t heard about pquery yet, read the pquery introduction blog post, come and join the pquery introduction lightning talk at Percona Live (15 April just after 7 p.m.), or keep an eye out for some of the upcoming episodes in the MySQL QA Series.)

While we were in the process of creating these binaries (which turned out to be not as straighforward as we thought it would be), we also disovered a rather significant memory de-allocation bug in pquery, which would at times have caused pquery to crash (segfault). I want to especially thank Sergei for helping get the client libraries compiled into pquery, as well as fixing the segfault memory de-allocation bug and some other problems we found, and Ramesh who helped with binary testing and debugging.

I also want to thank Sveta (previously at Oracle, now at Percona), Umesh (Oracle), and Satya Bodapati (Oracle) for not giving up too easily when we logged some bug reports with dynamically linked (i.e. client libs not included) and at times failing pquery binaries!

Now, the pquery binaries with statically included client binaries are finally ready! We have pquery-ps (with static Percona Server 5.6 client libs), pquery-ms (with static MySQL 5.6 client libs), and pquery-md (with static MariaDB 5.5 client libs). The pquery binaries can also be used to test  any other community solution or product. For example, we use it to test our beloved Percona XtraDB Cluster (PXC) using specially developed pquery+Docker+PXC scripts.

Things have also continued to evolve quickly in the pquery framework, as well as in the accompanying reducer.sh ($ bzr branch lp:randgen – available as randgen/util/reducer/reducer.sh), so check out the many updates now! To get it, $ bzr branch lp:percona-qa and start by having a look at pquery-run.sh

You may also like to checkout our latest pquery-reach.sh and pquery-reach++.sh – which are a wrapper around most of the major pquery framework tools. Not as straightforward to use and setup as pquery-run.sh (as it requires setup within the sub-scripts it uses…), but reviewing pquery-reach.sh will give you a good idea on how to setup pquery-run.sh and get into things.

Stay tuned for the upcoming MySQL QA episodes (link above), and you’ll soon be hunting bugs like Mr. Nuclear! (To meet (the nice) Mr. Nuclear, come and see our lightning talk at Percona Live!)

Enjoy!

The post pquery binaries with statically included client libs now available! appeared first on MySQL Performance Blog.

Pages

]]>