EmergencyEMERGENCY? Get 24/7 Help Now!
Subscribe to Latest MySQL Performance Blog posts feed
Updated: 15 hours 41 min ago

Webinar Thursday May 26: Troubleshooting MySQL hardware resource usage

May 24, 2016 - 8:46am

Join Sveta on Thursday, May 26, 2016, at 10 am PDT (UTC-7) for her webinar Troubleshooting MySQL hardware resource usage.

MySQL does not just run on its own. It stores data on disk, and stores data and temporarily results in memory. It uses CPU resources to perform operations, and a network to communicate with its clients.

In this webinar, we’ll discuss common resource usage issues, how they affect MySQL Server performance, and methods to find out how resources are being used. We will employ both OS-level tools, and new features in Performance Schema that provide detailed information on what exactly is happening inside MySQL Server.

Register for the webinar here.

Sveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015.

Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker, teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as Support Engineer in MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book MySQL Troubleshooting and JSON UDF Functions for MySQL.

Take Percona’s one-click high availability poll

May 23, 2016 - 1:47pm

Wondering what high availability (HA) solutions are most popular? Take our high availability poll below!

HA is always a hot topic. The reality is that if your data is not available, your customers cannot do business with you. In fact, estimates show the average cost of downtime is about $5K per minute. With an average outage taking 40 minutes to correct, you could be looking at a potential cost of $200K if your MySQL instance goes down. Whether your database is on premise, or in public or private clouds, it is critical that your database deployment does not have a potentially devastating single point of failure.

Please take a few seconds and answer the following poll. It will help the community get an idea of how companies are approaching HA in their critical database environments.

If you’re using other solutions or have specific issues, feel free to comment below. We’ll post a follow-up blog with the results!

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Percona disabling TLSv1.0 May 31st 2016

May 23, 2016 - 11:09am

As of May 31st, 2016, we will be disabling TLSv1.0 support on www.percona.com, repo.percona.com, etc.

This is ahead of the PCI changes that will affect the June 30th 2016 deprecation the TLSv1.0 protocol. (PDF)

What does this mean for you the user?

Based on analysis of our IDS logs, this will affect around 6.32% of requests. As of May 31st, such requests will present an error when trying to negotiate a TLS connection.

Users are advised to update their clients accordingly. SSLabs provides a good test for browsers, though this does not support command line tools. Going forward, we will only support TLSv1.1 and TLSv1.2.

These changes come a little over a year from our previous SSL overhaul, and are part of our ongoing effort to ensure the security of our users.

Thank you for your time. Please leave any questions in the comments section, or email us at security(at)percona.com.

 

 

Percona XtraBackup 2.4.3 is now available

May 23, 2016 - 6:56am


Percona
is glad to announce the GA release of Percona XtraBackup 2.4.3 on May 23rd, 2016. Downloads are available from our download site and from apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups

New Features:

  • Percona XtraBackup has implemented new --reencrypt-for-server-id option. Using this option allows users to start the server instance with different server_id from the one the encrypted backup was taken from, like a replication slave or a Galera node. When this option is used, xtrabackup will, as a prepare step, generate a new master key with ID based on the new server_id, store it into keyring file and re-encrypt the tablespace keys inside of tablespace headers.

Bugs Fixed:

  • Running DDL statements on Percona Server 5.7 during the backup process could in some cases lead to failure while preparing the backup. Bug fixed #1555626.
  • MySQL 5.7 can sometimes skip redo logging when creating an index. If such ALTER TABLE is being issued during the backup, the backup would be inconsistent. xtrabackup will now abort with an error message if such ALTER TABLE has been done during the backup. Bug fixed #1582345.
  • .ibd files for remote tablespaces were not copied back to the original location pointed by the .isl files. Bug fixed #1555423.
  • When called with insufficient parameters, like specifying the empty --defaults-file option, Percona XtraBackup could crash. Bug fixed #1566228.
  • The documentation states that the default value for –ftwrl-wait-query-type is all, however it was update. Changed the default value to reflect the documentation. Bug fixed #1566315.
  • When –keyring-file-data option was specified, but no keyring file was found, xtrabackup would create an empty one instead of reporting an error. Bug fixed #1578607.
  • If ALTER INSTANCE ROTATE INNODB MASTER KEY was run at the same time when xtrabackup --backup was bootstrapping it could catch a moment when the key was not written into the keyring file yet and xtrabackup would overwrite the keyring with the old copy of a keyring, so the new key would be lost. Bug fixed #1582601.
  • The output of the --slave-info option was missing an apostrophe. Bug fixed #1573371.

Release notes with all the bugfixes for Percona XtraBackup 2.4.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

Introduction to Troubleshooting Performance – Troubleshooting Slow Queries webinar: Q & A

May 20, 2016 - 1:50pm

In this blog, I will provide answers to the Q & A for the Troubleshooting Slow Queries webinar.

First, I want to thank you for attending the April 28 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: I’ve heard that is a bad idea to use select *; what do you recommend?

A: When I used SELECT * in my slides, I wanted to underline the idea that sometimes you need to select all columns from the table. There is nothing bad about it if you need them. SELECT * is bad when you need only a few columns from the table. In this case, you retrieve more data than needed, which affects performance. Another issue that  SELECT * can cause is if you hard-code the statement into your application, then change table definition; the application could start retrieving columns in wrong order and output (e.g., email instead of billing address). Or even worse, it will try to access a non-existent index in the result set array. The best practice is to explicitly enumerate all columns that your application needs.

Q: I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?

A: I assume you are asking about the ability to create an index with lengths smaller than the column length? They work as follows:

Assume you have a TEXT  field which contains these user questions:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?
  3. ….

Since this is a TEXT  field you cannot create and index on it without specifying its length, so you need to make the index as minimal as possible to uniquely identify questions. If you create an index with length 10 it will contain:

  1. I’ve heard
  2. I heard th

You will index only those parts of questions that are not very distinct from each other, and do not contain useful information about what the question is. You can create index of length 255:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as index

In this case, the index includes the whole first question and almost all the second question. This makes the index too large and requires us to use more disk space (which causes more IO). Also, information from the second question is probably too much.

If make index of length 75, we will have:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle du

This is more than enough for the first question and gives a good idea of what is in the second question. It also potentially will have enough unique entries to make its cardinality look more like the cardinality of real data distribution.

To conclude: choosing the correct index length is something that requires practice and analysis of your actual data. Try to make them as short as possible, but long enough so that the number of unique entries in the index will be similar to a number of unique entries in the table.

Q: Which view can we query to see stats?

A: Do you mean index statistics? SHOW INDEX FROM table_name will do it.

Q: We have an InnoDB table with 47 fields (mostly text); some are ft-indexed. I tried to do an alter table, and it ran for 24 hours. What is the best way to run an alter table to add one extra field? The table has 1.9 M rows and 47 columns with many indexes.

A: Adding a column requires a table copy. Therefore, the speed of this operation depends on the table size and speed of your disk. If you are using version 5.6 and later, adding a column would not block parallel queries (and therefore is not a big deal). If you are using an older version, you can always use the pt-online-schema-change utility from Percona Toolkit. However, it will run even more slowly than the regular ALTER TABLE. Unfortunately, you cannot speed up the execution of ALTER TABLE much. The only thing that you can do is to use a faster disk (with options, tuned to explore speed of the disk).

However, if you do not want to have this increased IO affect the production server, you can alter the table on the separate instance, then copy tablespace to production and then apply all changes to the original table from the binary logs. The steps will be something like:

  1. Ensure you use option innodb_file_per_table  and the big table has individual tablespace
  2. Ensure that binary log is enabled
  3. Start a new server (you can also use an existent stand-by slave).
  4. Disable writes to the table
  5. Record the binary log position
  6. Copy the tablespace to the new server as described here.
  7. Enable writes on the production server
  8. Run ALTER TABLE on the new server you created in step 2 (it will still take 24 hours)
  9. Stop writes to the table on the production server
  10. Copy the tablespace, altered in step 7
  11. Apply all writes to this table, which are in the binary logs after position, recorded in step 4.
  12. Enable writes to the table

This scenario will take even more time overall, but will have minimal impact on the production server

Q: If there is a compound index like index1(emp_id,date), will the following query be able to use index? “select * from table1 where emp_id = 10”

A: Yes. At least it should.

Q: Are filesort and temporary in extended info for explain not good?

A: Regarding filesort: it depends. For example, you will always have the word filesort” for tables which perform ORDER BY and cannot use an index for ORDER BY. This is not always bad. For example, in this query:

mysql> explain select emp_no, first_name from employees where emp_no <20000 order by first_nameG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: employees partitions: NULL type: range possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 18722 filtered: 100.00 Extra: Using where; Using filesort 1 row in set, 1 warning (0,01 sec)

the primary key used to resolve rows and filesort were necessary and not avoidable. You can read about different filesort algorithms here.

Regarding Using temporary: this means what during query execution temporary table will be created. This is can be not good, especially if the temporary table is large and cannot fit into memory. In this case, it would be written to disk and slow down operations. But, again, sometimes creating temporary tables in not avoidable, for example, if you have both GROUP BY and ORDER BY clauses which list columns differently as stated in the user manual.

Q: Is key_len length more of a good thing for query execution?

A: key_len field is not NULL for all queries that use and index, and just shows the length of the key part used. It is not good or bad, it is just for information. You can use this information, for example, to identify which part of combined index is used to resolve the query.

Q: Does an alter query go for an optimizer check?

A: No. You can check it either by enabling optimizer trace, running ALTER and find what trace is empty. Or by enabling the debug option and searching the resulting trace for optimize.

Q: A query involves four columns that are all individually covered by an index. The optimizer didn’t merge indexes because of cost, and even didn’t choose the composite index I created.

A: This depends on the table definition and query you used. I cannot provide a more detailed answer based only on this information.

Q cont.: Finally, only certain composite indexes were suitable, the column order in the complex index mattered a lot. Why couldn’t the optimizer merge the four individual single column indexes, and why did the order of the columns in the composite index matter?

A: Regarding why the optimizer could not merge four indexes, I need to see how the table is defined and which data is in these indexed columns. Regarding why the order of the columns in the composite index matters, it depends on the query. Why the optimizer can use an index, say, on (col1, col2) where the conditions col1=X AND col2=Y and col2=Y AND col2=X for the case when you use OR, the order is important. For example, for the condition col1=X OR col2=Y, where the part col1=X is always executed and the part col2=Y  is executed only when col1=X is false. The same logic applies to queries like SELECT col1 WHERE col2=Y ORDER BY col3. See the user manual for details.

Q: When I try to obtain the optimizer trace on the console, the result is cut off. Even if I redirect the output to a file, how to overcome that?

A: Which version of MySQL Server do you use? The TRACE column is defined as longtext NOT NULL, and should not cause such issues. If it does with a newer version, report a bug at http://bugs.mysql.com/.

Q: Are there any free graphical visualizers for either EXPLAIN or the optimizer TRACE output?

A: There is graphical visualizer for EXPLAIN in MySQL Workbench. But it works with online data only: you cannot run it on EXPLAIN output, saved into a file. I don’t know about any visualizer for the optimizer TRACE output. However, since it is JSON you can simply save it to file and open in web browser. It will allow a better view than if opened in simple text editor.

Q: When do you use force index instead of use index hints?

A: According to user manual “USE INDEX (index_list) hint tells MySQL to use only one of the named indexes to find rows in the table” and “FORCE INDEX  hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive . . . a table scan is used only if there is no way to use one of the named indexes to find rows in the table.” This means that when you use USE INDEX, you are giving a hint for the optimizer to prefer a particular index to others, but not enforcing index usage if the optimizer prefers a table scan, while FORCE INDEX requires using the index. I myself use only FORCE and IGNORE  index hints.

Q: Very informative session. I missed the beginning part. Are you going to distribute the recoded session later?

A: Yes. As usual slides and recording available here.

Percona XtraDB Cluster 5.6.29-25.15 is now available

May 20, 2016 - 7:06am


Percona
is glad to announce the new release of Percona XtraDB Cluster 5.6 on May 20, 2016. Binaries are available from the downloads area or our software repositories.

Percona XtraDB Cluster 5.6.29-25.15 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.29-25.15 milestone at Launchpad.

For more information about relevant Codership releases, see this announcement.

Bugs Fixed:

  • Node eviction in the middle of SST now causes the node to shut down properly.
  • After an error during node startup, the state is now marked unsafe only if SST is required.
  • Fixed data inconsistency during multi-insert auto-increment workload on async master with binlog-format=STATEMENTwhen a node begins async slave with wsrep_auto_increment_control=ON.
  • Fixed crash when a prepare statement is aborted (due to a conflict with applier) and then replayed.
  • Removed a special case condition in wsrep_recover() that would not happen under normal conditions.
  • Percona XtraDB Cluster no longer fails during SST, if a node reserves a very large amount of memory for InnoDB buffer pool.
  • If the value of wsrep_cluster_address is not valid, trying to create a slave thread will now generate a warning instead of an error, and the thread will not be created.
  • Fixed error with loading data infile (LDI) into a multi-partitioned table.
  • The wsrep_node_name variable now defaults to host name.
  • Starting mysqld with unknown option now fails with a clear error message, instead of randomly crashing.
  • Optimized the operation of SST and IST when a node fails during startup.
  • The wsrep_desync variable can now be enabled only after a node is synced with cluster. That is, it cannot be set during node bootup configuration).
  • Fixed crash when setting a high flow control limit (fc_limit) and the recv queue fills up.
  • Only the default 16 KB page size (innodb_page_size=16384) is accepted until the relevant upstream bug is fixed by Codership (see https://github.com/codership/galera/issues/398). All other sizes will report Invalid page size and shut down (the server will not start up).
  • If a node is executing RSU/FTWRL, explicit desync of the node will not happen until the implicit desync action is complete.
  • Fixed multiple bugs in the test suite to improve quality assurance.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Fixing MySQL scalability problems with ProxySQL or thread pool

May 19, 2016 - 1:58pm

In this blog post, we’ll discuss fixing MySQL scalability problems using either ProxySQL or thread pool.

In the previous post I showed that even MySQL 5.7 in read-write workloads is not able to maintain throughput. Oracle’s recommendation to play black magic with innodb_thread_concurrency and innodb_spin_wait_delay doesn’t always help. We need a different solution to deal with this scaling problem.

All the conditions are the same as in my previous run, but I will use:

  • ProxySQL limited to 200 connections to MySQL. ProxySQL has a capability to multiplex incoming connections; with this setting, even with 1000 connections to the proxy it will maintain only 200 connections to MySQL.
  • Percona Server with enabled thread pool, and a thread pool size of 64

You can see final results here:

There are good and bad sides for both solutions. With ProxySQL, there is a visible overhead on lower numbers of threads, but it keeps very stable throughput after 200 threads.

With Percona Server thread pool, there is little-to-no overhead if the number of threads is less than thread pool size, but after 200 threads it falls behind ProxySQL.

I would say the correct solution depends on your setup:

  • If you already use or plan to use ProxySQL, you may use it to prevent MySQL from saturation
  • If you use Percona Server, you might consider trying to adjust the thread pool

Summary https://github.com/Percona-Lab-results/201605-OLTP-RW-proxy-threadpool/blob/master/summary-OLTP-RW-proxy.md.

 

Webinar Tuesday, May 24: Understanding how your MongoDB schema affects scaling, and when to consider sharding for help

May 19, 2016 - 1:02pm

Please join David Murphy on Tuesday, May 24 at 10 am PDT (UTC-7) as he presents “Understanding how your MongoDB schema affects scaling, and when to consider sharding for help.”

David will discuss the pros and cons of a few MongoDB schema design patterns on a stand-alone machine, and then will look at how sharding affects them.  He’ll examine what assumptions did you make that could cause havoc on your CPU, memory and network during a scatter gather.   This webinar will help answer the questions:

  • Would you still use the same schema if you knew you were going to shard?
  • Are your fetches using the same shard, or employing parallelism to boost performance?
  • Are you following the golden rules of schema design?

Register for this webinar here.

David Murphy, MongoDB Practice Manager

David joined Percona in October 2015 as Practice Manager for MongoDB. Before that, David joined the ObjectRocket by Rackspace team as the Lead DBA in Sept 2013. With the growth involved with any recently acquired startup, David’s role covered a wide range of evangelism, research, run book development, knowledge base design, consulting, technical account management, mentoring and much more. Before the world of MongoDB, David was a MySQL and NoSQL architect at Electronic Arts working with some of the largest titles in the world like FIFA, SimCity, and Battle Field providing tuning, design, and technology choice responsibilities. David maintains an active interest in database speaking and exploring new technologies.

Percona Server for MongoDB 3.0.11-1.6 is now available

May 19, 2016 - 9:10am

Percona is pleased to announce the release of Percona Server for MongoDB 3.0.11-1.6 on May 19, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.11-1.6 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.11, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks that can be recommended for production.

This release includes all changes from MongoDB 3.0.11. Additionally, the following fixes were made:

  • Fixed memory over-allocation
  • PSMDB-56: Additional fixes related to this previously fixed bug.

The release notes are available in the official documentation.

 

Percona Server 5.5.49-37.9 is now available

May 19, 2016 - 8:39am


Percona is glad to announce the release of Percona Server 5.5.49-37.9 on May 19, 2016. Based on MySQL 5.5.49, including all the bug fixes in it, Percona Server 5.5.49-37.9 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.49-37.9 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Percona Server is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • CREATE TABLE ... LIKE ... could create a system table with an unsupported enforced engine. Bug fixed #1540338.
  • The server will now show a more descriptive error message when Percona Server fails with errno == 22 "Invalid argument", if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1578604.
  • apt-cache show command for percona-server-client was showing innotop included as part of the package. Bug fixed #1201074.
  • mysql-systemd would fail with PAM authentication and proxies due to regression introduced when fixing bug #1534825 in Percona Server 5.5.48-37.8. Bug fixed #1558312.

Other bugs fixed: #1578625 (upstream #81295), bug fixed #1553166, and bug fixed #1578303 (upstream #81324).

The release notes for Percona Server 5.5.49-37.9 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

Where is the MySQL 5.7 root password?

May 18, 2016 - 10:26am

In this blog, we’ll discuss how to find the MySQL 5.7 root password.

While new MySQL software security features are always welcome, they can impact use and performance. Now by default, MySQL 5.7 creates a password for the root user (among other changes) so the installation itself can be considered secure. It’s a necessary change, but it has confused some customers and users. I see a lot of people on social networks (like Twitter) asking about this change.

Where is my root password?

The answer depends on the way you have installed MySQL 5.7 or Percona Server 5.7. I am going to show where to find the password depending on the installation method and the distribution used. For all these examples, I assume this is a new installation and you are using the default my.cnf.

Centos/Redhat – RPM Packages.

The password is not shown on screen during the installation. It is in the error log. The autogenerated my.cnf contains this line:

log-error=/var/log/mysqld.log

So, there is our password:

# cat /var/log/mysqld.log | grep "temporary password" 2016-05-16T07:09:49.796912Z 1 [Note] A temporary password is generated for root@localhost: 8)13ftQG5OYl

Debian/Ubuntu

During the packages installation, you get a prompt asking for the root password. If you don’t set it up, MySQL’s root user is created without a password. We can read the following line in package installation output:

2016-05-16T07:27:21.532619Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.

but it is configured with the auth_socket plugin. You will only be able to connect using the UNIX socket, therefore any attempt to connect using your local IP or the network fails. Later on, you can change the password to allow connections from the network (as explained in this blog post).

All distributions – Binary tarball

mysql_install_db has been deprecated since MySQL 5.7.6. You need to use mysqld to initialize all system databases (like mysql, it contains the users and password). You have two ways of doing it:

–initialize: this is the default and recommended option. It will create a mysql database including a random password that will be written in the error log.

# tail -n1 /var/log/mysql/error.log 2016-05-16T07:47:58.199154Z 1 [Note] A temporary password is generated for root@localhost: wzgds/:Kf2,g

If you don’t have error-log directive configured, or any my.cnf at all, then it will be in the datadir with host_name.err name.

–initialize-insecure: datadir will be initialized without setting a random password to the root user.

# tail -n1 /var/log/mysql/error.log 2016-05-16T07:51:28.506142Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.

Conclusion

Unfortunately, more security can also add more confusion. Depending on the installation method and distribution, the MySQL 5.7 root password process varies a lot, so keep an eye on the error log after every installation and also watch the installation process output shown on screen. In case you are really lost (or you have removed the error log for some reason), you can still start mysqld with --skip-grant-tables to access the database and change the password.

Webinar Thursday May 19, 2016: MongoDB administration for MySQL DBA

May 18, 2016 - 9:21am

Please join Alexander Rubin, Percona Principal Consultant, for his webinar MongoDB administration for MySQL DBA on Thursday, May 19 at 10 am PDT (UTC-7).

If you are a MySQL DBA and want to learn MongoDB quickly – this webinar is for you. MySQL and MongoDB share similar concepts so it will not be hard to get up to speed with MongoDB.

In this talk I will explain the following MongoDB administration concepts:
  • Day to day operations for MongoDB
  • Storage engines and differences with MySQL storage engines
  • Databases, collections and documents
  • Replication in MongoDB and the difference with MySQL replication
  • Sharding in MongoDB
  • Backups in MongoDB

In the webinar, each slide will show a MySQL concept or operation (on the left) and the corresponding MongoDB one (on the right).

Register here.

Alexander Rubin, Principal Consultant

Alexander joined Percona in 2013. Alexander has worked with MySQL since 2000 as a DBA and Application Developer. Before joining Percona, he was a MySQL principal consultant for over seven years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

MySQL 5.7 read-write benchmarks

May 17, 2016 - 10:22am

In this post, we’ll look at the results from some MySQL 5.7 read-write benchmarks.

In my past blogs I’ve posted benchmarks on MySQL 5.5 / 5.6 / 5.7 in OLTP read-only workloads. For example:

Now, it is time to test some read-write transactional workloads. I will again use sysbench, and my scripts and configs are available here: https://github.com/Percona-Lab-results/201605-OLTP-RW.

A short description of the setup:

  • The client (sysbench) and server are on different servers, connected via 10Gb network
  • CPU: 56 logical CPU threads servers Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  • sysbench ten tables x 10mln rows, Pareto distribution
  • OS: Ubuntu 15.10 (Wily Werewolf)
  • Kernel 4.2.0-30-generic
  • The storage device is Samsung SM863 SATA SSD, single device, with ext4 filesystem
  • MySQL versions: 5.7.12 , 5.6.30 and 5.5.48

InnoDB holds all data in memory, and InnoDB log files are big enough, so there are only IO writes (which happen in the background) and there is no pressure from InnoDB on the IO subsystem.

The results looked like the following:

The vertical line shows the variability of the throughput (standard deviation).

To show the difference for a lower numbers of threads, here is chart with relative performance normalized by MySQL 5.7 (MySQL 5.7 = 1 in the following chart):

So we can finally see significant improvements in MySQL 5.7, it scales much better in read-write workloads than previous versions.

In the lower numbers of threads, however, MySQL 5.7 throughput is still behind MySQL 5.5 and MySQL 5.6, this is where slower single thread performance in MySQL 5.7 and longer execution paths show themselves. The problem with low threads read-write performance is replication. I wonder how a slave 5.7 performs comparing to 5.6 slave – I am going to run this benchmark soon.

Another point to keep in mind is that we still see a “bell shape,” even for MySQL 5.7. After 430 threads, the throughput drops off a cliff. Despite Oracle’s claims that there is no need for a thread pool anymore, this is not the case – I am not able to prevent a throughput drop using magic tuning with innodb_thread_concurrency and innodb_spin_wait_delay. No matter what, MySQL 5.7 is not able to maintain throughput on a high amount of threads (1000+) for this workload.

What can be done in this case? I have two solutions: Percona Server with thread pool functionality, or ProxySQL with connection multiplexing. I will show these results in the next post.

 

 

 

MySQL “Got an error reading communication packet” errors

May 16, 2016 - 9:32am

In this blog post, we’ll discuss the possible reasons for MySQL “Got an error reading communication packet” errors, and how to address them.

In Percona’s managed services, we often receive customer questions on communication failure errors – where customers are faced with intermittent “Got an error reading communication packets” messages. I thought this topic deserved blog post so we can discuss possible reasons for this error, and how to remedy this problem. I hope this will help readers on how to investigate and resolve this problem.

First of all, whenever a communication error occurs it increments the status counter for either Aborted_clients or Aborted_connects, which describe the number of connections that were aborted because the client died without closing the connection properly and the number of failed attempts to connect to MySQL server (respectively). The possible reasons for both errors are numerous (see the Aborted_clients increments or Aborted_connects increments sections in the MySQL manual).

In the case of log_warnings>1, MySQL also writes this information to the error log (shown below):

[Warning] Aborted connection 305628 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets) [Warning] Aborted connection 305627 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets)

In this case, MySQL increments the status counter for Aborted_clients, which could mean:

  • The client connected successfully but terminated improperly (and may relate to not closing the connection properly)
  • The client slept for longer than the defined wait_timeout or interactive_timeout seconds (which ends up causing the connection to sleep for wait_timeout seconds and then the connection gets forcibly closed by the MySQL server)
  • The client terminated abnormally or exceeded the max_allowed_packet for queries

The above is not an all-inclusive list.Now, how to identify what causing this problem and how to remedy this problem.

How do we identify what caused this problem, and how do we fix it?

To be honest, aborted connection errors are not easy to diagnose. But in my experience, it’s related to network/firewall issues most of the time. We usually investigate those issues with the help of Percona toolkit scripts, i.e. pt-summary / pt-mysql-summary / pt-stalk. The outputs from those scripts can be very helpful.

Some of the reasons can be:

  • A high rate of connections sleeping inside MySQL for hundred of seconds is one of the symptoms that applications aren’t closing connections after doing work, and instead relying on the wait_timeout to close them. I strongly recommend changing the application logic to properly close connections at the end of an operation.
  • Check to make sure the value of max_allowed_packet is high enough, and that your clients are not receiving a “packet too large” message. This situation aborts the connection without properly closing it.
  • Another possibility is TIME_WAIT. I’ve noticed many TIME_WAIT notifications from the netstat, so I would recommend confirming the connections are well managed to close on the application side.
  • Make sure the transactions are committed (begin and commit) properly, so that once the application is “done” with the connection it is left in a clean state.
  • You should ensure that client applications do not abort connections. For example, if PHP has option max_execution_time set to 5 seconds, increasing connect_timeout would not help because PHP will kill the script. Other programming languages and environments can have similar safety options.
  • Another cause for delay in connections is DNS problems. Check if you have skip-name-resolve enabled, and if hosts are authenticated against their IP address instead their hostname.
  • One way to find out where your application is misbehaving is to add some logging to your code that will save the application actions along with the MySQL connection ID. With that, you can correlate it to the connection number from the error lines. Enable the Audit log plugin, which logs connections and query activity, and check the Percona Audit Log Plugin as soon as you hit a connection abort error. You can check for the audit log to identify which query is the culprit. If you can’t use the Audit plugin for some reason, you can consider using the MySQL general log – however, this can be risky on a loaded server. You should enable the general log for at least a few minutes. While it puts a heavy burden on the server, the errors tend to happen fairly often, so you should be able to collect the needed data before the log grows too large. I recommend enabling the general log with an -f tail, then disable the general log when you see the next warning in the log. Once you find the query from the aborted connection, identify which piece of your application issues that query and co-relate the queries with portions of your application.
  • Try increasing the net_read_timeout and net_write_timeout values for MySQL and see if that reduces the number of errors. net_read_timeout is rarely the problem unless you have an extremely poor network. Try tweaking those values, however, because in most cases a query is generated and sent as a single packet to the server, and applications can’t switch to doing something else while leaving the server with a partially received query. There is a very detailed blog post on this topic from our CEO, Peter Zaitsev.

Aborted connections happen because a connection was not closed properly. The server can’t cause aborted connections unless there is a networking problem between the server and the client (like the server is half duplex, and the client is full duplex) – but that is the network causing the problem, not the server. In any case, such problems should show up as errors on the networking interface. To be extra sure, check the ifconfig -a  output on the MySQL server to check if there are errors.

Another way to troubleshoot this problem is via tcpdump. You can refer to this blog post on how to track down the source of aborted connections. Look for potential network issues, timeouts and resource issues with MySQL.

I found this blog post useful in explaining how to use tcpdump on busy hosts. It provides help for tracking down the TCP exchange sequence that led to the aborted connection, which can help you figure out why the connection broke.

For network issues, use a ping to calculate the round trip time (RTT) between machine where mysqld is located and the machine from where the application makes requests. Send a large file (1GB or more) to and from client and server machines, watch the process using tcpdump, then check if an error occurred during transfer. Repeat this test few times. I also found this from my colleague Marco Tusa useful: Effective way to check network connection.

One other idea I can think of is to capture the netstat -s output along with a timestamp after every N seconds (e.g., 10 seconds so you can relate netstat -s output of BEFORE and AFTER an aborted connection error from the MySQL error log). With the aborted connection error timestamp, you can co-relate it with the netstat sample captured as per a timestamp of netstat, and watch which error counters increased under the TcpExt section of netstat -s.

Along with that, you should also check the network infrastructure sitting between the client and the server for proxies, load balancers, and firewalls that could be causing a problem.

Conclusion:
I’ve tried to cover communication failure errors, and how to identify and fix the possible aborted connections. Take into account, faulty ethernets, hubs, switches, cables, and so forth can cause this issue as well. You must replace the hardware itself to properly diagnose these issues.

Benchmark MongoDB with sysbench

May 13, 2016 - 9:17am

In this blog post, we’ll discuss how to benchmark MongoDB with sysbench.

In an earlier post, I mentioned our use of sysbench-mongodb (via this fork) to run benchmarks of MongoDB servers. I now want to share our work extending sysbench to make it work with MongoDB.

If you’re not familiar with sysbench, it’s a great project developed by Alexey Kopytov that lets you run different types of benchmarks (referred to as “tests” by the tool), including database benchmarks. The database tests are implemented in Lua scripts, which means you can customize them as needed (or even write new ones from scratch) – something useful for simulating specific workloads.

All of the database tests in sysbench assume an SQL-based database, so instead of trying to shoehorn MongoDB tests into this framework I modified the connect/disconnect functions to handle MongoDB, and then implemented new functions specific for this database.

You can find the work (which is still in progress but usable, and in fact currently used by us in benchmarks) on the dev-mongodb-support-1.0 branch of our sysbench fork.

To use it, you just need to specify the –mongo-url argument (others too, as needed, but this is the one that must be present for sysbench to detect a MongoDB test is requested), and then provide the path to the Lua script you want to run. The following is an example:

sysbench --mongo-write-concern=1 --mongo-url="mongodb://localhost" --mongo-database-name=sbtest --test=sysbench/sysbench/tests/mongodb/oltp.lua --oltp_table_size=60000000 --oltp_tables_count=16 --num-threads=512 --rand-type=pareto --report-interval=10 --max-requests=0 --max-time=600 --oltp-point-selects=10 --oltp-simple-ranges=1 --oltp-sum-ranges=1 --oltp-order-ranges=1 --oltp-distinct-ranges=1 --oltp-index-updates=1 --oltp-non-index-updates=1 --oltp-inserts=1 run

To build this branch, you’ll first need to build and install (or otherwise obtain) the mongo-c-driver project, as that is what we use to connect to MongoDB. Once that’s done, building is just a matter of running the following commands from the repo’s root:

./autogen.sh ./configure make sudo make install #optionally

The changes should not affect the other database tests in sysbench, though I have only verified that the MySQL ones continue to work.

Right now, the workload from sysbench-mongodb is implemented in Lua scripts (oltp.lua), and work is in progress to allow freeform operations to be created with new Lua scripts (by providing functions that take JSON as the argument). As an alternative, you may want to check out this much-less-tested (and currently unstable) branch based on luamongo. It already supports the creation of arbitrary workloads in Lua. In this case, you also need to build luamongo, which is included.

With either branch, you can add new tests by implementing new Lua scripts (though the dev-mongodb-support-1.0 branch still needs a few functions implemented on the C side to support arbitrary operations from the Lua side).

We think there are still some types of operations needed to improve sysbench’s usefulness for MongoDB, such as queries involving arrays, union, the $in operator, geospatial operators, and in place updates.

We hope you find this useful, and we welcome suggestions and bug reports to improve it.

Happy benchmarking!

ProxySQL versus MaxScale for OLTP RO workloads

May 12, 2016 - 10:52am

In this blog post, we’ll discuss ProxySQL versus MaxScale for OLTP RO workloads.

Continuing my series of READ-ONLY benchmarks (you can find the other posts here: https://www.percona.com/blog/2016/04/07/mysql-5-7-sysbench-oltp-read-results-really-faster/ and https://www.percona.com/blog/2016/03/28/mysql-5-7-primary-key-lookup-results-is-it-really-faster), in this post I want to see how much overhead a proxy adds. At this

In my opinion, there are only two solid proxy software options for MySQL at the moment: ProxySQL and MaxScale. In the past, there was also MySQL Proxy, but it is pretty much dead for now. Its replacement, MySQl Router, is still in the very early stages and seriously lacks any features that would compete with ProxySQL and MaxScale. This will most likely change in the future – when MySQL Router adds more features, I will reevaluate them then!

To test the proxies, I will start with a very simple setup to gauge basic performance characteristics. I will use a sysbench client and proxy running on the same box. Sysbench connects to the proxy via local socket (for minimal network and TCP overhead), and the proxy is connected to a remote MySQL via a 10Gb network. This way, the proxy and sysbench share the same server resources.

Other parameters:

  • CPU: 56 logical CPU threads servers Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  • sysbench ten tables x 10mln rows, Pareto distribution
  • OS: Ubuntu 15.10 (Wily Werewolf)
  • MySQL 5.7
  • MaxScale version 1.4.1
  • ProxySQL version 1.2.0b

You can find more details about benchmarks, scripts and configs here: https://github.com/Percona-Lab/benchmark-results/tree/201603-mysql55-56-57-RO/remote-OLTP-proxy-may.

An important parameter to consider is how much of the CPU resources you allocate for a proxy. Both ProxySQL and MaxScale allow you to configure how many threads they can use to process user requests and to route queries. I’ve found that 16 threads for ProxySQL 8 threads for  MaxScale is optimal (I will also show 16 threads for MaxScale in this). Both proxies also allow you to setup simple load-balancing configurations, or to work in read-write splitting mode. In this case, I will use simple load balancing, since there are no read-write splitting requirements in a read-only workload).

ProxySQL

First result: How does ProxySQL perform compared to vanilla MySQL 5.7?

As we can see, there is a noticeable drop in performance with ProxySQL. This is expected, as ProxySQL does extra work to process queries. What is good though is that ProxySQL scales with increasing user connections.

One of the tricks that ProxySQL has is a “fast-forward” mode, which minimizes overhead from processing (but as a drawback, you can’t use many of the other features). Out of curiosity, let’s see how the “fast-forward” mode performs:

MaxScale

Now let’s see what happens with MaxScale. Before showing the next chart, let me not it contains “error bars,” which are presented as vertical bars. Basically, an “error bar” shows a standard deviation: the longer the bar, the more variation was observed during the experiment. We want to see less variance, as it implies more stable performance.

Here are results for MaxScale versus ProxySQL:

We can see that with lower numbers of threads both proxies are nearly similar, but MaxScale has a harder time scaling over 100 threads. On average, MaxScale’s throughput is worse, and there is a lot of variation. In general, we can see that MaxScale demands more CPU resources and uses more of the CPU per request (compared to ProxySQL). This holds true if we run MaxScale with 16 threads (instead of 8):

MaxScale with 16 threads does not handle the workload well, and there is a lot of variation along with some visible scalability issues.

To summarize, here is a chart with relative performance (vanilla MySQL 5.7 is shown as 1):

While this chart does show that MaxScale has less overhead from 1-6 threads, it doesn’t scale as user load increases.

Quick start MySQL testing using Docker (on a Mac!)

May 11, 2016 - 9:38am

In this post, we’ll discuss how you can quick start MySQL testing using Docker, specifically in a Mac environment.

Like a lot of people, I’m hearing a lot about Docker and it’s got me curious. The Docker ecosystem seems to be moving quickly, however, and simple “getting started” or “how-to” type articles that are easy to find for well-established technologies seem to be out-of-date or non-existent for Docker. I’ve been playing with Docker on Mac for a bit, but it is definitely a second-class citizen in the Docker world. However, I saw Giuseppe’s blog on the new Docker beta for Mac and decided to try it for myself. These steps work for the beta version on a Mac (and probably Windows), but they should work with Linux as well (using the GA release, currently Docker 1.11.1).

The new Docker beta for Mac requires that you register for the beta program, and receive a download code from Docker. I got mine in about a day, but I would assume it won’t be long before the full version is released.

Once installed, I needed to setup some Docker containers for common MySQL versions so that I can easily have some sandboxes. The method I used is below:

jayj@~ [510]$ docker network create test 90005b3ffa9fef1f817ee4965e794a567404c9a8d5bf07320514e7d848d59ff9 jayj@~ [511]$ docker run --name=mysql57 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql/mysql-server:5.7 6c80fa89610dbd5418ba474ad7d5451cd061f80a8a72ff2e718341827a08144b jayj@~ [512]$ docker run -it --rm --net=test -e MYSQL_HOST=mysql57 mysql/shell init Creating a Classic Session to root@mysql57:3306 Enter password: No default schema selected. enableXProtocol: Installing plugin mysqlx... enableXProtocol: done

A quick summary of what I did above:

  1. I created a network called “test” for my containers to share, essentially this is a dedicated private network between containers.  I like this because multiple containers can listen on the same port and I don’t have to fight with ports on my host OS.
  2. I started a MySQL 5.7 image from Oracle’s official MySQL Docker container bound to that test network.
  3. I used the MySQL/shell image (also from Oracle) to initialize the mysqlx plugin on my 5.7 server. Notice I didn’t enter a password because I created the server without one (insecure, but it’s a sandbox).

The shell init uses a temporary container that is removed (–rm) after the run, so you don’t pollute your docker ps -a a output.

So, now I want to be able to use the standard MySQL command line and/or the new MySQL shell to access this container.  To  make this really clean, I added some bash aliases:

alias mysqlsh='docker run -it --rm --net=test mysql/shell' alias mysql='docker run -it --rm -e MYSQL_ALLOW_EMPTY_PASSWORD=yes --net=test --entrypoint="mysql" mysql/mysql-server:5.7'

With these in effect, I can call them directly and pass normal command line options to connect to my mysql57 image just as if I was using a native MySQL CLI binary.

Using the MySQL CLI from the 5.7 image:

jayj@~ [524]$ mysql -h mysql57 Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 4 Server version: 5.7.12 MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> show schemas; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.01 sec)

Using the MySQL shell:

jayj@~ [527]$ mysqlsh -h mysql57 -u root --session-type=node Creating a Node Session to root@mysql57:33060 Enter password: No default schema selected. Welcome to MySQL Shell 1.0.3 Development Preview Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help', 'h' or '?' for help. Currently in JavaScript mode. Use sql to switch to SQL mode and execute queries. mysql-js> sql Switching to SQL mode... Commands end with ; mysql-sql> show schemas; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.00 sec) mysql-sql>

Now if I want to run check MySQL 5.5 for something, I can just do this:

jayj@~ [530]$ docker run --name=mysql55 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql/mysql-server:5.5 Unable to find image 'mysql/mysql-server:5.5' locally 5.5: Pulling from mysql/mysql-server a3ed95caeb02: Already exists ffe36b360c6d: Already exists 646f220a8b5d: Pull complete ed65e4fea7ed: Pull complete d34b408b18dd: Pull complete Digest: sha256:12f0b7025d1dc0e7b40fc6c2172106cdf73b8832f2f910ad36d65228d9e4c433 Status: Downloaded newer image for mysql/mysql-server:5.5 6691dd9d42c73f53baf2968bcca92b7f4d26f54bb01d967be475193305affd4f jayj@~ [531]$ mysql -h mysql55 Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 1 Server version: 5.5.49 MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> show schemas; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | +--------------------+ 3 rows in set (0.00 sec)

or, Percona Server:

jayj@~ [534]$ docker run --name=ps57 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d percona/percona-server:5.7 Unable to find image 'percona/percona-server:5.7' locally 5.7: Pulling from percona/percona-server a3ed95caeb02: Pull complete a07226856d92: Pull complete eee62d87a612: Pull complete 4c6755120a98: Pull complete 10eab0da5972: Pull complete d5159a6502a4: Pull complete e595a1a01d00: Pull complete Digest: sha256:d57f0ce736f5403b1714ff8d1d6b91d5a7ee7271f30222c2bc2c5cad4b4e6950 Status: Downloaded newer image for percona/percona-server:5.7 9db503852747bc1603ab59455124663e8cedf708ac6d992cff9b43e2fbebd167 jayj@~ [537]$ mysql -h ps57 Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 2 Server version: 5.7.10-3 Percona Server (GPL), Release 3, Revision 63dafaf Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql>

So all this is nice – once the images are cached locally, spinning new containers up and down is painless and fast. All this sandbox work is cleanly separated from my workstation OS. There are probably other things I’d want to be able to do with this setup that I haven’t figured out yet (e.g., loading data files, running code to connect to these containers, etc.) – but I’ll figure those out in the future.

Query Rewrite plugin can harm performance

May 10, 2016 - 10:53am

In this blog post, we’ll discuss how the Query Rewrite plugin can harm performance.

MySQL 5.7 comes with Query Rewrite plugin, which allows you to modify queries coming to the server. (You can view the details here: https://dev.mysql.com/doc/refman/5.7/en/rewriter-query-rewrite-plugin.html.)

It is based on the audit plugin API, and unfortunately it suffers from serious scalability issues (which seems to be the case for all API-based audit plugins).

I want to share the results for sysbench OLTP RO with and without the query rewrite plugin — but with one very simple rewrite rule, which doesn’t affect any queries. This is the rule from the documentation:

INSERT INTO query_rewrite.rewrite_rules (pattern, replacement) -> VALUES('SELECT ?', 'SELECT ? + 1');

There are results for both cases:

As you can see, the server with the Query Rewrite plugin can’t scale after 100 threads.

When we look at the PMP profile, it shows the following:

170 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_unlock_list,mysql_a udit_release,handle_connection,pfs_spawn_thread,start_thread,clone 164 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_foreach_with_mask,m ysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connec tion,pfs_spawn_thread,start_thread,clone 77 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_lock,acquire_plugin s,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_comm and,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 12 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq l_mutex_unlock,plugin_unlock_list,mysql_audit_release,handle_connection,pfs_spawn_thread,start_thread,clone 10 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq l_mutex_unlock,plugin_lock,acquire_plugins,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_ rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 10 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq l_mutex_unlock,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse, dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 7 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,Table_cache::lock,open_tab le,open_and_process_table,open_tables,open_tables_for_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command ,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 6 __GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_unlock_list,mysql_audit_release,han dle_connection,pfs_spawn_thread,start_thread,clone 6 __GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_foreach_with_mask,mysql_audit_acqui re_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_t hread,start_thread,clone

So clearly it’s related to a mutex acquired in the audit plugin API code. I filed a bug (https://bugs.mysql.com/bug.php?id=81298), but it’s discouraging to see that while the InnoDB code is constantly being improved for better scaling, other parts of the server can still suffer from global mutexes.

Percona Server 5.7 parallel doublewrite

May 9, 2016 - 1:35pm

In this blog post, we’ll discuss the ins and outs of Percona Server 5.7 parallel doublewrite.

After implementing parallel LRU flushing as described in the previous post, we went back to benchmarking. At first, we tested with the doublewrite buffer turned off. We wanted to isolate the effect of the parallel LRU flusher, and the results validated the design. Then we turned the doublewrite buffer back on and saw very little, if any, gain from the parallel LRU flusher. What happened? Let’s take a look at the data:

We see that the doublewrite buffer mutex is gone as expected and that the top waiters are the rseg mutexes and the index lock (shouldn’t this be fixed in 5.7?). Then we checked PMP:

2678 nanosleep(libpthread.so.0),...,buf_LRU_get_free_block(buf0lru.cc:1435),... 867 pthread_cond_wait,...,log_write_up_to(log0log.cc:1293),... 396 pthread_cond_wait,...,mtr_t::s_lock(sync0rw.ic:433),btr_cur_search_to_nth_level(btr0cur.cc:1022),... 337 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2325),... 240 poll(libc.so.6),...,Protocol_classic::read_packet(protocol_classic.cc:810),...

Again we see that PFS is not telling the whole story, this time due to a missing annotation in XtraDB. Whereas the PFS results might lead us to leave the flushing analysis and focus on the rseg/undo/purge or check the index lock, PMP clearly shows that a lack of free pages is the biggest source of waits. Turning on the doublewrite buffer makes LRU flushing inadequate again. This data, however, doesn’t tell us why that is.

To see how enabling the doublewrite buffer makes LRU flushing perform worse, we collect PFS and PMP data only for the server flusher (cleaner coordinator, cleaner worker, and LRU flusher) threads and I/O completion threads:

If we zoom in from the whole server to the flushers only, the doublewrite mutex is back. Since we removed its contention for the single page flushes, it must be the batch doublewrite buffer usage by the flusher threads that causes it to reappear. The doublewrite buffer has a single area for 120 pages that is shared and filled by flusher threads. The page add to the batch action is protected by the doublewrite mutex, serialising the adds, and results in the following picture:

By now we should be wary of reviewing PFS data without checking its results against PMP. Here it is:

139 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2448),LinuxAIOHandler::poll(os0file.cc:2594),... 56 pthread_cond_wait,...,os_event_wait_low(os0event.cc:534),buf_dblwr_add_to_batch(buf0dblwr.cc:1111),...,buf_flush_LRU_list_batch(buf0flu.cc:1555),...,buf_lru_manager(buf0flu.cc:2334),... 25 pthread_cond_wait,...,os_event_wait_low(os0event.cc:534),buf_flush_page_cleaner_worker(buf0flu.cc:3482),... 21 pthread_cond_wait,...,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),buf_page_io_complete(buf0buf.cc:5966),fil_aio_wait(fil0fil.cc:5754),io_handler_thread(srv0start.cc:330),... 8 pthread_cond_timedwait,...,buf_flush_page_cleaner_coordinator(buf0flu.cc:2726),...

As with the single-page flush doublewrite contention and the wait to get a free page in the previous posts, here we have an unannotated-for-Performance Schema doublewrite OS event wait (same bug 80979):

if (buf_dblwr->batch_running) { /* This not nearly as bad as it looks. There is only page_cleaner thread which does background flushing in batches therefore it is unlikely to be a contention point. The only exception is when a user thread is forced to do a flush batch because of a sync checkpoint. */ int64_t sig_count = os_event_reset(buf_dblwr->b_event); mutex_exit(&buf_dblwr->mutex); os_event_wait_low(buf_dblwr->b_event, sig_count); goto try_again; }

This is as bad as it looks (the comment is outdated). A running doublewrite flush blocks any doublewrite page add attempts from all the other flusher threads for the duration of the flush (up to 120 data pages written twice to storage):

The issue also occurs with MySQL 5.7 multi-threaded flusher but becomes more acute with the PS 5.7 multi-threaded LRU flusher. There is no inherent reason why all the parallel flusher threads must share the single doublewrite buffer. Each thread can have its own private buffer, and doing so allows us to add to the buffers and flush them independently. This means a lot of synchronisation simply disappears. Adding pages to parallel buffers is fully asynchronous:

And so is flushing them:

This behavior is what we shipped in the 5.7.11-4 release, and the performance results were shown in a previous post. To see how the private doublewrite buffer affects flusher threads, let’s look at isolated data for those threads again.

Performance Schema:

It shows the redo log mutex as the current top contention source from the PFS point of view, which is not caused directly by flushing.

PMP data looks better too:

112 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2455),...,io_handler_thread(srv0start.cc:330),... 54 pthread_cond_wait,...,buf_dblwr_flush_buffered_writes(buf0dblwr.cc:1287),...,buf_flush_LRU_list(buf0flu.cc:2341),buf_lru_manager(buf0flu.cc:2341),... 35 pthread_cond_wait,...,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),buf_page_io_complete(buf0buf.cc:5986),...,io_handler_thread(srv0start.cc:330),... 27 pthread_cond_wait,...,buf_flush_page_cleaner_worker(buf0flu.cc:3489),... 10 pthread_cond_wait,...,enter(ib0mutex.h:845),buf_LRU_block_free_non_file_page(ib0mutex.h:845),buf_LRU_block_free_hashed_page(buf0lru.cc:2567),...,buf_page_io_complete(buf0buf.cc:6070),...,io_handler_thread(srv0start.cc:330),...

The buf_dblwr_flush_buffered_writes now waits for its own thread I/O to complete and doesn’t block other threads from proceeding. The other top mutex waits belong to the LRU list mutex, which is again not caused directly by flushing.

This concludes the description of the current flushing implementation in Percona Server. To sum up, in these post series we took you through the road to the current XtraDB 5.7 flushing implementation:

  • Under high concurrency I/O-bound workloads, the server has a high demand for free buffer pages. This demand can be satisfied by either LRU batch flushing, either single page flushing.
  • Single page flushes cause a lot of doublewrite buffer contention and are bad even without the doublewrite.
  • Same as in XtraDB 5.6, we removed the single page flushing altogether.
  • Existing cleaner LRU flushing could not satisfy free page demand.
  • Multi-threaded LRU flushing design addresses this issue – if the doublewrite buffer is disabled.
  • If the doublewrite buffer is enabled, MT LRU flushing contends on it, negating its improvements.
  • Parallel doublewrite buffers address this bottleneck.

CPU governor performance

May 6, 2016 - 3:53pm

In this blog, we’ll examine how CPU governor performance affects MySQL.

It’s been a while since we looked into CPU governors and with the new Intel CPUs and new Linux distros, I wanted to check how CPU governors affect MySQL performance.

Before jumping to results, let’s review what drivers manage CPU frequency. Traditionally, the default driver was “acpi-cpufreq”, but for the recent Intel CPUs and new Linux kernel it was changed to “intel_pstate”.

To check what driver is being used, run the command cpupower frequency-info .

cpupower frequency-info analyzing CPU 0: driver: acpi-cpufreq CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 10.0 us. hardware limits: 1.20 GHz - 2.00 GHz available frequency steps: 2.00 GHz, 2.00 GHz, 1.90 GHz, 1.80 GHz, 1.70 GHz, 1.60 GHz, 1.50 GHz, 1.40 GHz, 1.30 GHz, 1.20 GHz available cpufreq governors: conservative, ondemand, userspace, powersave, performance current policy: frequency should be within 1.20 GHz and 2.00 GHz. The governor "ondemand" may decide which speed to use within this range. current CPU frequency is 1.20 GHz (asserted by call to hardware). cpufreq stats: 2.00 GHz:29.48%, 2.00 GHz:0.00%, 1.90 GHz:0.00%, 1.80 GHz:0.00%, 1.70 GHz:0.00%, 1.60 GHz:0.00%, 1.50 GHz:0.00%, 1.40 GHz:0.00%, 1.30 GHz:0.37%, 1.20 GHz:70.15% (7) boost state support: Supported: yes Active: yes

In this case, we can see that the driver is “acpi-cpufreq”, and the governor is “ondemand”.

On my server (running Ubuntu 16.04, running “Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz” CPUs), I get following output by default settings:

analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 0.97 ms. hardware limits: 1.20 GHz - 3.00 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 1.20 GHz and 3.00 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency is 1.50 GHz (asserted by call to hardware). boost state support: Supported: yes Active: yes

So, it’s interesting to see that “intel_pstate” with the “performance” governor is chosen by default, and the CPU frequency range is 1.20GHz to 3.00GHz (even though the CPU specification is 2.ooGHz). If we check CPU specification page, it says that 2.00GHz is the “base frequency” and “3.00GHz” is the “Max Turbo” frequency.

In contrast to “intel_pstate”, “acpi-cpufreq” says “frequency should be within 1.20 GHz and 2.00 GHz.”

Also, “intel_pstate” only supports “performance” and “powersave” governors, while “acpi-cpufreq” has a wider range. For this blog, I only tested “ondemand” and “performance”.

Switching between CPU drivers is not easy, as it requires a server reboot — you need to pass a parameter to the kernel startup line. In Ubuntu, you can do this in /etc/default/grub by changing GRUB_CMDLINE_LINUX_DEFAULT to GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable", which will disable intel_pstate and will load acpi-cpufreq.

Is there a real difference in performance between different CPU drivers and CPU governors? To check , I took a sysbench OLTP read-only workload over a 10Gb network, where the data fits into memory (so it is CPU-burning workload).

The results are as follows. This is a chart for absolute throughput:

And to better understand relative performance, here is a chart on how other governors perform compared to “intel-pstate” with the performance governor. In this case, I showed relative performance to “PSTATE performance”, which equals “1”. In the chart, the orange bar is “PSTATE powersave” and shows the relative difference between “PSTATE powersave” and “PSTATE performance” (=1):

Here are the takeaways I see:

  • The combination of CPU driver and CPU governors still affect performance
  • ACPI ondemand might be not the best choice to achieve the best throughput
  • Intel_pstate “powersave” is slower on a fewer number of threads (I guess the Linux scheduler assign execution to “sleeping” CPU cores)
  • Both ACPI and Intel_pstate “performance” governor shows the best (and practically identical) performance
  • My Ubuntu 16.04 starts with “intel_pstate” + “performance” governor by default, but you still may want to check what the settings are in your case (and change to “performance” if it is not set)


General Inquiries

For general inquiries, please send us your question and someone will contact you.