Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 37 min ago

Changing an async slave of a PXC cluster to a new Master using 5.6 and GTID

February 14, 2014 - 3:00am

Before Percona XtraBackup 2.1.7 and Percona XtraDB Cluster 5.6.15-25.3, rsync was the only SST method supporting GTID in the way that it was possible to move an asynchronous slave from one Galera node to another one (related bug).

Indeed, previous versions of Percona XtraBackup didn’t copy any binary log and due to that, moving the async slave to another master, just broke replication (when writes still happened).

Now with the latest version of Percona XtraBackup and Percona XtraDB Cluster, wsrep_sst_xtrabackup-v2 handles the last binlog sent from Percona XtraBackup and allows the move to another master in the easiest ways as possible using CHANGE MASTER TO MASTER_HOST = "new node". Nothing else needed.

It’s also obvious that using 5.6 and GTID is easier than previous 5.5 where is was more tricky to point the slave to the right position (see Jay’s blog post).

Don’t forget to provide a server-id to your PXC nodes. This is an example of configuration settings needed in my.cnf to enable GTID on Galera/PXC nodes:

server-id=2 log-bin=percona2-bin log_slave_updates enforce_gtid_consistency=1 gtid_mode=on

The post Changing an async slave of a PXC cluster to a new Master using 5.6 and GTID appeared first on MySQL Performance Blog.

Keynote Speakers for Percona Live MySQL Conference 2014 and $5 Expo Only Passes

February 13, 2014 - 5:00am

The Percona Live MySQL Conference 2014 in Santa Clara, California is just around the corner: April 1-4. There is a tremendous amount of activity leading up to the conference, some of which I will highlight here.

Percona Live MySQL Conference 2014 Keynote Speakers

Oracle’s Tomas Ulin will speak again this year.

I am extremely pleased with the keynote speakers we can now announce. They include speakers from major MySQL users Dropbox and Big Fish Games as well as speakers from influential technology leaders Oracle, Fusion-io, Continuent, and Percona. We are fortunate to once again have Tomas Ulin join us to talk about the future of MySQL including insight into the coming improvements in MySQL 5.7. The list of confirmed speakers and their talks are:

  • Tomas Ulin, Vice President of MySQL Development at Oracle, will present “Raising the MySQL Bar,” a look at how Oracle delivers constant innovation and enhancements to the MySQL community.
  • Nisha Talagala, Lead Architect at Fusion-io, will present “The Evolution of MySQL in the All-Flash Datacenter,” focusing on the state of the art in non volatile memory technology and its application to MySQL.
  • Renjish Abraham, Tech Lead DBA for Dropbox MySQL SRE, will present “Adventures in MySQL at Dropbox,” a look at how MySQL helps Dropbox users store and share more than a billion files a day.
  • Sean Chighizola, Senior Director of Database Administration at Big Fish Games, will present “MySQL, Private Cloud Infrastructure and OpenStack,” discussing the value of private clouds and the cost, management and interoperability issues between private cloud and employer-managed data centers.
  • Robert Hodges, CEO of Continuent, will present “Getting Serious about MySQL and Hadoop at Continuent,” exploring the data management trends spurring integration, how the MySQL community is stepping up, and where the integration may go in the future.
  • Peter Zaitsev, CEO of Percona, will present “9 Things You Need to Know About High Availability, Manageability, and The Cloud,” in which he will discuss the latest needs and solutions in these three key areas and what they mean for various types of MySQL users, from startups to large organizations.
$5 Expo-Only Passes for Percona Live MySQL Conference 2014

To help make the keynotes and the MySQL community events accessible to the greatest number of community members, we are once again offering $5 Expo-Only passes for the Percona Live MySQL Conference 2014. An Expo-Only pass provides access to the keynote addresses, Lightning Talks, Birds of a Feather sessions, the exhibit floor, and the Community Networking Reception on Thursday night. The first 100 people who register for an Expo-Only pass (new registrations only) using the discount code “KEY” will be able to register for just $5.

Call for DotOrg Pavilion Exhibitors Now Open

We are now accepting applications for the DotOrg Pavilion at the Percona Live MySQL Conference 2014. The Pavilion is a free expo hall space for non-commercial projects relevant to the MySQL community. Previous participants have included projects like common_schema, MariaDB, Mroonga, openark-kit, OpenStack, phpMyAdmin, Shar-Query and Tarantool. Applications must be submitted by February 28, 2014.

2014 MySQL Community Awards

We are once again supporting the MySQL Community Awards by providing a platform for the winner’s to be announced at the conference. Nominations are now open for the 2014 MySQL Community Awards which are organized this year by co-secretaries Shlomi Noach and Mark Callaghan. Winners will be announced on Thursday, April 3, during the Community Reception at the Percona Live MySQL Conference and Expo. The deadline to submit nominations is February 23.

More Information

Visit the Percona Live MySQL Conference and Expo 2014 website for more information about the conference and to register.

The Percona Live MySQL Conference 2014 promises to be the best yet. I look forward to seeing you there!

The post Keynote Speakers for Percona Live MySQL Conference 2014 and $5 Expo Only Passes appeared first on MySQL Performance Blog.

8 common (but deadly) MySQL operations mistakes and how to avoid them

February 12, 2014 - 3:00am

January 22 I gave a presentation on “How to Avoid Common (but Deadly) MySQL Operations Mistakes” for Percona MySQL Webinars. If you missed it, you can still register to view the recording and my slides.

Thanks to everyone who attended, and especially to folks who asked the great questions. I answered as many as we had time for during the session, but here are all the questions with my complete answers:

Q: What if I use rsync to sync the mysql dir to another server as a backup?

You can do this only if you shut down the mysqld daemon first. Otherwise, you have a high risk that your copy will be incomplete and impossible to restore.

If you need to create a physical backup on a running instance of MySQL, use Percona XtraBackup. This is safe because this tool carefully synchronizes copying the tablespace with the transaction log, so it assures against getting a partial copy.

Q: Is MIXED binlog-format useful as well as ROW?

The MIXED format defaults to STATEMENT, and switches to ROW for individual events that it detects are non-deterministic and therefore unsafe for statement-based replication. In theory, this should be give you the best of both worlds. But there could still be cases where MySQL replicates an event in statement format because it fails to detect a non-deterministic case.

Q: Percona Server version 5.1 — is this version compatible with pt-mext tool?

Yes, pt-mext works fine with Percona Server 5.1.

Q: We have multiple slaves and slaves to slaves and one slave sends changes to the master. Checksum is breaking the replication.

I am not sure I follow your description of your topology, or how the slave would send changes to the master.

I suggest you contact Percona Oncall and they can help you solve any issue with running checksums.

Q: To verify a restore can the check table extended command be used? This supposedly makes sure the table is consistent. I was wondering if this is still useful in to verify a restore in the latest versions of MySQL.

CHECK TABLE analyzes a table for errors, but it can’t tell if the data is different from the data in the same table on a different instance (e.g. a slave). This can check for physical corruption after a restore, but it can’t verify that the data is correct.

Q: Query optimize really helps? And for large databases like 100GB +, how that will be affected?

By choosing query optimization techniques carefully, some queries can be made to run thousands of times faster.

The larger the table, the more important it is to make sure your queries run efficiently.

Q: Is pt-online-schema-change available in 5.1? All my servers are Percona 5.1.

Yes, pt-online-schema-change works fine with Percona Server 5.1.

Q: What is the best way to perform backup?

Choosing the right tool, schedule, and method for performing backups is a complex problem. I suggest you read a white paper Percona has published on “MySQL Backup and Recovery Best Practices.”

Q: Is there a list of measurable indicators of performance somewhere?

A good tool to help you monitor many performance indicators is Percona Monitoring Plugins. This works with popular open-source monitoring frameworks like Cacti and Zabbix.

Q: How does most of this apply to Amazon’s RDS? Not having direct root access seems like a problem.

You’re right, Amazon RDS is a convenient way to deploy a preconfigured MySQL appliance, but you don’t get to log into the server, and many MySQL tuning variables are not accessible. We can still use some of the tools we are accustomed to using with MySQL, because the tools can access a remote MySQL server. But other tools require local access to the data directory.

Amazon RDS has a management console that allows you to do backups and restores, but if you want to choose specific tools, you may need to migrate to another environment such as Amazon EC2.

Q: A sales person told me that Percona XtraDB Cluster was not fully baked yet about half a year ago, is it ready to go for production now? (we have cluster instances in Amazon for low latency)

PXC is fully baked, has a beautiful golden crust, and smells delicious.

But seriously, we’ve helped many customers deploy PXC over the past year, and it’s working in many production environments.

Q: What buffer size and max_heap_size would you recommend for small 512 Mb RAM server (runs php5-fpm+nginx)?

I suggest you try the Percona Configuration Wizard for MySQL to get you started. It will suggest configuration values appropriate for your server hardware. This won’t be optimized specifically for your site’s workload, but it will be a good start.

Q: Is there any harm in in running pt-table-sync without running pt-table-checksum?

No harm. You can optionally use pt-table-sync to calculate its own checks to find out which rows need to be synchronized. You can even synchronize MySQL instances that aren’t replication master and slave.

Q: Is Percona XtraDB Cluster a viable option when MySQL servers are located in different data centers and connected via shared Internet connections?

Yes, this is a great use case for PXC. Refer to an early proof of concept test we ran to prove that multi-datacenter clusters work, and our white paper on High Level Multi-Datacenter MySQL High Availability.

Q: Can Percona XtraBackup be used to take a backup of a single table?

Yes, you can use partial backup options to make innobackupex back up only specific databases or specific tables.

Q: What methods do you recommend to replicate the binlogs outside of replication? We are working with DRBD any other recommendations?

MySQL 5.6 adds an option to the mysqlbinlog tool to backup binary logs files continously. So you can effectively keep your binlogs backed up on a separate server for safety.

Q: How will pt-table-checksum tolerate binlog-format=MIXED with GTID replication?

pt-table-checksum must use statement-based binlog events for the checksums to work, so it overrides any default binlog row format you have defined on your system.

Q: What are your thoughts on SymmetricDS for db replication over standard MySQL replication?

I have not evaluated SymmetricDS, so I can’t offer a specific opinion about it.

Most alternative solutions fit a specific type of project, and no single solution works for everyone.
So if this one works for your needs, it’s worth taking a look at it.

You should compare it with Tungsten Replicator, which is designed for a similar use case, as a highly-available solution for multi-master and multi-site replication.

Q: A question about indexes: in a table with persons, should I add an index on the column gender?

The best indexes depend on what queries you need to run, and the selectivity of data. If you never run a query that uses the gender column for searching or sorting, there would be no benefit to adding an index on it.

Furthermore, MySQL may still not use an index even if you do search on that column, if the value you search for occurs in a large (typically 20%+) of the rows of the table.

Q: I have tried Percona XtraBackup but I’m not sure about the best way to backup full server and restore only a single database from that full backup. I’m using mysqldump to backup and restore now.

Percona XtraBackup does support a method to restore individual tablespaces, but the steps to do it are laborious, and must be done one table at a time. Restoring all tables from a given database this way is possible, but involves more work that so far is mostly manual.

To be honest, using mysqldump is probably still the easier solution for this.

Q: Does Percona Xtradb Cluster have any replication drift? How can one minimize it?

PXC uses a different method of replication, not the built-in replication of standard MySQL. For purposes of replication drift and lag, you can think of it as similar to ROW based replication with semi-synchronous slaves. PXC should therefore have minimal chance of replication drift all by itself.

Q: How reliable are Percona XtraBackup incremental backups, in combination with binary logs for point in time recovery?

The incremental backups in Percona XtraBackup work very reliably, the most common problem is when you make a mistake and apply the incremental backups in an incorrect order.

Likewise, binary logs are reliable, but you must apply all the binary logs after the last incremental backup has been applied, and you must have a contiguous set of binary logs. For maximum safety, use sync_binlog=1 to assure the last events in the binlog are written to disk.

. . .

Thanks again for attending my webinar! Here are some more tips:

The post 8 common (but deadly) MySQL operations mistakes and how to avoid them appeared first on MySQL Performance Blog.


February 11, 2014 - 9:03am

A couple of weeks ago, shortly after Vadim wrote about Percona Cloud Tools and using Slow Query Log to capture the data, Mark Leith asked why don’t we just use Performance Schema instead? This is an interesting question and I think it deserves its own blog post to talk about.

First, I would say main reason for using Slow Query Log is compatibility. Basic Slow query log with microsecond query time precision is available starting in MySQL 5.1, while events_statements_summary_by_digest table was only added in MySQL 5.6 which was out for about a year now but which is still far from complete market domination. It is especially interesting if you look at the low-end market – users who just run some web applications using whatever MySQL Version their hosting provider installed for them. If you look at WordPress Users for example you will see MySQL 5.6 at just 1.3% as of today. As time passes and MySQL 5.6 takes a larger share of the market we surely should add support for Performance Schema based query sampling to Percona Cloud Tools.

The second reason is amount of data available. There is a fair amount of data which Performance Schema digest table providers including some which are not available in Percona Server logs:

mysql> select * from events_statements_summary_by_digest where digest_text like "%sbtest%" \G *************************** 1. row *************************** SCHEMA_NAME: sbtest DIGEST: 2062ac01bc1798df1eebd3e111a22b59 DIGEST_TEXT: SELECT c FROM sbtest WHERE id = ? COUNT_STAR: 882262 SUM_TIMER_WAIT: 933683089690000 MIN_TIMER_WAIT: 106418000 AVG_TIMER_WAIT: 1058283000 MAX_TIMER_WAIT: 1031299058000 SUM_LOCK_TIME: 60853469000000 SUM_ERRORS: 0 SUM_WARNINGS: 0 SUM_ROWS_AFFECTED: 0 SUM_ROWS_SENT: 889205 SUM_ROWS_EXAMINED: 890279 SUM_CREATED_TMP_DISK_TABLES: 0 SUM_CREATED_TMP_TABLES: 0 SUM_SELECT_FULL_JOIN: 0 SUM_SELECT_FULL_RANGE_JOIN: 0 SUM_SELECT_RANGE: 0 SUM_SELECT_RANGE_CHECK: 0 SUM_SELECT_SCAN: 0 SUM_SORT_MERGE_PASSES: 0 SUM_SORT_RANGE: 0 SUM_SORT_ROWS: 0 SUM_SORT_SCAN: 0 SUM_NO_INDEX_USED: 0 SUM_NO_GOOD_INDEX_USED: 0 FIRST_SEEN: 2014-02-10 17:39:39 LAST_SEEN: 2014-02-10 17:40:39 1 row in set (0.00 sec)

BTW – note Rows Sent here not being equal to rows examined while in reality they should be exactly the same for this benchmark. This is the approximate accounting of Performance Schema in action, though.

Now compare it to the sample for the same query in the slow query log in Percona Server

SET timestamp=1392071614; SELECT c from sbtest where id=387872; # Time: 140210 17:33:34.837517 # User@Host: root[root] @ localhost [] Id: 95 # Schema: sbtest Last_errno: 0 Killed: 0 # Query_time: 0.001000 Lock_time: 0.000054 Rows_sent: 1 Rows_examined: 1 Rows_affected: 0 # Bytes_sent: 74 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: 90E34CF # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: No Filesort_on_disk: No Merge_passes: 0 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 1 # Log_slow_rate_type: query Log_slow_rate_limit: 100

What I believe is the most valuable here is the information about Innodb IO which instantly allows us to isolate the query patterns which are disk IO bound as well as information about Bytes Sent which allows to see which queries are responsible for generating high volumes of network traffic.

I wish Performance Schema would be enhanced to return data in something like JSON where for each digest the top waits are accounted as in reality it can be different by the query. Some queries might be waiting on IO other on Locks, yet another could be bound by some specific mutexes. Having exact information about what limits performance of queries of the specific type would be a gem.

The third reason for using Slow Query Log is using placeholders. Note in the query above has “SELECT c FROM sbtest WHERE id = ?” which is not very convenient – I can’t even run EXPLAIN for such query to see what could be the reason for its slowness. Log contains exact queries and we are able to show exact queries in reports (pt-query-digest and Percona Cloud Tools) or you can opt for seeing only query digests if you do not want to see the values for privacy/security reasons. Picking the constant for a query with worse plan usually works very well to check out worse case scenario.

This might look like very simple problem – why you can’t just come up with ID and reconstruct the query but for more complicated queries with multiple conditions it is virtually impossible to reconstruct the realistic query.

Now in theory you can look up actual query from events_statements_history_long and join the data together, however it does not really work at the high query rates as it is very likely rare queries will not have a sample available in the history table.

The forth reason is support for prepared statements. Enable prepared statements and you will not see the actual query in the digest. This may or may not be an issue for your application but it further limits usability of this feature. I can’t count on simply looking at events_statements_summary_by_digest to always find which queries are responsible for majority of the load.

The fifth reason is performance or actually not much of a reason. I really think Performance Schema overhead is reasonable for most workloads. In simple queries benchmark which I’ve done:

sysbench --test=oltp --oltp-test-mode=simple --num-threads=16 --max-requests=0 --max-time=60 --mysql-user=root --oltp-table-size=1000000 run

On my old server I got some 20.2K QPS with Performance Schema Disabled and 19.4 QPS with Performance Schema enabled which is overhead of less than 5%.

For most workloads paying 5% to have insight about what is happening with the system is a very fair trade.

The slow query log overhead actually can be much larger. The moderate level of details “microtime” resulted in 15.1K queries and log_slow_verbosity=”full” takes this value down to 11.6K having over 40% overhead. Note I designed this test as worse case scenario and for more complicated queries the overhead is likely to be less (while overhead with Performance Schema can stay the same or even increase depending on what queries are doing).

Some people set long_query_time to some non zero value to reduce amount of queries logged. This is bad idea because the workload logged will be very different from your real one – chances are majority of your load comes from simple quick queries which will be very unrepresented with non zero long query time with only outliers logged.

A much better idea is to enable Sampling which is available in latest version of Percona Server – this way only one out of every so many queries will be logged:

mysql> set global log_slow_rate_limit=100; Query OK, 0 rows affected (0.00 sec) mysql> set global log_slow_rate_type="query"; Query OK, 0 rows affected (0.00 sec)

This will get one out ever 100 queries randomly logged which should give you good idea of your workload without such skew. It works well unless you have some rare and very complicated queries which impact your workload disproportionally and which you can’t ignore for performance analyses. To deal with this situation we added slow_query_log_always_write_time option to Percona Server, which allows you to always log such queries in the log even if they would not be selected because of sampling.

Enabling sampling 1/100 queries for this workload with full level of details I get 19.8K queries giving us overhead less than 2% which is even less than Performance Schema and selecting 1/1000 queries to be logged I can get overhead to about 1%. So with Slow Query Log I can make a choice between accuracy and overhead.

I wish Performance Schema would offer something similar – instead of figuring out what probes and statistic tables I need (not all of them are enabled by default) I could just chose to get the sampled data and play with accuracy vs overhead instead of missing the data all together.

Summary: There is a great amount of data in Performance Schema in MySQL 5.6 and Percona Server 5.6 though there are a number of reasons that you also might not want to discard the old and proven tools based on the slow query log just yet.

P.S., If you’re interested in seeing Query Performance Analyses in action please join Vadim’s webinar on Wednesday Feb 12. If you missed it check out recording at that same link.

The post PERFORMANCE_SCHEMA vs Slow Query Log appeared first on MySQL Performance Blog.

WITHer Recursive Queries?

February 11, 2014 - 3:00am

Over the past few years, we’ve seen MySQL technology advance in leaps and bounds, especially when it comes to scalability. But by focusing on the internals of the storage engine for so long, MySQL has fallen behind regarding support for advanced SQL features.

SQLite, another popular open-source SQL database, just released version 3.8.3, including support for recursive SQL queries using the WITH RECURSIVE syntax, in compliance with SQL:1999.

Why is this significant? It means that MySQL is now the only widely-used SQL implementation that does not support recursive queries. Fifteen years after it was defined in the SQL standard, almost every other SQL database of note has supported this feature:

Only Informix among common RDBMS brands lacks support for WITH RECURSIVE, though Informix still supports recursive queries with the non-standard CONNECT BY syntax.

MySQL has been requested to support common table expressions using the WITH syntax for a long time:

The CTE-style queries would allow us to share more advanced SQL queries with those that are being used by other brands of database, and do it with standard SQL instead of proprietary functions or tricks.

The most common example of a query solved with a recursive CTE is to query a tree of unknown depth. But there are quite a few other useful applications of this form of query, all the way up to fancy stunts like a query that generates a fractal design like the Mandelbrot Set. Recursion is powerful.

Is it time for the MySQL community to raise the priority of the CTE feature requests for MySQL? Visit the links I gave above at bugs.mysql.com, and add your voice by clicking the Affects Me button.

The post WITHer Recursive Queries? appeared first on MySQL Performance Blog.

Generating test data for MySQL tables

February 10, 2014 - 12:00am

One of the common tasks requested by our support customers is to optimize slow queries. We normally ask for the table structure(s), the problematic query and sample data to be able to reproduce the problem and resolve it by modifying the query, table structure, or global/session variables. Sometimes, we are given access to the server to test the queries on their live or test environment. But, more often than not, customers will not be able to provide us access to their servers or sample data due to security and data privacy reasons. Hence, we need to generate the test data ourselves.

A convenient way of generating test data is visiting http://generatedata.com which provides a web form where you can provide the columns and its corresponding data types, and turn them into test data. The website is capable of generating data in various formats such as Excel, HTML and JSON but for MySQL, you can choose either the CSV or SQL format.

The website limits you to generate up to a maximum of 100 rows. Fortunately, the web application is available for download so you can install it in your test server to generate data up to 100,000 rows instead.

Below are instructions on installing generatedata on CentOS 6:

1. Install Apache, PHP, MySQL and wget

# yum -y install mysql-server mysql httpd php php-mysql wget

2. Ensure Apache and MySQL runs on startup

# chkconfig httpd on # chkconfig mysqld on

3. Start Apache and MySQL

# service httpd start # service mysqld start

4. Create a MySQL user and database for the application

# mysql mysql> GRANT ALL PRIVILEGES ON generatedata.* TO generatedata@localhost IDENTIFIED BY ‘my-weak-password-please-do-not-use-this-password-in-a-production-environment’; mysql> FLUSH PRIVILEGES; mysql> CREATE DATABASE generatedata; mysql> quit;

5. Download generatedata source and place it in the web directory

# wget https://github.com/benkeen/generatedata/archive/3.0.8.tar.gz -O 3.0.8.tar.gz # tar xzvf 3.0.8.tar.gz # mv generatedata-3.0.8 /var/www/html/generatedata

6. Make the application readable and writable by Apache

# chown -R apache:apache /var/www/html/generatedata

7. Open your browser to access the web application. In my case, it’s It will prompt you to supply mysql credentials. Supply the MySQL credentials and click the “Continue” button to proceed.

8. The next screen will prompt you to create a settings.php file which will be used to store the custom settings for the application which include access credentials to MySQL. Click the “Create File” button to proceed.

9. You can setup an anonymous account, single account with login or multiple accounts. For testing purposes, select the default option and click the “Continue” button.

10. The next step is to install the plugins used for generating data and exporting them in a particular format. Click the “Install Plugins” button to continue.

11. Once the plugins are generated, click the “Continue” button.

12. Finally, generatedata is installed. Click “Go to script” button to continue.

Using generatedata

Below are samples of what generatedata can produce:

  • Names: “Karleigh K. Valencia”, “Claire W. Woodard”, “Yvonne Wyatt”
  • Date/Time: “2013-11-29 09:19:38″, “2013-11-29″, “11:05:53″
  • Phone/Fax: “(191) 919-9508″, “(847) 807-6360″, “(366) 902-0912″
  • Street Address: “P.O. Box 345, 8566 Mi St.”, “587-8731 Ultrices. Avenue”, “4612 Eu St.”
  • Latitude/Longitude: “37.5863, -20.25932″, “-48.29183, -69.31125″, “-78.67594″
  • Text: “Fusce aliquet magna a neque.”, “Nullam ut nisi a odio”, “mollis nec, cursus a, enim.”
  • Currency: “$3364.88″, “$7849.22″, “1217.18″
  • Alphanumeric: “RU384″, “GL941″, “HI144″
  • AutoIncrement: 1, 2, 3, 4, 5, 6, 7
  • Number Range: 1, 3, 8, 2, 14, 7
  • Custom List: “Dr.”, “Mr.”, “Mrs.”, “Ms.”

For our example, let’s use the employees table structure from the Employees sample database.

1. Create the employees table under the test database

# mysql test mysql> CREATE TABLE employees ( emp_no INT NOT NULL, birth_date DATE NOT NULL, first_name VARCHAR(14) NOT NULL, last_name VARCHAR(16) NOT NULL, gender ENUM ('M','F') NOT NULL, hire_date DATE NOT NULL, PRIMARY KEY (emp_no) );

2. Enter the columns and corresponding datatypes in the web form. In this example, using the appropriate data type for each column is straightforward:

  • emp_no: AutoIncrement, Start at 10000, Increment 1
  • birth_date: Date, From 01/01/1960, 12/31/1990, Format Y-m-d
  • first_name: Names, Alex(any gender)
  • last_name: Names, Smith(surname)
  • gender: Custom List, Exactly 1, Values M|F
  • hire_date: Date, From 01/01/2000, To 12/11/2014, Format code Y-m-d

3. Export the data in MySQL Insert format. Under the EXPORTS TYPES section, select SQL tab and enter the following information:

  • Database table: employees
  • Database Type: MySQL
  • Remove the tick on: Include CREATE TABLE query and Include DROP TABLE query

4. Click the “Generate” button to generate the desired data. You can now copy the generated output and paste this in the MySQL console. On the other hand, you can also choose to download the generated rows as a file.

CSV output
You can also opt to generate CSV output and use LOAD DATA INFILE to import the CSV data to your table. Under the EXPORT TYPES section, select the CSV tab. Inside the Generate frame, select “Prompt to download” and click the “Generate” button.

Upload the CSV file under the /tmp directory of your MySQL server and import it via the MySQL console:

# mysql test LOAD DATA INFILE '/tmp/dataDec-11-2013.csv' INTO TABLE employees FIELDS TERMINATED BY '|' IGNORE 1 LINES;

Generating more than 100,000 rows at a time
Suppose you want to generate more than 100,000 rows, you can modify settings.php under /var/www/html/generatedata/settings.php and append $maxGeneratedRows = <maxGeneratedRows>; to the configuration file. The example below increases the maximum number of generated rows to 150,000.

<?php $dbHostname = 'localhost'; $dbName = 'generatedata'; $dbUsername = 'generatedata'; $dbPassword = 'my-weak-password-please-do-not-use-this-password-in-a-production-environment'; $dbTablePrefix = 'gd_'; $encryptionSalt = 'ZPd'; $maxGeneratedRows = 150000;

Final Notes
There is just one major caveat to generating test data. Nothing compares to the actual data for reproducing query performance problems because its data and index cardinality will be entirely different from the generated test data. So, even if you’re able to fix the problem in your test environment, the solution may or may not apply to your production environment.

The post Generating test data for MySQL tables appeared first on MySQL Performance Blog.

Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6′ webinar

February 7, 2014 - 6:00am

Thanks to all who attended my webinar yesterday.  The slides and recording are available on the webinar’s page.  I was a bit overwhelmed with the amount of questions that came in and I’ll try to answer them the best I can here.

Q: Does Percona XtraDB Cluster support writing to multiple master?

Yes, it does.  However, locks are not held on all nodes while a transaction is in progress, so there is a possibility for conflicts between simultaneous writes on different nodes.  Certification is the process that Galera uses to protect your data integrity, and the problem is pushed back to the application to handle in the form of deadlock errors.   As long as you understand this and the conditions where you may see more deadlocks (and your application can properly handle them), multi-node writing is fine.  See more detail on this in a blog post I did a while ago.

Q: Is there any limitation to scale writes?

Yes, there is no sharding in Percona XtraDB Cluster, all data goes to all nodes.  You are limited by replication throughput, or more specifically, transaction apply throughput, just like with standard MySQL replication.  However, PXC offers true parallel apply, so many transactions can be applied in parallel, which should generally give better throughput than conventional master/slave.

Q: Are the WAN segments feature only used to reduce network bandwidth? Would DB write performance be the same before Galera 3.x since each commit still has to be ack by all the servers across the WAN?

If you listened closely to something Alexey said during the webinar, it’s actually possible that WAN segments will improve commit times because the master node won’t have spend extra time sending the transaction to each remote node.  I would expect overall write performance to vary somewhat compared to not using segments, but there are probably factors that may influence it either way including number of remote nodes, bandwidth, latency, etc.

With or without segments, all nodes must acknowledge the transaction, this is a vital part of Galera’s implementation and cannot be relaxed.

Q: What is the max number of cluster servers across a WAN recommended before you start seeing a diminishing return in performance b/c of sync replication?

Good question, I don’t know.  This would be fun to test and would make a good blog post.  I’ll put it on my list.  I’d bet that segments may increase such a limit, but it probably depends on a lot of factors.

Practically I haven’t seen a cluster with more than a half-dozen nodes, but that doesn’t mean that’s the limit.  I’d expect a big cluster would be around 10-12 nodes, but in reality that’s just a gut feeling more than any hard evidence.

Q: Should I be worried about the auto_increment bug you mentioned? I wasn’t planning to upgrade our cluster to Percona XtraDB Cluster 5.6 soon.

Not unless you regularly add AUTO_INCREMENT columns to existing tables using ALTER TABLE.  Note that the bug was also fixed in 5.5.34.

Q: Does Percona XtraDB Cluster support SphinxSE Storage Engine?

Nope.  Currently the Codership-mysql patches are for Innodb only.  Any other transactional storage engine can theoretically be supported provided you can implement prioritized transactions in it.

However SphinxSE is not transactional, so its support would be similar to MyISAM at best (which is very rudimentary and not likely to change).  It would be easy to add such support and it’s possible that the Maria Galera Cluster guys are already considering it since Maria ships with that storage engine.

Q: To convert from mysql to Percona XtraDB Cluster 5.6, do you now recommend first upgrading to mysql 5.6 and then converting to PXC?

It depends, but I’d consider migrating to PXC to be at least equivalent to the risk of upgrading a major MySQL version and it should be tested thoroughly.  To limit your risk, breaking the upgrade into smaller pieces may be prudent, but it is not strictly necessary.  The further away you are from 5.6 (like say you are on 5.1 or 5.0), the more likely I’d recommend separate steps.

Q: Do “WAN segments” effect the quorum in any way?

No, group communication continues as before, it’s just that the routing of the actual writeset content is different.

Q: Since each galera node is identical, they all have the same storage footprint. What are best practices for expanding storage on galera nodes when we are low on free space?

On a cluster in steady state, it should be easy to do rolling changes.  That is, take one node out of the cluster, add some extra storage, and put it back in, repeat.  Ideally such rolling changes are done quickly enough and are non-destructive to the datadir so you can IST on restart.

Q: Is there any change to wsrep_retry_autocommit behavior in Percona XtraDB Cluster 5.6, or any plans to apply this setting to explicit transactions in order to avoid cluster “deadlocks”?

wsrep_retry_autocommit has not changed to my knowledge.  The reasoning behind not applying this methodology to explicit transactions was that it was generally assumed that explicit transactions may have application logic being applied between the statements and could we assume it was safe to simply retry the same transaction if the data changed underneath?  For example:

BEGIN; SELECT * FROM USERS WHERE ID=100 FOR UPDATE; # Application applies some business logic based on the row contents here??? UPDATE USERS SET AUTHORIZED=1 WHERE ID=100; COMMIT;

If the UPDATE USERS was an autocommit, then wsrep_retry_autocommit would simply re-broadcast the same writeset (unmodified) with the new RBR row if there was a conflict.  It does not re-run the statement.  Would it be safe to do this if the explicit transaction got a cluster deadlock?  We don’t know.  If the user record was modified (which is what a cluster deadlock would indicate), should they still be authorized?

Q: I heard about xtrabackup-v2. What is the difference between the previous one?

This was actually released in 5.5.  The big changes are that the new SST method allows for encrypted SST transfers, compression, rate limiting, progress metering, and other goodness like that.  It is not backwards compatible with the old xtrabackup method, and both the donor and joiner must have the new method available (i.e., be running a PXC version that has both) for it to work.

Q: Can a cluster which use rsync to be switched to xtrabackup in a rolling like mode?

Yes, no problem.  The SST method is dictated by the JOINER, meaning whatever the joiner node’s wsrep_sst_method is, that’s what is used (the donor obeys this even if its wsrep_sst_method is different).  You can just go change the wsrep_sst_method in all the config files and it will be used next time an SST happens (since SST would only happen on a restart anyway).  Just be careful that you test xtrabackup first, since it requires proper mysql credentials to be set in wsrep_sst_auth.

Q: Do you saw or installed or recommended Percona XtraDB Cluster 5.6 for production now or wait for a while ?

Now that 5.6 is GA, we’ve just passed a milestone where I’d start to recommend 5.6 as a starter for new Percona XtraDB Cluster deployments going forward.  Certainly 5.5 is more battle-tested and understood and there still may be good reasons to use it, but from now forward, I’m expecting that the need for 5.5 will only diminish, not increase.

For existing PXC 5.5 deployments, I’d probably recommend waiting a bit unless there is some great need for the new release, but I don’t have any specific bugs or issues I’m thinking of, just the general newness of 5.6.

Q: are there any specific warnings or cautions to be aware of with PXC 5.6 when the db uses triggers and/or procedures, beyond the cautions in MySQL itself?

Nothing specific, these should work fine with Percona XtraDB Cluster.  In RBR these only run on the master node (the node taking the write) and the changes done by them are incorporated into the transactions writeset.

Q: So all the certifications come directly from the applying nodes back to the node that sent the data?  Or does it relay back through the relay node?

Actually, certification results are not shared on the cluster.  Certification is deterministic (or should be) and all nodes are expected to either pass or fail a transaction without any mixed results, hence there is no communication about pass/failure and only the master node (the node that originated the transaction) actually does anything about certification failure (i.e., increments a lcf failure counter and deadlock error for the client).   Past bugs in PXC have resulted in non-deterministic certification in some edge-cases, and this can then obviously lead to node inconsistencies.

What is sent back is an acknowledgement of receipt of the transaction (which is much smaller) at the replication stage (pre-certification) and my understanding is that all nodes will reply back to the originating node more or less directly.  I tend to think of this communication as “out-and-back”, but in reality it’s more nuanced than this; for example an acknowledgement may be piggy-backed with a new transaction from that node.

The act of replication delivers the transaction payload to all nodes, and all nodes acknowledge the transaction to each other, AND within this process a consistent GTID for the transaction is established efficiently.  HOW precisely this happens is, as Alexey would say, is an implementation detail.   Segments simply modify the method of transaction delivery, but I believe most of the other details are more or less the same.

Q: we are on Percona XtraDB Cluster 5.5 and have bursts of a large number of simultaneous updates to the same table which often triggers lock wait timeouts. Could binlog_row_image=minimal help reduce the frequency of these lock waits?

Lock wait timeouts in the cluster will be happening to transactions on a single node while it waits for other transactions on that same node to commit.   Any way that you can reduce the commit time should, in theory, reduce the lock_waits.

Since part of that commit wait is the synchronous replication, it stands to reason that this is perhaps the bulk of the wait time.  I haven’t measured actual commit rates comparing full vs minimal row images, so I can’t tell you if this would decrease replication time or not.  I can imagine a scenario where a very large transaction would benefit from minimal row images (i.e., by being much smaller and thereby taking less time to transmit over the network), but I’d expect that when the transactions are already small and single-row to begin with, it would make less of an impact.

Are you guys using innodb_flush_log_at_trx_commit=1?  Relaxing that (and relying on the cluster replication for durability) may improve your commit times a lot.

Q: Any Percona XtraDB Cluster 5.6 or Galera 3.x settings to increase write performance across the cluster as well as clusters in different geographic locations?

I’m not aware of anything that is necessarily Galera 3 specific that is tunable, though there are things baked into Galera 3 that may help, such as the certification improvements.   As I mentioned above, minimal row images may help by simply reducing the volume of data that needs transmitting.

The post Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6′ webinar appeared first on MySQL Performance Blog.

Calvin Sun on MySQL at Twitter, Percona Live 2014 and Facebook

February 6, 2014 - 12:00am

Twitter’s Calvin Sun

Twitter’s Calvin Sun (@Calvinsun2012) is looking forward to the fast-approaching Percona Live MySQL Conference and Expo this April 1-4 in Santa Clara, Calif. He’ll be speaking, yes, but he’s also looking forward to learning from his peers – particularly those at Facebook. Both companies, he explained, are in the rather unique positions of unprecedented rapid growth and ever-expanding performance demands.

Calvin is a senior engineering manager at Twitter, where he manages MySQL development. Prior to that, he managed the InnoDB team at Oracle. Calvin also worked for MySQL Inc. from 2006 to 2008, managing MySQL storage engines development. He has over 15+ years of database development experience.

He said MySQL is the data storage technology behind most Twitter data: the social graph, timelines, users, direct messages, as well as tweets themselves. With more than 230 million monthly active users and 500 million tweets per day, Twitter pushes MySQL and InnoDB a lot further than most companies. I had an opportunity to chat with Calvin via Skype the other day. Here’s that interview.

Tom: Aside from your talk, “MySQL at Twitter,” which we’ll get to soon, what are you looking forward to the most at this year’s Percona Live MySQL Conference and Expo?

Calvin: This year I’m particularly eager to attend the Facebook sessions because Twitter faces some of the same challenges as they do. Learning from each other within the MySQL community is very valuable and I look forward to this event every April.  For me, and my team, we always get the most excited about the opportunities to learn from the experiences of other vendors and other MySQL users.

We are also very pleased to see Oracle join this conference for the second year in a row. Their attendance is very beneficial to the MySQL community because we want to better understand what’s going on in MySQL development at Oracle. And hopefully we’ll also have some visibility and perhaps even some influence into their roadmap in terms of upcoming releases.

Multi-datacenter support and fast failover are also areas we are interested in. And of course since I’m from the InnoDB team within MySQL, I’d like to see what’s new there, too.

And finally, I’m looking forward to participating in a panel Q&A with our DBA team. The session is called “MySQL at Twitter” and in it we’ll field questions on how MySQL is used within Twitter, along with the challenges we have now and have faced in the past. It will be led by my colleague Will Gunty, a senior MySQL DBA.

Tom: Let’s take a look at some of those Facebook sessions you said you’re interested in – they have several spread out over each day of the conference:

Wow, that is a fantastic collection of sessions just from Facebook alone!

Calvin: Facebook is probably the largest MySQL installation in the world and they are in a lot of ways facing some of the same challenges we are facing. Facebook has a bigger MySQL team than we do and are also a little bit ahead of us, especially on MySQL 5.6 and InnoDB compression, so we can learn new things from those sessions – in addition to having direct conversations with members of the Facebook team outside of those sessions, as we always do. They have created a lot of innovative solutions to address their problems.

Tom: You’ll be leading a session titled “Scaling Twitter with MySQL” along with a couple members of your team.  With Twitter’s explosive growth since its founding in 2006, and as that activity continues to grow with Twitter being used in new ways, how has your team kept pace? And how are you preparing for the future?

Calvin: It is a huge scale. During my session I’ll talk about how MySQL evolved within Twitter over the years. Many small companies just starting out, including Twitter in the early days, use master-slave replication. But as Twitter started to grow, that sort of setup could no longer meet our needs, so over the years we starting developing our own sharding and replication framework. We still use MySQL as our basic data storage, though, and working creatively like this enabled us to scale fast.

For us, usage is in real-time and is event driven.  For example, the World Cup or the Olympics or the Super Bowl – those types of very popular events send peaks up several times higher than average. For that reason we have to always be prepared for whatever sort of usage levels we might encounter.

Tom: Who should make it a point to attend your session?

I would say people working at both large companies as well as startups, especially Internet companies that want to use MySQL and who need to learn how to prepare for growth. Also anyone who needs to ensure that their company’s architecture, their infrastructure, is in fact prepared for rapid growth if needed. They’ll learn how Twitter prepared for and addressed specific issues.

Tom: What will they walk away with?

Calvin: They’ll walk away with 1) how to build a scalable infrastructure using MySQL and 2) how to plan for growth.

Tom: What are some of the bigger projects or challenges you and your team face in 2014?

Calvin: One of the bigger things we are in the process of doing is MySQL 5.6 migration. But for Twitter overall, I think that as it grows, ensuring reliability and scalability has always been a priority for us. We are also looking to do a better job at failover – above and beyond failover between our data centers. So we’ll be looking to improve that kind of operational efficiency this year.

Another area is optimization. We’re looking into the structure of our MySQL installation within Twitter to see which areas we can improve; make more efficient. One area is data compression, which we currently don’t use in production. We want to see what we can achieve in reducing data sizes and saving disk spaces.

Tom: Before we started the interview you mentioned that you’ve been attending these MySQL conferences since 2006. Why do you feel it is important to be there personally?

Calvin: Because I strongly support the MySQL community in general and try to do my best to participate and serve the community. For example last year I gave presentations on MySQL/InnoDB in Beijing and Seoul, plus OSCON and MySQL Connect. Those in addition to Percona Live and related events give me a chance to share my own experiences and lessons learned with the community. As I’ve already mentioned it also gives me the chance to learn.

Tom: Thanks very much for speaking with me today, Calvin. I look forward to seeing you in April! And as a small token of my appreciation, here’s a discount code you can share with your friends and colleagues (everyone reading this can take advantage of it, too). Just have them enter SeeMeSpeak on the Percona Live 2014 registration page and they’ll automatically save 10 percent.

Calvin: Thanks, Tom. We really appreciate the hard work Percona and the Conference Committee does in terms of organizing these MySQL conferences with all of the excellent sessions. See you in April!

The post Calvin Sun on MySQL at Twitter, Percona Live 2014 and Facebook appeared first on MySQL Performance Blog.

Analyze MySQL Query Performance with Percona Cloud Tools: Feb. 12 webinar

February 5, 2014 - 6:00am

Next week (Wednesday, February 12 at 10 a.m. Pacific) I will host a webinar titled “Analyze MySQL Query Performance with Percona Cloud Tools.” Percona Cloud Tools, currently in beta, reveals new insights about MySQL performance enabling you to improve your database queries and applications. (You can request access to the free beta here).

For webinar attendees, Percona will raffle five (5) 60-minute MySQL query consulting sessions with me to analyze your Percona Cloud Tools query data and provide feedback and performance suggestions. Read below for how to win.

In the webinar I will cover:

  • Setting up Percona Cloud Tools Query Analytics
  • Deciphering advanced database query metrics
  • Identifying the best opportunities to achieve significant query performance improvements

Set up Percona Cloud Tools

Getting started with Percona Cloud Tools takes only a few steps (after creating your account). Users of Percona Toolkit can even skip step 1.

  1. Install Percona Toolkit (2.2.6 or newer) on each MySQL database server
  2. Run “pt-agent –install” on each database server and paste your API key when asked
  3. Log in to https://cloud.percona.com and enable Query Analytics

Win a query consulting session

As I mentioned earlier, Percona will raffle five (5) 60-minute MySQL query consulting sessions with me to analyze your Percona Cloud Tools / Query Analytics data and provide feedback and performance suggestions. To qualify for the raffle, you must complete the following by 12 a.m. UTC on Thursday, February 13:

Good luck and see you on February 12th!

In the meantime, here are some related posts I’ve written about Percona Cloud Tools:


About Percona Cloud Tools
Percona Cloud Tools is a hosted service providing access to query performance insights for all MySQL uses. After a brief setup, unlock new information about your database and how to improve your applications.

The post Analyze MySQL Query Performance with Percona Cloud Tools: Feb. 12 webinar appeared first on MySQL Performance Blog.

16000 active connections – Percona Server continues to work when others die

February 4, 2014 - 4:59pm

We just published results with improvements in Thread Pool in Percona Server:
Percona Server: Thread Pool Improvements for Transactional Workloads
Percona Server: Improve Scalability with Thread Pool

What I am happy to see is that Percona Server is able to handle a tremendous amount of user connections. From our charts you can see it can go to 16000 active connections without a decline in throughput.

In fact, in our experiments we used 25000 connections and even more, so I think now we face OS limitations in handling connections as opposed to Percona Server’s limitations. From the posts above, both MySQL and MariaDB drop to the floor somewhere along the way.

You may ask a valid question if 16000 connections are really necessary. I would say that in regular workloads we do see this, but I propose to think about this as an electrical fuse, one that provides overcurrent protection. See my post SimCity outages, traffic control and Thread Pool for MySQL.

Even you do not have many connections on regular days, one mention from Yahoo’s homepage or from Hacker News will change that in a moment. Thread Pool will act as protection from overloading a database.

There is also another application for Thread Pool. I often see that application layer creates its own connection pool (think Java), and with ~20-30 app servers it can get to ~3000 connections, and this is already problematic… but not for Percona Server!

The post 16000 active connections – Percona Server continues to work when others die appeared first on MySQL Performance Blog.

ClusterControl for Percona XtraDB Cluster Improves Management and Monitoring

February 4, 2014 - 6:03am

ClusterControl for Percona XtraDB Cluster is now available in three different versions thanks to our partnership with Severalnines. ClusterControl will make it simpler to manage and monitor Percona XtraDB Cluster, MySQL Cluster, MySQL Replication, or MySQL Galera.

I am very excited about our GA release of Percona XtraDB Cluster 5.6 last week. As Vadim described in his blog post announcing the release, we have brought together the benefits of Percona Server 5.6, Percona XtraBackup and Galera 3 to create a drop-in compatible, open source, state-of-the-art high availability MySQL clustering solution. We could not have done it without the close cooperation and help of our partners at Codership. To learn more, join us on February 5th for a webinar entitled “What’s New in Percona XtraDB Cluster 5.6” presented by Jay Janssen.

While these new features in Percona XtraDB Cluster are highly valuable, we regularly receive feedback from users that they would like more tools to make it easier to manage and monitor their MySQL clusters. In response to that feedback, I’m pleased to announce our partnership with Severalnines to make ClusterControl more available to Percona XtraDB Cluster users and, in fact, all of our Support customers who use MySQL clustering solutions.

Percona XtraDB Cluster users now have three choices:

On the last option, Percona ClusterControl is a privately branded version of ClusterControl supplied to us by Severalnines. With a Percona Support contract covering Percona XtraDB Cluster, MySQL Cluster, or MySQL Galera, we provide support for your deployment and use of Percona ClusterControl. Percona has worked directly with Severalnines to ensure full compatibility of ClusterControl Community Edition with Percona XtraDB Cluster. The Percona Support team will provide assistance with the configuration and administration of Percona ClusterControl. We will help you get setup and then we will help you make the most of your new management and monitoring tools for the high availability solution of your choice.

We will also provide support for Severalnines ClusterControl Enterprise Edition when it is purchased from Severalnines or Percona and used with a cluster deployment covered by a Percona Support subscription that includes coverage of Percona XtraDB Cluster, MySQL Galera, or MySQL Cluster. ClusterControl Enterprise Edition is the de-facto management solution for Galera-based clusters and it provides the highest levels of monitoring and management capabilities for Percona XtraDB Cluster users.

Percona ClusterControl will provide our Support customers broad flexibility to deploy, monitor, and manage Percona XtraDB Clusters, supporting on-premise and Amazon Web Services deployments, as well as deployments spread across multiple cloud vendors. Percona ClusterControl can be used to automate the deployment of Percona XtraDB Cluster in cross-region, multi-datacenter setups. Percona ClusterControl offers a full range of management functionality for clusters, including:

  • Node and cluster recovery
  • Configuration management
  • Performance health checks
  • Online backups
  • Upgrades
  • Scale out
  • Cloning of clusters
  • Deploying HAProxy load balancers

Percona ClusterControl can be used with multiple clustering solutions including MySQL Cluster, MySQL Replication, MySQL Galera, MariaDB Galera Cluster, and MongoDB. Percona Support subscribers that elect coverage for these clustering solutions receive Percona ClusterControl with their subscriptions.

Just as we made Percona XtraDB Cluster a much better solution by partnering with Codership, we now expect it will be a much easier to manage and monitor solution thanks to our partnership with Severalnines. Whether you need Percona Support or not, I hope you will take advantage of these powerful management and monitoring solutions which are now available for Percona XtraDB Cluster. To learn more about using ClusterControl to manage Percona XtraDB Cluster, please join us on February 19th for a webinar entitled “Performance Monitoring and Troubleshooting of Percona XtraDB Cluster” presented by Peter Boros.

The post ClusterControl for Percona XtraDB Cluster Improves Management and Monitoring appeared first on MySQL Performance Blog.

Quick installation guide for Percona Cloud Tools for MySQL

February 3, 2014 - 10:54am

Here in Percona Support, we’re receiving several requests per day for help with Percona Cloud Tools installation steps.

So I decided to prepare a step-by-step example of the installation process with some comments based on experience.  Percona Cloud Tools is a hosted service providing access to query performance insights for all MySQL uses. After a brief setup, you’ll unlock new information about your database and how to improve your applications. You can sign up here to request access to the free beta, currently under way.

Some notes
  • It’s recommended to do the installation under root.
  • If you’re installing pt-agent as root then .pt-agent.conf should be placed in root $HOME
  • You could became root with “sudo -s” command and in this case your homedir is still unchanged and not homedir of root user.
  • So I would strongly recommend to login as root or to “sudo -i” to become root and check if your HOME and PWD are the same: env | egrep -i ‘home=|pwd=’
Sign Up

Go to the URL and sign up (or log-in): https://cloud.percona.com/


Copy your API key:
On this URL: https://cloud.percona.com/api-key
Or by menu: Agents -> API Key

Percona-Toolkit Download

Download Percona-Toolkit:
For example:

  • DEB: for Debian/Ubuntu like systems
  • RPM: for RedHat, CentOS
Percona-Toolkit Installation

yum install http://www.percona.com/redir/downloads/percona-toolkit/LATEST/RPM/percona-toolkit-2.2.6-1.noarch.rpm

wget http://www.percona.com/redir/downloads/percona-toolkit/LATEST/deb/percona-toolkit_2.2.6_all.deb
dpkg -i percona-toolkit_2.2.6_all.deb

pt-agent installation

Run this command:
pt-agent --install --user={mysql username} --password={password} --api-key={API Key copied from web site}

Note: add there your username, password and API Key

pt-agent installation output

You should see this:

============================================================================= pt-agent --install --user={user} --password={password} --api-key={api-key} Step 1 of 11: Verify the user is root: OK Step 2 of 11: Check Perl module dependencies: OK Step 3 of 11: Check for crontab: OK Step 4 of 11: Verify the API key: OK Step 5 of 11: Connect to MySQL: OK Step 6 of 11: Check if MySQL is a slave: NO Step 7 of 11: Create a MySQL user for the agent: OK Step 8 of 11: Initialize /etc/percona/agent/my.cnf: OK Step 9 of 11: Initialize /root/.pt-agent.conf: OK Step 10 of 11: Create the agent: OK Step 11 of 11: Run the agent: 2014-01-29T20:12:17 INFO Starting agent pt-agent has daemonized and is running as PID 13506: --lib /var/lib/pt-agent --log /var/log/pt-agent.log --pid /var/run/pt-agent.pid These values can change if a different configuration is received. OK INSTALLATION COMPLETE The agent has been installed and started, but it is not running any services yet. Go to https://cloud.percona.com/agents#node1 to enable services for the agent. =============================================================================

Agent is installed

pt-agent configuration

Goto: https://cloud.percona.com/agents and select your newly created agent

Then enable Service:

  • Agents -> Services -> Query Analytics -> On (Push Off button and configure values)
  • Save


Now wait a few min, check Status Log: Agents -> select agent -> Status Log
You should see there: “Agent OK”

Now wait (~3 min) until pt-agent will add jobs to crontab.
You should see there: “Services OK”

Then run some slow queries and wait (~3-5 min).
If everything is ok then you should see there “Exit: 200″

Now check Query Analytics.

There is a “Help Me” button at the bottom of the page so you can ask for Support if you have any questions and our Support team will gladly help you.


The post Quick installation guide for Percona Cloud Tools for MySQL appeared first on MySQL Performance Blog.

Percona Toolkit collection: pt-visual-explain

February 3, 2014 - 12:00am

This is the first in a series of posts highlighting a few of the seldom-used but still handy Percona Toolkit tools.

Have you ever had a problem understanding the EXPLAIN statement output? And are you the type of person who would rather use the command line than a GUI application? Then I would recommend that you use Percona’s pt-visual-explain toolkit. This is one of many Percona Toolkit tools that is useful for those who want to have a different view and an easier time understanding the EXPLAIN output aside from the usual table and vertical views.

As described in the documentation – http://www.percona.com/doc/percona-toolkit/2.2/pt-visual-explain.html#description

pt-visual-explain reverse-engineers MySQL’s EXPLAIN output into a query execution plan, which it then formats as a left-deep tree – the same way the plan is represented inside MySQL.”

Our sample query:

SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)

In Table view the EXPLAIN output would look like:

$ mysql -Dworld -e "explain SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)" +----+-------------+---------+------+---------------+-------------+---------+--------------------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+-------------+---------+--------------------+------+-------+ | 1 | SIMPLE | Country | ALL | PRIMARY | NULL | NULL | NULL | 183 | | | 1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | world.Country.Code | 1 | | +----+-------------+---------+------+---------------+-------------+---------+--------------------+------+-------+

While on vertical view, the EXPLAIN output would be:

$ mysql -Dworld -e "explain SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)\G" *************************** 1. row *************************** id: 1 select_type: SIMPLE table: Country type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 183 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: City type: ref possible_keys: CountryCode key: CountryCode key_len: 3 ref: world.Country.Code rows: 1 Extra:

Using the tool…

The EXPLAIN output in pt-visual-explain tree format, I will be showing you three different ways on how to use this handy toolkit.

Piping input into pt-visual-explain, like the one shown below:

$ mysql -Dworld -e "explain SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)" | pt-visual-explain

Parsing input to a file and then pt-visual-explain will reverse-engineer the EXPLAIN output to a query execution plan and formats it to a tree view, as shown below:

$ mysql -Dworld -e "explain SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)" > explain.out $ cat explain.out id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE Country ALL PRIMARY NULL NULL NULL 183 1 SIMPLE City ref CountryCode CountryCode 3 world.Country.Code 1 $ pt-visual-explain explain.out

Create a file containing the query and then parse the file while pt-visual-explain connects to the mysql instance, like the example shown below:

$ echo "SELECT City.Name, Country.Name FROM City JOIN Country on (City.CountryCode=Country.Code)" > select.out $ pt-visual-explain -Dworld --connect select.out

The tree structure below would be the result if you run any of the three methods shown above.

JOIN +- Bookmark lookup | +- Table | | table City | | possible_keys CountryCode | +- Index lookup | key City->CountryCode | possible_keys CountryCode | key_len 3 | ref world.Country.Code | rows 1 +- Table scan rows 183 +- Table table Country possible_keys PRIMARY

So how do we interpret the EXPLAIN plan? Simple. The query will do a table scan on Country table for 183 rows. For each row, it will do an index lookup into the City.CountryCode index with the value from Country.Code, then do a bookmark lookup into the City table.

Please read related posts about limits and errors, and Extended EXPLAIN here and here for more in-depth understanding of MySQL EXPLAIN plans.

And here’s another example using an EXPLAIN DELETE statement (works on MySQL version 5.6.3 and above) :

$ mysql -Dworld -e "EXPLAIN DELETE from Cities where CountryCode IN (select Code from Country where LifeExpectancy < 70.0)" | pt-visual-explain DEPENDENT SUBQUERY +- Filter with WHERE | +- Bookmark lookup | +- Table | | table Country | | possible_keys PRIMARY | +- Unique subquery | key Country->PRIMARY | possible_keys PRIMARY | key_len 3 | ref func | rows 1 +- Filter with WHERE +- Table scan rows 581 +- Table table Cities

Alternatively, one can use - -format=dump option when parsing input into pt-visual-explain and it will create a Data::Dumper output which I think may be harder to understand but may be useable in some other cases.


The next time you want to check your execution plan information you can use pt-visual-explain to compare the result from the output of a regular EXPLAIN statement. You can also use MySQL Workbench, a GUI application that can produce beautified representation of the visual explain with colors, numbers on lines and other information.

For further reading please consult the sections under Modules and Algorithm on the pt-visual-explain documentation.

Other helpful information: pt-visual-explain is derived from Maatkit specifically mk-visual-explain. You can download Percona Toolkit from our download page – http://www.percona.com/downloads/ or get pt-visual-explain only by using $wget percona.com/get/pt-visual-explain.

The post Percona Toolkit collection: pt-visual-explain appeared first on MySQL Performance Blog.

Percona XtraDB Cluster 5.6 GA release now available

January 30, 2014 - 5:01pm

Percona is pleased to announce the first General Availability release of the leading open source High Availability solution for MySQL, Percona XtraDB Cluster 5.6 on January 30, 2014. Binaries are available from downloads area or from our software repositories.

Percona XtraDB Cluster 5.6
Percona XtraDB Cluster 5.6 is an active/active cluster solution for High Availability (HA) MySQL that delivers performance and MySQL-based cluster management features available nowhere else. This free, open source solution combines Percona Server (a drop-in MySQL replacement), Galera, wsrep API, and Percona XtraBackup, in a single package that’s easier to deploy and manage than the various approaches to MySQL replication. Applications that require high availability will benefit from this combination of the latest versions of these software solutions to deliver cost-effective, high performance based on the latest features and fixes available in Percona Server 5.6, Galera 3, wsrep API 25, and Percona XtraBackup 2.1.

Percona XtraDB Cluster expert Jay Janssen will present a webinar titled “What’s New in Percona XtraDB Cluster 5.6” on February 5 at 1PM EST/10AM PST to talk about some of the new features and upgrade strategies and to answer your questions. Alexey Yurchenko, solutions architect from Codership, will join Jay to contribute additional insights on Galera and the Percona XtraDB Cluster solution.

Percona Server 5.6
As our previous performance analysis demonstrates, MySQL 5.6 was a big step forward from MySQL 5.5. However, Percona Server 5.6 yields significantly better results than MySQL 5.6 in IO-bound cases. The ThreadPool feature in Percona Server 5.6 delivers performance advantages over MySQL 5.6 with as few as 150 concurrent threads and up to 4x better performance as concurrent threads grow into the 1,000s. ThreadPool efficiency improvements in Percona Server 5.6 have even shown to deliver similar gains over MariaDB 10.0.7 in transactional workload scenarios.

Other Percona Server 5.6 features that benefit Percona XtraDB Cluster 5.6 include:

  • Expanded diagnostics capability which reveals new insights into user activity with Performance Schema and User Stats features as well as query activity with the Slow Query Log
  • Support for greater control of the end-user experience with the integration of the Statement Timeout feature from Twitter’s MySQL contributions

Galera 3
Galera from Codership provides synchronous replication for greater data consistency. The latest version, Galera 3, includes new features and fixes to minimize replication traffic between WAN segments, improved memory performance, greater data integrity, and improved data throughput.

Percona XtraBackup 2.1
Adding new nodes and recovering failed nodes is an important part of managing a high availability MySQL cluster. Percona XtraDB Cluster 5.6 now defaults to use Percona XtraBackup 2.1 to transfer node state for these processes. Though Galera supports other choices, Percona XtraBackup 2.1 is the best option because it:

  • Has the fastest State Snapshot Transfers (SST) with parallel, compressed, and compacted options
  • Is highly secure with industry standard AES multi-threaded data encryption available

Drop-in Compatibility
Percona XtraDB Cluster has Percona Server at its core so it is drop-in compatible with MySQL. As a result, upgrades to Percona XtraDB Cluster 5.6 are straightforward from a variety of platforms including:

  • Percona XtraDB Cluster 5.5
  • Percona Server 5.6
  • Percona Server 5.5
  • MySQL 5.6
  • MySQL 5.5
  • MariaDB 5.5

Follow our upgrade steps for moving to Percona XtraDB Cluster 5.6 with no downtime.

Additional Help
You can download the operations manual now at the Percona XtraDB Cluster page on percona.com. Community-based help is available for Percona XtraDB Cluster in the Percona Community Forums.

Commercial help with Percona XtraDB Cluster 5.5 and 5.6 is available from the Percona Consulting team. The Percona Support team offers optional coverage for Percona XtraDB Cluster. The Percona Remote DBA team can provide outsourced management of Percona XtraDB Cluster deployments.

All of Percona‘s software is open-source and free. Details of the release can be found in the 5.6.15-25.3 milestone at Launchpad and in the release notes.

New Features

  • New meta packages are now available in order to make the Percona XtraDB Cluster installation easier.
  • Debian/Ubuntu debug packages are now available for Galera and garbd.
  • xtrabackup-v2 SST now supports the GTID replication.

Bugs Fixed

  • Node would get stuck and required restart if DDL was performed after FLUSH TABLES WITH READ LOCK. Bug fixed #1265656.
  • Galera provider pause has been fixed to avoid potential deadlock with replicating threads.
  • Default value for variable binlog_format is now ROW. This is done so that Percona XtraDB Cluster is not started with wrong defaults leading to non-deterministic outcomes like crash. Bug fixed #1243228.
  • During the installation of percona-xtradb-cluster-garbd-3.x package, Debian tries to start it, but as the configuration is not set, it would fail to start and leave the installation in iF state. Bug fixed #1262171.
  • Runtime checks have been added for dynamic variables which are Galera incompatible. Bug fixed #1262188.
  • During the upgrade process, parallel applying could hit an unresolvable conflict when events were replicated from Percona XtraDB Cluster 5.5 to Percona XtraDB Cluster 5.6. Bug fixed #1267494.
  • xtrabackup-v2 is now used as default SST method. Bug fixed #1268837.
  • FLUSH TABLES WITH READ LOCK behavior on the same connection was changed to conform to MySQL behavior. Bug fixed #1269085.
  • Read-only detection has been added in clustercheck, which can be helpful during major upgrades (this is used by xinetd for HAProxy etc.) Bug fixed #1269469.
  • Binary log directory is now being cleanup as part of the XtraBackup SST. Bug fixed #1273368.
  • First connection would hang after changing the wsrep_cluster_address variable. Bug fixed #1022250.
  • When gmcast.listen_addr variable was set manually it did not allow nodes own address in gcomm address list. Bug fixed #1099478.
  • GCache file allocation could fail if file size was a multiple of page size. Bug fixed #1259952.
  • Group remerge after partitioning event has been fixed. Bug fixed #1232747.
  • Fixed multiple build bugs: #1262716, #1269063, #1269351, #1272723, #1272732, and #1253055.

Other bugs fixed: #1273101, #1272961, #1271264, and #1253055.

We did our best to eliminate bugs and problems during the testing release, but this is a software, so bugs are expected. If you encounter them, please report them to our bug tracking system.

The post Percona XtraDB Cluster 5.6 GA release now available appeared first on MySQL Performance Blog.

Percona Live MySQL Conference and Expo 2014: Things I’m looking forward to

January 30, 2014 - 3:00am

The Percona Live MySQL Conference and Expo 2014 is just two months away. I’m very excited about this year’s event which continues our tradition of open technical discussions and inviting all parties to the table.

We have a great amount of technical talks from Oracle – I’m especially excited about future-focused talks shedding some light about what to expect in MySQL 5.7 and beyond. This content is best covered by developers actually designing the system and writing the code. We also have great coverage of MySQL alternatives from Percona and MariaDB. You can view the entire program here.

It is great to see well-rounded coverage of many newer MySQL technologies – Percona XtraDB Cluster and Galera Replication technology as a new generation, high-availability solution, TokuDB as an exciting write-optimized storage engine with a high level of compression, Tungsten replication technology, column store technology InfiniDB.

We’re also going beyond MySQL technologies this year and looking at technologies that you might be running alongside MySQL – we’ll have tutorials around MongoDB, Hadoop and OpenStack specially tailored to MySQL users.

As someone who started my career in Technical Operations I know that the real world is always different from the theory and that real systems’ behavior is rarely completely covered by documentation, so I love to see so many talks done by practitioners sharing their real-life lessons including: Facebook, Google, Twitter, Yahoo, LinkedIn, Salesforce, Tumblr, Dropbox, Github, Booking.com, Box, Groupon and others.

Are you joining us for this great event? It is a great time to register now as Early Bird special prices end this Sunday, February 2nd. Our sponsor list is also filling fast with industry leaders stepping up as top level supporters of the conference. A limited number of spaces are still available.

The post Percona Live MySQL Conference and Expo 2014: Things I’m looking forward to appeared first on MySQL Performance Blog.

Percona Server: Thread Pool Improvements for Transactional Workloads

January 29, 2014 - 2:00pm
ul.b {list-style-type:square;}

In a previous thread pool post, I mentioned that in Percona Server we used an open source implementation of MariaDB’s thread pool, and enhanced/improved it further. Below I would like to describe some of these improvements for transactional workloads.

When we were evaluating MariaDB’s thread pool implementation, we observed that it improves scalability for AUTOCOMMIT statements. However, it does not scale well with multi-statement transactions. The UPDATE_NO_KEY test which was run as an AUTOCOMMIT statement and inside a transaction gave the following results:

After analysis, we identified the major cause of that inefficiency: High latency between individual statements inside transactions. This looked pretty similar to the case when transactions are executed in parallel without thread pool. Latency there is high as well, though the cause of high latency in these two cases is different.

  • In the “one-thread-per-connection” case, with 1000 connections, higher latency is caused by increased contention on accessing MySQL server shared resources like structures/locks/etc.
  • In the case of “pool-of-threads”, 1000 client connections will be organized into thread_pool_size groups (or to be more specific into thread_pool_size queues), and latency here comes not from contention as we have much smaller number of parallel threads. It comes from the execution order of individual statements in transactions. Suppose that you have 100 identical transactions (each with 4 statements in it) in the thread group queue. As transactions are processed and executed sequentially, statements of transaction T1 will be placed at the following positions in the thread pool queue: 1…101…201…301…401. So in case of a uniform workload distances between statements in the transaction will be 100. This way transaction T1 may hold server resources during execution of all statements in the thread pool queue between 1 and 401, and that has a negative impact on performance.

In an ideal world, the number of concurrent transactions does not matter, as long as we keep the number of concurrent statements sufficiently low. Reality is different though. An open transaction which is not currently executing a statement may still block other connections by holding metadata or row-level locks. On top of that, any MVCC implementation should examine states of all open transactions and thus may perform less efficiently with large numbers of transactions (we blogged about InnoDB-specific problems here and here).

In order to help execute transactions as fast as possible we introduced high and low priority queues for thread pool. Now with default thread pool settings, we check every incoming statement, and if it is from an already started transaction we put it into the the high priority queue, otherwise it will go in the low priority queue.

Such reordering allowed to notably reduce latency and resulted in very good scalability up to a very high number of threads. You can find details about this optimization in our documentation.

Now let’s check how these introduced changes will affect the workload we mentioned earlier in this article.

The next graphs show the results for the UPDATE_NO_KEY test that was run as an AUTOCOMMIT statement, and inside a transaction for MariaDB with thread_pool and Percona Server with the thread_pool_high_priority mode=statements – which is very similar to behavior of thread_pool in MariaDB and Percona Server with thread_pool_high_priority mode=transactions – optimization that performs statements reordering of the transactions in the thread pool queue.

This works even more efficiently on larger transactions like OLTP_RW from sysbench. See the graphs below for the same set of servers:

IO bound: sysbench dataset 32 tables/12M rows each (~100GB), InnoDB buffer pool=25GB

As seen, we get nearly flat throughput with thread_pool_high_prio_mode=transactions even with very high numbers of users connections.

The post Percona Server: Thread Pool Improvements for Transactional Workloads appeared first on MySQL Performance Blog.

MySQL performance optimization: Don’t guess! Measure with Percona Cloud Tools

January 29, 2014 - 7:49am

In our practice we often see that MySQL performance optimization is done in a sort of “black magic” way. A common opinion is: “If there is a performance problem – then it must be a database issue, because what else could it be? And if this is a database issue, then it must involve IO problems because the reason for a slow database is always a slow IO…”  Following this logic might actually give a result, but achieving a fully successful resolution would require magic.

At Percona we use a different approach. Performance optimization should not be based on guesses, but exact measurements. In application to databases, I described previously, we use queries as units of work. And a proper measurement of these units is the first important task in performance optimization.

Let me list the metrics of what our Percona Cloud Tools provides:

  • Query count – How many times query was executed
  • Query_time – Total time that MySQL spent on query execution
  • Lock_time – Time spent in waiting on Table level locks
  • Rows_sent – How many rows query returned to application
  • Rows_examined – How many rows MySQL actually had to read

In my previous post I indicated that Rows_sent/Rows_examined ratio is one of interest for OLTP workloads.

And the following metrics are available only for Percona Server, and not because we crippled our tools, but because MySQL simply does not provide them. It is worth reminding that one of main goals in making Percona Server was to provide diagnostics and transparency.

Percona Server metrics:

  • Rows_affected
  • Merge_passes
  • InnoDB_IO_r_bytes
  • InnoDB_IO_r_ops
  • InnoDB_IO_r_wait
  • InnoDB_pages_distinct
  • InnoDB_queue_wait
  • InnoDB_rec_lock_wait
  • Query_length
  • Bytes_sent
  • Tmp_tables
  • Tmp_disk_tables
  • Tmp_table_sizes

Meaning of all these metrics are available from our docs page

Now, often it is not enough to provide a single value for a given metric, so the following stats are available:
Total, Average, Minimum, Median, 95%, Maximum and Stddev.
Add for each of these trending graphs and you will see a matrix on your query:

Having all of these comprehensive metrics about your queries, you can now make an intelligent decision on how to approach performance optimization of your database.

You can do it all by yourself today with the free Beta of Percona Cloud Tools.

I understand that all of this information might be too much too figure out at first, but as with any tool – it takes skills and some experience to analyze and apply metrics properly. To help to bootstrap with Percona Cloud Tools, I will be running a webinar “Analyze MySQL Query Performance with Percona Cloud Tools” on Feb-12, 2014; if you register and install the Percona Cloud Tools agent you may win a free consulting hour from me during which I will examined your queries and provide an advice for optimization. See all conditions there.

The post MySQL performance optimization: Don’t guess! Measure with Percona Cloud Tools appeared first on MySQL Performance Blog.

Looking to upgrade to MySQL 5.6? Check out my webinar on Jan 29!

January 27, 2014 - 6:58am

We are nearing the one-year anniversary since MySQL 5.6 went GA – which is typically a good time even for the most conservative users to start thinking about upgrading. At this point there is a fair amount of practical use and experience; many bugs have also been fixed (1991 to be exact according to Morgan Tocker).

We also know that MySQL 5.6 has been used in some very demanding environments on a very large scale, such as at Facebook. We also know from the Facebook team, after kindly sharing their upgrade experiences, that it takes a lot of work to upgrade to MySQL 5.6. That’s where Percona can help!

In my webinar this Wednesday, “Upgrading to MySQL 5.6: Best Practices,” I will share some of the issues and best practices we have discovered here at Percona in helping our customers upgrade to MySQL 5.6 – and also Percona Server 5.6, which is an enhanced, drop-in MySQL replacement. My webinar starts at 10 a.m. Pacific time. You can register here, and that URL will be available afterwards for the recording.

While we’re on the upgrade topic: MySQL 5.1 has reached EOL when it comes to MySQL support from Oracle. According to mysql.com, “Users are encouraged to upgrade to MySQL 5.6.,” and, “Per Oracle’s Lifetime Support policy, as of December 31, 2013, MySQL 5.1 is covered under Oracle Sustaining Support.” That means there will be no new releases, no new fixes (no error correction for new issues), no new updates.

Only existing updates, fixes and alerts are available – which means no more releases for MySQL 5.1 even if you run into crash-inducing bugs or the need for security fixes. The former can be especially worrying as MySQL 5.1 just got the some significant vulnerabilities fixed (more details) and chances are it will not take long for the next significant security issue to be discovered.

One solution to this problem is to upgrade to MySQL 5.6 or MySQL 5.5, which will be supported for awhile. That’s why Wednesday’s webinar is important to attend if you haven’t yet upgraded. You should also consider calling Percona to help in upgrading to MySQL 5.6 – especially if you do not have experience doing MySQL upgrades.

Another solution is to use Percona’s MySQL Support, which continues to cover MySQL 5.1 and Percona Server 5.1 and will provide fixes for crash-inducing bugs and security issues. I hope to see you on Wednesday!

The post Looking to upgrade to MySQL 5.6? Check out my webinar on Jan 29! appeared first on MySQL Performance Blog.

FOSDEM 2014 MySQL Devroom, Community Dinner in Brussels this Saturday!

January 26, 2014 - 10:00pm

This weekend, on 1-2 February, FOSDEM 2014 will take place in Brussels, Belgium.
There will be MySQL talks on Saturday in the MySQL Devroom and a MySQL Community dinner afterward.

30 people have already signed up for the community dinner, we’re almost at the amount of people from last year.

Last orders for the MySQL Community dinner will be accepted on Wednesday 29 January, so if you plan to attend, make sure to buy your tickets now. After that day, we won’t be able to add more people to the list.

I also wanted to thank the sponsors. Without them, this community dinner would not have been made possible:

The post FOSDEM 2014 MySQL Devroom, Community Dinner in Brussels this Saturday! appeared first on MySQL Performance Blog.

MySQL server memory usage troubleshooting tips

January 24, 2014 - 6:00am

There are many blog posts already written on topics related to “MySQL server memory usage,” but nevertheless there are some who still get confused when troubleshooting issues associated with memory usage for MySQL. As a Percona support engineer, I’m seeing many issues regularly related to heavy server loads – OR OOM killer got invoked and killed MySQL server due to high Memory usage… OR with a question like: “I don’t know why mysql is taking so much memory. How do I find where exactly memory is allocated? please help!”

There are many ways to check memory consumption of MySQL. So, I’m just trying here to explain it by combining all details that I know of in this post.

  • Check memory related Global/Session variables.

If you are using MyISAM then you need to check for Key_buffer_size, while using InnoDB, you can check innodb_buffer_pool_size,  innodb_additional_memory_pool_size, innodb_log_buffer_size,  innodb_sort_buffer_size (used only for sorting data while creating index in innodb, introduced from 5.6). max_connections, query_cache_size and table_cache are also important variables to check

We know that whenever a thread is connected to MySQL, it will need it’s own buffers when they are doing some complex operations like FTS,  sorting, creating temp tables etc. So we also need to check the size of read_buffer_size, sort_buffer_size, read_rnd_buffer_size and tmp_table_size.

There is a very good quote from High Performance MySQL, 3rd Edition: “ You can think of MySQL’s memory consumption as falling into two categories: the memory you can control, and the memory you can’t. You can’t control how much memory MySQL uses merely to run the server, parse queries, and manage its internals, but you have a lot of control over how much memory it uses for specific purposes.” So it seems we have to understand the purpose for configuring any variable… either it is Global or Session level. I would like to explain more about that here.

For the Global variables like key_buffer_size, query_cache_size etc,  MySQL always allocates and initializes the specified amount of memory all at once when the server starts. But it’s not happened for those who are global default but can be set as per-session variables, i.e  For read_buffer_size, sort_buffer_size, join_buffer_size, MySQL doesn’t allocate any memory for these buffers until query needs. But when a query needs, it immediately allocates the entire chunk of memory specified. So if there are even small sorts, full buffer size will be allocated which is just waste of memory. Even some buffers can be used multiple times. For example on queries that join several tables join_buffer can be allocated once per joined table. also some complicated queries including sub-queries can use multiple sort_buffers at the same time which can lead to high memory consumption. In some scenario, query didn’t even use sort_buffer whatever size is, as it select by primary key which will not allocate it. So it depends on the nature of your environment but I would say it’s always better to start with a safe variable value that can be larger than default if needed but not as large as it can consume all of the server’s memory.

One more thing,  not all per thread memory allocation is configured by variables.  Some of memory allocation per thread is done by MySQL itself for running complex processes/queries like “stored procedures” and it can take unlimited amount of memory while running. And sometimes, optimizer  can also take a lot of memory working with highly complex queries which generally we can’t control by any configuration parameter.

Even innodb_buffer_pool_size is not a hard limit, usually innodb uses 10% more memory than the one specified. Many people do not recommend using both storage engine MyISAM and InnoDB at the same time on production server. Because both have individual buffers which can eat all server memory.

For detailed information related to this topic, I would suggest reading this post from Peter Zaitsev titled “MySQL Server Memory Usage.”


---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 137363456; in additional pool allocated 0 Dictionary memory allocated 76056 Buffer pool size 8191 Free buffers 7804 Database pages 387 Old database pages 0 Modified db pages 0

Above one is from Native MySQL but if you’ll check the same with Percona Server you’ll get some more information.

---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 137756672; in additional pool allocated 0 Total memory allocated by read views 88 Internal hash tables (constant factor + variable factor) Adaptive hash index 2217584 (2213368 + 4216) Page hash 139112 (buffer pool 0 only) Dictionary cache 597885 (554768 + 43117) File system 83536 (82672 + 864) Lock system 333248 (332872 + 376) Recovery system 0 (0 + 0) Dictionary memory allocated 43117 Buffer pool size 8191 Buffer pool size, bytes 134201344 Free buffers 7760 Database pages 431 Old database pages 0 Modified db pages 0

This will give you information regarding how much memory is allocated by InnoDB. You can see here “Total Memory Allocated”, “Internal Hash Tables”, “Dictionary Memory Allocated”, “Buffer Pool Size” etc.

  • Profiling MySQL Memory usage with Valgrind Massif

Recently, I used this tool and surprisingly I got very good statistics about memory usage. Here the only problem is you have to shutdown the mysql, start it with valgrind massif and after collecting statistics, you again have to shutdown and normal start.

$ /etc/init.d/mysql stop $ valgrind --tool=massif --massif-out-file=/tmp/massif.out /usr/sbin/mysqld $ /etc/init.d/mysql restart

After getting massif.out file, you have to read it with ms_print command. You will see pretty nice graph and then statistics. i.e

[root@percona1 ~]# ms_print /tmp/massif.out -------------------------------------------------------------------------------- Command: /usr/sbin/mysqld Massif arguments: --massif-out-file=/tmp/massif.out ms_print arguments: /tmp/massif.out -------------------------------------------------------------------------------- MB 50.22^ ## | # | # | # | # | # | : # | :: # ::::::@:::::::::::::@:::@::::@:::@:::: | : @::# :::::@@::::::::::::::::::: @::: ::: :::::@:::@::: @:::@:::: | : @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@:::: | : @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@:::: | : @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@:::: | : @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@ | : @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@ | :: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@ | ::: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@ | :::: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@: | :::: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@: | :::: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@: | :::: @::# : ::@ :::::::::: :: ::::: @::: ::: :::::@:::@::: @:::@::::@: 0 +----------------------------------------------------------------------->Mi 0 575.9 Number of snapshots: 96 Detailed snapshots: [1, 7, 11 (peak), 16, 35, 48, 58, 68, 78, 88] -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 6,090,089 195,648 194,590 1,058 0 99.46% (194,590B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->48.79% (95,458B) 0x7A1D20: my_malloc (my_malloc.c:38) | ->25.08% (49,060B) 0x6594F1: read_texts(char const*, char const*, char const***, unsigned int) (derror.cc:160) | | ->25.08% (49,060B) 0x6597C2: init_errmessage() (derror.cc:69) | | ->25.08% (49,060B) 0x506232: init_common_variables() (mysqld.cc:3414) | | ->25.08% (49,060B) 0x508CBB: mysqld_main(int, char**) (mysqld.cc:4461) | | ->25.08% (49,060B) 0x5B2CD1B: (below main) (in /lib64/libc-2.12.so) | | | ->09.36% (18,317B) 0x789571: my_read_charset_file (charset.c:364) | | ->09.36% (18,317B) 0x789DEC: init_available_charsets (charset.c:458) | | ->09.36% (18,317B) 0x4E35D31: pthread_once (in /lib64/libpthread-2.12.so) | | ->09.36% (18,317B) 0x789C80: get_charset_by_csname (charset.c:644) | | ->09.36% (18,317B) 0x5062E9: init_common_variables() (mysqld.cc:3439) | | ->09.36% (18,317B) 0x508CBB: mysqld_main(int, char**) (mysqld.cc:4461) | | ->09.36% (18,317B) 0x5B2CD1B: (below main) (in /lib64/libc-2.12.so) | | | ->08.37% (16,384B) 0x79DEEF: my_set_max_open_files (my_file.c:105) | | ->08.37% (16,384B) 0x506169: init_common_variables() (mysqld.cc:3373) | | ->08.37% (16,384B) 0x508CBB: mysqld_main(int, char**) (mysqld.cc:4461) | | ->08.37% (16,384B) 0x5B2CD1B: (below main) (in /lib64/libc-2.12.so) | | | ->04.36% (8,536B) 0x788DB4: init_dynamic_array2 (array.c:70) | | ->02.45% (4,800B) 0x5CD51A: add_status_vars(st_mysql_show_var*) (sql_show.cc:2062) | | | ->02.45% (4,800B) 0x505E68: init_common_variables() (mysqld.cc:3245) | | | ->02.45% (4,800B) 0x508CBB: mysqld_main(int, char**) (mysqld.cc:4461) | | | ->02.45% (4,800B) 0x5B2CD1B: (below main) (in /lib64/libc-2.12.so) | | |

You can see from the output that where memory is allocated, to which function etc. You can use this tool to find memory leaks. You can get more information here for how to install and use it. Here’s another related post by Roel Van de Paar titled: “Profiling MySQL Memory Usage With Valgrind Massif.”

If possible Valgrind massif should not be used on busy production server as it can degrade the performance. Generally it’s used to find memory leak by creating mirror environment on test/stage server and run on it. It needs debug binary to run so it decreases performance a lot. So it can be used for investigating some cases but not for regular use.

  • Check Plot memory usage by monitoring ps output. 

This also useful when you want to check how much virtual(VSZ) and real memory (RSS) is used by mysqld. You can either simply run some bash script for monitoring it like

while true do   date >> ps.log   ps aux | grep mysqld >> ps.log   sleep 60 done
Or you can also check when needed from shell prompt with “ps aux | grep mysqld” command. 

  • Memory tables in MySQL 5.7

With MySQL 5.7, some very interesting memory statistics tables are introduced to check memory usage in performance_schema.  There is no any detailed documentation available yet but you can check some details here.  http://dev.mysql.com/doc/refman/5.7/en/memory-summary-tables.html

In P_S, there are five memory summary tables.

mysql> show tables like '%memory%'; +-----------------------------------------+ | Tables_in_performance_schema (%memory%) | +-----------------------------------------+ | memory_summary_by_account_by_event_name | | memory_summary_by_host_by_event_name | | memory_summary_by_thread_by_event_name | | memory_summary_by_user_by_event_name | | memory_summary_global_by_event_name | +-----------------------------------------+ 5 rows in set (0.00 sec)

So with every event you can get summarized memory consumption for a particular account, host, thread and user. While checking more, I found that there are around 209 different events to check. I have just tried to check one event related to join buffer size.

mysql> select * from memory_summary_by_account_by_event_name where SUM_NUMBER_OF_BYTES_ALLOC <> 0 and user = 'msandbox' and event_name = 'memory/sql/JOIN_CACHE' \G *************************** 1. row *************************** USER: msandbox HOST: localhost EVENT_NAME: memory/sql/JOIN_CACHE COUNT_ALLOC: 2 COUNT_FREE: 2 SUM_NUMBER_OF_BYTES_ALLOC: 524288 SUM_NUMBER_OF_BYTES_FREE: 524288 LOW_COUNT_USED: 0 CURRENT_COUNT_USED: 0 HIGH_COUNT_USED: 1 LOW_NUMBER_OF_BYTES_USED: 0 CURRENT_NUMBER_OF_BYTES_USED: 0 HIGH_NUMBER_OF_BYTES_USED: 262144 1 row in set (0.00 sec) mysql> show global variables like 'join%'; +------------------+--------+ | Variable_name | Value | +------------------+--------+ | join_buffer_size | 262144 | +------------------+--------+ 1 row in set (0.00 sec) mysql>

Here, COUNT_ALLOC, COUNT_FREE are aggregate the number of calls to malloc-like and free-like functions. SUM_NUMBER_OF_BYTES_ALLOC and SUM_NUMBER_OF_BYTES_FREE are indicate the aggregate size of allocated and freed memory blocks. CURRENT_COUNT_USED is the aggregate number of currently allocated blocks that have not been freed yet. CURRENT_NUMBER_OF_BYTES_USED is the aggregate size of currently allocated memory blocks that have not been freed yet. LOW_ and HIGH_ are low and high water marks corresponding to the columns. 

If you are aware about these scripts then, these are giving very good summary about overall server memory consumption as well as related to MySQL.

Like in output of pt-summary,

# Memory ##################################################### Total | 11.8G Free | 143.7M Used | physical = 11.6G, swap allocated = 4.0G, swap used = 0.0, virtual = 11.6G Buffers | 224.9M Caches | 6.2G Dirty | 164 kB UsedRSS | 4.8G

In output of pt-mysql-summary.

# Query cache ################################################ query_cache_type | OFF Size | 0.0 Usage | 0% HitToInsertRatio | 0% # InnoDB ##################################################### Version | 5.5.30-rel30.2 Buffer Pool Size | 4.0G Buffer Pool Fill | 35% Buffer Pool Dirty | 1% # MyISAM ##################################################### Key Cache | 32.0M Pct Used | 20% Unflushed | 0%


It is really important for us to know where MySQL allocates memory and how it affects the overall load on the MySQL server and performance. I have just tried here to describe a few ways but I still think that we should have some sort of script or something that can combine all of these results and gives us some truthful output of memory usage in MySQL.

The post MySQL server memory usage troubleshooting tips appeared first on MySQL Performance Blog.


Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
Training (855) 55TRAIN or
(925) 271-5054


Share This