]]>
]]>

You are here

Latest MySQL Performance Blog posts

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 17 min ago

3 handy tools to remove problematic MySQL processes

February 27, 2015 - 12:00am

DBAs often encounter situations where they need to kill queries to ensure there are no long-running queries on a MySQL server that would impact performance. Long-running queries can be the result of many factors. Fortunately, Percona Server contains some handy tools to remove problematic MySQL processes. I will highlight all of the tools via some examples in this post.

pt-kill:
There have been some good posts on this blog about the pt-kill tool, like this one by Arunjith Aravindan titled “How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill.” Let’s dive into pt-kill a bit further with a few more examples. What does pt-kill do? It kills MySQL connections. Say you wanted to run pt-kill from a cronjob and then get an email on every killed process/query. Here is typical example for that.

$ pt-kill --interval 1 --run-time 1 --busy-time 5 --log /path/to/kill_long_running_thread.log --match-info "^(select|SELECT|Select)" --kill --print --user=xxxxxx --password=xxxxxxxxxx

Assume this is running from a cronjob, When pt-kill executes, it will kill queries longer than 5 seconds. By default, pt-kill runs forever –run-time option tells how long pt-kill to run before exiting If –interval and –busy-time parameters are used together then the explicit –interval value is used. Otherwise the default interval is 30 seconds. Note: this will only kill all read queries as per the –match-info parameter.

The above command will log all killed queries in the file referenced with the –log option. If you need to be notified via email for every killed query, the command below will do it. Off-course, you need to have the system configured to send e-mail.

#!/bin/bash tail -n 0 -F /path/to/kill_long_running_thread.log | while read LOG do echo "$LOG" | mail -s "pt-kill alert" sample@test.com done

You can execute this shell script in the background within screen or with the nohup utility continuously to monitor the log file. It will send an email whenever any new killed query arrives to the referenced log file in the pt-kill command. Unfortunately, there is no option to notify-by-email in pt-kill at the moment, so this is sort of a workaround.

In order to log all killed queries into a database table you will need to use the –log-dsn option as per the example below.

$ pt-kill --interval 1 --busy-time 1 --create-log-table --log-dsn=h=localhost,D=percona,t=kill_log --daemonize --match-info "^(select|SELECT|Select)" --kill

All killed queries will be logged into percona.kill_log table. The –daemonize option will run this command in the background forever and will kill all SELECT queries running longer than 1 second (–busy-time 1). The –interval option instructs pt-kill to scan processes every 1 second (–interval 1).

mysql> select * from kill_log; +---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+ | kill_id | server_id | timestamp | reason | kill_error | Id | User | Host | db | Command | Time | State | Info | Time_ms | +---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+ | 17 | 1 | 2015-01-10 08:38:33 | Query matches Info spec | | 35146 | root | localhost | NULL | Query | 0 | User sleep | SELECT SLEEP(5) | NULL | | 20 | 1 | 2015-01-10 08:38:34 | Query matches Info spec | | 35223 | root | localhost | NULL | Query | 0 | User sleep | SELECT SLEEP(5) | NULL | +---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+

With the help of logging killed queries into a database tables. You can easily get all the trends/and /statistics on killed queries via SQL.

By default the tool kills the oldest of the queries that would have been running for more than a given –busy-time.  If you need to kill all the threads that have been busy for more than a specified –busy-time, then this will do it:

$ pt-kill --victims=all --busy-time=60

Statement Timeout in Percona Server:
The max-statement-time feature is ported from the Twitter patches. This feature can be used to limit the query execution time by specifying the timeout value in the max_statement_time variable. When the specified number of milliseconds is reached the server aborts the statement and returns the error below to the client.

ERROR 1877 (70101): Query execution was interrupted, max_statement_time exceeded

Let me demonstrate this through another example:

mysql [localhost] {msandbox} (world) > SET max_statement_time=1; Query OK, 0 rows affected (0.00 sec) mysql [localhost] {msandbox} (world) > show variables like 'max_statement_time'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | max_statement_time | 1 | +--------------------+-------+ 1 row in set (0.00 sec) mysql [localhost] {msandbox} (world) > SELECT * FROM City WHERE District = 'abcd'; ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded mysql [localhost] {msandbox} (world) > UPDATE City SET District='abcd' WHERE ID = 2001; ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded mysql [localhost] {msandbox} (world) > ALTER TABLE City ADD INDEX district_idx (district); ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

As you can see from this example statement, the timeout feature works for all statements including SELECT/DML/DDL queries.

mysql [localhost] {msandbox} (world) > show status like 'Max_statement%'; +-------------------------------+-------+ | Variable_name | Value | +-------------------------------+-------+ | Max_statement_time_exceeded | 3 | | Max_statement_time_set | 19 | | Max_statement_time_set_failed | 0 | +-------------------------------+-------+ 3 rows in set (0.00 sec)

The above mentioned status variables are stats for a statement timeout feature. Max_statement_time_exceeded will inform you that the total number of statements exceeded the defined timeout. Max_statement_time_set defines the number of statements for which execution time limit was set. You can find more details in this documentation. The statement timeout feature was introduced in Percona Server 5.6. You can check if your specific version of Percona Server supports this feature or not via the have_statement_timeout variable.

mysql [localhost] {msandbox} (world) > show global variables like 'have_statement_timeout'; +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | have_statement_timeout | YES | +------------------------+-------+ 1 row in set (0.00 sec)

Bugs you should be aware of:

https://bugs.launchpad.net/percona-server/+bug/1388533 -> This affects how the feature interacts with stored procedures. If you use stored procedures, max_statement_time might not behave as you expect.
https://bugs.launchpad.net/percona-server/+bug/1307432 -> This is documentation bug. Percona Server timeouts might not be safe for some statements like DDL and should not be used with such queries, The documentation does not reflect this. You should be very careful if you set a global statement timeout, It affects data changing queries as well. For best results set the max_statement_time variable in a session before running queries that you want to be killed if they execute too long, instead of using a global variable.
https://bugs.launchpad.net/percona-server/+bug/1376934 -> This affects the statement timeout feature on the query level. You must set max_statement_time in a session or globally instead however, this bug is fixed in latest version i.e. Percona Server 5.6.22-72.0

InnoDB Kill Idle Transactions:
This feature was introduced in Percona Server 5.5. It limits the age of idle XtraDB transactions and will kill idle transactions longer than a specified threshold for innodb_kill_idle_transaction. This feature is useful when autocommit is disabled on the server side and you are relying on the application to commit transactions and want to avoid long running transactions that are uncommitted. Application logic errors sometimes leaves transactions uncommitted. Let me demonstrate it quickly through one example:

mysql [localhost] {msandbox} (world) > show variables like 'autocommit'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | autocommit | OFF | +---------------+-------+ mysql [localhost] {msandbox} (world) > show global variables like 'innodb_kill_idle_transaction'; +------------------------------+-------+ | Variable_name | Value | +------------------------------+-------+ | innodb_kill_idle_transaction | 10 | +------------------------------+-------+ mysql [localhost] {msandbox} (world) > START TRANSACTION; SELECT NOW(); INSERT INTO City_backup (Name,CountryCode,District,Population) VALUES ('Karachi','PK','Sindh','1000000'); Query OK, 0 rows affected (0.00 sec) +---------------------+ | NOW() | +---------------------+ | 2015-01-31 07:11:39 | +---------------------+ 1 row in set (0.00 sec) Query OK, 1 row affected (0.01 sec) mysql [localhost] {msandbox} (world) > SHOW ENGINE INNODB STATUSG *************************** 1. row *************************** ------------ TRANSACTIONS ------------ ---TRANSACTION 173076, ACTIVE 10 sec 1 lock struct(s), heap size 360, 0 row lock(s), undo log entries 1 MySQL thread id 15, OS thread handle 0x7f862e9bf700, query id 176 localhost msandbox init SHOW ENGINE InnoDB STATUS TABLE LOCK table `world`.`City_backup` trx id 173076 lock mode IX ---------------------------- END OF INNODB MONITOR OUTPUT ============================ mysql [localhost] {msandbox} (world) > SELECT NOW(); SELECT * FROM City_backup; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 16 Current database: world +---------------------+ | NOW() | +---------------------+ | 2015-01-31 07:12:06 | +---------------------+ 1 row in set (0.01 sec) Empty set (0.00 sec)

 Conclusion:
In this post, I shared some tools that can help you get rid of long-running transactions to help ensure that you don’t run into performance issues. This is one of the many good reasons to use Percona Server, which has some extra features on top of vanilla MySQL server.

 

The post 3 handy tools to remove problematic MySQL processes appeared first on MySQL Performance Blog.

Worrying about the ‘InnoDB: detected cycle in LRU for buffer pool (…)’ message?

February 26, 2015 - 12:00am

If you use Percona Server 5.5 and you have configured it to use multiple buffer pool instances than sooner or later you’ll see the following lines on the server’s error log and chances are you’ll be worried about them:

InnoDB: detected cycle in LRU for buffer pool 5, skipping to next buffer pool. InnoDB: detected cycle in LRU for buffer pool 3, skipping to next buffer pool. InnoDB: detected cycle in LRU for buffer pool 7, skipping to next buffer pool.

Worry not as this is mostly harmless. It’s becoming a February tradition for me (Fernando) to face a question about this subject (ok, it’s maybe a coincidence) and this time I’ve teamed up with my dear colleague and software engineer George Lorch to provide you the most complete blog post ever published on this topic(with a belated thank you! to Ernie Souhrada, with whom I’ve also discussed this same matter one year ago).

InnoDB internals: what is “LRU” ?

There’s a short and to-the-point section of the MySQL manual that explains in a clear way what is the InnoDB buffer pool, how it operates and why it plays such an important role in MySQL performance. If you’re interested in understanding InnoDB internals then that page is a good start. In this section we’ll refrain ourselves to explain what the “LRU” that shows in our subject message is so we’ll only slightly dig into InnoDB internals, enough to make for some context. Here’s a quick introduction to the buffer pool, quoting from the above manual page:

InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. (…) Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads.

In practice, however, we can rarely fit our whole dataset inside the InnoDB buffer pool so there must be a process to manage this limited pool of memory pages:

InnoDB manages the pool as a list, using a variation of the least recently used (LRU) algorithm. When room is needed to add a new block to the pool, InnoDB evicts the least recently used block and adds the new block to the middle of the list.

There you go, InnoDB employs a variation of the Least Recently Used algorithm called midpoint insertion strategy to manage the pages within the buffer pool. We should mention it does makes exceptions, such as during a full table scan, when it knows the loaded pages might end up being read only a single time.

Dumping and reloading the buffer pool

Before we can get to the main point of this article lets first examine why would you want to dump the buffer pool to disk, which is at the core of the matter here: that’s when those warning messages we’re discussing may appear.

When you start a MySQL server the buffer pool is empty by default. Performance is at it’s worse at this point because no data can be found in memory so in practice each request for data results in an I/O operation to retrieve the data in the disk and bring it to memory. With time the buffer pool gets filled and performance improves – more and more data can now be found in memory. With yet more time we reach a peek performance state: the buffer pool not only is full but it is filled with the most popular data. The time between the start of the server and reaching this optimum state in the buffer pool is called server warm up. How long it takes depends mostly on two things: the size of the buffer pool and the level of activity of the server – the less busy it is the less requests it will get and thus more time is needed until the popular data is fully loaded.

Now, there could be a shortcut: what if before we save the buffer pool on a disk file before we stop MySQL? We could later use it to reload the buffer pool to an optimum state when we restart the server, thus decreasing the warm up period dramatically.

Percona was a pioneer in this field related to other MySQL distributions and implemented this functionality in Percona Server 5.5. Later on, MySQL 5.6 was released with a similar functionality which also allowed preloading the buffer pool for a faster warm up. Percona Server 5.6 incorporates this upstream feature, effectively replacing its own implementation.

“Detected cycle in LRU”

In the section above we introduced a functionality that allows to dump the contents of the buffer pool to disk so we can later reload it at server restart. What we didn’t mention was that this is yet most useful outside of maintenance time and planned shutdows – that is, when the server crashes. When a crash happens it’s that more important to bring it back to a warm up state soon, so it can resume providing data fast enough. And giving we cannot predict a crash the only way we can arrange to have the latest buffer pool on disk is by flushing it often.

While the buffer pool is divided into pages for efficiency of high-volume read operations it is implemented as a linked list of pages, for efficiency of cache management. During the process of dumping the buffer pool to disk a mutex is acquired on the LRU list. However, this mutex is not hold for the duration of the process – it is periodically released to prevent stalling of the system. The problem is: in between the release of the mutex and the moment it is acquired again the list may get reshuffled. Since the dump keeps a pointer to its position across the mutex boundry, the dump can get put into some artificial cycling.

Lets consider a linked list:

A > B > C > D > E

where each letter corresponds to a memory page. Now lets say the initial dump was partially taken and covered the first three pages, “A > B > C”, placing a pointer on “C” before releasing the mutex. Once the mutex is reacquired the list has been reshuffled:  “A > C > B > D > E”. The resulting junction of the partial list we have already copied and the reshuffled list now includes a loop, which would incur in a cycle: “(A > B > C) > B > D > E”. When the dumping process detects a cycle on the LRU list it stops copying from the actual buffer pool, throws in a warning message, and moves on to the next buffer pool instance – otherwise it would keep dumping in an infinite loop.

How harmless are those messages ?

It is fairly harmless except for the fact you will only have a partial LRU list dump for that buffer pool instance – that is, until the next dump occurs. If the server crashes or is shutdown before the next dump takes place the existing one won’t be totally up to date for the server warm up to complete – it will still be used and will still provide a partially filled, somewhat “warm” buffer pool, just not as optimal as it could have been if the last dump had been taken fully.

The post Worrying about the ‘InnoDB: detected cycle in LRU for buffer pool (…)’ message? appeared first on MySQL Performance Blog.

Using MySQL Event Scheduler and how to prevent contention

February 25, 2015 - 3:00am

MySQL introduced the Event Scheduler in version 5.1.6. The Event Scheduler is a MySQL-level “cron job”, which will run events inside MySQL. Up until now, this was not a very popular feature, however, it has gotten more popular since the adoption of Amazon RDS – as well as similar MySQL database as a service offerings where there is no OS level.

What is important to understand about the Event Scheduler is that it does not have any protection against multiple execution (neither does linux cron). Let’s imagine you have created an event that executes every 10 seconds, but the logic inside the event (i.e. queries or stored procedure call) can take longer than 10 seconds (may be in case of the high load), so it can pile-up. In the worst case, when an event contains a set of “insert” + “update”/”delete” statement inside a transaction, it can cause a deadlock.

Adding “get_lock” conditions inside of the event will help to prevent such situation:

If a repeating event does not terminate within its scheduling interval, the result may be multiple instances of the event executing simultaneously. If this is undesirable, you should institute a mechanism to prevent simultaneous instances. For example, you could use the GET_LOCK() function, or row or table locking. Read more at event_scheduler documentation.

Function GET_LOCK() can be used for communications between threads:

The following example can illustrate using get_lock:

DELIMITER // CREATE EVENT testlock_event ON SCHEDULE EVERY 2 SECOND DO BEGIN DECLARE CONTINUE HANDLER FOR SQLEXCEPTION BEGIN DO RELEASE_LOCK('testlock_event'); END; IF GET_LOCK('testlock_event', 0) THEN -- add some business logic here, for example: -- insert into test.testlock_event values(NULL, NOW()); END IF; DO RELEASE_LOCK('testlock_event'); END; // DELIMITER ;

DECLARE CONTINUE HANDLER FOR SQLEXCEPTION is needed here to release lock even if the event failed or was killed.

The above GET_LOCK / RELEASE_LOCK combination will help to prevent contention inside the MySQL Event Scheduler.

The post Using MySQL Event Scheduler and how to prevent contention appeared first on MySQL Performance Blog.

Is MySQL’s innodb_file_per_table slowing you down?

February 24, 2015 - 3:00am

MySQL’s innodb_file_per_table is a wonderful thing – most of the time. Having every table use its own .ibd file allows you to easily reclaim space when dropping or truncating tables. But in some use cases, it may cause significant performance issues.

Many of you in the audience are responsible for running automated tests on your codebase before deploying to production. If you are, then one of your goals is having tests run as quickly as possible so you can run them as frequently as possible. Often times you can change specific settings in your test environment that don’t affect the outcome of the test, but do improve throughput. This post discusses how innodb_file_per_table is one of those settings.

I recently spoke with a customer whose use case involved creating hundreds of tables on up to 16 schemas concurrently as part of a Jenkins testing environment. This was not in production, so performance was far more important than durability. They’d run their tests, and then drop the schemas. This process took close to 20 minutes. They asked “How can we make this faster?”

Due to the number of tables involved innodb_file_per_table seemed a likely culprit.

It’s been noted here on the MySQL Performance Blog that innodb_file_per_table can cause table creation and drops to slow down. But what exactly is the performance hit? We wanted to find out.

The innodb_file_per_table Test:

On a test server running CentOS release 6.5, xfs filesystem, and 5.6.22-71.0-log Percona Server, I ran the following homemade benchmark bash script:

[root@host ~]# time $(for db in {1..16}; do mysql -e "create database bench$db"; $(for tb in {1..500}; do $(mysql bench$db -e "create table tab${tb} (i int) engine=innodb"); done) & done)

If you open the mysql client in another screen or terminal, you should see something like this:

... +-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+          | Id    | User | Host      | db      | Command | Time | State          | Info                                     | Rows_sent | Rows_examined |          +-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+          | 80013 | root | localhost | NULL    | Query   |    0 | init           | show processlist                         |         0 |             0 |          | 89462 | root | localhost | bench5  | Query   |    0 | creating table | create table tab95 (i int) engine=innodb |         0 |             0 |          | 89466 | root | localhost | bench8  | Query   |    0 | creating table | create table tab81 (i int) engine=innodb |         0 |             0 |          | 89467 | root | localhost | bench1  | Query   |    0 | creating table | create table tab91 (i int) engine=innodb |         0 |             0 |          | 89468 | root | localhost | bench13 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |          | 89469 | root | localhost | bench15 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |          | 89472 | root | localhost | bench9  | Query   |    0 | creating table | create table tab86 (i int) engine=innodb |         0 |             0 |          | 89473 | root | localhost | bench10 | Query   |    0 | creating table | create table tab94 (i int) engine=innodb |         0 |             0 |          | 89474 | root | localhost | bench11 | Query   |    0 | creating table | create table tab80 (i int) engine=innodb |         0 |             0 |          | 89475 | root | localhost | bench3  | Query   |    0 | creating table | create table tab80 (i int) engine=innodb |         0 |             0 |          | 89476 | root | localhost | bench2  | Query   |    0 | creating table | create table tab82 (i int) engine=innodb |         0 |             0 |          | 89478 | root | localhost | bench4  | Query   |    0 | creating table | create table tab91 (i int) engine=innodb |         0 |             0 |          | 89479 | root | localhost | bench16 | Query   |    0 | creating table | create table tab88 (i int) engine=innodb |         0 |             0 |          | 89481 | root | localhost | bench12 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |          | 89483 | root | localhost | bench6  | Query   |    0 | creating table | create table tab96 (i int) engine=innodb |         0 |             0 |          | 89484 | root | localhost | bench14 | Query   |    0 | creating table | create table tab95 (i int) engine=innodb |         0 |             0 |          +-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+          ...        

After creating the tables, I dropped all schemas concurrently:

[root@host ~]# time $(for db in {1..16}; do mysql -e "drop database bench${db}" & done)

So what was the difference with innodb_file_per_table ON vs OFF?

  • With innodb_file_per_table=ON
    • Schema and table creation = 1m54.852s
    • Schema drops = 1m21.682s
  • With innodb_file_per_table=OFF
    • Schema and table creation = 0m59.968s
    • Schema drops = 0m54.870s

So creation time decreased by 48%, drop time decreased by 33%.

I think its worth noting that this benchmark creates and drops empty tables. Dropping InnoDB tables created with innodb_file_per_table=ON can take much longer if they have large amounts of data.

Please also be aware that there are always trade-offs when modifying your InnoDB settings. That is outside the scope of this post, so please research and test before making changes. The MySQL documentation discusses that here.  In 5.6.6 and up, innodb_file_per_table is ON by default. MySQL 5.6 will also create temp tables as InnoDB, as noted here.

So there you have it. If your primary goal is to improve create and drop table time, turning OFF innodb_file_per_table will save significant amounts of time.

The post Is MySQL’s innodb_file_per_table slowing you down? appeared first on MySQL Performance Blog.

MySQL community t-shirt contest for Percona Live 2015

February 23, 2015 - 3:00am

Have designs on Percona Live this April in Silicon Valley? Send them to us! The winning entry will appear on a cool limited edition t-shirt that we’ll award to a few dozen lucky recipients at our booth’s new Percona T-Shirt Bar. The winner will also get a t-shirt, of course, along with a pair “Beats by Dre” headphones!

That’s right: We’re calling on the creative types within the MySQL community to come up with designs for a unique t-shirt.

Let your imaginations run free! Just make sure to include “Percona Live 2015” in there somewhere. You might also want to include your signature, hanko, seal or mark… treat the cotton as the canvas of your masterpiece… let the world know who created it!

Send your t-shirt designs to me as a high-resolution PDF or in .EPS format. The deadline for entries is March 6. The winner will be chosen under the sole discretion of Percona’s marketing team, taking into consideration quality of design, values alignment, trademark clearance and general awesomeness. (Submitted designs assume unlimited royalty free rights usage by Percona. We also reserve the right to declare no winners if there are no suitable designs submitted. You do not need to register or attend the conference to submit a design.)

Click here to submit your design for the MySQL community t-shirt contest!

By the way, the image on this post is not a template. You have free rein so go get ‘em! And just to be clear: this won’t be the t-shirt that everyone receives at the registration booth. However, it just might be one of the most coveted t-shirts at the conference!

I’ll share the winning design the week of March 9. Good luck and I hope to see you all this April and the Percona Live MySQL Conference and Expo! The conference runs April 13-16 at the Hyatt Regency Santa Clara & the Santa Clara Convention Center.

 

The post MySQL community t-shirt contest for Percona Live 2015 appeared first on MySQL Performance Blog.

How to setup a PXC cluster with GTIDs (and have async slaves replicating from it!)

February 20, 2015 - 12:00am

This past week was marked by a series of personal findings related to the use of Global Transaction IDs (GTIDs) on Galera-based clusters such as Percona XtraDB Cluster (PXC). The main one being the fact that transactions touching MyISAM tables (and FLUSH PRIVILEGES!) issued on a giving node of the cluster are recorded on a GTID set bearing the node’s server_uuid as “source id” and added to the binary log (if the node has binlog enabled), thus being replicated to any async replicas connected to it. However, they won’t be replicated across the cluster (that is, all of this is by design, if wsrep_replicate_myisam is disabled, which it is by default).

My colleague Stéphane covered this story in one of his recent blog posts titled, “Percona XtraDB Cluster 5.6: a tale of 2 GTIDs,” explaining how those local (in reference to the node) transactions are saved in a different GTID set to the cluster’s main one and the impact this may cause when re-pointing an async slave to replicate from a different node.

GTIDs is a feature introduced in MySQL 5.6 that made replication management much easier and considering there’s a series of advantages in having an async replica attached to a PXC cluster, why hasn’t this popped out earlier to either of us? I guess there aren’t so many people using GTIDs with Galera-based clusters around yet so here’s a post to show you how to do it.

Initializing a PXC cluster configured with GTIDs

My testing environment for a 3-node cluster is composed of node1 (192.168.70.2), node2 (.3) and node3 (.4). All of them have the same PXC binaries installed:

$ rpm -qa |grep -i percona-xtradb-cluster Percona-XtraDB-Cluster-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-galera-3-3.8-1.3390.rhel6.x86_64 Percona-XtraDB-Cluster-server-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-client-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-shared-56-5.6.21-25.8.938.el6.x86_64

and are configured with an almost identical /etc/my.cnf (apart from wsrep_node_address and the prompt line):

[mysql] prompt='mysql {node1} > ' [mysqld] datadir = /var/lib/mysql log_warnings=2 server_id=1 log_bin=percona-bin log_slave_updates binlog_format = ROW enforce_gtid_consistency=1 gtid_mode=on wsrep_cluster_name = my-three-node-cluster wsrep_cluster_address = gcomm://192.168.70.2,192.168.70.3,192.168.70.4 wsrep_node_address = 192.168.70.2 wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = sst:secret innodb_locks_unsafe_for_binlog = 1 innodb_autoinc_lock_mode = 2

server_id, log_bin, log_slave_updates and binlog_format are not needed for the cluster to operate but they are required to setup regular replication later on so I’ve added those to let the nodes ready to operate as masters.

We start with an empty, inexisting datadir on all nodes so I’ll use the mysql_install_db script to create a base datadir with all that is needed for MySQL to work on node1, which will be the reference node of the cluster:

[node1]$ mysql_install_db --user=mysql

We’re now ready to bootstrap the cluster from this reference node:

[node1]$ service mysql bootstrap-pxc

With that, we have an operational reference node:

mysql [node1] > select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | 834bca7f-b45e-11e4-a7b5-0800272e951c | +--------------------------------------+ 1 row in set (0.00 sec)
Note from above my.cnf settings that I’ve chosen xtrabackup-v2 as the State Snapshot Transfer (SST) method, which requires authentication (wsrep_sst_auth). For this reason, if we now try to start MySQL on node2 it will fail with its error log showing:
2015-02-14 16:58:26 24149 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --addre ss '192.168.70.3' --auth 'sst:secret' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '24149' --binlog 'percona-bin' : 1 (Operation not permitted) 2015-02-14 16:58:26 24149 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. 2015-02-14 16:58:26 24149 [ERROR] WSREP: SST failed: 1 (Operation not permitted) 2015-02-14 16:58:26 24149 [ERROR] Aborting
The problem here is that the SST method requires MySQL authentication: even though the credentials are shared on the wsrep_sst_auth variable of all node’s my.cnf configuration file this only tells xtrabackup-v2 to require it, it doesn’t actually configure MySQL with it. That’s a step left for us:
mysql [node1] > GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst'@'localhost' IDENTIFIED BY 'secret'; Query OK, 0 rows affected (0.02 sec)
And this constitutes our very first commited transaction, which goes into the cluster’s GTID set:
mysql [node1] > select @@global.gtid_executed; +----------------------------------------+ | @@global.gtid_executed | +----------------------------------------+ | 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1 | +----------------------------------------+ 1 row in set (0.00 sec)
This GRANT only needs to be issued once, in the reference node. Now you start MySQL on node2 and node3: they will use XtraBackup to make a backup of node1’s dataset, which will contain this GRANT, and restore it in their own datadir:
[node2]$ service mysql start
and:
[node3]$ service mysql start

OK, that’s done. But how do I attach an async replica to the cluster?

Just to make it clear, you cannot attach an async replica to the cluster: you need to choose a node that is member of the cluster and make it the master for the async replica. Considering all nodes should have the same data replicated it should be easy to change the async replica’s configuration and make it replicate from any other node from the cluster – and it is, though you may need to do some adjustments (more on this later).

The initial procedure is exactly the same one used with regular replication: you start by taking a backup of the master and restoring it on the replica. We’ll use XtraBackup again to perform the backup and we’ll start by having node2 as the master of our async replica (192.168.70.7). We could stream the backup from node2 directly to the async replica and later “prepare” it (by applying the logs, which needs to be done using the same version of Percona XtraBackup that you’ve used to take the backup), but to make things simple we’ll first take the backup on node2:

[node2]$ innobackupex /tmp (...) innobackupex: Using server version 5.6.21-70.1-56-log innobackupex: Created backup directory /tmp/2015-02-14_17-53-22 (...) 150214 17:53:26 innobackupex: completed OK!

then “prepare” it:

[node2]$ innobackupex --apply-log /tmp/2015-02-14_17-53-22 (...) 150214 17:56:10 innobackupex: Starting the apply-log operation (...) 150214 17:56:14 innobackupex: completed OK!

and from our async replica we’ll copy the backup from node2 using rsync over ssh (you could use scp or any other mathod to copy the files from one server to the other):

[replica]$ rsync -av -e 'ssh -l root' 192.168.70.3:/tmp/2015-02-14_17-53-22/ /var/lib/mysql

We now need to change the ownership of those files to the ‘mysql‘ user:

[replica]$ chown mysql:mysql -R /var/lib/mysql

and take note of the “replication coordinates” – but related to GTID. We have those in the xtrabackup_info file:

[replica]$ cat /var/lib/mysql/xtrabackup_info (...) binlog_pos = GTID of the last change '7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1' (...)

In our case we took the backup very early in this database’s lifecycle (seeing by the GTID # in there) but in practice that’s seldom the case. We can now start MySQL:

[replica]$ service MySQL start

Before we can point our async replica to replicate from node2 we need to create a replication user with the right set of privileges there:

mysql [node2] > GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO repl@'192.168.70.%' IDENTIFIED BY 'p4ssword'; Query OK, 0 rows affected (0.06 sec)

Now we can use CHANGE MASTER TO on our async replica to point it to node2 using the above credentials:

mysql [replica] > CHANGE MASTER TO MASTER_HOST='192.168.70.3', MASTER_USER='repl', MASTER_PASSWORD='p4ssword', MASTER_AUTO_POSITION = 1; Query OK, 0 rows affected, 2 warnings (0.05 sec)

If we START SLAVE now we’ll run into error 1236: we need first to set the replication coordinates from when the backup was taken. With GTID replication, this is done in a different manner: instead of providing replicate coordinates in the CHANGE MASTER TO command (where we’ve used MASTER_AUTO_POSITION=1 instead) we do it by redefining the gtid_purged global variable with the GTID sets we got from the xtrabackup_info file (in this example there’s only one set):

mysql [replica] > SET @@global.gtid_purged='7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1'; Query OK, 0 rows affected (0.02 sec)

We can now START SLAVE and check its status:

mysql [replica] > START SLAVE; SHOW SLAVE STATUSG Query OK, 0 rows affected (0.00 sec) *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.70.3 Master_User: repl (...) Slave_IO_Running: Yes Slave_SQL_Running: Yes (...) Master_Server_Id: 2 Master_UUID: 8a157d9c-b465-11e4-aafa-0800272e951c (...) Retrieved_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:2 Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-2 Auto_Position: 1 1 row in set (0.00 sec)

Note that Executed_Gtid above is showing a second transaction in the cluster’s GTID set: this is related to the GRANT statement we issued on node2 to setup the replication account and it means it is now configured on all nodes members of the cluster (and this async replica as well). With that in place we can easily point our async replica to replicate from a different node, such as node3:

mysql [replica] > STOP SLAVE; CHANGE MASTER TO MASTER_HOST='192.168.70.4'; START SLAVE; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec)

Let’s verify this is indeed the case:

mysql [replica] > SHOW SLAVE STATUSG *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.70.4 Master_User: repl (...) Slave_IO_Running: Yes Slave_SQL_Running: Yes (...) Master_Server_Id: 3 Master_UUID: aa9acb85-b465-11e4-ab09-0800272e951c (...) Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-2 Auto_Position: 1

Nice! What about the caveats you were talking about in the other blog post?

The problem arises when you start to have other GTID sets on the gtid_executed variable of the nodes. This happens when a DML transaction involving a MyISAM table is issued on the node while having wsrep_replicate_myisam disabled: instead of having that transaction going to the cluster’s main GTID set it will go to a new one, bearing the node’s server_uuid as source id. In fact, that’s the behavior you find on regular MySQL replication configured with GTIDs. Here’s an example:

mysql [node1] > CREATE TABLE test.fernando1 (id int) ENGINE=MYISAM; Query OK, 0 rows affected (0.03 sec)

Creating a MyISAM table per si is not an issue, because this is a DDL, so the transaction will go to the cluster’s GTID set and replicated to the other nodes:

mysql [node1] > select @@global.gtid_executedG *************************** 1. row *************************** @@global.gtid_executed: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3 1 row in set (0.00 sec)

But doing an INSERT in such a table is a DML so the transaction will go to a different GTID set:

mysql [node1] > select @@global.gtid_executedG *************************** 1. row *************************** @@global.gtid_executed: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 834bca7f-b45e-11e4-a7b5-0800272e951c:1 1 row in set (0.00 sec)

Now if we move our async replica to node1 it might just works:

mysql [replica] > STOP SLAVE; CHANGE MASTER TO MASTER_HOST='192.168.70.2'; START SLAVE; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.02 sec) Query OK, 0 rows affected (0.01 sec) mysql [replica] > show slave statusG *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.70.2 (...) Retrieved_Gtid_Set: 834bca7f-b45e-11e4-a7b5-0800272e951c:1 Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 834bca7f-b45e-11e4-a7b5-0800272e951c:1 Auto_Position: 1 1 row in set (0.00 sec)

and that is because transaction ‘834bca7f-b45e-11e4-a7b5-0800272e951c:1‘ could still be found on the binary logs of node1. Let’s do something slightly different now, but on node2:

mysql [node2] > INSERT INTO test.fernando1 values (2); Query OK, 1 row affected (0.01 sec) mysql [node2] > select @@global.gtid_executedG *************************** 1. row *************************** @@global.gtid_executed: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 8a157d9c-b465-11e4-aafa-0800272e951c:1 1 row in set (0.00 sec) mysql [node2] > SHOW MASTER STATUSG *************************** 1. row *************************** File: percona-bin.000008 Position: 923 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 8a157d9c-b465-11e4-aafa-0800272e951c:1 1 row in set (0.00 sec) mysql [node2] > FLUSH LOGS; Query OK, 0 rows affected (0.01 sec) mysql [node2] > SHOW MASTER STATUSG *************************** 1. row *************************** File: percona-bin.000009 Position: 231 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 8a157d9c-b465-11e4-aafa-0800272e951c:1 1 row in set (0.00 sec) mysql [node2] > PURGE BINARY LOGS TO 'percona-bin.000009'; Query OK, 0 rows affected (0.01 sec)

Transaction ‘8a157d9c-b465-11e4-aafa-0800272e951c:1‘ that contains the INSERT statement we issued from node2 was recorded into binary log file percona-bin.000008 but the following PURGE BINARY LOGS TO command deleted that file. Now, if we point our async replica back to node2 we’ll run into a problem:

mysql [replica] > STOP SLAVE; CHANGE MASTER TO MASTER_HOST='192.168.70.3'; START SLAVE; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec) mysql [replica] > show slave statusG *************************** 1. row *************************** Slave_IO_State: Master_Host: 192.168.70.3 (...) Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.' (...) Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 834bca7f-b45e-11e4-a7b5-0800272e951c:1 Auto_Position: 1 1 row in set (0.00 sec)

The master can no longer provide transaction ‘834bca7f-b45e-11e4-a7b5-0800272e951c:1‘ to the replica as requested thus breaking replication.

Ouch! Is there a fix for this?

Well, kind of. You can trick MySQL into believing it has processed this transaction by injecting an empty transaction bearing the same GTID of the missing one:

mysql [replica] > SELECT GTID_SUBTRACT('7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3,8a157d9c-b465-11e4-aafa-0800272e951c:1', '7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3,834bca7f-b45e-11e4-a7b5-0800272e951c:1')G *************************** 1. row *************************** GTID_SUBTRACT('7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3,8a157d9c-b465-11e4-aafa-0800272e951c:1', '7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3,834bca7f-b45e-11e4-a7b5-0800272e951c:1'): 8a157d9c-b465-11e4-aafa-0800272e951c:1 1 row in set (0.00 sec) mysql [replica] > SET GTID_NEXT='8a157d9c-b465-11e4-aafa-0800272e951c:1'; Query OK, 0 rows affected (0.00 sec) mysql [replica] > BEGIN;COMMIT; Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) mysql [replica] > SET gtid_next = 'AUTOMATIC'; Query OK, 0 rows affected (0.00 sec) mysql [replica] > show slave statusG *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.70.3 (...) Slave_IO_Running: Yes Slave_SQL_Running: Yes (...) Executed_Gtid_Set: 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1-3, 834bca7f-b45e-11e4-a7b5-0800272e951c:1, 8a157d9c-b465-11e4-aafa-0800272e951c:1 Auto_Position: 1 1 row in set (0.00 sec)

The problem is that you’ll end up with an inconsistent dataset – injecting an empty transaction is the same as using SQL_SLAVE_SKIP_COUNTER on regular MySQL replication.

Take-home lesson

It is not complicated to configure a PXC cluster to use GTIDs, in fact you may prefer to do so once you get used to it. And it makes that all much easier to have an asynchronous slave replicating from one of the nodes, even moving it around to replicate from a different node. It should all go well while the nodes record their transactions on the cluster’s main GTID set, which is shared by all nodes: this implies all of them will have the same transactions in their binlogs, as designed. The problem appears when we start seeing a deviation on this pattern, with a few nodes recording local transactions on their “own” GTID set. This won’t bother the cluster operation per si, as these other GTID sets are simply ignored and won’t be replicated to other nodes, but they may complicate things for attached async replicas if you need to point them to a different node.

In such environments, strive to maintain a unified GTID set around the cluster’s main one and if you find out that one of the nodes has started adding transactions to a different set, investigate it. You shouldn’t be using MyISAM tables on an XtraDB Cluster as those aren’t officially supported but if you must do them you should probably consider using wsrep_replicate_myisam. Operations on mysql system tables should be done through the use of DDLs instead of DMLs (like GRANT instead of INSERT) and, above all, keep distance from this bug.

The post How to setup a PXC cluster with GTIDs (and have async slaves replicating from it!) appeared first on MySQL Performance Blog.

More on MySQL 5.6 multi-threaded replication and GTIDs (and Feb. 25 webinar)

February 19, 2015 - 12:00am

In a previous post, titled “Multi-threaded replication with MySQL 5.6: Use GTIDs,” I explained that using GTID replication is almost a requirement when using MySQL 5.6 MTS. Let’s see now how to perform the day-to-day operations when MTS and GTIDs are both enabled. (I’ll also be presenting a related webinar next week titled “Multi-threaded Replication in MySQL 5.6 and 5.7″).

Seeing the execution gaps

If you have a look at SHOW SLAVE STATUS while the slave is running, you may not be expecting such an output:

[...] Executed_Gtid_Set: 1381aa44-9a60-11e4-b6d8-94dbc999324d:1-2520:2522:2524:2526-2528:2531-2533:2536-2538:2540-2541:2544:2546-2547:2550-2551:2555:2565-2566:2569:2575-2577:2579-2581:2584-2586:2588:2590-2591:2595-2597:2599:2602:2604-2605:2607-2610:2613:2615-2620:2622-2624:2626-2627:2629:2631:2634:2636-2639:2641-2642:2644:2646-2647:2649:2651-2653:2657-2658:2661-2662:2666-2672:2676-2678:2680:2683-2684:2686-2693:2695:2701:2704:2706-2707:2709:2711:2713-2714:2717:2720-2722:2729-2730:2735:2744:2746:2749:2751-2752:2762:2764-2765:2768-2769:2771:2774:2776:2780-2782:2784:2786-2787:2789:2791:2793:2800:2803:2805-2807:2809:2811-2814:2816-2817:2819-2820:2822-2826:2828-2834:2837-2840:2842:2844-2845:2847:2850-2851:2853:2855:2857-2859:2861-2863:2865-2868:2870-2871:2873-2874:2878:2880-2884:2886-2888:2891:2893:2895-2896:2899:2903:2906-2907:2910:2912:2915-2918:2921-2923:2925-2926:2930:2932:2934:2936:2939-2940:2943-2944:2946:2948-2957:2966:2969-2970:2974:2976:2979-2980:2982:2985:2987:2989:2994:2996-2997:2999:3001:3003:3006:3008:3011-3013 [...]

Ouch! What does that insane list of GTIDs mean?

It is actually easy to understand as long as you know that the GTID of all executed transaction is tracked in Executed_Gtid_Set and that execution gaps are allowed with MTS.

Then 1-2520:2522:2524 simply means that transactions #1 to #2520 have been executed, as well as transactions #2522 and #2524, but not #2521 and #2523.

You can also see that a gap at a specific position will not last for long. If you run SHOW SLAVE STATUS an instant later, you will see:

[...] Executed_Gtid_Set: 1381aa44-9a60-11e4-b6d8-94dbc999324d:1-4095:4098:4100:4103-4105[...]

This time the first execution gap is for transaction #4096.

If you stop the writes on the master, all gaps will be filled once replication has caught up and you will simply see:

[...] Executed_Gtid_Set: 1381aa44-9a60-11e4-b6d8-94dbc999324d:1-20599 [...]

Dealing with replication errors

Say replication has stopped with an error and you want to skip the offending event. The only option with GTID replication is to inject an empty transaction, which in turn means you must know the GTID of the transaction you want to skip.

Let’s look at SHOW SLAVE STATUS:

Executed_Gtid_Set: 1381aa44-9a60-11e4-b6d8-94dbc999324d:1-1052769:1052805-1052806:1052808:1052810-1052811:1052814:1052816:1052819:1052823:1052825:1052827-1052828:1052830-1052831:1052835:1052838:1052840:1052842:1052844-1052846:1052848-1052849:1052851-1052852:1052855-1052857:1052859:1052862-1052863:1052867-1052868:1052870:1052873-1052875:1052878-1052879:1052882-1052883:1052885:1052887:1052890-1052892:1052896:1052901:1052905:1052908-1052909:1052911:1052915:1052917-1052918:1052922-1052923:1052927-1052929:1052931-1052933:1052937-1052938:1052940:1052943:1052946:1052948-1052949:1052953:1052955-1052956:1052958:1052962-1052964:1052967-1052969:1052972:1052975-1052977:1052979:1052981-1052983:1052985:1052987:1052989:1052991:1052993-1052995:1052999:1053001:1053003:1053005-1053016:1053018:1053020:1053024-1053026:1053029:1053032-1053034:1053037-1053038:1053040:1053043:1053045-1053046

So which transaction should we skip: probably 1381aa44-9a60-11e4-b6d8-94dbc999324d:1052770, right? This is the first transaction that could not be executed.

This is confirmed by looking at the Last_SQL_Error field:

Last_SQL_Error: Worker 0 failed executing transaction '1381aa44-9a60-11e4-b6d8-94dbc999324d:1052770' [...]

Once you know the GTID to skip, it is easy to restart replication (and fix the inconsistency later):

mysql> SET gtid_next='1381aa44-9a60-11e4-b6d8-94dbc999324d:1052770'; mysql> BEGIN;COMMIT; mysql> SET gtid_next='AUTOMATIC'; mysql> START SLAVE;

Taking backups

When using GTID replication, taking a backup from a multi-threaded slave is not difficult at all.

With Percona XtraBackup, simply add the --slave-info option as usual and you will get the list of executed GTIDs in the xtrabackup_slave_info file:

$ less xtrabackup_slave_info SET GLOBAL gtid_purged='1381aa44-9a60-11e4-b6d8-94dbc999324d:1-1095246:1095248-1095249:1095253-1095254:1095257-1095266:1095270-1095271:1095273:1095275:1095284-1095285:1095296:1095305:1095311-1095312:1095315-1095316:1095318:1095321:1095323-1095324:1095326-1095332:1095335:1095337-1095338:1095348-1095351:1095353-1095354:1095356:1095358-1095359:1095363-1095364:1095366:1095368-1095369:1095372:1095375-1095376:1095378:1095380-1095385:1095388:1095390-1095392:1095394-1095411:1095419:1095424-1095425:1095428:1095430-1095433:1095437-1095439:1095442:1095444-1095445:1095447:1095449:1095461-1095464:1095468:1095473:1095482-1095484:1095488-1095490:1095494:1095496-1095497:1095499-1095500:1095502:1095505:1095507:1095509:1095511-1095516:1095521:1095524-1095525:1095527:1095530-1095531:1095534-1095552:1095554:1095557:1095559-1095560:1095563:1095566:1095569:1095572-1095573:1095582:1095594-1095595:1095597:1095601-1095605:1095607-1095610:1095612-1095614:1095618:1095621-1095623:1095625-1095628:1095630:1095632:1095634:1095636:1095639-1095642:1095645:1095649:1095659:1095661:1095664-1095665:1095667-1095669:1095671:1095674'; CHANGE MASTER TO MASTER_AUTO_POSITION=1

Then starting replication on a new instance bootstrapped from this backup is easy:

mysql> SET GLOBAL gtid_purged='...'; mysql> RESET SLAVE ALL; mysql> CHANGE MASTER TO [...], MASTER_AUTO_POSITION=1; mysql> START SLAVE;

With mysqldump, simply discard the --dump-slave option as the list of executed GTIDs will be automatically added at the top of the dump:

-- MySQL dump 10.13 Distrib 5.6.22, for linux-glibc2.5 (x86_64) [...] -- -- GTID state at the beginning of the backup -- SET @@GLOBAL.GTID_PURGED='1381aa44-9a60-11e4-b6d8-94dbc999324d:1-1095246:1095248-1095249:1095253-1095254:1095257-1095266:1095270-1095271:1095273:1095275:1095284-1095285:1095296:1095305:1095311-1095312:1095315-1095316:1095318:1095321:1095323-1095324:1095326-1095332:1095335:1095337-1095338:1095348-1095351:1095353-1095354:1095356:1095358-1095359:1095363-1095364:1095366:1095368-1095369:1095372:1095375-1095376:1095378:1095380-1095385:1095388:1095390-1095392:1095394-1095411:1095419:1095424-1095425:1095428:1095430-1095433:1095437-1095439:1095442:1095444-1095445:1095447:1095449:1095461-1095464:1095468:1095473:1095482-1095484:1095488-1095490:1095494:1095496-1095497:1095499-1095500:1095502:1095505:1095507:1095509:1095511-1095516:1095521:1095524-1095525:1095527:1095530-1095531:1095534-1095552:1095554:1095557:1095559-1095560:1095563:1095566:1095569:1095572-1095573:1095582:1095594-1095595:1095597:1095601-1095605:1095607-1095610:1095612-1095614:1095618:1095621-1095623:1095625-1095628:1095630:1095632:1095634:1095636:1095639-1095642:1095645:1095649:1095659:1095661:1095664-1095665:1095667-1095669:1095671:1095674';

And then replication can be started like stated before.

Conclusion

Seeing the execution gaps in the output of SHOW SLAVE STATUS can be disturbing at first sight, and of course you may have to change a few habits, but overall there is no specific issue when using GTIDs and MTS.

Next week I will be presenting a free webinar on multi-threaded replication (Wednesday February 25th at 10 a.m. PST). If you are interested in learning more on the topic, feel free to register. It will also be recorded – you’ll be able to use that same link to watch the presentation and download my slides.

The post More on MySQL 5.6 multi-threaded replication and GTIDs (and Feb. 25 webinar) appeared first on MySQL Performance Blog.

Percona submits 7 talks for Vancouver OpenStack Summit (voting ends Feb. 23)

February 18, 2015 - 6:43am

Percona has submitted seven talks for the next OpenStack Summit in Vancouver this May. And as with all OpenStack Summit events, the community decides the content. Voting ends February 23, and if you aren’t already an OpenStack Foundation member (required to vote), you can join now for free here.

Percona’s Vancouver OpenStack Summit proposals

Percona’s proposals include collaborations with top contributors across a variety of OpenStack services including Trove and Swift. You can vote for our talks by clicking the titles below that interest you.

MySQL and OpenStack Deep Dive
Speakers: Peter Boros, Jay Pipes (Mirantis)

Deep Dive into MySQL Replication with OpenStack Trove, and Kilo
Speakers: George Lorch, Amrith Kumar (Tesora)

MySQL on Ceph Storage: Exploring Design, Challenges and Benefits
Speakers: Yves Trudeau, Kyle Bader (Red Hat)

Core Services MySQL Database Backup & Recovery to Swift
Speakers: Kenny Gryp, Chris Nelson (SwiftStack)

Smart MySQL Log Management with Swift
Speakers: Matt Griffin, Chris Nelson (SwiftStack)

Discovering Better Database Architectures For Core Services In OpenStack
Speakers: Kenny Gryp, Matt Griffin

Upgrading your OpenStack Core Services On The Database Side: Nova, Neutron, Cinder…
Speakers: Kenny Gryp, Matt Griffin

Other interesting proposals

Here are a few proposals from other organizations that look particularly interesting. Please consider them as well.

Exploration into OpenStack Trove, customer use cases, and the future of Trove for the community
Speakers: Amrith Kumar (Tesora), Brad Topol (IBM), Mariam John (IBM)

The Entrepreneur’s Challenge: The Realities of Starting an OpenStack Company
Speakers: Simon Anderson (Dreamhost, Inktank), Ken Rugg (Tesora), Josh McKenty (Pivotal, Piston Cloud), Jesse Proudman (BlueBox), Joe Arnold (SwiftStack)

Making a Case for Your OpenStack Deployment: How Vendor’s Can Help
Speaker: Ryan Floyd (Storm Ventures)

Real World Experiences with Upgrading OpenStack at Time Warner Cable
Speakers: Clayton ONeill (Time Warner Cable), Matt Fischer (Time Warner Cable)

Percona & OpenStack

According to the most recent OpenStack User Survey in November 2014, Percona’s database software is a popular choice for OpenStack operators needing high availability.

OpenStack User Survey results from November 2014 shows Percona XtraDB Cluster as the top Galera-based choice for production clouds.

Percona XtraDB Cluster, the top Galera-based MySQL cluster solution for production OpenStack deployments, incorporates the latest version of MySQL 5.6, Percona Server 5.6, Percona XtraBackup, and Galera. This combination delivers top performance, high availability, and critical security coverage with the latest features and fixes. Additionally, Percona Server is a popular guest database option with unique features designed for cloud operators offering DBaaS.

In addition to sharing our open source software with the OpenStack community, Percona is sharing our expertise in services like Trove, projects like the HA Guide update, extensive benchmark testing activities, and upcoming events like OpenStack Live 2015. The inaugural OpenStack Live Conference, April 13-14 in Santa Clara, California, will be a user-focused event. The program will cover database-related on topics like Trove as well as other OpenStack services and feature multiple 3-hour hands-on tutorials.

Percona is a proud supporter of OpenStack and we hope to see you in both Santa Clara in April and Vancouver in May. And in the meantime, don’t forget to vote!

The post Percona submits 7 talks for Vancouver OpenStack Summit (voting ends Feb. 23) appeared first on MySQL Performance Blog.

Percona XtraBackup 2.2.9 is now available

February 17, 2015 - 10:22am

Percona is glad to announce the release of Percona XtraBackup 2.2.9 on February 17, 2015. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

Bugs Fixed:

  • Percona XtraBackup was vulnerable to MITM attack which could allow exfiltration of MySQL configuration information via --version-check option. This vulnerability was logged as CVE 2015-1027. Bug fixed #1408375.
  • xtrabackup_galera_info file isn’t overwritten during the Galera auto-recovery. Bug fixed #1418584.
  • Percona XtraBackup man pages are now included with binary packages. Bug fixed #1156209.
  • Percona XtraBackup now sets the maximum supported session value for lock_wait_timeout variable to prevent unnecessary timeouts when the global value is changed from the default. Bug fixed #1410339.
  • New option --backup-locks, enabled by default, has been implemented to control if backup locks will be used even if they are supported by the server. To disable backup locks innobackupex should be run with innobackupex --no-backup-locks option. Bug fixed #1418820.

Release notes with all the bugfixes for Percona XtraBackup 2.2.9 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.9 is now available appeared first on MySQL Performance Blog.

‘Indexing’ JSON documents for efficient MySQL queries over JSON data

February 17, 2015 - 8:45am
MySQL meets NoSQL with JSON UDF

I recently got back from FOSDEM, in Brussels, Belgium. While I was there I got to see a great talk by Sveta Smirnova, about her MySQL 5.7 Labs release JSON UDF functions. It is important to note that while the UDF come in a 5.7 release it is absolutely possible to compile and use the UDF with earlier versions of MySQL because the UDF interface has not changed for a long time. However, the UDF should still be considered alpha/preview level of quality and should not be used in production yet! For this example I am using Percona Server 5.6 with the UDF.

That being said, the proof-of-concept that I’m about to present here uses only one JSON function (JSON_EXTRACT) and it has worked well enough in my testing to present my idea here. The JSON functions will probably be GA sometime soon anyway, and this is a useful test of the JSON_EXTRACT function.

The UDF let you parse, search and manipulate JSON data inside of MySQL, bringing MySQL closer to the capabilities of a document store.

Since I am using Percona Server 5.6, I needed to compile and install the UDF. Here are the steps I took to compile the plugin:

  1. $ cd mysql-json-udfs-0.3.3-labs-json-udfs-src
  2. $ cmake -DMYSQL_DIR=/usr/local/mysql .
  3. $ sudo make install
  4. $ sudo cp *.so /usr/local/mysql/lib/plugin
JSON UDF are great, but what’s the problem

The JSON functions work very well for manipulating individual JSON objects, but like all other functions, using JSON_EXTRACT in the WHERE clause will result in a full table scan. This means the functions are virtually useless for searching through large volumes of JSON data.  If you want to use MySQL as a document store, this is going to limit the usefulness in the extreme as the ability to extract key/value pairs from JSON documents is powerful, but without indexing it can’t scale well.

What can be done to index JSON in MySQL for efficient access?

The JSON UDF provides a JSON_EXTRACT function which can pull data out of a JSON document. There are two ways we can use this function to “index” the JSON data.

  1. Add extra columns to the table (or use a separate table, or tables) containing the JSON and populate the columns using JSON_EXTRACT in a trigger. The downside is that this slows down inserts and modifications of the documents significantly.
  2. Use Flexviews materialized views to maintain an index table separately and asynchronously. The upside is that insertion/modification speed is not affected, but there is slight delay before index is populated. This is similar to eventual consistency in a document store.

Writing triggers is an exercise I’ll leave up to the user. The rest of this post will discuss using Flexviews materialized views to create a JSON index.

What is Flexviews?

Flexviews can create ‘incrementally refreshable’ materialized views. This means that the views are able to be refreshed efficiently using changes captured by FlexCDC, the change data capture tool that ships with Flexviews. Since the view can be refreshed fast, it is possible to refresh it frequently and have a low latency index, but not one perfectly in sync with the base table at all times.

The materialized view is a real table that is indexed to provide fast access. Flexviews includes a SQL_API, or a set of stored procedures for defining and maintaining materialized views.

See this set of slides for an overview of Flexviews: http://www.slideshare.net/MySQLGeek/flexviews-materialized-views-for-my-sql

Demo/POC using materialized view as an index

The first step to creating an incrementally refreshable materialized view with Flexviews, is to create a materialized view change log on all of the tables used in the view. The CREATE_MVLOG($schema, $table) function creates the log and FlexCDC will immediately being to collect changes into it.

mysql> call flexviews.create_mvlog('ssb_json','json'); Query OK, 1 row affected (0.01 sec)

Next, the materialized view name, and refresh type must be registered with the CREATE($schema, $mvname, $refreshtype) function:

mysql> call flexviews.create('ssb_json','json_idx','INCREMENTAL'); Query OK, 0 rows affected (0.00 sec) -- save the just generated identifier for the view. You can use GET_ID($schema,$mvname) later. mysql> set @mvid := last_insert_id(); Query OK, 0 rows affected (0.00 sec)

Now one or more tables have to be added to the view using the ADD_TABLE($mvid, $schema, $table, $alias,$joinclause) function. This example will use only one table, but Flexviews supports joins too.

mysql> call flexviews.add_table(@mvid, 'ssb_json','json','json',null); Query OK, 1 row affected (0.00 sec)

Expressions must be added to the view next. Since aggregation is not used in this example, the expressions should be ‘COLUMN’ type expressions. The function ADD_EXPR($mvid, $expression_type, $expression, $alias) is used to add expressions. Note that JSON_EXTRACT returns a TEXT column, so I’ve CAST the function to integer so that it can be indexed. Flexviews does not currently have a way to define prefix indexes.

mysql> call flexviews.add_expr(@mvid, 'COLUMN', "cast(json_extract(doc,'D_DateKey') as date)", 'D_DateKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'C_CustomerKey') as unsigned)", 'C_CustomerKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'S_SuppKey') as unsigned)", 'S_SuppKey'); Query OK, 1 row affected (0.01 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'P_PartKey') as unsigned)", 'P_PartKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'LO_OrderKey') as unsigned)", 'LO_OrderKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'LO_LineNumber') as unsigned)", 'LO_LineNumber'); Query OK, 1 row affected (0.00 sec)

I’ve also projected out the ‘id’ column from the table, which is the primary key. This ties the index entries to the original row, so that the original document can be retrieved.

mysql> call flexviews.add_expr(@mvid, 'COLUMN', 'id', 'id'); Query OK, 1 row affected (0.00 sec)

Since we want to use the materialized view as an index, we need to index the columns we’ve added to it.

mysql> call flexviews.add_expr(@mvid, 'KEY',"LO_LineNumber", 'LO_LineNumber_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"LO_OrderKey", 'LO_OrderKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"P_PartKey", 'P_PartKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"S_SuppKey", 'S_SuppKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"D_DateKey", 'D_DateKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"C_CustomerKey", 'C_CustomerKey_Idx'); Query OK, 1 row affected (0.00 sec)

Finally, the view has to be created. There are 6 million rows in my table, the JSON functions are UDF so they are not as fast as built in functions, and I indexed a lot of things (six different indexes are being populated at once) so it takes some time to build the index:

mysql> call flexviews.enable(@mvid); Query OK, 2 rows affected (35 min 53.17 sec)

After the materialized view is built, you can see it in the schema. Note there is also a delta table, which I will explain a bit later.

mysql> show tables; +--------------------+ | Tables_in_ssb_json | +--------------------+ | json | | json_idx | | json_idx_delta | +--------------------+ 3 rows in set (0.00 sec)

Here is the table definition of json_idx, our materialized view. You can see it is indexed:

CREATE TABLE `json_idx` ( `mview$pk` bigint(20) NOT NULL AUTO_INCREMENT, `D_DateKey` date DEFAULT NULL, `C_CustomerKey` bigint(21) unsigned DEFAULT NULL, `S_SuppKey` bigint(21) unsigned DEFAULT NULL, `P_PartKey` bigint(21) unsigned DEFAULT NULL, `LO_OrderKey` bigint(21) unsigned DEFAULT NULL, `LO_LineNumber` bigint(21) unsigned DEFAULT NULL, `id` bigint(20) NOT NULL DEFAULT '0', `mview$hash` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`mview$pk`), KEY `LO_LineNumber_Idx` (`LO_LineNumber`), KEY `LO_OrderKey_Idx` (`LO_OrderKey`), KEY `P_PartKey_Idx` (`P_PartKey`), KEY `S_SuppKey_Idx` (`S_SuppKey`), KEY `D_DateKey_Idx` (`D_DateKey`), KEY `C_CustomerKey_Idx` (`C_CustomerKey`), KEY `mview$hash_key` (`mview$hash`) ) ENGINE=InnoDB AUTO_INCREMENT=6029221 DEFAULT CHARSET=latin1;

Here are some sample contents. You can see the integer values extracted out of the JSON:

mysql> select * from json_idx limit 10; +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ | mview$pk | D_DateKey | C_CustomerKey | S_SuppKey | P_PartKey | LO_OrderKey | LO_LineNumber | id | mview$hash | +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ | 1 | 1996-08-08 | 6748 | 1 | 178778 | 35620 | 2 | 1 | 3151656687 | | 2 | 1994-05-20 | 5272 | 1 | 52362 | 102790 | 4 | 2 | 2181615425 | | 3 | 1995-05-04 | 22870 | 1 | 136407 | 146757 | 3 | 3 | 544130577 | | 4 | 1996-06-16 | 12217 | 1 | 129103 | 151200 | 1 | 4 | 2969697839 | | 5 | 1992-07-20 | 21757 | 1 | 35243 | 151745 | 1 | 5 | 1438921571 | | 6 | 1997-08-16 | 18760 | 1 | 150766 | 159232 | 6 | 6 | 3941775529 | | 7 | 1994-03-04 | 757 | 1 | 15750 | 188902 | 3 | 7 | 2142628088 | | 8 | 1993-11-04 | 17830 | 1 | 192023 | 253828 | 5 | 8 | 3480147565 | | 9 | 1993-07-12 | 16933 | 1 | 59997 | 269062 | 5 | 9 | 3572286145 | | 10 | 1998-06-16 | 26660 | 1 | 30023 | 298272 | 3 | 10 | 1971966244 | +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ 10 rows in set (0.00 sec)

Now, there needs to be an easy way to use this index in a select statement. Since a JOIN is needed between the materialized view and the base table, a regular VIEW makes sense to access the data. We’ll call this the index view:

mysql> create view json_idx_v as select * from json natural join json_idx; Query OK, 0 rows affected (0.00 sec)

And just for completeness, here is the contents of a row from our new index view:

mysql> select * from json_idx_v limit 1G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} mview$pk: 1 D_DateKey: 1996-08-08 C_CustomerKey: 6748 S_SuppKey: 1 P_PartKey: 178778 LO_OrderKey: 35620 LO_LineNumber: 2 mview$hash: 3151656687 1 row in set (0.00 sec)

Using the UDF to find a document

The UDF does a full table scan, parsing all six million documents (TWICE!) as it goes along. Unsurprisingly, this is slow:

mysql> select * from json where json_extract(doc,'LO_OrderKey') = 35620 and json_extract(doc,'LO_LineNumber') = 2G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} 1 row in set (54.49 sec) mysql> explain select * from json where json_extract(doc,'LO_OrderKey') = 35620 and json_extract(doc,'LO_LineNumber') = 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: json type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5233236 Extra: Using where 1 row in set (0.00 sec)

Using the index view to find a document

mysql> select id, doc from json_idx_v where LO_OrderKey = 35620 and LO_LineNumber = 2G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} 1 row in set (0.00 sec) mysql> explain select id, doc from json_idx_v where LO_OrderKey = 35620 and LO_LineNumber = 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: json_idx type: ref possible_keys: LO_LineNumber_Idx,LO_OrderKey_Idx key: LO_OrderKey_Idx key_len: 9 ref: const rows: 4 Extra: Using index condition; Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: json type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: ssb_json.json_idx.id rows: 1 Extra: NULL 2 rows in set (0.00 sec)

Keeping the index in sync

Flexviews materialized views need to be refreshed when the underlying table changes. Flexviews includes a REFRESH($mvid, $mode, $transaction_id) function.

I am going to remove one document from the table:

mysql> delete from json where id = 10000; Query OK, 1 row affected (0.01 sec)

Note there is now one row in the materialized view change log. dml_type is -1 because it is a delete:

mysql> select * from flexviews.mvlog_f1673fac9814a93508a1c917566ecd4dG *************************** 1. row *************************** dml_type: -1 uow_id: 113 fv$server_id: 33 fv$gsn: 1083 id: 10000 doc: {"LO_OrderKey":"3359521","LO_LineNumber":"2","LO_CustKey":"10306","LO_PartKey":"77997","LO_SuppKey":"4","LO_OrderDateKey":"19951010","LO_OrderPriority":"2-HIGH","LO_ShipPriority":"0","LO_Quantity":"43","LO_ExtendedPrice":"8492457","LO_OrdTotalPrice":"27032802","LO_Discount":"2","LO_Revenue":"8322607","LO_SupplyCost":"118499","LO_Tax":"4","LO_CommitDateKey":"19951228","LO_ShipMode":"FOB","C_CustomerKey":"10306","C_Name":"Customer#000010306","C_Address":"4UR9tz8","C_City":"ROMANIA 5","C_Nation":"ROMANIA","C_Region":"EUROPE","C_Phone":"29-361-986-3513","C_MktSegment":"BUILDING","S_SuppKey":"4","S_Name":"Supplier#000000004","S_Address":"qGTQJXogS83a7MB","S_City":"MOROCCO 4","S_Nation":"MOROCCO","S_Region":"AFRICA","S_Phone":"25-128-190-5944","D_DateKey":"19951010","D_Date":"Octorber 10, 1995","D_DayOfWeek":"Wednesday","D_Month":"Octorber","D_Year":"1995","D_YearMonthNum":"199510","D_YearMonth":"Oct1995","D_DayNumInWeek":"4","D_DayNumInMonth":"10","D_DayNumInYear":"283","D_MonthNumInYear":"10","D_WeekNumInYear":"41","D_SellingSeason":"Fall","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"77997","P_Name":"burnished olive","P_MFGR":"MFGR#2","P_Category":"MFGR#24","P_Brand":"MFGR#2426","P_Colour":"orchid","P_Type":"MEDIUM PLATED TIN","P_Size":"16","P_Container":"WRAP PKG"} 1 row in set (0.01 sec)

Now we can verify the materialized view is out of date:

mysql> select * from json_idx where id = 10000; +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ | mview$pk | D_DateKey | C_CustomerKey | S_SuppKey | P_PartKey | LO_OrderKey | LO_LineNumber | id | mview$hash | +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ | 10000 | 1995-10-10 | 10306 | 4 | 77997 | 3359521 | 2 | 10000 | 2937185172 | +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ 1 row in set (2.60 sec)

To bring the index up to date we must refresh it. Usually you will use the ‘BOTH’ mode to ‘COMPUTE’ and ‘APPLY’ the changes at the same time, but I am going to use COMPUTE mode to show you what ends up in the delta table:

mysql> select * from json_idx_deltaG *************************** 1. row *************************** dml_type: -1 uow_id: 113 fv$gsn: 1083 D_DateKey: 1995-10-10 C_CustomerKey: 10306 S_SuppKey: 4 P_PartKey: 77997 LO_OrderKey: 3359521 LO_LineNumber: 2 id: 10000 mview$hash: 2937185172 1 row in set (0.00 sec)

Delta tables are similar to materialized view change log tables, except they contain insertions and deletions to the view contents. In this case, you can see dml_type is -1 and id = 10000, so the row from the view corresponding to the row we deleted, will be deleted when the change is applied.

Finally the change can be applied:

mysql> call flexviews.refresh(flexviews.get_id('ssb_json','json_idx'), 'APPLY',NULL); Query OK, 2 rows affected (0.47 sec) mysql> select * from json_idx where id = 10000; -- note, we didn't index id in the MV Empty set (2.61 sec)

Finally, it makes sense to try to keep the index in sync as quickly as possible using a MySQL event:

DELIMITER ;; CREATE EVENT IF NOT EXISTS flexviews.refresh_json_idx ON SCHEDULE EVERY 1 SECOND DO BEGIN DECLARE v_got_lock tinyint default 0; SELECT GET_LOCK('JSON_IDX_LOCK', 0) INTO v_got_lock; IF v_got_lock = 1 THEN CALL flexviews.refresh(flexviews.get_id('ssb_json','json_idx'),'BOTH',NULL); SELECT RELEASE_LOCK('JSON_IDX_LOCK') INTO @discard; END IF; END;; DELIMITER ;

So there you have it. A way to index and quickly search through JSON documents and keep the index in sync automatically.

The post ‘Indexing’ JSON documents for efficient MySQL queries over JSON data appeared first on MySQL Performance Blog.

Percona XtraDB Cluster 5.6: a tale of 2 GTIDs

February 13, 2015 - 6:27am

Say you have a cluster with 3 nodes using Percona XtraDB Cluster (PXC) 5.6 and one asynchronous replica connected to node1. If asynchronous replication is using GTIDs, moving the replica so that it is connected to node2 is trivial, right? Actually replication can easily break for reasons that may not be obvious at first sight.

Summary

Let’s assume we have the following setup with 3 PXC nodes and one asynchronous replica:


Regarding MySQL GTIDs, a Galera cluster behaves like a distributed master: transactions coming from any node will use the same auto-generated uuid. This auto-generated uuid is related to the Galera uuid, it’s neither ABC, nor DEF, nor GHI.

Transactions executed for instance on node1 but not replicated to all nodes with Galera replication will generate a GTID using the uuid of the node (ABC). This can happen for writes on MyISAM tables if wsrep_replicate_myisam is not enabled.

Such local transactions bring the same potential issues as errant transactions do for a regular master-slave setup using GTID-based replication: if node3 has a local transaction, when you connect replica1 to it, replication may break instantly.

So do not assume that moving replica1 from node2 to node3 is a safe operation if you don’t check errant transactions on node3 first.

And if you find errant transactions that you don’t want to get replicated to replica1, there is only one good fix: insert a corresponding empty transaction on replica1.

Galera GTID vs MySQL GTID

Both kinds of GTIDs are using the same format: <source_id:trx_number>.

For Galera, <source_id> is generated when the cluster is bootstrapped. This <source_id> is shared by all nodes.

For MySQL, <source_id> is the server uuid. So it is easy to identify from which server a transaction originates.

Knowing the Galera GTID of a transaction will give you no clue about the corresponding MySQL GTID of the same transaction, and vice versa. You should simply consider them as separate identifiers.

MySQL GTID generation when writing to the cluster

What can be surprising is that writing to node1 will generate a MySQL GTID where <source_id> is not the server uuid:

node1> select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | 03c236a0-f860-11e3-9b80-9cebe8067a3f | +--------------------------------------+ node1> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1 | +------------------------------------------+

Even more surprising is that if you write to node2, you will see a single GTID set as if both transactions had been executed on the same server:

node2> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:2 | +------------------------------------------+

Actually this is reasonable: the cluster acts as a distributed master regarding MySQL replication, so it makes sense that all nodes share the same <source_id>.

And by the way, if you are puzzled about how this ‘anonymous’ <source_id> is generated, look at this:

mysql> show global status like 'wsrep_local_state_uuid'; +------------------------+--------------------------------------+ | Variable_name | Value | +------------------------+--------------------------------------+ | wsrep_local_state_uuid | 4e0c0cc5-f876-11e3-bc0c-07c8c1ed0e15 | +------------------------+--------------------------------------+ node1> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1 | +------------------------------------------+

If you ‘sum’ both <source_id>, you will get ffffffff-ffff-ffff-ffff-ffffffffffff.

How can local transactions show up?

Now the question is: given that any transaction executed on any node of the cluster is replicated to all nodes, how can a local transaction (a transaction only found on one node) appear?

The most common reason is probably a write on a MyISAM table if wsrep_replicate_myisam is not enabled, simply because writes on MyISAM tables are not replicated by Galera by default:

# Node1 mysql> insert into myisam_table (id) values (1); mysql> select @@global.gtid_executed; +----------------------------------------------------------------------------------+ | @@global.gtid_executed | +----------------------------------------------------------------------------------+ | 03c236a0-f860-11e3-9b80-9cebe8067a3f:1, b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1-8 | +----------------------------------------------------------------------------------+ # Node2 mysql> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1-8 | +------------------------------------------+

As you can see the GTID of the local transaction on node1 uses the uuid of node1, which makes it easy to spot.

Are there other statements that can create a local transaction? Actually we found that this is also true for FLUSH PRIVILEGES, which is tracked in the binlogs but not replicated by Galera. See the bug report.

As you probably know, these local transactions can hurt when you connect an async replica on a node with a local transaction: the replication protocol used by GTID-based replication will make sure that the local transaction will be fetched from the binlogs and executed on the async slave. But of course, if the transaction is no longer in the binlogs, that triggers a replication error!

Conclusion

I will repeat here what I always say about MySQL GTIDs: it is a great feature but replication works a bit differently from regular position-based replication. So make sure you understand the main differences otherwise it is quite easy to be confused.

The post Percona XtraDB Cluster 5.6: a tale of 2 GTIDs appeared first on MySQL Performance Blog.

Online GTID rollout now available in Percona Server 5.6

February 10, 2015 - 12:00am

Global Transaction IDs (GTIDs) are one of my favorite features of MySQL 5.6. The main limitation is that you must stop all the servers at the same time to allow GTID-replication. Not everyone can afford to take a downtime so this requirement has been a showstopper for many people. Starting with Percona Server 5.6.22-72.0 enabling GTID replication can be done without almost no downtime. Let’s see how to do it.

Implementation of the Facebook patch

Finding a solution to migrate to GTIDs with no downtime is not a new idea, and several companies have already developed their own patch. The 2 best known implementations are the one from Facebook and the one from Booking.com.

Both options have pros and cons, and we finally chose to port the Facebook patch and add a new setting (gtid_deployment_step).

Performing the migration

Let’s assume we have a master-slaves setup with 4 servers A, B, C and D. A is the master:


The 1st step is to take each slave out of rotation, one at a time, and set gtid_mode = ON and gtid_deployment_step = ON (and also log_bin, log_slave_updates and enforce_gtid_consistency).


gtid_deployment_step = ON means that a server will not generate GTIDs when it executes writes, but it will record a GTID in its binary log if it gets an event from the replication stream tagged with a GTID.

The 2nd step is to promote one of the slaves to become the new master (for instance C) and to disable gtid_deployment_step. It is a regular slave promotion so you should do it the same way you deal with planned slave promotions (for instance using MHA or your own scripts). Our patch doesn’t help you do this promotion.

At this point replication will break on the old master as it has gtid_mode = OFF and gtid_deployment_step = OFF.


Don’t forget that you need to use CHANGE MASTER TO MASTER_AUTO_POSITION = 1 to enable GTID-based replication.

The 3rd step is to restart the old master to set gtid_mode = ON. Replication will resume automatically, but don’t forget to set MASTER_AUTO_POSITION = 1.


The final step is to disable gtid_deployment_step on all slaves. This can be done dynamically:

mysql> SET GLOBAL gtid_deployment_step = OFF;

and you should remove the setting from the my.cnf file so that it is not set again when the server is restarted.

Optionally, you can promote the old master back to its original role.

That’s it, GTID replication is now available without having restarted all servers at the same time!

Limitations

At some point during the migration, a slave promotion is needed. And at this point, you are still using position-based replication. The patch will not help you with this promotion so use your regular failover scripts. If you have no scripts to deal with that kind of situation, make sure you know how to proceed.

Also be aware that this patch provides a way to migrate to GTIDs with no downtime, but not a way to migrate away from GTIDs with no downtime. So test carefully and make sure you understand all the new stuff that comes with GTIDs, like the new replication protocol, or how to skip transactions.

Other topologies

If you are using master-master replication or multiple tier replication, you can follow the same steps. With multiple tier replication, simply start by setting gtid_mode = ON and gtid_deployment_step = ON for the leaves first.

Conclusion

If you’re interested by the benefits of GTID replication but if taking a downtime has always scared you, you should definitely download the latest Percona Server 5.6 and give it a try!

The post Online GTID rollout now available in Percona Server 5.6 appeared first on MySQL Performance Blog.

Percona Server 5.6.22-72.0 is now available

February 6, 2015 - 9:04am

Percona is glad to announce the release of Percona Server 5.6.22-72.0 on February 6, 2015. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.22, including all the bug fixes in it, Percona Server 5.6.22-72.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.22-72.0 milestone on Launchpad.

New Features:

  • Percona Server is now able to log the query’s response times into separate READ and WRITE INFORMATION_SCHEMA tables. Two new INFORMATION_SCHEMA tables QUERY_RESPONSE_TIME_READ and QUERY_RESPONSE_TIME_WRITE have been implemented for READ and WRITE queries correspondingly.
  • Percona Server now supports Online GTID deployment. This enables GTID to be deployed on existing replication setups without making the master read-only and stopping all the slaves. This feature was ported from the Facebook branch.
  • New ps_tokudb_admin script has been implemented to make the TokuDB storage engine installation easier.

Bugs Fixed:

  • SET STATEMENT ... FOR would crash the server if it could not execute the due to: 1) if the was Read-Write in a Read-Only transaction (bug #1387951), 2) if the needed to re-open an already open temporary table and would fail to do so (bug #1412423), 3) if the needed to commit implicitly the ongoing transaction and the implicit commit would fail (bug #1418049).
  • TokuDB storage engine would fail to load after the upgrade on CentOS 5 and 6. Bug fixed #1413956.
  • Fixed a potential low-probability crash in XtraDB linear read-ahead code. Bug fixed #1417953.
  • Setting the max_statement_time per query had no effect. Bug fixed #1376934.

Other bugs fixed: #1407941, and #1415843 (upstream #75642)

Release notes for Percona Server 5.6.22-72.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker

The post Percona Server 5.6.22-72.0 is now available appeared first on MySQL Performance Blog.

The future of MySQL quality assurance: Introducing pquery

February 4, 2015 - 12:00am

Being a QA Engineer, how would you feel if you had access to a framework which can generate 80+ crashes – a mix of hitting developer introduced assertions (situations that should not happen), and serious unforeseen binary crashes – for the world’s most popular open source database software – each and ever hour? What if you could do this running on a medium spec machine – even a laptop?

The seniors amongst you may object “But… generating a crash or assertion is one thing – creating a repeatable testcase for the same is quite another.”

Introducing pquery, mtr_to_sql, reducer.sh (the pquery-enabled version), and more:

80+ coredumps per hour. Fully automatic testcase creation. Near-100% testcase reproducibility. C++ core. 15 Seconds run time per trial. Up to 20-25k lines of SQL executed per trial. CLI testcases. Compatible with sporadic issues. High-end automation of many aspects.

It all started when we realized how slow RQG really is. The Random Query Generator by Philip Stoev is a fantastic tool, and it has been greatly expanded over the years, but though it is a Swiss army knife in what you can do with it, when it comes to speed it is not the fastest. The Perl backend – combined with much random-ness – has made the system slow. At least when compared with the ultra-fast execution of C++ code.

I discussed things with Alexey Bychko, and before long pquery was born. The C++ core code of pquery is Alexey’s creation. It easily executes 20k random lines of SQL in 15-20 seconds, with full logging (including errors) turned on. Though the tool looks fairly simple when reviewing the code, it is incredibly powerful.

Now, one thing which people being introduced to QA for MySQL (or any other large software with many features/functions/options etc.) have to grasp is “random spread testing”. If your random spread (and the amount of samples) is large enough (read: ‘sufficient’), it is relatively easy to get a good “overall quality” estimate of your software by doing a few runs (taking usually around 1-3 days – or longer if various options are being tested).

So,we now had speed (pquery) and near-perfect testcase simplification/reduction & reproducibility (the new pquery adaption of reducer.sh) – but we needed one thing more: SQL which would cover… well… every area of mysqld. A fast framework without matching grammars is not worth much…

Converting the grammars from RQG seemed like a mammoth task – and we would really just be re-writing RQG in C. And creating a new SQL generator was an almost hopeless venture (I gave it a short try) – given the huge variety and complexity when constructing SQL statements.

I took the cheeky road. And it paid off. mtr_to_sql.sh was born.

The MTR testcases included (and delivered) with the MySQL server download contain individual tests for nearly every possible SQL syntax possible, including ones that – ought to – error out (but not crash). Not only that, there are specific MTR testcases for each feature, not to mention the many MTR testcases that were added to cover bug regression testing. Where there is smoke…

[roel@localhost 5.6]$ pwd /bzr/5.6 [roel@localhost 5.6]$ find . | grep ".test$" | wc -l 3867 [roel@localhost 5.6]$ cat ~/percona-qa/mtr_to_sql.sh | grep BZR_PATH | head -n1 BZR_PATH="/bzr/5.6/mysql-test" [roel@localhost 5.6]$ time ~/percona-qa/mtr_to_sql.sh Done! Generated /tmp/mtr_to_sql.yy for RQG, and /tmp/mtr_to_sql.sql for pquery (SQL is indentical in both, but first is formatted for use with RQG) real 0m20.150s user 1m2.951s sys 0m1.214s [roel@localhost 5.6]$ cat /tmp/mtr_to_sql.sql | wc -l 107541

107.5K of high-quality SQL. Covering every possible functionality and buggy area out there. Free.

Let the testing begin!

I was quite dumbfounded when (after further tuning and scripting) we started seeing 50+, then 80+ cores per hour. Sig11’s (crashes), Sig6’s (asserts), server hangs, character set issues, error log errors and so. Many crashes and issues in optimized code. Fun.

Our best weeks yet?

Pquery update: 220+ bugs logged, of which 140+ in MySQL, 60+ in Percona, 10+ in TokuTek (with limited TokuTek testing) #mysql #percona #qa

— Roel Van de Paar (@RoelVandePaar) November 21, 2014

Last week @rameshvs02 & @RoelVandePaar logged 70+ bugs. Today reducer.sh was updated w/ pquery functionality. Testcases on their way! #mysql

— Roel Van de Paar (@RoelVandePaar) October 27, 2014

All of the above can be done on commodity hardware, running a single server, running single-threaded SQL (single client) and with no special mysqld options activated.

Compare this to RQG. Even with combinations.pl running hundreds if not thousands of mysqld — option combinations, and with nearly-everything-is-random-sql, it still comes nowhere near even one tenth of that number/speed. And this is even when using high-end hardware, 8 simultaneous mysqld servers, up to 20-25 client threads and at times special grammar-mix tools like MaxiGen etc.

In preparation for the Twitter week mentioned above we started running 4-5 simultaneous pquery run’s (5x mysqld, still all single threaded; a single client per mysqld) in different shell screen sessions, controlled by cron jobs.

A whole set of automation scripts were quickly added to handle the huge influx in bugs (you can get all for free (GPLv2) at $bzr branch lp:percona-qa – see pquery*.sh files), and now you can quickly review a list of issues pquery discovered. For writing this article, I started a run and in it’s first hour it found exactly 85 crashes. Here is a report from around ~2h;

[roel@localhost 830147]$ ~/percona-qa/pquery-results.sh ================ Sorted unique issue strings (Approx 439 trials executed, 167 remaining reducer scripts) btr0btr.ic line 143 (Seen 31 times: reducers 123,124,135,150,159,164,16,173,175,178,179,18,196,199,224,22,238,245,286,310,319,324,366,388,38,401,67,73,78,88,9) btr0btr.ic line 169 (Seen 1 times: reducers 158) btr0cur.cc line 769 (Seen 1 times: reducers 304) buf0buf.cc line 2738 (Seen 2 times: reducers 113,257) fsp0fsp.cc line 1899 (Seen 5 times: reducers 145,174,409,69,85) . is_set (Seen 32 times: reducers 112,165,170,192,203,218,231,249,24,253,259,273,278,280,289,329,331,333,336,338,363,371,373,379,384,398,404,44,47,6,72,82) .length % 4 (Seen 4 times: reducers 169,220,307,80) m_lock .= __null .... thd->mdl_context.is_lock_owner.m_namespace, ...., ...., MDL_SHARED(Seen 3 times: reducers 297,403,86) row0quiesce.cc line 683 (Seen 1 times: reducers 97) row0umod.cc line 338 (Seen 1 times: reducers 357) .slen % 2 (Seen 21 times: reducers 106,122,131,144,221,250,251,252,275,282,296,316,318,32,359,375,39,405,407,43,46) .slen % 4 (Seen 5 times: reducers 103,382,76,7,81) sort_field->length >= length (Seen 1 times: reducers 138) timer == __null (Seen 36 times: reducers 133,139,149,160,161,181,183,184,185,20,212,227,229,234,244,260,266,274,292,294,295,298,301,308,326,327,330,343,346,364,367,400,48,50,59,99) .tlen % 2 (Seen 8 times: reducers 117,119,200,205,213,217,285,35) .tlen % 4 (Seen 3 times: reducers 25,355,365) trx0roll.cc line 264 (Seen 1 times: reducers 40) Z21mysql_execute_commandP3THD (Seen 4 times: reducers 182,237,291,393) ZN13Bounded_queueIhhE4pushEPh (Seen 3 times: reducers 101,118,214) ZN8Protocol13end_statementEv (Seen 4 times: reducers 211,410,42,61) ================

For these (standard by now) pquery runs, we use pquery-run.sh. It starts a server, executes and monitors the pquery binary, and then checks on the outcome:

[roel@localhost percona-qa]$ ./pquery-run.sh [07:23:21] [0] Workdir: /sda/459689 | Rundir: /dev/shm/459689 | Basedir: /sda/Percona-Server-5.6.21-rel69.0-687.Linux.x86_64-debug [07:23:21] [0] mysqld Start Timeout: 60 | Client Threads: 1 | Queries/Thread: 25000 | Trials: 100000 | Save coredump trials only: TRUE [07:23:21] [0] Pquery timeout: 15 | SQL file used: /home/roel/percona-qa/pquery/main-new.sql [07:23:21] [0] MYSAFE: --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode=ONLY_FULL_GROUP_BY [07:23:21] [0] Archiving a copy of this script (/home/roel/percona-qa/pquery-run.sh) in the workdir (/sda/459689) for later reference... [07:23:21] [0] Archiving a copy of the infile used for this run (/home/roel/percona-qa/pquery/main-new.sql) in the workdir (/sda/459689) for later reference... [07:23:21] [0] Generating datadir template (using mysql_install_db)... [07:23:29] [0] Making a copy of mysqld used to /sda/459689/mysqld (handy for core file analysis and manual bundle creation)... [07:23:29] [0] Storing a copy of ldd files for mysqld in same directory also... [07:23:29] [0] Starting pquery testing iterations... [07:23:29] [0] ====== TRIAL #1 ====== [07:23:29] [0] Ensuring there are no relevant servers running... [07:23:29] [0] Clearing rundir... [07:23:29] [0] Generating new workdir /dev/shm/459689/1... [07:23:29] [0] Copying datadir from template... [07:23:29] [0] Starting mysqld. Error log is stored at /dev/shm/459689/1/log/master.err [07:23:29] [0] Waiting for mysqld (pid: 10879) to fully start... [07:23:31] [0] Server started ok. Client: /sda/Percona-Server-5.6.21-rel69.0-687.Linux.x86_64-debug/bin/mysql -uroot -S/dev/shm/459689/1/socket.sock [07:23:31] [0] Starting pquery (log stored in /dev/shm/459689/1/pquery.log)... [07:23:31] [0] pquery running... ./pquery-run.sh: line 150: 10879 Aborted (core dumped) $CMD > ${RUNDIR}/${TRIAL}/log/master.err 2>&1 <---- success! [07:23:32] [0] Cleaning up... [07:23:36] [0] pquery summary: 578/769 queries failed (24.84% were successful) [07:23:36] [0] mysqld core detected at /dev/shm/459689/1/data/core.10879.1000.1000.6.1414700611.mysqld [07:23:36] [1] Copying rundir from /dev/shm/459689/1 to /sda/459689/1 [07:23:36] [1] ====== TRIAL #2 ====== [...]

(The MYSAFE settings are some generic settings specifically suited for QA tested – kindly provided by Shane Bester)

Within the scripts many QA-goodies are already built-in: automated gdb query extraction from the core and the error log (each added 3x to the end of the sql trace to ensure maximum reproducibility), unique issue classification using bug-relevant strings, /dev/shm execution for optimal speed, etc. – it is all based/build on many years of mysqld QA experience.

If you can’t wait to spin off some I-crashed-mysqld (it’s easy you know…) bugs of your own, download the tools at lp:percona-qa ($bzr branch lp:percona-qa) and checkout the many pquery* scripts. Some shown in use above.

And, when you’re ready for slightly more serious feature testing – add whatever statements (+matching mysqld options/parameters) your feature uses to the sql file (or even replace it), plus the relevant mysqld options to pquery-run.sh (MYEXTRA string) (You can use sql-interleave.sh/sql-interleave-switch.sh to interleave new sql into the main sql file(s), available in the pquery/ directory of the percona-qa branch mentioned above). Soon you will see fireworks.

Enjoy the show!

The post The future of MySQL quality assurance: Introducing pquery appeared first on MySQL Performance Blog.

Percona Live 2015 Lightning Talks, BoF submission deadline Feb. 13! And introducing “MySQL 101″ program

February 3, 2015 - 5:00am

It’s hard to believe that the Percona Live MySQL Conference and Expo is just over two months away (April 13-16 in Santa Clara, California). So if you’ve been thinking about submitting a proposal for the popular “Lightning Talks” and/or “Birds of a Feather” sessions, it’s time to get moving because the deadline to do so if February 13.

Lightning Talks provide an opportunity for attendees to propose, explain, exhort, or rant on any MySQL-related topic for five minutes. Topics might include a new idea, successful project, cautionary story, quick tip, or demonstration. All submissions will be reviewed, and the top 10 will be selected to present during the one-hour Lightning Talks session on Wednesday (April 15) during the Community Networking Reception. Lighthearted, fun or otherwise entertaining submissions are highly welcome. Submit your proposal here.

“MySQL 101″ is coming to Percona Live 2015.

Birds of a Feather (BoF) sessions enable attendees with interests in the same project or topic to enjoy some quality face time. BoFs can be organized for individual projects or broader topics (e.g., best practices, open data, standards). Any attendee or conference speaker can propose and moderate an engaging BoF. Percona will post the selected topics and moderators online and provide a meeting space and time. The BoF sessions will be held Tuesday night, (April 14) from 6- 7 p.m. Submit your BoF proposal here.

This year we’re also adding a new program for MySQL “newbies.” It’s called “MySQL 101,” and the motto of this special two-day program is: “You send us developers and admins, and we will send you back MySQL DBAs.” The two days of practical training will include everything they need to know to handle day-to-day MySQL DBA tasks.

“MySQL 101,” which is not included in regular Percona Live registration, will cost $400. However, the first 101 tickets are just $101 if you use the promo code “101” during checkout.

New: 25-Minute Sessions
On the first day of the conference, Percona is now offering 25-minute talks that pack tons of great information into a shorter format to allow for a wider range of topics. The 25-minute sessions include:

I also wanted to give another shout-out to Percona Live 2015’s awesome sponsor, which include: VMware, Yahoo, Deep Information Sciences, Pythian, Codership, Machine Zone, Box, Yelp, MariaDB, SpringbokSQL, Tesora, BlackMesh, SolidFire, Severalnines, Tokutek, VividCortex, FoundationDB, ScaleArc, Walmart eCommerce and more.(Sponsorship opportunities are still available.)

The great thing about Percona Live conferences is that there is something for everyone within the MySQL ecosystem – veterans and newcomers alike. And for the first time this year, that community expands to encompass OpenStack. Percona Live attendees can also attend OpenStack Live events. Those events run April 13-14, also at the Hyatt Regency Santa Clara and Santa Clara Convention Center.

OpenStack Live 2015’s awesome sponsors include: PMC Sierra and Nimble Storage!

With so much to offer this year, this is why there are several more options in terms of tickets. Click the image below for a detailed view of what access each ticket type provides.

Register here for the Percona Live MySQL Conference and Expo.

Register here for the OpenStack Live Conference and Expo.

For full conference schedule details please visit the Percona Live MySQL Conference and Expo website and the OpenStack Live Conference Website!

I hope to see you in Santa Clara in a couple months!

 

The post Percona Live 2015 Lightning Talks, BoF submission deadline Feb. 13! And introducing “MySQL 101″ program appeared first on MySQL Performance Blog.

Faster fingerprints and Go packages for MySQL

February 2, 2015 - 7:31am

I’m happy to announce Go packages for MySQL. Particularly exciting is a new query fingerprint function which is very fast and efficient, but I’ll talk about that later. First, go-mysql is currently three simple Go packages for parsing and aggregating MySQL slow logs. If you’ve been following Percona development, you’ve no doubt heard of Percona Cloud Tools (PCT), a somewhat new performance management web service for MySQL.

One tool in PCT is “Query Analytics” which continuously analyzes query metrics from the slow log. The slow log provides the most metrics and therefore the most performance insight into MySQL. percona-agent, the open-source agent for PCT, uses go-mysql to parse and analyze the slow log, so the code has both heavy formal testing and heavy real-world testing. If you’re working with Go, MySQL, and MySQL slow logs, we invite you to try go-mysql.

Last October we implemented a completely new query fingerprint function. (See “Fingerprints” in the pt-query-digest doc for a background on query fingerprints.) Since mydumpslow, the very first slow log parser circa 2000, fingerprints have been accomplished with regular expressions. This approach is normally fine, but percona-agent needs to be faster and more efficient than normal to reduce the cost of observation. Regex patterns are like little state machines. One regex can be very fast, but several are required to produce a good fingerprint. Therefore, the regex approach requires processing the same query several times to produce a fingerprint. Even worse: a regex can backtrack which means a single logical pass through the query can result in several physical passes. In short: regular expressions are a quick and easy solution, but they are very inefficient.

Several years ago, a former colleague suggested a different approach: a single pass, purpose-built, character-level state machine. The resulting code is rather complicated, but the resulting performance is a tremendous improvement: 3-5x faster in informal benchmarks on my machine, and it handles more edge cases. In simplest terms: the new fingerprint function does more with less, which makes percona-agent and Query Analytics better.

Check out  github.com/percona/go-mysql, browse the percona-agent source code if you wish, and try Percona Cloud Tools for free.

The post Faster fingerprints and Go packages for MySQL appeared first on MySQL Performance Blog.

OpenStack Live 2015: FAQs on the who, what, where, when, why & how

January 30, 2015 - 5:00am

This April 13-14 Percona is introducing an annual conference called OpenStack Live. I’ve seen a few questions about the new event so decided to help clarify what this show is about and who should attend.

Unlike OpenStack Summits, held twice a year and dedicated to primarily to developers, OpenStack Live is an opportunity for OpenStack evaluators and users of all levels to learn from experts on topics such as how to deploy, optimize, and manage OpenStack and the role of MySQL as a crucial technology in this free and open-source cloud computing software platform. A full day of hands-on tutorials will also focus on making OpenStack users more productive and confident in this emerging technology.

Still confused about OpenStack Live 2015? Fear not! Here are the answers to commonly asked questions.

Q: Who should attend?
A: You should attend…

  • if you are currently using OpenStack and want to improve your skills and knowledge
  • if you are evaluating or considering using it.
  • if you are a solutions provider – this is your opportunity to show the world your contributions and services

Q: Percona Live has a conference committee. Does OpenStack Live have one, too?
A: Yes and it’s a completely different committee comprised of:

  • Mark Atwood, Director of Open Source Engagement at HP (Conference Chairman)
  • Rich Bowen, OpenStack Community Liaison at Red Hat
  • Jason Rouault, Senior Director OpenStack Cloud at Time Warner Cable
  • Peter Boros, Principal Architect at Percona

Q: Are the tutorials really “hands-on”?
A: Yes and most are at least 3-hours long. So you’ll need your laptop and power cord. Here’s a look at all of the OpenStack tutorials.

Q: How meaty are the sessions?
A: Very meaty indeed! Here’s a sample:

Q: I am going to attend the Percona Live MySQL Conference and Expo. Will my pass also include OpenStack Live 2015?
A: Yes, your Percona Live pass will be honored at the OpenStack Live conference. OpenStack Live attendees will also have access to the Percona Live/OpenStack Live Exhibit hall, keynotes, receptions and FUN activities April 13 and 16, allowing them to dive deeper into MySQL topics such as high availability, security, performance optimization, and much more. However, the OpenStack Live pass does not allow access to Percona Live breakout sessions or tutorials.

Q: Where can I register?
A: You can register here and take advantage of Early Bird discounts but those end Feb. 1 at 11:30 p.m. PST, so hurry!

The post OpenStack Live 2015: FAQs on the who, what, where, when, why & how appeared first on MySQL Performance Blog.

GHOST vulnerability (CVE-2015-0235) Percona response

January 29, 2015 - 6:40am

Cloud security company Qualys announced Tuesday the issues prevalent in glibc since version 2.2 introduced in 2000-11-10 (the complete Qualys announcement may be viewed here). The vulnerability, CVE-2015-0235, has been dubbed “GHOST.”

As the announcement from Qualys indicates, it is believed that MySQL and by extension Percona Server are not affected by this issue.

Percona is in the process of conducting our own review into the issue related to the Percona Server source code – more information will be released as soon as it is available.

In the interim the current advisory is to update your glibc packages for your distributions if they are in fact vulnerable. The C code from the Qualys announcement may aid in your diagnostics, section 4 of this document or via this gist. I also wrote a very quick python script to help identify processes which may be running libc that you can access here.

Compiling the above and executing it will yield an output indicating if your glibc version is believed to be vulnerable or not vulnerable.

Distribution Resource Resource Links
    1. RedHat BZ: https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2015-0235
    2. RedHat EL5 Errata: https://rhn.redhat.com/errata/RHSA-2015-0090.html
    3. RedHat EL6 / 7 Errata: https://rhn.redhat.com/errata/RHSA-2015-0092.html
    4. Ubuntu USN: http://www.ubuntu.com/usn/usn-2485-1/ (affects 10.04 12.04)
    5. Debian security tracker: https://security-tracker.debian.org/tracker/CVE-2015-0235

Distributions which use musl-libc (http://www.musl-libc.org/) are not affected by this issue.

Acknowledgements

Qualys

Robert Barabas – Percona
Raghavendra Prabhu – Percona
Laura Byrnes – Percona

The post GHOST vulnerability (CVE-2015-0235) Percona response appeared first on MySQL Performance Blog.

Multi-threaded replication with MySQL 5.6: Use GTIDs!

January 29, 2015 - 12:00am

MySQL 5.6 allows you to execute replicated events in parallel as long as data is split across several databases. This feature is named “Multi-Threaded Slave” (MTS) and it is easy to enable by setting slave_parallel_workers to a > 1 value. However if you decide to use MTS without GTIDs, you may run into annoying issues. Let’s look at two of them.

Skipping replication errors

When replication stops with an error, a frequent approach is to “ignore now and fix later.” This means you will run SET GLOBAL sql_slave_skip_counter=1 to be able to restart replication as quickly as possible and later use pt-table-checksum/pt-table-sync to resync data on the slave.

Then the day when I hit:

mysql> show slave status; [...] Last_SQL_Error: Worker 0 failed executing transaction '' at master log mysql-bin.000017, end_log_pos 1216451; Error 'Duplicate entry '1001' for key 'PRIMARY'' on query. Default database: 'db1'. Query: 'INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5320, '49123511666-22272014664-85739796464-62261637750-57593947547-00947134109-73607171516-11063345053-55659776318-82888369235', '11400300639-05875856680-20973514928-29618434959-69429576205')' Exec_Master_Log_Pos: 1005432

I tried to use the trick:

mysql> set global sql_slave_skip_counter=1; mysql> start slave;

But:

mysql> show slave status; [...] Last_SQL_Error: Worker 0 failed executing transaction '' at master log mysql-bin.000017, end_log_pos 1216451; Error 'Duplicate entry '1001' for key 'PRIMARY'' on query. Default database: 'db1'. Query: 'INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5320, '49123511666-22272014664-85739796464-62261637750-57593947547-00947134109-73607171516-11063345053-55659776318-82888369235', '11400300639-05875856680-20973514928-29618434959-69429576205')' Exec_Master_Log_Pos: 1005882

Note that the position reported with Exec_Master_Log_Pos has moved forward, but I still have my duplicate key error. What’s wrong?

The issue is that the positions reported by SHOW SLAVE STATUS are misleading when using MTS. Quoting the documentation about Exec_Master_Log_Pos:

When using a multi-threaded slave (by setting slave_parallel_workers to a nonzero value in MySQL 5.6.3 and later), the value in this column actually represents a “low-water” mark, before which no uncommitted transactions remain. Because the current implementation allows execution of transactions on different databases in a different order on the slave than on the master, this is not necessarily the position of the most recently executed transaction.

So the solution to my problem is first to make sure that there is no execution gap, and only then to skip the offending event. There is a specific statement for the first part:

mysql> start slave until sql_after_mts_gaps;

And now I can finally skip the error and restart replication:

mysql> set global sql_slave_skip_counter=1; mysql> start slave; mysql> show slave statusG Slave_IO_Running: Yes Slave_SQL_Running: Yes

The last thing to do is of course to resync the slave.

Backups

If you cannot trust the output of SHOW SLAVE STATUS to get the current binlog position, it means that taking a backup from a slave with parallel replication is tricky.

For instance, if you run mysqldump --dump-slave=2 to get the binlog position of the master, mysqldump will first run STOP SLAVE and then SHOW SLAVE STATUS. Is stopping the slave sufficient to avoid execution gaps? Actually, no.

The only option then seems to be: run STOP SLAVE followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS, followed by mysqldump while replication is stopped. Not very handy!

GTIDs to the rescue!

The solution for both issues is to use GTIDs.

They help when you want to skip an event because when using GTIDs, you must explicitly specify the transaction you will be skipping. It doesn’t matter whether there are execution holes.

They also help for backups because mysqldump takes the position from gtid_executed which is updated at each transaction commit (XtraBackup does that too).

Conclusion

If your application uses several databases and if you’re fighting with replication lag, MTS can be a great feature for you. But although GTIDs are not technically necessary, you’ll be exposed to tricky situations if you don’t use them.

Is everything rosy when using both GTIDs and MTS? Not exactly… But that will be the topic for a separate post!

By the way, if you are in the Brussels area this weekend, come see me and other great speakers at the MySQL and friends devroom at FOSDEM!

The post Multi-threaded replication with MySQL 5.6: Use GTIDs! appeared first on MySQL Performance Blog.

Percona University: Back to school Feb. 12 in Raleigh, N.C.

January 27, 2015 - 11:01am

Percona CEO Peter Zaitsev leads a track at the inaugural Percona University event in Raleigh, N.C. on Jan. 29, 2013.

About two years ago we held our first-ever Percona University event in Raleigh, N.C. It was a great success with high attendance and very positive feedback which led us to organize a number of similar educational events in different locations around the world.

And next month we’ll be back where it all started. On February 12, Percona University comes to Raleigh – and this time the full-day educational event will be much more cool. What have we changed? Take a look at the agenda.

First - this is no longer just a MySQL-focused event. While 10 years ago MySQL was the default, dominating choice for modern companies looking to store and process data effectively – this is no longer the case. And as such the event’s theme is “Smart Data.” In addition to MySQL, Percona and MariaDB technologies (which you would expect to be covered), we have talks about Hadoop, MongoDB, Cassandra, Redis, Kafka, SQLLite.

However the “core” data-store technologies is not the only thing successful data architects should know – one should also be well-versed in the modern approaches to the infrastructure and general data management. This is why we also have talks about Ansible and OpenStack, DBaaS and PaaS as well as a number of more talks about big-picture topics around architecture and technology management.

Second – this is our first multi-track Percona University event – we had so many great speakers interested in speaking that we could not fit them all into one track, so we have two tracks now with 25 sessions which makes that quite an educational experience!

Third – while we’re committed to having those events be very affordable, we decided to charge $10 per attendee. The reason for this is to encourage people to register who actually plan on attending – when hosting free events we found out that way too many registered and never showed up, which was causing the venues to rapidly fill past capacity and forcing us to turn away those who could actually be there. It was also causing us to order more food than needed, causing waste. We trust $10 will not prevent you from attending, but if it does cause hardship, just drop me a note and I’ll give you a free pass.

A few other things you need to know:

This is very much a technically focused event. I have encouraged all speakers to make it about technology rather than sales pitches or marketing presentations.

This is low-key educational event. Do not expect it to be very fancy. If you’re looking for the great conference experience consider attending the Percona Live MySQL Conference and Expo this April.

Although it’s a full-day event, you can come for just part of the day. We recognize many of you will not be able to take a full day from work and may be able to attend only in the morning or the afternoon. This is totally fine. The morning registration hours is when most people will register, however, there will be someone on the desk to get you your pass throughout the day.

Thinking of Attending? Take a look at the day’s sessions and then register as space is limited. The event will be held at North Carolina State University’s McKimmon Conference & Training Center. I hope to see you there!

The post Percona University: Back to school Feb. 12 in Raleigh, N.C. appeared first on MySQL Performance Blog.

Pages

]]>