]]>
]]>

You are here

Feed aggregator

Full graceful restart without bootstrap, pc.recovery=true hangs: no gvwstate.dat!?

Lastest Forum Posts - January 23, 2015 - 3:23am
Hi,

I tried to do a full cluster graceful restart without bootstrap, and relying on the pc.recovery=true feature + gvwstate.dat.

"This feature can be used for automatic recovery from full cluster crashes, such as in the case of a data center power outage and graceful full cluster restarts without the need for explicitly bootstrapping a new Primary Component."


However, gvwstate.dat was deleted on graceful shutdown, so all the daemons hung on reboot.

systemctl stop mysql
ls -l /var/lib/mysql/gv*
No such file or directory


How is full graceful restart supposed to work?

Version: Percona-XtraDB-Cluster-56-5.6.21-25.8.940.el7.x86_64

Oracle MySQL Security Patches January 2015

Lastest Forum Posts - January 22, 2015 - 6:57am
Hi,

I'm new to the forum, but wanted to know that in the light of the new Oracle MySQL Security patches for January 2015 for MySQL releases 5.5.40 and earlier and 5.6.21 and earlier, how will Percona incorporate these patches or will the current Percona servers need to be patched?

Reason I'm asking is because we have a host of Percona servers (ranging on different versions) and want to know how these patches will be handled? Will a new release be brought out or do we need to apply these patches to our servers?

Any help will be greatly appreciated.

Many Thanks,
Sunny

Nodes terminated when addeted new

Lastest Forum Posts - January 22, 2015 - 12:59am
Hello!

I have cluster of 2 nodes (master-master). One of them master for slave.
When i tried add new master node in cluster Donor node and acceptor fall down. On donor node in innobackup.backup.log i see next:
150121 14:49:11 innobackupex: Finished backing up non-InnoDB tables and files

150121 14:49:11 innobackupex: Executing LOCK BINLOG FOR BACKUP...
DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr//bin/innobackupex line 3036.
innobackupex: got a fatal error with the following stacktrace: at /usr//bin/innobackupex line 3039
main::mysql_query('HASH(0x10a9720)', 'LOCK BINLOG FOR BACKUP') called at /usr//bin/innobackupex line 3501
main::mysql_lock_binlog('HASH(0x10a9720)') called at /usr//bin/innobackupex line 2000
main::backup() called at /usr//bin/innobackupex line 1592
innobackupex: Error:
Error executing 'LOCK BINLOG FOR BACKUP': DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr//bin/innobackupex line 3036.
150121 14:49:11 innobackupex: Waiting for ibbackup (pid=44712) to finish


Versions:
mysqld Ver 5.6.21-70.1-56 for Linux on x86_64 (Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150)

xtrabackup version 2.2.8 based on MySQL server 5.6.22 Linux (x86_64) (revision id: )



I tried to setup new node and got same error.

CPU Usage

Lastest Forum Posts - January 21, 2015 - 3:52pm
I looking at the graphs for CPU usage and I'm getting values with 'u' and 'm'. Can you explain to me what these mean?

Also I'm looking at the graphs of the Daily (5 min average) and the other graphs. The values for the "Daily (5 min average)" on the 'y' axis exceeds 500. However, the values for all other graphs only reach 400 and reflects the increase by 100 with each added virtual CPU (it's a four core processor).

Below are examples of the Daily and Weekly graphs. Thanks.

innobackupex Incremental backup fails on LOCK BINLOG FOR BACKUP...

Lastest Forum Posts - January 21, 2015 - 11:36am
Executing an incremental backup on my database and it failed with the following trace. Any clues to what is wrong.

Backup Version is: InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona LLC and/or its affiliates 2009-2013. All Rights Reserved.

This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.

Get the latest version of Percona XtraBackup, documentation, and help resources:
http://www.percona.com/xb/p

Server Version is: Server version: 5.6.21-70.0 Percona Server (GPL), Release 70.0, Revision 688

Command is: innobackupex --compress --incremental /usb/ariel/innobackupex/mysql --incremental-basedir=/usb/ariel/innobackupex/mysql/2015-01-14_05-51-30

Log Trace is:
innobackupex: Backing up files '/srv/mysql//annotation_service/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM ,CSV,opt,par}' (10 files)
innobackupex: Backing up file '/srv/mysql//entrez/db.opt'
innobackupex: Backing up file '/srv/mysql//entrez/gene.frm'
>> log scanned up to (8048683632701)
innobackupex: Backing up file '/srv/mysql//entrez/synonym.frm'
150121 10:38:01 innobackupex: Finished backing up non-InnoDB tables and files

150121 10:38:01 innobackupex: Executing LOCK BINLOG FOR BACKUP...
DBD::mysql::db do failed: MySQL server has gone away at /usr/bin/innobackupex line 3036.
innobackupex: got a fatal error with the following stacktrace: at /usr/bin/innobackupex line 3039
main::mysql_query('HASH(0x1df0d20)', 'LOCK BINLOG FOR BACKUP') called at /usr/bin/innobackupex line 3501
main::mysql_lock_binlog('HASH(0x1df0d20)') called at /usr/bin/innobackupex line 2000
main::backup() called at /usr/bin/innobackupex line 1592
innobackupex: Error:
Error executing 'LOCK BINLOG FOR BACKUP': DBD::mysql::db do failed: MySQL server has gone away at /usr/bin/innobackupex line 3036.
150121 10:38:01 innobackupex: Waiting for ibbackup (pid=4074) to finish


Encrypted and incremental backups.

Lastest Forum Posts - January 21, 2015 - 1:50am
Hello,


I'm currently writing backups and restorations scripts, on the basis of innobackupex. I need the backups to be encrypted and incremental.

Therefore, In order to enable incremental backups on the basis of an encrypted basedir, I'm using the '--extra-lsndir' option to save an alternative cleartext 'xtrabackup_checkpoints' file. I think it is a good solution (this solution comes from this blog post).

My question is :
Can I safely set the '--extra-lsndir' value to the same value as the backupDir, and then delete xtrabackup_checkpoints.xbcrypt ?

Identifying useful info from MySQL row-based binary logs

Latest MySQL Performance Blog posts - January 20, 2015 - 7:34am

As a MySQL DBA/consultant, it is part of my job to decode the MySQL binary logs – and there are a number of reasons for doing that. In this post, I’ll explain how you can get the important information about your write workload using MySQL row-based binary logs and a simple awk script.

First, it is important to understand that row-based binary logs contain the actual changes done by a query. For example, if I run a delete query against a table, the binary log will contain the rows that were deleted. MySQL provides the mysqlbinlog utility to decode the events stored in MySQL binary logs. You can read more about mysqlbinlog in detail in the reference manual here.

The following example illustrates how mysqlbinlog displays row events that specify data modifications. These correspond to events with the WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, and DELETE_ROWS_EVENT type codes.

We will use following options of mysqlbinlog.
–base64-output=decode-rows
–verbose, -v
–start-datetime=”datetime”
–stop-datetime=”datetime”

We have a server running with row based binary logging.

mysql> show variables like '%binlog_format%'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | binlog_format | ROW | +---------------+-------+ 1 row in set (0.00 sec)

We created a test table and executed the following sequence of statements under a transaction.

use test; CREATE TABLE t ( id INT NOT NULL, name VARCHAR(20) NOT NULL, date DATE NULL ) ENGINE = InnoDB; START TRANSACTION; INSERT INTO t VALUES(1, 'apple', NULL); UPDATE t SET name = 'pear', date = '2009-01-01' WHERE id = 1; DELETE FROM t WHERE id = 1; COMMIT;

Now let’s see how it is represented in binary logs.

# mysqlbinlog --base64-output=decode-rows -vv --start-datetime="2015-01-12 21:40:00" --stop-datetime="2015-01-12 21:45:00" mysqld-bin.000023 /*!*/; # at 295 #150112 21:40:14 server id 1 end_log_pos 367 CRC32 0x19ab4f0f Query thread_id=108 exec_time=0 error_code=0 SET TIMESTAMP=1421079014/*!*/; BEGIN /*!*/; # at 367 #150112 21:40:14 server id 1 end_log_pos 415 CRC32 0x6b1f2240 Table_map: `test`.`t` mapped to number 251 # at 415 #150112 21:40:14 server id 1 end_log_pos 461 CRC32 0x7725d174 Write_rows: table id 251 flags: STMT_END_F ### INSERT INTO `test`.`t` ### SET ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */ ### @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */ # at 461 #150112 21:40:14 server id 1 end_log_pos 509 CRC32 0x7e44d741 Table_map: `test`.`t` mapped to number 251 # at 509 #150112 21:40:14 server id 1 end_log_pos 569 CRC32 0x0cd1363a Update_rows: table id 251 flags: STMT_END_F ### UPDATE `test`.`t` ### WHERE ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */ ### @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */ ### SET ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */ ### @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */ # at 569 #150112 21:40:14 server id 1 end_log_pos 617 CRC32 0xf134ad89 Table_map: `test`.`t` mapped to number 251 # at 617 #150112 21:40:14 server id 1 end_log_pos 665 CRC32 0x87047106 Delete_rows: table id 251 flags: STMT_END_F ### DELETE FROM `test`.`t` ### WHERE ### @1=1 /* INT meta=0 nullable=0 is_null=0 */ ### @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */ ### @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */ # at 665 #150112 21:40:15 server id 1 end_log_pos 696 CRC32 0x85ffc9ff Xid = 465 COMMIT/*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

  • Row with “Table_map: test.t” defines the table name used by query.
  • Rows with “Write_rows/Update_rows/Delete_rows” defines the event type.
  • Lines that start with “###” defines the actual row that got changed.
  • Columns are represented as @1, @2 and so on.

Now have a look at our simple awk script that will use the mysqlbinlog output and print a beautiful summary for INSERT/UPDATE/DELETE events from row-based binary logs. Just replace the “mysqld-bin.000023″ with your binary log file. The string “#15″ in third line is for year 2015. If you are decoding a binary log file from 2014, just change it to “#14″. It is also recommended to use “–start-datetime” and ” –stop-datetime” options to decode binary logs of a specific time range instead of decoding a large binary log file.

Script :

mysqlbinlog --base64-output=decode-rows -vv --start-datetime="2015-01-12 21:40:00" --stop-datetime="2015-01-12 21:45:00" mysqld-bin.000023 | awk 'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} {if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;} else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;} else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; } else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " delete_count "] n+----------------------+----------------------+----------------------+----------------------+"; count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } '

Output :

Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : INSERT 1 row(s) affected Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : UPDATE 1 row(s) affected Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : DELETE 1 row(s) affected [Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1] +----------------------+----------------------+----------------------+----------------------+

This awk script will return following columns as output for every transaction in binary logs.

Timestamp : timestamp of event as logged in binary log.
Table : database.tablename
Query Type : Type of query executed on table and number of rows affected by query.

[Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1]
Above line print the summary of transaction, it displays the total number of rows affected by transaction, total number of rows affected by each type of query in transaction.

Let’s execute some more queries in sakila database and then we will summarize them using our script. We assume that we executed these queries between “2015-01-16 13:30:00″ and “2015-01-16 14:00:00″

use sakila; mysql> update city set city="Acua++" where city_id=4; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> delete from country where country_id=4; Query OK, 1 row affected (0.04 sec) mysql> insert into country (country_id,country,last_update) values ('4','Angola',now()); Query OK, 1 row affected (0.03 sec) mysql> create table payment_tmp select * from payment; Query OK, 16049 rows affected (0.80 sec) Records: 16049 Duplicates: 0 Warnings: 0 mysql> delete from payment_tmp where payment_id between 10000 and 15000; Query OK, 5001 rows affected (0.08 sec) mysql> update payment_tmp set amount=9.99 where payment_id between 1 and 7000; Query OK, 6890 rows affected (0.08 sec) Rows matched: 7000 Changed: 6890 Warnings: 0 mysql> start transaction; Query OK, 0 rows affected (0.00 sec) mysql> delete from country where country_id=5; Query OK, 1 row affected (0.00 sec) mysql> delete from city where country_id=5; Query OK, 1 row affected (0.00 sec) mysql> delete from address where city_id=300; Query OK, 2 rows affected (0.00 sec) mysql> commit; Query OK, 0 rows affected (0.01 sec)

We will save our script in a file named “summarize_binlogs.sh” and summarize the queries we just executed.

# vim summarize_binlogs.sh #!/bin/bash BINLOG_FILE="mysqld-bin.000035" START_TIME="2015-01-16 13:30:00" STOP_TIME="2015-01-16 14:00:00" mysqlbinlog --base64-output=decode-rows -vv --start-datetime="${START_TIME}" --stop-datetime="${STOP_TIME}" ${BINLOG_FILE} | awk 'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} {if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;} else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;} else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; } else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " delete_count "] n+----------------------+----------------------+----------------------+----------------------+"; count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } ' :wq # chmod u+x summarize_binlogs.sh

Now we run our script to get the summary of information logged in binary log.

# ./summarize_binlogs.sh Timestamp : #150116 13:41:09 Table : `sakila`.`city` Query Type : UPDATE 1 row(s) affected [Transaction total : 1 Insert(s) : 0 Update(s) : 1 Delete(s) : 0] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:41:59 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected [Transaction total : 1 Insert(s) : 0 Update(s) : 0 Delete(s) : 1] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:42:07 Table : `sakila`.`country` Query Type : INSERT 1 row(s) affected [Transaction total : 1 Insert(s) : 1 Update(s) : 0 Delete(s) : 0] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected [Transaction total : 16049 Insert(s) : 16049 Update(s) : 0 Delete(s) : 0] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected [Transaction total : 5001 Insert(s) : 0 Update(s) : 0 Delete(s) : 5001] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected [Transaction total : 6890 Insert(s) : 0 Update(s) : 6890 Delete(s) : 0] +----------------------+----------------------+----------------------+----------------------+ Timestamp : #150116 13:42:42 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected Timestamp : #150116 13:42:48 Table : `sakila`.`city` Query Type : DELETE 1 row(s) affected Timestamp : #150116 13:42:53 Table : `sakila`.`address` Query Type : DELETE 2 row(s) affected [Transaction total : 4 Insert(s) : 0 Update(s) : 0 Delete(s) : 4] +----------------------+----------------------+----------------------+----------------------+

See how nicely our script has summarized the information logged in binary log. For the last transaction, it shows 3 rows that corresponds to each statement executed within transaction and then the final summary of transaction i.e. transaction affected 4 rows in total and all of them were deleted.

This was how we can summarize MySQL row-based binary logs in a more readable format using a simple awk script. This will save you time whenever you need to decode the binary logs for troubleshooting. Now I’ll show you how this summarized information can help answer some basic but important questions.

Q1 : Which tables received highest number of insert/update/delete statements?

./summarize_binlogs.sh | grep Table |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr 3 `sakila`.`payment_tmp` 3 `sakila`.`country` 2 `sakila`.`city` 1 `sakila`.`address`

Q2 : Which table received the highest number of DELETE queries?

./summarize_binlogs.sh | grep -E 'DELETE' |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr 2 `sakila`.`country` 1 `sakila`.`payment_tmp` 1 `sakila`.`city` 1 `sakila`.`address`

Q3: How many insert/update/delete queries executed against sakila.country table?

./summarize_binlogs.sh | grep -i '`sakila`.`country`' | awk '{print $7 " " $11}' | sort -k1,2 | uniq -c 2 `sakila`.`country` DELETE 1 `sakila`.`country` INSERT

Q4: Give me the top 3 statements which affected maximum number of rows.

./summarize_binlogs.sh | grep Table | sort -nr -k 12 | head -n 3 Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

See how easy it is to identify the large transactions.

Q5 : Find DELETE queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'DELETE' | awk '{if($12>1000) print $0}' Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

If we want to get all queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'Table' | awk '{if($12>1000) print $0}' Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected

Conclusion

The information logged in binary logs is really helpful to understand our write workload. We used mysqlbinlog + awk to present the binary log contents in more readable format. Further we used the summarized information to answer some of the important questions.

The post Identifying useful info from MySQL row-based binary logs appeared first on MySQL Performance Blog.

Importing big tables with large indexes with Myloader

Latest MySQL Performance Blog posts - January 20, 2015 - 3:56am

Mydumper is known as the mysqldump but faster, much faster. So, if you take logical backup you will choose Mydumper instead of mysqldump. But what about the restore? Well, who needs to restore a logical backup? It takes ages! Even with Myloader. But, this could change just a bit if we are able to take advantage of Fast Index Creation.

As you probably know, Mydumper and mysqldump export the struct of a table, with all the indexes and the constraints, and of course, the data. Then, Myloader and mysql import the struct of the table and import the data. The most important difference, is that you can configure Myloader to import the data using certain amount of threads. So, the import steps are:

  1. Create the complete struct of the table
  2. Import the data

When you execute Myloader, internally it first creates the tables executing the “-schema.sql” files and then takes all the filenames without “schema.sql” and puts them in a task queue. Every thread takes a filename from the queue, which actually is a chunk of the table, and executes it, when it finishes take another chunk from the queue, but if the queue is empty it just ends.

This import procedure works fast for small tables, but with big tables with large indexes the inserts are getting slower caused by the overhead of insert the new values in secondary indexes. Another way to import the data is:

  1. Split the table structure into table creation with primary key, indexes creation and constraint creation
  2. Create tables with primary key
  3. Per table do:
    1. Load the data
    2. Create index
  4. Create constraints

This import procedure is implemented in a brach of Myloader that can be downloaded from here or  directly executing  bzr with the repository:

bzr branch lp:~david-ducos/mydumper/mydumper

The tool reads the schema files ands splits them into three separate statements which create the tables with the primary key, the indexes and the constraints. The primary key is kept in the table creation in order to avoid the recreation of the table when a primary key is added and the “KEY” and “CONSTRAINT” lines are removed. These lines are added to the index and constraint statements, respectively.

It processes tables according to their size starting with the largest because creating the indexes of a big table could take hours and is single-threaded. While we cannot process other indexes at the time, we are potentially able to create other tables with the remaining threads.

It has a new thread (monitor_process) that decides which chunk of data will be put in the task queue and a communication queue which is used by the task processes to tell the monitor_process which chunk has been completed.

I run multiples imports on AWS m1.xlarge machine with one table comparing Myloader and this branch and I found that with large indexes the times were:

As you can see, when you have less than 150M rows, import the data and then create the indexes is higher than import the table with the indexes all at once. But everything changes after 150M rows, import 200M takes 64 minutes more for Myloader but just 24 minutes for the new branch.

On a table of 200M rows with a integer primary key and 9 integer columns, you will see how the time increases as the index gets larger:

Where:

2-2-0: two 1-column and two 2-column index
2-2-1: two 1-column, two 2-column and one 3-column index
2-3-1: two 1-column, three 2-column and one 3-column index
2-3-2: two 1-column, three 2-column and two 3-column index

 

Conclusion

This branch can only import all the tables with this same strategy, but with this new logic in Myloader, in a future version it could be able to import each table with the best strategy reducing the time of the restore considerably.

The post Importing big tables with large indexes with Myloader appeared first on MySQL Performance Blog.

query_cache behaviour in multi-master mode for 5.6.21-70.1-56

Lastest Forum Posts - January 19, 2015 - 7:09am
The query_cache is now supported in 5.6 but only after starting up.

But i cannot find any reference to the node and cluster behaviour for a mult-master mode.
ie is the query_cache node or cluster specific and is there any cache invalidation across the nodes?

For 5.5 i can find information about above, but not for 5.6

any help would be appreciated as my app is currently very depended on the query_cache

marcel

Looking deeper into InnoDB’s problem with many row versions

Latest MySQL Performance Blog posts - January 19, 2015 - 7:06am

A few days ago I wrote about MySQL performance implications of InnoDB isolation modes and I touched briefly upon the bizarre performance regression I found with InnoDB handling a large amount of versions for a single row. Today I wanted to look a bit deeper into the problem, which I also filed as a bug.

First I validated in which conditions the problem happens. It seems to happen only in REPEATABLE-READ isolation mode and only in case there is some hot rows which get many row versions during a benchmark run. For example the problem does NOT happen if I run sysbench with “uniform” distribution.

In terms of concurrent selects it also seems to require some very special conditions – you need to have the connection to let some history accumulate by having read snapshot open and then do it again with high history. The exact queries to do that seems not to be that important.

Contrary to what I expected this problem also does not require multiple connections – I can repeat it with only one read and one write connection, which means it can happen in a lot more workloads.

Here is the simplified case to repeat it:

sysbench --num-threads=1 --report-interval=10 --max-time=0 --max-requests=0 --rand-type=uniform --oltp-table-size=1 --mysql-user=root --mysql-password= --mysql-db=sbinnodb --test=/usr/share/doc/sysbench/tests/db/update_index.lua run

It is repeatable on the table with just 1 row where this row is very hot!

Select query sequence:

begin; select c from sbtest1 limit 1; select sleep(20); commit; begin; select c from sbtest1 limit 1; select sleep(300); commit;

It is interesting though, in this case it does not look like it is enough to open a consistent snapshot – it is only when I run the query on the same table as the update workload is running (I assume touches the same data) when the issue happens.

Let’s look at some graphs:

The transaction rate indeed suffers dramatically – in this case going down more than 100x from close to 15K to around 60 in the lowest point.

It is interesting that CPU consumption is reduced but not as much as the throughput.

This is something I find the most enlightening about what seems to happen. The number of buffer pool read requests increases dramatically where the number of transactions goes down. If we do the math we have 940K buffer pool reads per 14.5K updates in the normal case giving us 65 buffer reads per update, which goes up to 37000 per update when we have regression happening. Which is a dramatic change and it points to something that goes badly wrong.

History size is of interest. We can see it grows quickly first – this is when we have the first transaction and sleep(20) and when it grows very slowly when the second transaction is open as update rate is low. Finally it goes to zero very quickly once the transaction is committed. What is worth noting is the history length here peaks at just 280K which is really quite trivial size.

The InnoDB contention graph is interesting from 2 view points. First it shows that the contention picture changes at the time when the transaction is held open and also right afterward when history is being worked out.

If you look at this graph focused only on the os_wait metrics we can see that mutex_os_waits goes down during the problem while x-locks os waits increases significantly. This means there are some very long exclusive lock os-waits happening which is indicative of the long waits. We can’t actually lock times from status variables – we would need to get data from performance schema for that which unfortunately does not cover key mutexes, and which I unfortunately can’t graph easily yet.

Lets look at some more data.

From the great processlist from sys-schema we can get:

*************************** 4. row *************************** thd_id: 536 conn_id: 516 user: root@localhost db: sbinnodb command: Query state: updating time: 0 current_statement: UPDATE sbtest1 SET k=k+1 WHERE id=1 lock_latency: 36.00 us rows_examined: 1 rows_sent: 0 rows_affected: 0 tmp_tables: 0 tmp_disk_tables: 0 full_scan: NO last_statement: NULL last_statement_latency: NULL last_wait: wait/io/table/sql/handler last_wait_latency: Still Waiting source: handler.cc:7692

Which is unfortunately not very helpful as it does not show the actual lock which is being waited on – just saying that it is waiting somewhere inside storage engine call which is obvious.

We can get some good summary information in the other table:

mysql> select * from waits_global_by_latency; +------------------------------------------------------+-----------+---------------+-------------+-------------+ | events | total | total_latency | avg_latency | max_latency | +------------------------------------------------------+-----------+---------------+-------------+-------------+ | wait/synch/cond/sql/Item_func_sleep::cond | 384 | 00:31:27.18 | 4.91 s | 5.00 s | | wait/synch/rwlock/innodb/index_tree_rw_lock | 239460870 | 00:12:13.82 | 3.06 us | 495.27 ms | | wait/synch/mutex/sql/THD::LOCK_thd_data | 601160083 | 36.48 s | 60.41 ns | 643.18 us | | wait/io/table/sql/handler | 1838 | 29.12 s | 15.84 ms | 211.21 ms | | wait/synch/mutex/mysys/THR_LOCK::mutex | 240456674 | 16.57 s | 68.86 ns | 655.16 us | | wait/synch/cond/sql/MDL_context::COND_wait_status | 15 | 14.11 s | 940.37 ms | 999.36 ms | | wait/synch/mutex/innodb/trx_mutex | 360685304 | 10.52 s | 29.11 ns | 4.01 ms | | wait/synch/mutex/innodb/trx_undo_mutex | 120228229 | 3.87 s | 31.93 ns | 426.16 us | | wait/io/file/innodb/innodb_data_file | 399 | 1.46 s | 3.66 ms | 11.21 ms | | wait/synch/mutex/innodb/buf_pool_flush_state_mutex | 32317 | 587.10 ms | 18.17 us | 11.27 ms | | wait/synch/rwlock/innodb/fil_space_latch | 4154178 | 386.96 ms | 92.96 ns | 74.57 us | | wait/io/file/innodb/innodb_log_file | 987 | 271.76 ms | 275.34 us | 14.23 ms | | wait/synch/rwlock/innodb/hash_table_locks | 8509138 | 255.76 ms | 30.05 ns | 53.41 us | | wait/synch/mutex/innodb/srv_threads_mutex | 11938747 | 249.49 ms | 20.66 ns | 15.06 us | | wait/synch/rwlock/innodb/trx_purge_latch | 8488041 | 181.01 ms | 21.28 ns | 23.51 us | | wait/synch/rwlock/innodb/dict_operation_lock | 232 | 100.48 ms | 433.09 us | 61.88 ms | | wait/synch/rwlock/innodb/checkpoint_lock | 10 | 32.34 ms | 3.23 ms | 8.38 ms | | wait/io/socket/sql/client_connection | 2171 | 30.87 ms | 14.22 us | 3.95 ms | | wait/synch/mutex/innodb/log_sys_mutex | 260485 | 6.57 ms | 25.04 ns | 52.00 us | | wait/synch/rwlock/innodb/btr_search_latch | 4982 | 1.71 ms | 343.05 ns | 17.83 us | | wait/io/file/myisam/kfile | 647 | 1.70 ms | 2.63 us | 39.28 us | | wait/synch/rwlock/innodb/dict_table_stats | 46323 | 1.63 ms | 35.06 ns | 6.55 us | | wait/synch/mutex/innodb/os_mutex | 11410 | 904.63 us | 79.19 ns | 26.67 us | | wait/lock/table/sql/handler | 1838 | 794.21 us | 431.94 ns | 9.70 us | | wait/synch/rwlock/sql/MDL_lock::rwlock | 7056 | 672.34 us | 95.15 ns | 53.29 us | | wait/io/file/myisam/dfile | 139 | 518.50 us | 3.73 us | 31.61 us | | wait/synch/mutex/sql/LOCK_global_system_variables | 11692 | 462.58 us | 39.44 ns | 363.39 ns | | wait/synch/mutex/innodb/rseg_mutex | 7238 | 348.25 us | 47.89 ns | 10.38 us | | wait/synch/mutex/innodb/lock_mutex | 5747 | 222.13 us | 38.50 ns | 9.84 us | | wait/synch/mutex/innodb/trx_sys_mutex | 4601 | 187.43 us | 40.69 ns | 326.15 ns | | wait/synch/mutex/sql/LOCK_table_cache | 2173 | 185.14 us | 85.14 ns | 10.02 us | | wait/synch/mutex/innodb/fil_system_mutex | 3481 | 144.60 us | 41.32 ns | 375.91 ns | | wait/synch/rwlock/sql/LOCK_grant | 1217 | 121.86 us | 99.85 ns | 12.07 us | | wait/synch/mutex/innodb/flush_list_mutex | 3191 | 108.82 us | 33.80 ns | 8.81 us | | wait/synch/mutex/innodb/purge_sys_bh_mutex | 3600 | 83.52 us | 23.16 ns | 123.95 ns | | wait/synch/mutex/sql/MDL_map::mutex | 1456 | 75.02 us | 51.33 ns | 8.78 us | | wait/synch/mutex/myisam/MYISAM_SHARE::intern_lock | 1357 | 55.55 us | 40.69 ns | 320.51 ns | | wait/synch/mutex/innodb/log_flush_order_mutex | 204 | 40.84 us | 200.01 ns | 9.42 us | | wait/io/file/sql/FRM | 16 | 31.16 us | 1.95 us | 16.40 us | | wait/synch/mutex/mysys/BITMAP::mutex | 378 | 30.52 us | 80.44 ns | 9.16 us | | wait/synch/mutex/innodb/dict_sys_mutex | 463 | 24.15 us | 51.96 ns | 146.17 ns | | wait/synch/mutex/innodb/buf_pool_LRU_list_mutex | 293 | 23.37 us | 79.50 ns | 313.94 ns | | wait/synch/mutex/innodb/buf_dblwr_mutex | 398 | 22.60 us | 56.65 ns | 380.61 ns | | wait/synch/rwlock/myisam/MYISAM_SHARE::key_root_lock | 388 | 14.86 us | 38.19 ns | 254.16 ns | | wait/synch/mutex/sql/LOCK_plugin | 250 | 14.30 us | 56.97 ns | 137.41 ns | | wait/io/socket/sql/server_unix_socket | 2 | 9.35 us | 4.67 us | 4.72 us | | wait/synch/mutex/sql/THD::LOCK_temporary_tables | 216 | 8.97 us | 41.32 ns | 465.74 ns | | wait/synch/mutex/sql/hash_filo::lock | 151 | 8.35 us | 55.09 ns | 150.87 ns | | wait/synch/mutex/mysys/THR_LOCK_open | 196 | 7.84 us | 39.75 ns | 110.80 ns | | wait/synch/mutex/sql/TABLE_SHARE::LOCK_ha_data | 159 | 6.23 us | 39.13 ns | 108.92 ns | | wait/synch/mutex/innodb/lock_wait_mutex | 29 | 5.38 us | 185.30 ns | 240.38 ns | | wait/synch/mutex/innodb/ibuf_mutex | 29 | 5.26 us | 181.23 ns | 249.46 ns | | wait/synch/mutex/sql/LOCK_status | 62 | 4.12 us | 66.36 ns | 127.39 ns | | wait/synch/mutex/sql/LOCK_open | 109 | 4.05 us | 36.93 ns | 105.79 ns | | wait/io/file/sql/dbopt | 3 | 3.34 us | 1.11 us | 1.99 us | | wait/synch/mutex/innodb/buf_pool_free_list_mutex | 30 | 1.58 us | 52.58 ns | 106.73 ns | | wait/synch/mutex/innodb/srv_innodb_monitor_mutex | 31 | 1.38 us | 44.45 ns | 89.52 ns | | wait/synch/mutex/sql/LOCK_thread_count | 33 | 1.10 us | 33.18 ns | 50.71 ns | | wait/synch/mutex/sql/LOCK_prepared_stmt_count | 32 | 1.10 us | 34.12 ns | 67.92 ns | | wait/synch/mutex/innodb/srv_sys_mutex | 42 | 1.07 us | 25.35 ns | 49.14 ns | | wait/synch/mutex/mysys/KEY_CACHE::cache_lock | 24 | 1.02 us | 42.26 ns | 103.60 ns | | wait/synch/mutex/sql/MDL_wait::LOCK_wait_status | 18 | 544.62 ns | 30.05 ns | 42.57 ns | | wait/synch/rwlock/sql/CRYPTO_dynlock_value::lock | 2 | 509.25 ns | 254.47 ns | 287.02 ns | | wait/synch/rwlock/sql/LOCK_dboptions | 6 | 262.92 ns | 43.82 ns | 84.82 ns | | wait/synch/rwlock/sql/MDL_context::LOCK_waiting_for | 4 | 262.92 ns | 65.73 ns | 103.60 ns | | wait/synch/mutex/sql/LOCK_connection_count | 4 | 238.51 ns | 59.47 ns | 123.01 ns | | wait/synch/mutex/sql/LOCK_sql_rand | 2 | 172.15 ns | 86.08 ns | 94.84 ns | | wait/synch/rwlock/sql/LOCK_system_variables_hash | 2 | 141.16 ns | 70.43 ns | 78.25 ns | | wait/synch/rwlock/sql/gtid_commit_rollback | 2 | 95.47 ns | 47.58 ns | 51.65 ns | | wait/synch/mutex/innodb/autoinc_mutex | 2 | 90.14 ns | 45.07 ns | 45.07 ns | | wait/synch/mutex/sql/LOCK_thd_remove | 2 | 85.14 ns | 42.57 ns | 52.58 ns | | wait/synch/mutex/innodb/ibuf_bitmap_mutex | 1 | 76.37 ns | 76.37 ns | 76.37 ns | | wait/synch/mutex/sql/LOCK_xid_cache | 2 | 62.91 ns | 31.30 ns | 34.43 ns | +------------------------------------------------------+-----------+---------------+-------------+-------------+ 73 rows in set (0.00 sec)

Which shows a couple of hotspots – the well-known index_tree_rw_lock hotspot which should be gone in MySQL 5.7 as well as trx_mutex and trx_undo_mutexes. It is especially worth noting the almost 500ms which this index tree lw_lock was waited at max (this is the data for a couple of minutes only, so it it is unlikely to be a very rare outlier). This hints to me that in the case of many versions our index_tree_rw_lock might be held for a prolonged period of time.

What is interesting is that the reliable “SHOW ENGINE INNODB STATUS” remains quite handy:

---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 58847289 --Thread 140386357991168 has waited at btr0cur.cc line 304 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fb0bea2ca40 '&block->lock' a writer (thread id 140386454755072) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0flu.cc line 1082 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row /row0row.cc line 815 --Thread 140386374776576 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386324420352 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386332813056 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386445960960 has waited at btr0cur.cc line 586 for 0.0000 seconds the semaphore: S-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386366383872 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386341205760 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577 --Thread 140386349598464 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock' a writer (thread id 140386357991168) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.cc line 586 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577

Another sample….

---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 61753307 --Thread 140386332813056 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock' a writer (thread id 140386454755072) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0flu.cc line 1082 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815 --Thread 140386366383872 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock' a writer (thread id 140386454755072) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0flu.cc line 1082 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815 --Thread 140386341205760 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock' a writer (thread id 140386454755072) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0flu.cc line 1082 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815 --Thread 140386374776576 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore: X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock' a writer (thread id 140386454755072) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0flu.cc line 1082 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815 --Thread 140386445960960 has waited at btr0cur.cc line 257 for 0.0000 seconds the semaphore: X-lock (wait_ex) on RW-latch at 0x7fb0bea2f2c0 '&block->lock' a writer (thread id 140386445960960) has reserved it in mode wait exclusive number of readers 1, waiters flag 0, lock_word: ffffffffffffffff Last time read locked in file btr0cur.cc line 257 Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr/btr0cur.cc line 257 OS WAIT ARRAY INFO: signal count 45260762 Mutex spin waits 1524575466, rounds 5092199292, OS waits 43431355 RW-shared spins 14500456, rounds 258217222, OS waits 3721069 RW-excl spins 37764438, rounds 730059390, OS waits 12592324 Spin rounds per wait: 3.34 mutex, 17.81 RW-shared, 19.33 RW-excl

The first one shows mostly index lock contention which we have already seen. The second, though, shows a lot of contention on the block lock, which is something that is not covered by performance schema instrumentation in MySQL 5.6.

Now to tell the truth in this 2nd thread test case it is not your typical case of contention – unlike the case with 64 threads we do not see that drastic CPU drop and it still stays above 25%, the single thread fully saturated should show on this system. This is however where DBA level profiling gets tough with MySQL – as I mentioned the only indication of excessive activity going on is a high number of buffer pool requests – there is no instrumentation which I’m aware of that tells us anything around scanning many old row versions in undo space. Perhaps some more metrics around this syssystem need to be added to INNODB_METRICS?

A good tool to see CPU consumption which both developers and DBAs can use is oprofile, which is of low enough overhead to use in production. Note that to get any meaningful information about the CPU usage inside the program you need to have debug systems installed. For Percona Server 5.6 On Ubuntu you can do apt-get install percona-server-5.6-dbg

To profile just the MySQL Process you can do:

root@ts140i:~# operf --pid 8970 operf: Press Ctl-c or 'kill -SIGINT 9113' to stop profiling operf: Profiler started ^C Profiling done.

To get top symbols (functions) called in mysqld I can do:

root@ts140i:~# opreport -l | more Using /root/oprofile_data/samples/ for samples directory. warning: /no-vmlinux could not be found. warning: [vdso] (tgid:8970 range:0x7fffba5fe000-0x7fffba5fffff) could not be found. CPU: Intel Haswell microarchitecture, speed 3.201e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name symbol name 212162 12.4324 mysqld buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*) 204230 11.9676 mysqld dict_table_copy_types(dtuple_t*, dict_table_t const*) 179783 10.5351 no-vmlinux /no-vmlinux 144335 8.4579 mysqld trx_undo_update_rec_get_update(unsigned char*, dict_index_t*, unsigned long, unsigned long, unsigned long, unsigned long, trx_t*, mem_block_info_t*, upd_t **) 95984 5.6245 mysqld trx_undo_prev_version_build(unsigned char const*, mtr_t*, unsigned char const*, dict_index_t*, unsigned long*, mem_block_info_t*, unsigned char**) 78785 4.6167 mysqld row_build(unsigned long, dict_index_t const*, unsigned char const*, unsigned long const*, dict_table_t const*, dtuple_t const*, unsigned long const*, row_ ext_t**, mem_block_info_t*) 76656 4.4919 libc-2.19.so _int_free 63040 3.6941 mysqld rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned long*, unsigned long, mem_block_info_t**) 52794 3.0937 mysqld mtr_commit(mtr_t*)

Which points to some hot functions – quite as expected we see some functions working with undo space here. In some cases it is also helpful to look at the call graph to understand where functions are being called if it is not clear.

Now in many cases looking at oprofile report it is hard to figure out what is “wrong” without having the baseline. In this case we can look at the oprofile run for sysbench alone:

root@ts140i:~# opreport -l | more Using /root/oprofile_data/samples/ for samples directory. warning: /no-vmlinux could not be found. warning: [vdso] (tgid:8970 range:0x7fffba5fe000-0x7fffba5fffff) could not be found. CPU: Intel Haswell microarchitecture, speed 3.201e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name symbol name 1331386 28.4544 no-vmlinux /no-vmlinux 603873 12.9060 mysqld ut_delay(unsigned long) 158428 3.3859 mysqld buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*) 88172 1.8844 mysqld my_timer_cycles 75948 1.6232 mysqld srv_worker_thread 74350 1.5890 mysqld MYSQLparse(THD*) 68075 1.4549 mysqld mutex_spin_wait(void*, bool, char const*, unsigned long) 59321 1.2678 mysqld que_run_threads(que_thr_t*) 59152 1.2642 mysqld start_mutex_wait_v1 57495 1.2288 mysqld rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long, bool, bool) 53133 1.1356 mysqld mtr_commit(mtr_t*) 44404 0.9490 mysqld end_mutex_wait_v1 40226 0.8597 libpthread-2.19.so pthread_mutex_lock 38889 0.8311 mysqld log_block_calc_checksum_innodb(unsigned char const*)

As you see this report is very different in terms of top functions.

Final thoughts: I had 2 purposes with this blog post. First, I hope it will contain some helpful information for developers so this bug can be fixed at least in MySQL 5.7. It is quite a nasty one as it can come out of nowhere if the application unexpectedly develops hot rows. Second, I hope it will show some of the methods you can use to capture and analyze data about MySQL performance.

The post Looking deeper into InnoDB’s problem with many row versions appeared first on MySQL Performance Blog.

pmp-check-aws-rds.py Issues

Lastest Forum Posts - January 18, 2015 - 5:14pm
Hey All

I am trying to configure the percona-nagios-plugins as i would like to monitor some RDS Instnaces that i have but i am running into issues

i have followed the documentation and i am getting a little frustrated as i keep getting errors or nothing gets returned

For example if i run ./pmp-check-aws-rds.py -l it returns "List of All DB Instnaces []" and if i run ./pmp-check-aws-rds.py -i "Instance-name" -p it returns No DB instance "Instance-name" found on your AWS account.

i have got the correct aws keys in the /etc/boto.cfg file and python-boto is installed, can someone help please

Lowering MySQL Memory Usage without Restart

Lastest Forum Posts - January 16, 2015 - 7:30pm
Hello,
We are using Percona Server 5.5.35-rel33.0-611.precise with InnoDB in our production server. Recently Mysql memory went high and it did't drop from there. We are afraid whether it would result in OOM sooner. We would like know whether mysql memory can be freed with out doing a restart

we tried RESET QUERY CACHE but it didn't help

We came across this post http://serverfault.com/questions/238...ear-the-buffer
where it says we can use

innodb_max_dirty_pages_pct=0;

to clear the buffer. We would like to know whether its safe to use in Prodution enviorment( no data loss) ?

Does it clear the clear memory ?

does it require a mysql restart ?

Please advise

thanks,
Santhosh

rsync went wrong, but server started clean, now what?

Lastest Forum Posts - January 16, 2015 - 5:39am
Hello,

I was moving a mysql datadir from one VM to another during an OS upgrade and I believe my rsync command did not copy the ib_logfile0 and ib_logfile1 files (hereon referred to as "the ib_logfiles", but I still mean both). I use Xtrabackup for scheduled backups, but for this I wanted to get the entire datadir w/ binlogs in one pass so I'd stopped the server and rsync'ed.

The datadir in its home on the new VM is running fine and not putting anything in the error.log file, but the problem is that now when Xtrabackup runs I get this during the applylog step:

: 150115 23:00:10 innobackupex: Starting the apply-log operation IMPORTANT: Please check that the apply-log run completes successfully. At the end of a successful apply-log run innobackupex prints "completed OK!". 150115 23:00:10 innobackupex: Starting ibbackup with command: xtrabackup --defaults-file="/opt/maginot/backup/dbnode/dbdir/backup-my.cnf" --defaults-group="mysqld" --prepare --target-dir=/opt/maginot/backup/dbnode/dbdir xtrabackup version 2.2.8 based on MySQL server 5.6.22 Linux (x86_64) (revision id: ) xtrabackup: cd to /opt/maginot/backup/dbnode/dbdir xtrabackup: This target seems to be not prepared yet. xtrabackup: xtrabackup_logfile detected: size=2097152, start_lsn=(607055496) xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = ./ xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend xtrabackup: innodb_log_group_home_dir = ./ xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 2097152 xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = ./ xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend xtrabackup: innodb_log_group_home_dir = ./ xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 2097152 xtrabackup: Starting InnoDB instance for recovery. xtrabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter) InnoDB: Using atomics to ref count buffer pool pages InnoDB: The InnoDB memory heap is disabled InnoDB: Mutexes and rw_locks use GCC atomic builtins InnoDB: Memory barrier is not used InnoDB: Compressed tables use zlib 1.2.8 InnoDB: Using CPU crc32 instructions InnoDB: Initializing buffer pool, size = 100.0M InnoDB: Completed initialization of buffer pool InnoDB: Highest supported file format is Barracuda. InnoDB: 128 rollback segment(s) are active. InnoDB: Waiting for purge to start 2015-01-15 23:00:10 7f035221e700 InnoDB: Error: page 1 log sequence number 607758329 InnoDB: is in the future! Current system log sequence number 607056925. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2015-01-15 23:00:10 7f035221e700 InnoDB: Error: page 97 log sequence number 609499062 InnoDB: is in the future! Current system log sequence number 607056925. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2015-01-15 23:00:10 7f035221e700 InnoDB: Error: page 39 log sequence number 607758329 InnoDB: is in the future! Current system log sequence number 607056925. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. InnoDB: 5.6.22 started; log sequence number 607055496 2015-01-15 23:00:10 7f035fcb3780 InnoDB: Error: page 0 log sequence number 607859942 InnoDB: is in the future! Current system log sequence number 607056925. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. - about 15 more like these - 2015-01-15 23:00:10 7f035fcb3780 InnoDB: Error: page 0 log sequence number 607475883 InnoDB: is in the future! Current system log sequence number 607056925. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. [notice (again)] If you use binary log and don't use any hack of group commit, the binary log position seems to be: InnoDB: Last MySQL binlog file position 0 911392, file name mysql-bin.001157 xtrabackup: starting shutdown with innodb_fast_shutdown = 1 InnoDB: FTS optimize thread exiting. InnoDB: Starting shutdown... InnoDB: Shutdown completed; log sequence number 607059834 - starts second pass - Trying to use the resulting backup on another test host results in similar InnoDB errors to those shown above and the processes of sessions crashing out. The clients error out with "the mysql server went away".

I have also tried taking a filesystem-level snapshot of the datadir and moving it to the test host. When I start the server after doing that, I get the same "sequence number is in the future" errors in error.log; presumably it is trying to do crash-recovery and failing?.

Below I've copy-pasted the commands I used to copy the datadir from the old VM to the new VM by way of an intermediary host I'll call opsVM:

: root@oldVM:~# service stop mysql -switch host- root@opsVM:~# mkdir hmmfoo root@opsVM:~# rsync -avz --rsync-path='sudo rsync' pordeploy@oldVM:/var/lib/mysql hmmfoo/ -e 'ssh -i /path/to/key' root@opsVM:~# ls hmmfoo/ -switch host- root@newVM:~# service mysql stop -switch host- root@opsVM:~# rsync -avz --rsync-path='sudo rsync' hmmfoo/mysql/ pordeploy@newVM:/var/lib/mysql/ -e 'ssh -i /path/to/key' -switch host- root@newVM:~# service mysql start So my theory is that, because the destination server was stopped after the source server, its ib_logfiles had a newer modtime and the same size as the source's. Because I did not include the -c option to rsync...
http://linux.die.net/man/1/rsync
: -c, --checksum skip based on checksum, not mod-time & size ...rsync skipped the ib_logfiles. There is only one DB on this server, for an internal production app. The destination host, newVM, had been built using the same automated process as the oldVM was; so even though the newVM copy was empty of data before I rsync'ed it had an identical my.cnf and an essentially-identical DB schema and on-disk layout in /var/lib/mysql. That said... I find it highly unlikely that any other file in the datadir would have been the exact same size even if it wasn't entirely empty...

So...
  1. I feel it is relatively safe to assume no other files were affected in this way.
  2. I'm not sure how to prove my theory (going to try and break another test VM the same way this morning).
  3. If the ib_logfiles were in fact skipped, I don't know if I can simply remove them with the server stopped and then restart it so it can create blank ones.
I hesitate on the 3rd point because this is now in production and, for now, working.

Thank you for reading, and thanks in advance if anyone has an insights here,
Mark

FOSDEM 2015: MySQL and Friends Community Dinner!

Latest MySQL Performance Blog posts - January 16, 2015 - 12:00am

As you may already be aware, FOSDEM and its wonderful MySQL and Friends track are once again descending upon the campus of the ULB in Brussels, Belgium. Logically, a traditional MySQL and Friends Community Dinner must follow! Signups for that are now available, so be sure to get your tickets before they run out.

As 2015 heralds the 20th anniversary of MySQL, it stands to reason we would love to mark the occasion with our biggest and best edition yet!

With the help of some wonderful sponsors, your hosts are looking forward to welcoming you for a night to remember. As such, brace yourselves for a feast prepared for the community, by the community. That’s right, we will be firing up the pizza ovens and serve you all with the best pizzas Belgium has to offer.

The listed ticket price includes all-you-can-eat pizza (or until we run out… go ahead, we dare you), as well as a great selection of Belgian specialty beers and other beverages. We’ll make sure that vegetarians have plenty of options, but anyone intending to join us with special dietary requirements, please let us know ahead of time!We’re looking forward to meeting you all again at FOSDEM 2015 and the Community Dinner. See you then!From the Party-Squad – Lizz, Dim0 and KennySponsorsOnce again, we want to thank our generous sponsors, whose help makes this affordable at such a great price.On that subject, we would like to point out that we will not be able to get an exact fix on the full cost until we have a full list of attendees, so with that in mind, the price we’ve set is based on the assumption of a certain amount of people attending. If the amount of people ends lower than our target, any remaining funds will be donated to the FOSDEM organization. FOSDEM, by the way, is a free, developer-oriented annual event attracting more than 5,000 geeks from around the world.Community Sponsors:

 

Other Sponsors:

ICAB Brussels – Business and Technology Incubator

Wondering how to get there from FOSDEM?

The venue itself is located very close to the VUB. You can find the route to get there right here.

The total distance from the ULB campus is about 2.3km, so you could walk there, but it might be more comfortable to take the tram.

– Tram 25 departs from the “ULB” stop regularly, and will take you up to “Hansen-Soulie”, where you need to get out, and walk the remaining 200m. The tram ride takes about 11 minutes.

The post FOSDEM 2015: MySQL and Friends Community Dinner! appeared first on MySQL Performance Blog.

Auto restart after power down

Lastest Forum Posts - January 15, 2015 - 8:15am
I know this question has been asked many times, but has not found the answer.
According to http://galeracluster.com/documentati...ngcluster.html
galera cluster cannot survive power down and require manual restart (IDENTIFYING THE MOST ADVANCED NODE).

Is it true for Percona cluster? Is there any solution? What happens in case when 3 nodes automatically starts after power down?
It is not a problem for me to lose information written in last seconds, even minutes. More important is fully automated installation. Thanks!

Hyper-threading – how does it double CPU throughput?

Latest MySQL Performance Blog posts - January 15, 2015 - 12:00am

The other day a customer asked me to do capacity planning for their web server farm. I was looking at the CPU graph for one of the web servers that had Hyper-threading switched ON and thought to myself: “This must be quite a misleading graph – it shows 30% CPU usage. It can’t really be that this server can handle 3 times more work?”

Or can it?

I decided to do what we usually do in such case – I decided to test it and find out the truth. Turns out – there’s more to it than meets the eye.

How Intel Hyper-Threading works

Before we get to my benchmark results, let’s talk a little bit about hyper-threading. According to Intel, Intel® Hyper-Threading Technology (Intel® HT Technology) uses processor resources more efficiently, enabling multiple threads to run on each core. As a performance feature, Intel HT Technology also increases processor throughput, improving overall performance on threaded software.

Sounds almost like magic, but in reality (and correct me if I’m wrong), what HT does essentially is – by presenting one CPU core as two CPUs (threads rather), it allows you to offload task scheduling from kernel to CPU.

So for example if you just had one physical CPU core and two tasks with the same priority running in parallel, the kernel would have to constantly switch the context so that both tasks get a fair amount of CPU time. If, however, you have the CPU presented to you as two CPUs, the kernel can give each task a CPU and take a vacation.

On the hardware level, it will still be one CPU doing the same amount of work, but there maybe some optimization to how that work is going to be executed.

My hypothesis

Here’s the problem that was driving me nuts: if HT does NOT actually give you twice more power and yet the system represents statistics for each CPU thread separately, then at 50% CPU utilization (as per mpstat on Linux), the CPU should be maxed out.

So if I tried to model the scalability of that web server – a 12-core system with HT enabled (represented as 24 CPUs on a system), assuming perfect linear scalability, here’s how it should look:

Throughput (requests per second) | 9 | ,+++++++++++++++ | + | + 6 | + | + | + 3 | + | + | + 0 '-----+----+----+----+---- 1 6 12 18 24

In the example above, single CPU thread could process the request in 1.2s, which is why you see it max out at 9-10 requests/sec (12/1.2).

From the user perspective, this limitation would hit VERY unexpectedly, as one would expect 50% utilization to be… well, exactly that – 50% utilization.

In fact, the CPU utilization graph would look even more frustrating. For example if I were increasing the number of parallel requests linearly from 1 to 24, here’s how that relationship should look:

CPU utilization: 100% | ++++++++++++++ | . | . | . | . 50% | . | + | + | + | + 0% '----+----+----+----+---- 0 6 12 18 24

Hence CPU utilization would skyrocket right at 12 cores from 50% to 100%, because in fact the system CPU would be 100% utilized at this point.

What happens in reality

Naturally, I decided to run a benchmark and see if my assumptions are correct. The benchmark was pretty basic – I wrote a CPU-intensive php script, that took 1.2s to execute on the CPU I was testing this, and bashed it over http (apache) with ab at increasing concurrency. Here’s the result:

Raw data can be found here.

If this does not blow your mind, please go over the facts again and then back at the graph.

Still not sure why do I find this interesting? Let me explain. If you look carefully, initially – at concurrency of 1 through 8 – it scales perfectly. So if you only had data for threads 1-8 (and you knew processes don’t incur coherency delays due to shared data structures), you’d probably predict that it will scale linearly until it reaches ~10 requests/sec at 12 cores, at which point adding more parallel requests would not have any benefits as the CPU would be saturated.

What happens in reality, though, is that past 8 parallel threads (hence, past 33% virtual CPU utilization), execution time starts to increase and maximum performance is only achieved at 24-32 concurrent requests. It looks like at the 33% mark there’s some kind of “throttling” happening.

In other words, to avoid a sharp performance hit past 50% CPU utilization, at 33% virtual thread utilization (i.e. 66% actual CPU utilization), the system gives the illusion of a performance limit – execution slows down so that the system only reaches the saturation point at 24 threads (visually, at 100% CPU utilization).

Naturally then the question is – does it still make sense to run hyper-threading on a multi-core system? I see at least two drawbacks:

1. You don’t see the real picture of how utilized your system really is – if the CPU graph shows 30% utilization, your system may well be 60% utilized already.
2. Past 60% physical utilization, execution speed of your requests will be throttled intentionally in order to provide higher system throughput.

So if you are optimizing for higher throughput – that may be fine. But if you are optimizing for response time, then you may consider running with HT turned off.

Did I miss something?

The post Hyper-threading – how does it double CPU throughput? appeared first on MySQL Performance Blog.

pt-query-digest stops digesting log when it encounters DELETE statement

Lastest Forum Posts - January 14, 2015 - 4:39pm
I'm trying to digest some slow query logs but pt-query-digest appears to stop working when it encounters a DELETE statement. The tool still takes a long time so it would appear that it is digesting the entire file but fails to report the DELETE statements in the report. It will either give me only the DELETE statement and then fail to give me everything else or only give me the other statements without the DELETE statement.

Here is the DELETE statement from the slow query log file:

# User@Host: abc[abc] @ [##.#.#.#]
# Query_time: 0.013774 Lock_time: 0.000049 Rows_sent: 0 Rows_examined: 16094
use my_db;
SET timestamp=1420780225;
DELETE FROM sessions WHERE date_modified < '2015-01-07 22:10:25';

When I execute pt-query-digest I do it like this: /opt/percona-toolkit-2.2.12/bin/pt-query-digest /opt/slow_query.log > /opt/slow_query_report.txt

Anyone know why the tool is failing to put the DELETE statements into the report?

Percona XtraBackup 2.2.8 is now available

Latest MySQL Performance Blog posts - January 14, 2015 - 6:36am

Percona is glad to announce the release of Percona XtraBackup 2.2.8 on January 14, 2015. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.6.22.

Bugs Fixed:

  • Incremental backups would fail if the number of undo tablespaces (innodb_undo_tablespaces) was more than 1. This was caused by innobackupex removing the undo tablespaces during the prepare phase. Bug fixed #1363234.
  • Fixed multiple memory leaks detected by AddressSanitizer. Bug fixed #1395143.
  • innobackupex could fail when preparing backup that was taken from Percona Server 5.5 with log files (innodb_log_file_size) bigger than 4G. The root cause was that the last checkpoint LSN offset in log group is stored at different offsets in ibdata1 for Percona Server 5.5 and MySQL 5.6 when the total size of log files is greater than 4G. Bug fixed #1403237.
  • Percona XtraBackup out-of-source builds failed. Bug fixed #1402450.

Release notes with all the bugfixes for Percona XtraBackup 2.2.8 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.8 is now available appeared first on MySQL Performance Blog.

ALTER TABLE locks up cluster

Lastest Forum Posts - January 14, 2015 - 4:05am
I have a 5-node PXC 5.5.39 on CentOS 6, in a single-write/multiple-read configuration with wsrep_sync_wait=1 for synchronous replication.

Each time I run an ALTER TABLE the entire cluster "locks up", refuses new connections (even though there are plenty more available on all nodes) and drops existing connections until the operation is finished. It doesn't matter if it's a long or short operation, the database or table I run it on, it always does this even if I add a table index which takes only 0.5 sec.

Before enabling wsrep_sync_wait, the cluster behaved normally, even in case of long ALTER's it accepted new connections and the other ones which ran on the altered table just waited in locked state for the alter to finish and then they resumed.

Some of the relevant wsrep and innodb settings:
: [mysqld] character-set-server=utf8 collation-server=utf8_general_ci default-storage-engine=InnoDB datadir=/var/lib/mysql tmpdir=/var/tmp/ socket=/var/lib/mysql/mysql.sock user=mysql skip-name-resolve event_scheduler=OFF # Default to using old password format for compatibility with mysql 3.x # clients (those using the mysqlclient10 compatibility package). old_passwords=0 # connections max_connections=700 connect_timeout=10 wait_timeout=300 interactive_timeout=300 lock_wait_timeout=60 # table caches table_open_cache=2048 table_definition_cache=2048 tmp_table_size=128M max_heap_table_size=128M thread_cache_size=256 max_allowed_packet=32M net_buffer_length=8K # operating buffers key_buffer_size=128M sort_buffer_size=4M myisam_sort_buffer_size=4M join_buffer_size=8M read_buffer_size=4M read_rnd_buffer_size=8M query_cache_type=0 server-id=2 wsrep_cluster_name=xxx wsrep_node_address=10.0.0.1 wsrep_cluster_address=gcomm://10.0.0.1,10.0.0.3,10.0.0.5,10.0.0.6,10.0.0.7 wsrep_provider_options='gcache.size=512M' wsrep_node_name=server2 wsrep_provider='/usr/lib64/libgalera_smm.so' wsrep_slave_threads=1 wsrep_auto_increment_control=1 wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=xxx:xxx wsrep_retry_autocommit=2 wsrep_forced_binlog_format=ROW wsrep_replicate_myisam=0 wsrep_log_conflicts=1 wsrep_sync_wait=1 log-bin=/var/lib/mysql/binary-log max_binlog_size=256M sync_binlog=0 expire_logs_days=5 sync_master_info=0 log_bin_trust_function_creators=1 binlog_format=ROW log_slave_updates=1 innodb_data_home_dir=/var/lib/mysql/ innodb_data_file_path=ibdata1:10M:autoextend innodb_autoextend_increment=10 innodb_log_group_home_dir=/var/lib/mysql/ innodb_thread_concurrency=0 innodb_checksums innodb_flush_log_at_trx_commit=2 innodb_fast_shutdown=0 innodb_mirrored_log_groups=1 innodb_log_files_in_group=3 innodb_log_file_size=128M innodb_log_buffer_size=8M innodb_buffer_pool_instances=8 innodb_buffer_pool_size=8G innodb_additional_mem_pool_size=64M innodb_lock_wait_timeout=60 innodb_flush_method=O_DIRECT innodb_file_per_table innodb_stats_on_metadata=0 innodb_file_format=barracuda innodb_stats_update_need_lock=0 innodb_purge_threads=1 default_week_format = 1 Do you have any ideas what could be causing this?

Thanks.

MySQL performance implications of InnoDB isolation modes

Latest MySQL Performance Blog posts - January 14, 2015 - 3:00am

Over the past few months I’ve written a couple of posts about dangerous debt of InnoDB Transactional History and about the fact MVCC can be the cause of severe MySQL performance issues. In this post I will cover a related topic – InnoDB Transaction Isolation Modes, their relationship with MVCC (multi-version concurrency control) and how they impact MySQL performance.

The MySQL Manual provides a decent description of transaction isolation modes supported by MySQL – I will not repeat it here but rather focus on performance implications.

SERIALIZABLE – This is the strongest isolation mode to the point it essentially defeats Multi-Versioning making all SELECTs locking causing significant overhead both in terms of lock management (setting locks is expensive) and the concurrency you can get. This mode is only used in very special circumstances among MySQL Applications.

REPEATABLE READ – This is default isolation level and generally it is quite nice and convenient for the application. It sees all the data at the time of the first read (assuming using standard non-locking reads). This however comes at high cost – InnoDB needs to maintain transaction history up to the point of transaction start, which can be very expensive. The worse-case scenario being applications with a high level of updates and hot rows – you really do not want InnoDB to deal with rows which have hundreds of thousands of versions.

In terms of performance impact, both reads and writes can be impacted. For select queries traversing multiple row versions is very expensive but so it is also for updates, especially as version control seems to cause severe contention issues in MySQL 5.6

Here is example: I ran sysbench for a completely in-memory data set and when start transaction and run full table scan query couple of times while keeping transaction open:

sysbench --num-threads=64 --report-interval=10 --max-time=0 --max-requests=0 --rand-type=pareto --oltp-table-size=80000000 --mysql-user=root --mysql-password= --mysql-db=sbinnodb --test=/usr/share/doc/sysbench/tests/db/update_index.lua run

As you can see the write throughput drops drastically and stays low at all times when transaction is open, and not only when the query is running. This is perhaps the worse-case scenario I could find which happens when you have select outside of transaction followed by a long transaction in repeatable-read isolation mode. Though you can see regression in other cases, too.

Here is the set of queries I used if someone wants to repeat the test:

select avg(length(c)) from sbtest1; begin; select avg(length(c)) from sbtest1; select sleep(300); commit;

Not only is Repeatable Read the default isolation level, it is also used for logical backups with InnoDB – think mydumper or mysqldump –single-transaction
These results show such backup methods not only can’t be used with large data sets due to long recovery time but also can’t be used with some high write environments due to their performance impact.

READ COMMITTED mode is similar to REPEATABLE READ with the essential difference being what versions are not kept to the start of first read in transaction, but instead to the start of the current statement. As such using this mode allows InnoDB to maintain a lot less versions, especially if you do not have very long running statements. If you have some long running selects – such as reporting queries performance impact can be still severe.

In general I think good practice is to use READ COMITTED isolation mode as default and change to REPEATABLE READ for those applications or transactions which require it.

READ UNCOMMITTED – I think this is the least understood isolation mode (not a surprise as it only has 2 lines of documentation about it) which only describe it from the logical point of view. If you’re using this isolation mode you will see all the changes done in the database as they happen, even those by transactions which have not been yet committed. One nice use case for this isolation mode is what you can “watch” as some large scale UPDATE statements are happening with dirty reads showing which rows have already been changes and which have not.

So this statement shows changes that have not been committed yet and might not ever be committed if the transaction doing them runs into some errors so – this mode should be used with extreme care. There are a number of cases though when we do not need 100% accurate data and in this case this mode becomes very handy.

So how does READ UNCOMMITTED behave from a performance point of view? In theory InnoDB could purge row versions even if they have been created after the start of the statement in READ UNCOMMITTED mode. In practice, due to a bug or some intricate implementation detail it does not do that – row versions still will be kept to the start of the statement. So if you run very long running SELECT in READ UNCOMMITTED statements you will get a large amount of row versions created, just as if you would use READ COMMITTED. No win here.

There is an important win from SELECT side – READ UNCOMMITTED isolation mode means InnoDB never needs to go and examine the older row versions – the last row version is always the correct one which can cause dramatic performance improvement especially if the undo space had spilled over to the disk so finding old row versions can cause a lot of IO.

Perhaps the best illustration I discovered with query select avg(k) from sbtest1; ran in parallel with the same update heavy workload stated above. In READ COMMITTED isolation mode it never completes – I assume because new index entries are inserted faster than they are scanned, in case of READ UNCOMMITTED isolation mode it completes in a minute or so.

Final Thoughts: Using InnoDB Isolation modes correctly can help your application to get the best possible performance. Your mileage may vary and in some cases you will see no difference; in others it will be absolutely dramatic. There also seems to be a lot of work to be done in relationship to InnoDB performance with long version history. I hope it will be tackled in the future MySQL version.

The post MySQL performance implications of InnoDB isolation modes appeared first on MySQL Performance Blog.

Pages

Subscribe to Percona aggregator
]]>