]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 22 min ago

How to avoid hash collisions when using MySQL’s CRC32 function

October 13, 2014 - 7:43am

Percona Toolkit’s  pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master – and the tool pt-table-sync synchronizes data efficiently between MySQL tables.

The tools by default use the CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-defined function, pt-table-sync will detect it and prefer to use it, because it is much faster than the built-ins. You can also use MURMUR_HASH if you’ve installed that user-defined function. Both of these are distributed with Maatkit. For details please see the tool’s documentation.

Below are test cases similar to what you might have encountered. By using the table checksum we can confirm that the two tables are identical and useful to verify a slave server is in sync with its master. The following test cases with pt-table-checksum and pt-table-sync will help you use the tools more accurately.

For example, in a master-slave setup we have a table with a primary key on column “a” and a unique key on column “b”. Here the master and slave tables are not in sync and the tables are having two identical values and two distinct values. The pt-table-checksum tool should be able to identify the difference between master and slave and the pt-table-sync in this case should sync the tables with two REPLACE queries.

+-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+

Case 1:  Non-cryptographic Hash function (CRC32) and the Hash collision.

The tables in the source and target have two different columns and in general way of thinking the tools should identify the difference. But the below scenarios explain how the tools can be wrongly used and how to avoid them – and make things more consistent and reliable when using the tools in your production.

The tools by default use the CRC32 checksums and it is prone to hash collisions. In the below case the non-cryptographic function (CRC32) is not able to identify the two distinct values as the function generates the same value even we are having the distinct values in the tables.

CREATE TABLE `t1` ( `a` int(11) NOT NULL, `b` int(11) NOT NULL, PRIMARY KEY (`a`), UNIQUE KEY `b` (`b`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+

Master: [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T00:59:45 0 0 4 1 0 1.081 db1.t1 Slave: [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE

Narrowed down to BIT_XOR:

Master: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.16 sec)

Case 2: As the tools are not able to identify the difference, let us add a new row to the slave and check if the tools are able to identify the distinct values. So I am adding a new row (5,5) to the slave.

mysql> insert into db1.t1 values(5,5); Query OK, 1 row affected (0.05 sec) Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ | 5 | 5 | +-----+-----+

[root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T01:01:13 0 1 4 1 0 1.054 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE DELETE FROM `db1`.`t1` WHERE `a`='5' LIMIT 1 /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**. 10,p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; # 1 2 0 0 Chunk 01:01:43 01:01:43 2 db1.t1

Well, apparently the tools are now able to identify the newly added row in the slave and the two other rows having the difference.

Case 3: Advantage of Cryptographic Hash functions (Ex: Secure MD5)

As such let us make the tables as in the case1 and ask the tools to use the cryptographic (secure MD5) hash functions instead the usual non-cryptographic function. The default CRC32 function provides no security due to their simple mathematical structure and too prone to hash collisions but the MD5 provides better level of integrity. So let us try with the –function=md5 and see the result.

Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+

Narrowed down to BIT_XOR:

Master: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 000000000000000063f65b71e539df48 | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 0000000000000000df024e1a4a32c31f | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec)

[root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --function=md5 --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-23T23:57:52 0 1 12 1 0 0.292 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=amma --verbose --function=md5 --sync-to-master 192.***.***.*** # Syncing via replication h=192.168.56.102,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.***.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.**.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; # 0 2 0 0 Chunk 04:46:04 04:46:04 2 db1.t1

Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 4 | 3 | | 3 | 4 | | 3 | 4 | +-----+-----+ +-----+-----+

The MD5 did the trick and solved the problem. See the BIT_XOR result for the MD5 given above and the function is able to identify the distinct values in the tables and resulted with the different crc values. The MD5 (Message-Digest algorithm 5) is a well-known cryptographic hash function with a 128-bit resulting hash value. MD5 is widely used in security-related applications, and is also frequently used to check the integrity but MD5() and SHA1() are very CPU-intensive with slower checksumming if chunk-time is included.

 

The post How to avoid hash collisions when using MySQL’s CRC32 function appeared first on MySQL Performance Blog.

MySQL compression: Compressed and Uncompressed data size

October 10, 2014 - 7:34am

MySQL has information_schema.tables that contain information such as “data_length” or “avg_row_length.” Documentation on this table however is quite poor, making an assumption that those fields are self explanatory – they are not when it comes to tables that employ compression. And this is where inconsistency is born. Lets take a look at the same table containing some highly compressible data using different storage engines that support MySQL compression:

TokuDB:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: TokuDB VERSION: 10 ROW_FORMAT: tokudb_zlib TABLE_ROWS: 40960 AVG_ROW_LENGTH: 10003 DATA_LENGTH: 409722880 MAX_DATA_LENGTH: 9223372036854775807 INDEX_LENGTH: 0 DATA_FREE: 421888 AUTO_INCREMENT: NULL CREATE_TIME: 2014-10-10 07:59:05 UPDATE_TIME: 2014-10-10 08:01:20 CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: TABLE_COMMENT: 1 row in set (0.00 sec)

Archive:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: ARCHIVE VERSION: 10 ROW_FORMAT: Compressed TABLE_ROWS: 40960 AVG_ROW_LENGTH: 12 DATA_LENGTH: 501651 MAX_DATA_LENGTH: 0 INDEX_LENGTH: 0 DATA_FREE: 0 AUTO_INCREMENT: NULL CREATE_TIME: NULL UPDATE_TIME: 2014-10-10 08:08:24 CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: TABLE_COMMENT: 1 row in set (0.01 sec)

InnoDB:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: InnoDB VERSION: 10 ROW_FORMAT: Compressed TABLE_ROWS: 40660 AVG_ROW_LENGTH: 4168 DATA_LENGTH: 169480192 MAX_DATA_LENGTH: 0 INDEX_LENGTH: 0 DATA_FREE: 1572864 AUTO_INCREMENT: NULL CREATE_TIME: 2014-10-10 08:33:22 UPDATE_TIME: NULL CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: row_format=COMPRESSED KEY_BLOCK_SIZE=4 TABLE_COMMENT: 1 row in set (0.00 sec)

From this we can see what Archive and Innodb show the COMPRESSED values for DATA_LENGTH and AVG_ROW_LENGTH while TokuDB shows uncompressed one (as of TokuDB 7.5.1) shipped with Percona Server 5.6.21-69.

The problem here is not only a lack of consistency but also what we need to know about BOTH numbers. We often need to know the uncompressed value to understand how much data there is really in the table, but also how much space it takes on the disk. Comparing these also can help us to understand the compression ratio (or on the contrary expansion due to storage overhead and indexes).

Looking at Information_Schema tables available I can’t find any way to find how much uncompressed data is stored in the Innodb (or Archive) table. The simple trick I can use is running the query along those lines: SELECT SUM(LENGTH(col1)+LENGTH(col2)…) FROM T – This would slightly overestimate the length converting numbers and dates to strings but it is good enough for most purposes.

TokuDB though, while providing uncompressed information in Information_schema TABLES table, allows you to get the information of real data storage on disk from its own information schema tables:

mysql> select * from information_schema.TokuDB_fractal_tree_info where table_schema='test' and table_name='comp' limit 10 G *************************** 1. row *************************** dictionary_name: ./test/comp-main internal_file_name: ./_test_sql_147e_292e_main_2c20c08_1_1b_B_0.tokudb bt_num_blocks_allocated: 125 bt_num_blocks_in_use: 125 bt_size_allocated: 1880088 bt_size_in_use: 1502232 table_schema: test table_name: comp table_dictionary_name: main *************************** 2. row *************************** dictionary_name: ./test/comp-status internal_file_name: ./_test_sql_147e_292e_status_2c20bdd_1_1b.tokudb bt_num_blocks_allocated: 4 bt_num_blocks_in_use: 4 bt_size_allocated: 8704 bt_size_in_use: 512 table_schema: test table_name: comp table_dictionary_name: status 2 rows in set (0.01 sec)

This shows us that this table is really consisting of 2 files each “bt_size_allocated” bytes in length. These numbers are close to what you will see on the disk but not exactly. I see these files are taking 1886208 and 16384 files, respectfully. I wish there would be an exact length available to query so we do not have to think how much difference there is and if it can get large enough in some cases to care.

If you’re just looking for information about how much space has been allocated and how much is currently used for given TokuDB table you can use a query like this:

mysql> select sum(bt_size_allocated) total, sum(bt_size_in_use) used, sum(bt_size_allocated)-sum(bt_size_in_use) free from information_schema.TokuDB_fractal_tree_ +---------+---------+--------+ | total | used | free | +---------+---------+--------+ | 1888792 | 1502744 | 386048 | +---------+---------+--------+ 1 row in set (0.01 sec)

To Sum it up – there is some consistency to improve in terms of reporting compressed and uncompressed data length information in MySQL – both in terms of consistency and information available. It is great to see that TokuDB found a way to report both compressed and uncompressed data size information, yet I would really love to see the actual size on the disk that a given table is taking. And it would be great if there was some consistent way to query it from inside MySQL without having to go to the file-system level and dealing with different ways that different storage engines place data on the file system. This becomes especially important with the coming of full tablespace support in MySQL 5.7 which would make it hard to find all matching files for the table on the filesystem.

The post MySQL compression: Compressed and Uncompressed data size appeared first on MySQL Performance Blog.

MySQL Replication: ‘Got fatal error 1236′ causes and cures

October 8, 2014 - 12:00am

MySQL replication is a core process for maintaining multiple copies of data – and replication is a very important aspect in database administration. In order to synchronize data between master and slaves you need to make sure that data transfers smoothly, and to do so you need to act promptly regarding replication errors to continue data synchronization. Here on the Percona Support team, we often help customers with replication broken-related issues. In this post I’ll highlight the top most critical replication error code 1236 along with the causes and cure. MySQL replication error “Got fatal error 1236” can be triggered by multiple reasons and I will try to cover all of them.

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event ‘binlog.000201′ at 5480571

This is a typical error on the slave(s) server. It reflects the problem around max_allowed_packet size. max_allowed_packet refers to single SQL statement sent to the MySQL server as binary log event from master to slave. This error usually occurs when you have a different size of max_allowed_packet on the master and slave (i.e. master max_allowed_packet size is greater then slave server). When the MySQL master server tries to send a bigger packet than defined on the slave server,  the slave server then fails to accept it and hence the error. In order to alleviate this issue please make sure to have the same value for max_allowed_packet on both slave and master. You can read more about max_allowed_packet here.

This error usually occurs when updating a huge number of rows on the master and it doesn’t fit into the value of slave max_allowed_packet size because slave max_allowed_packet size is lower then the master. This usually happens with queries “LOAD DATA INFILE” or “INSERT .. SELECT” queries. As per my experience, this can also be caused by application logic that can generate a huge INSERT with junk data. Take into account, that one new variable introduced in MySQL 5.6.6 and later slave_max_allowed_packet_size which controls the maximum packet size for the replication threads. It overrides the max_allowed_packet variable on slave and it’s default value is 1 GB. In this post, “max_allowed_packet and binary log corruption in MySQL,”my colleague Miguel Angel Nieto explains this error in detail.

Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

This error occurs when the slave server required binary log for replication no longer exists on the master database server. In one of the scenarios for this, your slave server is stopped for some reason for a few hours/days and when you resume replication on the slave it fails with above error.

When you investigate you will find that the master server is no longer requesting binary logs which the slave server needs to pull in order to synchronize data. Possible reasons for this include the master server expired binary logs via system variable expire_logs_days – or someone manually deleted binary logs from master via PURGE BINARY LOGS command or via ‘rm -f’ command or may be you have some cronjob which archives older binary logs to claim disk space, etc. So, make sure you always have the required binary logs exists on the master server and you can update your procedures to keep binary logs that the slave server requires by monitoring the “Relay_master_log_file” variable from SHOW SLAVE STATUS output. Moreover, if you have set expire_log_days in my.cnf old binlogs expire automatically and are removed. This means when MySQL opens a new binlog file, it checks the older binlogs, and purges any that are older than the value of expire_logs_days (in days). Percona Server added a feature to expire logs based on total number of files used instead of the age of the binlog files. So in that configuration, if you get a spike of traffic, it could cause binlogs to disappear sooner than you expect. For more information check Restricting the number of binlog files.

In order to resolve this problem, the only clean solution I can think of is to re-create the slave server from a master server backup or from other slave in replication topology.

– Got fatal error 1236 from master when reading data from binary log: ‘binlog truncated in the middle of event; consider out of disk space on master; the first event ‘mysql-bin.000525′ at 175770780, the last event read from ‘/data/mysql/repl/mysql-bin.000525′ at 175770780, the last byte read from ‘/data/mysql/repl/mysql-bin.000525′ at 175771648.’

Usually, this caused by sync_binlog <>1 on the master server which means binary log events may not be synchronized on the disk. There might be a committed SQL statement or row change (depending on your replication format) on the master that did not make it to the slave because the event is truncated. The solution would be to move the slave thread to the next available binary log and initialize slave thread with the first available position on binary log as below:

mysql> CHANGE MASTE R TO MASTER_LOG_FILE='mysql-bin.000526', MASTER_LOG_POS=4;

– [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.010711′ at 55212580, the last event read from ‘/var/lib/mysql/log/mysql-bin.000711′ at 4, the last byte read from ‘/var/lib/mysql/log/mysql-bin.010711′ at 4.’, Error_code: 1236

I foresee master server crashed or rebooted and hence binary log events not synchronized on disk. This usually happens when sync_binlog != 1 on the master. You can investigate it as inspecting binary log contents as below:

$ mysqlbinlog --base64-output=decode-rows --verbose --verbose --start-position=55212580 mysql-bin.010711

You will find this is the last position of binary log and end of binary log file. This issue can usually be fixed by moving the slave to the next binary log. In this case it would be:

mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000712', MASTER_LOG_POS=4;

This will resume replication.

To avoid corrupted binlogs on the master, enabling sync_binlog=1 on master helps in most cases. sync_binlog=1 will synchronize the binary log to disk after every commit. sync_binlog makes MySQL perform on fsync on the binary log in addition to the fsync by InnoDB. As a reminder, it has some cost impact as it will synchronize the write-to-binary log on disk after every commit. On the other hand, sync_binlog=1 overhead can be very minimal or negligible if the disk subsystem is SSD along with battery-backed cache (BBU). You can read more about this here in the manual.

sync_binlog is a dynamic option that you can enable on the fly. Here’s how:

mysql-master> SET GLOBAL sync_binlog=1;

To make the change persistent across reboot, you can add this parameter in my.cnf.

As a side note, along with replication fixes, it is always a better option to make sure your replica is in the master and to validate data between master/slaves. Fortunately, Percona Toolkit has tools for this purpose: pt-table-checksum & pt-table-sync. Before checking for replication consistency, be sure to check the replication environment and then, later, to sync any differences.

The post MySQL Replication: ‘Got fatal error 1236′ causes and cures appeared first on MySQL Performance Blog.

Percona Server 5.6.21-69.0 is now available

October 7, 2014 - 9:04am

Percona is glad to announce the release of Percona Server 5.6.21-69.0 on October 7, 2014. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.21, including all the bug fixes in it, Percona Server 5.6.21-69.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free. Complete details of this release can be found in the 5.6.21-69.0 milestone on Launchpad.

New Features:

Bugs Fixed:

  • Backup Locks did not guarantee consistent SHOW SLAVE STATUS information with binary log disabled. Bug fixed #1358836.
  • Audit Log Plugin would rotate the audit log in middle of an audit message. Bug fixed #1363370.
  • When the binary log is enabled on a replication slave, SHOW SLAVE STATUS performed under an active BINLOG lock could lead to a deadlock. Bug fixed #1372806.
  • Fixed a memory leak in Metrics for scalability measurement. Bug fixed #1334570.
  • Fixed a memory leak if secure-file-priv option was used with no argument. Bug fixed #1334719.
  • LOCK TABLES FOR BACKUP is now incompatible with LOCK TABLES, FLUSH TABLES WITH READ LOCK, and FLUSH TABLES FOR EXPORT in the same connection. Bug fixed #1360064.

Other bugs fixed: #1361568.

Release notes for Percona Server 5.6.21-69.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker

The post Percona Server 5.6.21-69.0 is now available appeared first on MySQL Performance Blog.

MySQL ring replication: Why it is a bad option

October 7, 2014 - 8:43am

I’ve recently worked with customers using replication rings with 4+ servers; several servers accepting writes. The idea behind this design is always the same: by having multiple servers, you get high availability and by having multiple writer nodes, you get write scalability. Alas, this is simply not true. Here is why.

High Availability

Having several servers is a necessary condition to have high availability, but it’s far from sufficient. What happens if for instance C suddenly disappears?

  • The replication ring is broken, so updates from A and B will never go to D. D will then quickly become so out-of-date that it’s not going to be usable. But wait! A will no longer receive the updates from B so A will quickly become non usable as well. Same thing for B. So unless you are very quick to configure a smaller ring with the remaining nodes, the whole chain will soon be non working.
  • If an event from C is still being executed on one of the other servers, it will go into an infinite loop, simply because C is the only server being able to prevent an event originally coming from C to cycle through the ring.

Conclusion: each time a server goes down, the whole system goes down. In other words, availability is poorer than with a single server.

Write Scalability

You can think that if you are able to run 1000 writes/s on a single server, writing on 4 servers in parallel will allow you to run 4000 writes/s over the whole cluster. However reality is quite different.

Don’t forget that ALL writes will be executed on ALL servers. So we have 2 separate scenarios:

  • Scenario #1: 1000 writes/s is the point where you’re hitting a bottleneck (for instance disk saturation). Then you’ll never be able to handle the extra load coming from replication. What is going to happen is simply that the servers will become slow because of overload and they’ll never be able to go beyond the 1000 writes/s mark.
  • Scenario #2: a single server could handle 5000 writes/s. Then writing on all servers will indeed allow you to claim that your cluster can absorb 4000 writes/s. But you would achieve the same result by running 4000 writes/s on a single server. This has nothing to do with write scalability.

Conclusion: As all writes are run on all servers, writing on multiple nodes doesn’t magically create extra write capacity. You’re still bound by the capacity of a single server.

Other concerns

Another concern when allowing multiple writers is write conflicts. MySQL doesn’t have any mechanism to detect or solve write conflicts.

So lots of “funny” things can happen when writes are conflicting:

  • Duplicate key errors that will cause replication to halt. And no, setting auto_increment_increment and auto_increment_offset cannot resolve all possible situations when duplicate key errors can happen.
  • An even funnier situation is when conflicting writes do not generate a replication error, but instead create hidden data inconsistencies. Like you have value=100 in a field, A does value=value+2 and B does value=valuex2. You can end up with one server having value=202 and another server having value=204. Which one is the right value? Impossible to know…

If you’re interested in learning more on the risk of writing on multiple nodes while using regular MySQL replication, you can check out this webinar.

Conclusion

A ring is one the worst MySQL replication topologies as it dramatically increases the complexity of all operations on the ring while providing no benefit.

If you need an HA solution, it is not an easy choice as there are many of them and all have tradeoffs, but a ring is definitely not the right option. This post can help you find the right candidate(s).

If you need write scalability, the options are limited, but again, MySQL ring replication is not a good fit. The main question to answer is how many writes do you want to be able to run? For instance, if you want 10x write scalability but your current workload is 100 writes/s, that’s easy: just make sure you have a decent schema, decent indexes and decent hardware. If you want 10x write scalability but you’re already running 5000 writes/s, it’s probably time to explore sharding.

The post MySQL ring replication: Why it is a bad option appeared first on Percona Performance Blog.

Percona Server 5.5.40-36.1 is now available

October 7, 2014 - 8:20am


Percona is glad to announce the release of
Percona Server 5.5.40-36.1 on October 7, 2014. Based on MySQL 5.5.40, including all the bug fixes in it, Percona Server 5.5.40-36.1 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.34-36.1 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

Release notes for Percona Server 5.5.40-36.1 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

(Please also note that Percona Server 5.6 series is the latest General Availability series and current GA release is 5.6.21-69.0.)

The post Percona Server 5.5.40-36.1 is now available appeared first on Percona Performance Blog.

Percona XtraBackup 2.2.5 now available (free MySQL hot backup software)

October 6, 2014 - 5:50am

Percona is glad to announce the release of Percona XtraBackup 2.2.5 on October 2, 2014. Downloads are available from our download site here and Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.6.21.

Bugs Fixed:

  • The fix for bug #1079700 introduced a problem for users with huge numbers of InnoDB tablespaces, and the workaround of raising the open files limits didn’t work in all cases due to a limitation in the Linux kernel. A new innobackupex --close-files option has been implemented to close the file handles once they are no longer accessed. NOTE: Using this option may result in a broken backup if DDL is performed on InnoDB tables during the backup procedure. Bug fixed #1222062.
  • Fix for bug #1206309 introduced a regression in Percona XtraBackup 2.2.0 which caused Percona XtraBackup to fail to copy redo logs in random cases. Bug fixed #1365835.
  • innobackupex --galera-info didn’t copy the last binlog file when it was taking a backup from server where backup locks are supported. Bug fixed #1368577.
  • xtrabackup binary would accept arguments that were not options, which could lead to unexpected results. Bug fixed #1367377.
  • If innobackupex is run against MySQL 5.1 with built-in InnoDB, it will now suggest using Percona XtraBackup 2.0 or upgrading to InnoDB plugin, rather than just failing with the generic unsupported server version message. Bug fixed #1335101.
  • Using the (deprecated) log parameter in mysqld section would cause backups to fail. Bug fixed #1347698.
  • Percona XtraBackup now uses MySQL code to get the stack trace in case Percona XtraBackup crashes with a segmentation fault or an assertion failure. Bug fixed #766305.
  • Attempt to use any of the following options without the --incremental option now fails with an error message rather than creates a full backup: --incremental-lsn, --incremental-basedir, --incremental-history-name, --incremental-history-uuid. Bug fixed #1213778.

Other bugs fixed: #1367613, #1368574, #1370462, #1371441, #1373429, #1373984, and #1265070.

Release notes with all the bugfixes for Percona XtraBackup 2.2.5 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.5 now available (free MySQL hot backup software) appeared first on Percona Performance Blog.

HAProxy: Give me some logs on CentOS 6.5!

October 3, 2014 - 6:59am

HAProxy is frequently used as a load-balancer in front of a Galera cluster. While diagnosing an issue with HAProxy configuration, I realized that logging doesn’t work out of the box on CentOS 6.5. Here is a simple recipe to fix the issue.

If you look at the top of /etc/haproxy/haproxy.cfg, you will see something like:

global log 127.0.0.1 local2 [...]

This means that HAProxy will send its messages to rsyslog on 127.0.0.1. But by default, rsyslog doesn’t listen on any address, hence the issue.

Let’s edit /etc/rsyslog.conf and uncomment these lines:

$ModLoad imudp $UDPServerRun 514

This will make rsyslog listen on UDP port 514 for all IP addresses. Optionally you can limit to 127.0.0.1 by adding:

$UDPServerAddress 127.0.0.1

Now create a /etc/rsyslog.d/haproxy.conf file containing:

local2.* /var/log/haproxy.log

You can of course be more specific and create separate log files according to the level of messages:

local2.=info /var/log/haproxy-info.log local2.notice /var/log/haproxy-allbutinfo.log

Then restart rsyslog and see that log files are created:

# service rsyslog restart Shutting down system logger: [ OK ] Starting system logger: [ OK ] # ls -l /var/log | grep haproxy -rw-------. 1 root root 131 3 oct. 10:43 haproxy-allbutinfo.log -rw-------. 1 root root 106 3 oct. 10:42 haproxy-info.log

Now you can start your debugging session!

The post HAProxy: Give me some logs on CentOS 6.5! appeared first on MySQL Performance Blog.

Percona Toolkit 2.2.11 for MySQL is now available

September 30, 2014 - 6:20am

Percona is pleased to announce the availability of Percona Toolkit 2.2.11.  Released on Sept. 25, Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release contains bug fixes for pt-query-digest, pt-mysql-summary, pt-stalk, as well as other tools and is the current GA (Generally Available) stable release in the 2.2 series. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Fixed bug #1262456: pt-query-digest didn’t report host details when host was using skip-name-resolve option. Fixed by using the IP of the host instead of its name, when the hostname is missing.
  • Fixed bug #1264580: pt-mysql-summary was incorrectly parsing key/value pairs in the wsrep_provider_options option, which resulted in incomplete my.cnf information.
  • Fixed bug #1318985: pt-stalk is now using SQL_NO_CACHE when executing queries for locks and transactions. Previously this could lead to situations where most of the queries that were waiting on query cache mutex were the pt-stalk queries (INNODB_TRX).
  • Fixed bug #1348679: When using -- -p option to enter the password for pt-stalk it would ask user to re-enter the password every time tool connects to the server to retrieve the information. New option --ask-pass has been introduced that can be used to specify the password only once.
  • Fixed bug #1368379: A parsing error caused pt-summary ( specifically the report_system_info module) to choke on the “Memory Device” parameter named “Configured Clock Speed” when using dmidecode to report memory slot information.

Details of the release can be found in the release notes and the 2.2.11 milestone at Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

The post Percona Toolkit 2.2.11 for MySQL is now available appeared first on MySQL Performance Blog.

MySQL & OpenStack: How to overcome issues as your dataset grows

September 29, 2014 - 6:00am

MySQL is the database of choice for most OpenStack components (Ceilometer is a notable exception). If you start with a small deployment, it will probably run like a charm. But as soon as the dataset grows, you will suddenly face several challenges. We will write a series of blog posts explaining the issues you may hit and how to overcome them.

Where is MySQL used in OpenStack?

Have a look at the logical diagram of OpenStack below (click the image for a larger view).

 

The diagram is a bit outdated: Neutron appears as Quantum and newer components like Heat are not pictured. But it shows that a database has to be used to store metadata or runtime information. And although many databases are supported, MySQL is the most common choice. Of course MySQL can also be used in instances running inside an OpenStack cloud.

What are the most common issues?

As with many applications, when you start small, the database is running well and maintenance operations are fast and easy to perform. But with a dataset that grows, you will find that the following operations are becoming increasingly difficult:

  1. Having good backups: mysqldump is the standard backup tool for small deployments. While backups of instances having 100GB of data is still quite fast, restore is single-threaded and will take hours. You will probably need to use other tools such as Percona XtraBackup, but what are the tradeoffs?
  2. Changing the schema: whenever you have to add an index, change a datatype or add a column, it can trigger a table rebuild which will prevent writes to proceed on the table. While the rebuild is fast when the table has only a few hundreds of MBs of data, ALTER TABLE statements can easily take hours or days for very large tables. Using pt-online-schema-change from Percona Toolkit is a good workaround, but it doesn’t mean that you can blindly run it without any precaution.
  3. Making the database highly available: whenever the database is down, the whole platform is down or runs in a degraded state. So you need to plan for a high availability solution. One option is to use Galera, but that can introduce subtle issues.
  4. Monitoring the health of your database instances: MySQL exposes hundreds of metrics, how do you know which ones to lookt at to quickly identify potential issues?

1. and 2. are not likely to be issues for the MySQL instance backing your OpenStack cloud as it will be very small, but they can be big hurdles for guest databases that can grow very large depending on the application.

3. and 4. are highly desirable no matter the size of the database.

Stay tuned for more related posts on MySQL & OpenStack – and feel free to give us your feedback! And remember that if MySQL is showing bad performance in your OpenStack deployment, Percona is here to help. Just give us a call anytime, 24/7. I also invite you and your team to attend the inaugural OpenStack Live 2015 conference, which runs April 13-14, 2015 in Santa Clara, Calif. It runs alongside the Percona Live MySQL Conference and Expo (April 13-16) at the Hyatt Regency Santa Clara and the Santa Clara Convention Center.

The post MySQL & OpenStack: How to overcome issues as your dataset grows appeared first on MySQL Performance Blog.

‘Bash Bug’ giving you Shellshock? CVE-2014-6271 update

September 26, 2014 - 9:34am

The media train is in full steam today over the the CVE-2014-6271 programming flaw, better known as the “Bash Bug” or “Shellshock” – the original problem was disclosed on Wednesday via this post. Firstly this issue exploits bash environment variables in order to execute arbitrary commands; a simple check for this per the Red Hat security blog is the following:

env x='() { :;}; echo vulnerable’ bash -c “echo this is a test”

If you see an error:

bash: warning: x: ignoring function definition attempt
bash: error importing function definition for `x’

Your version of bash is not vulnerable, if you see the text “vulnerable” – however you are.

The question becomes “how much of a problem is this really?” It’s a valid question given that even with the potential to exploit this issue via AcceptEnv for ssh connections; the attack appears to be a “local user” exploit.

I’d like to point out that it has been noted that there’s the potential for this to be exploitable in CGI web applications; and it’s also worth being aware of this new metasploit module which exploits an issue in dhclient where code injection can occur via crafted hostname response to DHCP requests, in my personal opinion this is a far wider issue in dhclient itself.

Redhat also notes that the current fix for “shellshock” is incomplete as per CVE-2014-7169 

Is MySQL affected?

It does not appear to be directly affected at this time; unless you have a UDF allowing shell command execution.

MySQL documentation on environment variables as per here modified local behavior of the client only not the server. (without local modification server side).

Additional resources:

Is my application affected?

There’s no singular answer here given the diversity of applications.  For instance if you’re using PHP and putenv then you potentially have quiet a large attack surface in you application for this specific vulnerability; the best recourse here is to ensure your follow best practices – e.g. update to the latest packages, test the vulnerability, ensure you application is running as a non privileged user, ensure you application only has access to the MySQL permissions it needs; and ensure you’re running a mandatory access control e.g. SELinux / Apparmor as an additional layer of defense.

Suricata and Snort signatures for shellshock as per this volexity blog post

Suricata

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:”Volex – Possible CVE-2014-6271 bash Vulnerability Requested (header)”; flow:established,to_server; content:”() {“; http_header;  threshold:type limit, track by_src, count 1, seconds 120; sid:2014092401;

Snort


alert tcp $EXTERNAL_NET any -> $HOME_NET $HTTP_PORTS (msg:”Volex – Possible CVE-2014-6271 bash Vulnerability Requested (header) “; flow:established,to_server; content:”() {“; http_header;  threshold:type limit, track by_src, count 1, seconds 120; sid:2014092401;)

The post ‘Bash Bug’ giving you Shellshock? CVE-2014-6271 update appeared first on MySQL Performance Blog.

Logical MySQL backup tool mydumper 0.6.2 now available

September 26, 2014 - 2:51am

We are pleased to announce the third release in the 0.6 series of mydumper, a tool for performing logical MySQL backups. In this release, we focused on simplifying compiling the code and added new features for making logical backups. These new features include enhancements to AWS RDS support and extending TokuDB support.

Due to recent changes (or not so much) on mysql libs, it became impossible to compile mydumper without the complete mysql source code. To simplify this, we had to disable the binlog functionality by default, as it was the one affected by the mysql libs changes. Now you should be able to compile against any mysql version without issues. If you still want the binlog feature, it is still there and you can enable it with:

cmake . -DWITH_BINLOG=ON

The one caveat is that you will need to compile against a mysql version greater than 5.5.34. It’s also possible to compile with binlog enabled against the latest 5.6 versions. In this case, you will need the source code to make some changes. You can find related information about this issue at these locations:

Download mydumper-0.6.2 source code here.

Bugs Fixed:
  • #1347392 Last row of table not dumped if it brings statement over statement_size
  • #1157113 Compilation of latest branch fails on CentOS 6.3 64bit
  • #1326368 Can’t make against Percona-Server-devel-55 headers
  • #1282862 Unknown type name ‘HASH’
  • #1336860 k is used twice
  • #913307 Can’t compile – missing libs crypto and ssl
  • #1364393 Rows chunks doesn’t increase non innodb jobs
New MySQL Backup Features:
  • --lock-all-tables

Use LOCK TABLE for all instead of FLUSH TABLES WITH READ LOCK. With this option you will be able to backup RDS instances and also get the binlog coordinates (5.6).

  • TokuDB support

Now TokuDB tables are dumped within the consistent snapshot instead of being locked like MyISAM.

  • Support to dump tables from different schemas

mydumper has two arguments,

    --database (-B)

and

    --tables-list (-T)

so until now you were able to do;

-B db1

or

-B db1 -T t1,t2,tn

To dump a whole database or a list of tables from one database respectively. In 0.6.2 you can list tables in different databases in this way;

-T db1.t1,db1.t2,db2.t1,db2.t1

NOTE: You should not use -B here because mydumper will take the -T list as table names.

The post Logical MySQL backup tool mydumper 0.6.2 now available appeared first on MySQL Performance Blog.

Looking forward to Oracle OpenWorld 2014; visit us at Booth 2413!

September 25, 2014 - 1:45pm

I’m excited to once again be heading to San Francisco next week for Oracle OpenWorld, and also very pleased to have a booth there this year along with some great speakers from Percona.

The scope of Oracle OpenWorld 2014, which runs Sept. 28-Oct. 2, is enormous and there are several keynotes and sessions I’m looking forward to attending. Especially the talks focused on the areas of:

  • MySQL architecture and application development
  • Cloud and Big Data
  • Database administration and DevOps
  • High-Availability and Replication
  • Performance and Scalability

I also have the honor of speaking on the topic of “Practical MySQL Optimization.” Other sessions led by Percona staff include:

All of us will, at one time or another, be at the Percona booth (Booth 2413), so please stop by and say “hello.” And while you’re there, register to win a new Kindle – the Fire HDX 8.9 Tablet.

We’ll also be raffling off passes to Percona Live London (Nov. 3-4), the annual Percona Live MySQL Conference and Expo 2015 (PLMCE, April 13-16) and the new OpenStack Live 2015 conference, which runs alongside PLMCE April 13-14 – also at the Hyatt Regency Santa Clara & the Santa Clara Convention Center.

Let me know if you’ll be attending and do be sure to stop by our booth and say “hello!” I hope to see you there!

 

 

 

 

The post Looking forward to Oracle OpenWorld 2014; visit us at Booth 2413! appeared first on MySQL Performance Blog.

More then 1000 columns – get transactional with TokuDB

September 25, 2014 - 8:03am

Recently I encountered a specific situation in which a customer was forced to stay with the MyISAM engine due to a legacy application using tables with over 1000 columns. Unfortunately InnoDB has a limit at this point. I did not expect to hear this argument for MyISAM. It is usually about full text search or spatial indexes functionality that were missing in InnoDB, and which were introduced in MySQL 5.6 and 5.7, respectively, to let people forget about MyISAM. In this case though, InnoDB still could not be used, so I gave the TokuDB a try.

I’ve created a simple bash script to generate a SQL file with CREATE TABLE statement with the number of columns I desired and then tried to load this using different storage engines. Bit surprisingly, InnoDB failed with column count above 1017, so little more then documented maximum of 1000:

mysql> source /home/vagrant/multicol2.sql ERROR 1117 (HY000): Too many columns

MyISAM let me to create maximum 2410 columns and I could achieve the same result for the TokuDB table! Tried with tinyint or char(10) datatype, same maximum cap applied, not quite sure why it’s exactly 2410 though.

mysql> SELECT tables.TABLE_NAME,count(*) columns,engine,row_format FROM information_schema.columns JOIN information_schema.tables USING (TABLE_NAME) where TABLE_NAME like "multicol%" group by TABLE_NAME; +-----------------+---------+--------+-------------+ | TABLE_NAME      | columns | engine | row_format  | +-----------------+---------+--------+-------------+ | multicol_innodb |    1017 | InnoDB | Compact     | | multicol_myisam |    2410 | MyISAM | Fixed       | | multicol_toku   |    2410 | TokuDB | tokudb_zlib | +-----------------+---------+--------+-------------+ 3 rows in set (0.31 sec)

So if you have that rare kind of table schema with that many columns and you wish to be able to use a transaction storage engine, you may go with TokuDB, available also with recent Percona Server 5.6 versions.

You can find more details about column number limits in MySQL in this post, “Understanding the maximum number of columns in a MySQL table.”

The post More then 1000 columns – get transactional with TokuDB appeared first on MySQL Performance Blog.

Percona Server, OpenStack and the Tesora DBaaS Platform

September 24, 2014 - 8:13am

Percona Server and Percona XtraDB Cluster provide high-performance alternatives for MySQL users. We have also seen rapidly growing interest in these solutions in the OpenStack community where higher performance and high availability are crucial. Many OpenStack users are adopting these solutions but we’ve also seen demand from companies creating OpenStack distros. For example, Percona XtraDB Cluster is now certified for the RHEL OSP (OpenStack Platform) and is included in the Ubuntu 14.04 LTS release. Tesora recently joined this rapidly growing list when they announced the Tesora DBaaS Platform Enterprise Edition which includes Percona Server.

The Tesora platform is an enterprise-ready database as a service (DBaaS) platform based on the OpenStack Trove project. Percona Server was certified by Tesora in August as part of their Tesora DBaaS Platform certification program.

What does this mean for you if you are one of the users who have downloaded Percona Server more than 1,000,000 times?

Many enterprises are evaluating how to deliver robust, high-performance MySQL-as-a-service. OpenStack Trove is is an open-source platform that addresses this challenge. OpenStack operators can build and offer multiple databases as a service (such as MySQL, MongoDB, or Redis). These users can create, operate, maintain and delete different kinds of SQL or NoSQL databases on demand. Users don’t need to worry about the administrative details of the database technology nor complexities such as availability, resiliency, backups, recovery and security.

The Tesora DBaaS Platform Enterprise Edition builds upon OpenStack Trove and makes the “provisioning, deployment, configuration, tuning, monitoring and administration” simpler for operators. The platform includes support for multiple database backends including Percona Server which is certified on the Tesora DBaaS platform. Administrators and DBAs looking to build and offer a database as a service using the Tesora DBaaS Platform Enterprise Edition can rest assured that Percona Server is ready to meet their needs.

OpenStack is a rapidly evolving open-source platform that depends heavily on MySQL for optimal performance. Percona is participating in the Trove project and is sharing our knowledge on OpenStack through webinars (such as “OpenStack: A MySQL DBA Perspective“), blog posts (such as “OpenStack users shed light on Percona XtraDB Cluster deadlock issues“), professional services (such as Percona Consulting Services) and the recently announced OpenStack Live Conference which will be held April 13-14 2015 at the Santa Clara Convention Center in Silicon Valley (the Call for Speakers is open until November 9!).

We look forward to working with Tesora as they build out their platform as well as the rest of the OpenStack community as this exciting technology continues to mature. I hope you can join us in Santa Clara for the OpenStack Live Conference this April – submit your speaker proposal now or purchase your ticket at Super Saver prices!

The post Percona Server, OpenStack and the Tesora DBaaS Platform appeared first on MySQL Performance Blog.

How to scale big data applications using MySQL sharding frameworks

September 23, 2014 - 12:00am

This Wednesday I’ll be discussing two common types of big data: machine-generated data and user-generated content. These types of big data are amenable to sharding, a commonly used technique for spreading data over more than one database server.

I’ll be discussing this in-depth during a live webinar at 10 a.m. Pacific time on Sept. 24. I’ll also talk about two major sharding frameworks: MySQL Fabric and Shard-Query for OLTP or OLAP workloads, respectively. Following the webinar there will be a brief Q/A session.

Find the webinar link here: “How to Scale Big Data Applications Using MySQL Sharding Frameworks” for more information or register directly here.

Find Shard-Query (part of Swanhart-Tools) here, in Github
Find MySQL Fabric (part of MySQL Utilities) here, at the MySQL documentation page

The post How to scale big data applications using MySQL sharding frameworks appeared first on MySQL Performance Blog.

Should you migrate to Percona XtraDB Cluster?

September 22, 2014 - 9:02am

Interest in Percona XtraDB Cluster / Galera has been high ever since we introduced the product in 2012.  I typically have a conversation about Galera and Percona XtraDB Cluster (PXC) at least once a week with a consulting customer who wants to know if it will be a good fit for their application.  Last week I gave a webinar entitled “Migrating to Percona XtraDB Cluster.”

I covered everything in the webinar that I feel it is important for someone to know who is considering Galera and I’d suggest anyone who wants a brief overview of PXC/Galera spends an hour watching the recording.  There were many questions asked in the webinar, but I answered all of them regarding Percona XtraDB Cluster. Access to the webinar is free along with download of the accompanying slides.

* * *

What is Percona XtraDB Cluster? PXC is a replacement for conventional MySQL master/slave architectures to eliminate replication lag and achieve a highly-available masterless cluster of MySQL servers. Like all Percona software, PXC is open source and free.

The post Should you migrate to Percona XtraDB Cluster? appeared first on MySQL Performance Blog.

MySQL upgrade best practices

September 19, 2014 - 10:06am

MySQL upgrades are necessary tasks and we field a variety of questions here at Percona Support regarding MySQL upgrade best practices. This post highlights recommended ways to upgrade MySQL in different scenarios.

Why are MySQL upgrades needed? The reasons are many and include: Access to new features, performance benefits, bug fixes…. However, MySQL upgrades can be risky if not tested extensively beforehand with your application because the process might break it, prevent the application from functioning properly – or performance issues could arise following the upgrade. Moreover, I suggest keeping an eye on new releases of MySQL and Percona Server – check what has changed in the most recent version. Perhaps the latest release has a fix for an issue that you have been experiencing.

Upgrading one major version via SQL Dump:

Upgrading between one major version covers upgrading from Percona Server 5.1 to 5.5 or Percona Server 5.5 to 5.6 and the same implies to Oracle MySQL.

First of all, upgrading between one major version is neither straightforward nor risk-free. Initially you should read “Upgrading from Previous Series” documentation here and here. In that documentation, please place special attention to all of the sections marked “Incompatible Change” and check whether you may be affected by those changes. There might be configuration changes as well as variables renamed, a few older variables obsoleted and new variables introduced – so make sure that you adjust your my.cnf accordingly. For Percona Server specific changes please refer here and here for Percona Server 5.5 & Percona Server 5.6, respectively.

Now there are several possible approaches you may take, where one may be more feasible than the other depending on the current replication topology and total data size – and one might also be safer than another. Let me show you an upgrade procedure… an example upgrading from Percona Server 5.5 to Percona Server 5.6.

In general, there are two types of MySQL upgrades:

  • In place, where you use existing datadir against the new MySQL major version, with just running mysql_upgrade after binaries are upgraded,
  • SQL dump on an old version and then restore it on a new version (using mysqldump utility or alternatives, like mydumper).

Also in general the second type is safer, but as you may expect a much slower MySQL upgrade process.

Theoretically, the safest scenario is:

Here’s a basic procedure (you should stop application writes before starting).

1) Capture users and permissions information. This will backup all your existing user privileges.

$ wget percona.com/get/pt-show-grants; $ perl pt-show-grants --user=root --ask-pass --flush > /root/grants.sql

2) Produce a logical dump of the 5.5 instance, excluding the mysql, information_schema and performance_schema databases:

$ mysql -BNe "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME NOT IN ('mysql', 'performance_schema', 'information_schema')" | tr 'n' ' ' > /root/dbs-to-dump.sql $ mysqldump --routines --events --single-transaction --databases $(cat /root/dbs-to-dump.sql) > /root/full-data-dump.sql

3) Stop 5.5 instance.

$ service mysql stop or $ /etc/init.d/mysql stop

4) Move old datadir (assuming /var/lib/mysql; edit accordingly to your setup):

$ mv /var/lib/mysql/ /var/lib/mysql-55

5) Install 5.6 (simply as you would do when not upgrading). If you don’t use a package manager (yum/apt-get) then is likely that you need to run mysql_install_db and mysql_upgrade.

6) Load the users back to new upgraded version of MySQL.

$ mysql -uroot < /root/grants.sql

7) Load the data back to new version of MySQL.

$ mysql -e "SET GLOBAL max_allowed_packet=1024*1024*1024"; $ mysql -uroot -p --max-allowed-packet=1G < /root/full-data-dump.sql;

At this point all tables have been re-created and loaded in MySQL 5.6 so every binary representation is native to MySQL 5.6. You’ve also completed the cleanest/most-stable upgrade path and your application can resume service – and for that reason it’s worth mentioning that this upgrade path is the same with either upgrading vanila MySQL or Percona Server. Further, you may upgrade from Oracle MySQL to Percona Server, for example,  upgrading Oracle MySQL 5.5 to Percona Server 5.6. Again, the MySQL upgrade path as described would be the same as Percona Server, which is a drop-in replacement of Oracle MySQL.

“SQL dump” is also known as a logical backup. It is safer in the sense that when restoring, all tables will be created using the format of the new MySQL binaries you’re using, which bypasses compatibility issues in general. Still for large data like data in terabytes, gigabytes… this may be a very time-consuming approach. On the other hand, by dumping/reloading such large data sets, it is possible that you will be able to recover a lot of free space on the disk as the InnoDB table spaces will be re-created from scratch, thus optimized and defragmented. If the data was often updated/deleted, the benefits may be significant.

Minor version MySQL upgrade within the same major version via In-Place Upgrade:

This implies to upgrading within the same series e.g. MySQL 5.5.35 to MySQL 5.5.38 or Percona Server 5.6.14 to latest Percona Server 5.6.20.

This is known as an in-place upgrade, where you just install a newer binary package and then run mysql_upgrade script, which checks and updates system tables if necessary. Still, with the in-place upgrade we highly recommend checking release notes for new features, bug fixes, etc. For Percona Server 5.5 and Percona Server 5.6, release notes can be found here and here respectively.

For Percona Server we have additional documents describing some details when it comes to upgrading Percona Server with a focus on Percona-specific features that can be found here and here. This also covers complete In-Place Upgrade procedure with the yum/apt package manager.

Also, to be on safe side you can do the upgrade with a logical dump using the earlier described procedure via mysqldump or mydumper program – where the former does parallel backups and restore and logical backup – and is the safest approach for the upgrade.

MySQL Upgrade directly to the latest version by skipping one major version in between:

This includes upgrading from MySQL 5.0 to MySQL 5.5 by skipping version 5.1 in between or upgrading MySQL 5.1 to MySQL 5.6 by skipping version 5.5 in between. Further, this also includes upgrading to MySQL 5.6 directly from MySQL 5.0 although there should be very few users still using MySQL version 5.0. This also implies to Percona Server.

For the topic, we would assume upgrading from Oracle MySQL or Percona Server 5.1 directly to version 5.6 by skipping one major version 5.5 in between.

Before anything, this is a serious upgrade, and a huge step over one major MySQL version. That is, it’s risky. Upgrading by using just binaries update is not supported and it’s not safe skipping major versions in between, so you should never do this from 5.0->5.5, 5.1->5.6, and surely not for 5.0->5.6. One problem is that not all changes in MySQL versions are backwards compatible. Some differences were introduced that may affect both how the data is handled, but also how the server behaves including both SQL language and MySQL server and storage engines internals. Another thing is that between MySQL 5.0 and 5.6 versions, a number of default setting variables were changed, which may result in completely different, unexpected behavior. For example since MySQL 5.5 the default storage engine is InnoDB and since MySQL 5.6 by default InnoDB will use a separate tablespace for each table and GTID replication was also introduced. But there are many more details which I won’t list here. All of those changes are described in “Upgrading from Previous Series” documentation as described above.

It’s worth mentioning that upgrading by skipping one major version is highly not recommended. Upgrading from MySQL 5.1 to 5.6 shouldn’t be done in one shot. Instead, I would suggest upgrading from version 5.1 to 5.5 and then from version 5.5 to 5.6 and running mysql_upgrade at each step. That will cope with the changes in formats as explained in the manual.

MySQL Upgrade Precautions:

MySQL upgrade precautions are an essential part of the upgrade itself. Before you upgrade make sure you have thoroughly tested all application parts with the desired version of MySQL. This is especially needed for an upgrade between major versions or if you are upgrading by skipping one major version in-between (e.g. upgrade from MySQL 5.1 to MySQL 5.6).

Make sure you read release notes carefully and that you are aware of all the changes. You can find Oracle MySQL 5.5 and 5.6 release notes as follows:

http://dev.mysql.com/doc/relnotes/mysql/5.5/en/
http://dev.mysql.com/doc/relnotes/mysql/5.6/en/

While Percona Server specific release notes can be found below for same versions as described above.

http://www.percona.com/doc/percona-server/5.5/release-notes/release-notes_index.html
http://www.percona.com/doc/percona-server/5.6/release-notes/release-notes_index.html

If you are planning to upgrade to Oracle MySQL 5.6 or Percona Server 5.6 I would recommend first checking for existing critical bugs. Bugs you should aware of:

http://bugs.mysql.com/bug.php?id=66546
http://bugs.mysql.com/bug.php?id=68953
http://bugs.mysql.com/bug.php?id=69444
http://bugs.mysql.com/bug.php?id=70922
http://bugs.mysql.com/bug.php?id=72794
http://bugs.mysql.com/bug.php?id=73820

Upgrade Hierarchy:

This is yet another important aspect of any MySQL upgrade. You should plan your upgrade along with an upgrade hierarchy. This is always recommend: upgrade your dev/QA servers first, then staging server’s before moving to production. In fact, you can spare upgraded instances where you have desired upgraded versions of MySQL and then test your application extensively.

Once you are happy with the MySQL upgrade on your test servers, staging servers, etc., then you can begin the MySQL upgrade on your production servers. In replication environments we highly recommend upgrading the MySQL slaves first (one by one) and then finally upgrading the MySQL master. In reality,  you can upgrade one of the slaves first and run it for few days to be on safe side – all the while closely monitoring its performance. If you don’t have a replication setup it may be worth creating a replica to test the newer version of MySQL on it first. Once you are happy with the results you can upgrade remaining the slaves and finally the master.

 How Percona software helps you in a MySQL upgrade:

In any MySQL upgrade, Percona Toolkit comes to the rescue. Percona Tookit contains a number of tools that help a great deal.

pt-upgrade is one of such tool. It allows you to test whether the new MySQL instance handles some specific queries at least as fast as old version. There may be some substantial differences as the MySQL query optimizer has changed a lot between versions 5.1 and 5.6 and also data statistics may be refreshed, hence the query plan may change. You can check further in the manual about optimizer changes.

pt-query-digest is another great tool that might help you in the upgrade. You can replay your slow query log against existing and new desired MySQL versions for before and after query performance validation.

You can also benefit from Percona Cloud Tools for MySQL which is a hosted service providing access to query performance insights for all MySQL uses. You can signup for free now because this service is in public beta. Percona Cloud Tools, among other things, allows you to visually check your queries performance after a MySQL upgrade.

It’s highly recommended to backup your data before your MySQL upgrade. Percona XtraBackup is free and open source (like all Percona software). It’s a hot backup tool which backs-up your data online without scarifying read/write ability from the database and it will backup your data with minor impact.

Last but not least, You will find this post pretty useful, too: “Upgrading MySQL.” It’s a few years old but still very relevant. And also take a look at this informative webinar, “Upgrading to MySQL 5.6: Best Practices.” Both are from Percona CEO Peter Zaitsev.

Conclusion:
A MySQL upgrade might look like a simple task –  but actually it’s not. I’ve tried to cover most of the MySQL upgrade scenarios in this post that you will encounter. Again, I recommend to briefly test your application parts before pushing it “live,” otherwise it may break your application or part of it – or may minimize performance instead of a performance gain. Finally, I recommend having a downgrade plan in place before the MySQL upgrade just in case something goes wrong. Planning a proper downgrade procedure will minimize your app downtime when things go wrong. I’m looking forward to your comments and questions below.

The post MySQL upgrade best practices appeared first on MySQL Performance Blog.

Managing big data? Say ‘hello’ to HP Vertica

September 18, 2014 - 7:29am

Over the past few months, I’ve seen an increase in the following use case while working on performance and schema review engagements:

I need to store exponentially increasing amounts of data and analyze all of it in real-time.

This is also known simply as: “We have big data.” Typically, this data is used for user interaction analysis, ad tracking, or other common click stream applications. However, it can also be seen in threat assessment (ddos mitigation, etc), financial forecasting, and other applications as well. While MySQL (and other OLTP systems) can handle this to a degree, it is by no means a forte. Some of the pain points include:

  • Cost of rapidly increasing, expensive disk storage (OLTP disks need to be fast == $$)
  • Performance decrease as the data size increases
  • Wasted hardware resources (excess I/O, etc)
  • Impact against other time-sensitive transactions (i.e. OLTP workload)

While there are many approaches to this problem – and often times, the solution is actually a hybrid of many individually tailored components – a solution that I have seen more frequently in recent work is HP Vertica.

At the 30,000 foot overview, Vertica is built around the following principles:

  • Columnar data store
  • Highly compressed data
  • Clustered solution for both availability and scalability

Over the next few weeks, I’ll discuss several aspects of Vertica including:

  • Underlying architecture and concepts
  • Basic installation and use
  • Different data loading techniques
  • Some various maintenance/operational procedures
  • Some comparisons vs. traditional OLTP (MySQL) performance
  • Some potential use-cases
  • Integration with other tools (such as Hadoop)

While Vertica is by no means the silver bullet that will solve all of your needs, it may prove to be a very valuable tool in your overall approach to managing big data.

The post Managing big data? Say ‘hello’ to HP Vertica appeared first on MySQL Performance Blog.

Syncing MySQL slave table with pt-online-schema-change

September 17, 2014 - 7:06am

I recently encountered a situation in which after running Percona Toolkit’s pt-table-checksum on a customer system, 95% of the table on the MySQL master was different on the MySQL slave. Although this table was not a critical part of the infrastructure, from time to time, writes to the table from the master would break replication. Additionally, this table has about 6 million rows, and running pt-table-sync would take sometime. Rebuilding the slave from backup of the master would not be an easy option as well since the slave acts as an archive where it has a lot more data than the master.

So how did we solve it? With pt-online-schema-change and a NOOP ALTER.

pt-online-schema-change --alter 'ENGINE=INNODB' D=dbname,t=tblname

How is it possible? pt-online-schema-change works by creating a shadow copy of the original table and start copying the rows from the latter to the former. An additional set of TRIGGERs also ensures that any additional changes to existing rows after they have been copied to the shadow version will propagate.

There is little one caveat though, the binlog_format  on the master would have to be ROW so the actual ROW images from the master would be copied to the slave. If your binlog_format is set to STATEMENT, you’d still end up with the same inconsistency. Since change statements will be logged as STATEMENT, the NOOP ALTER will operate on the slave table instead of copying from the master. You can configure the binlog_format for the alter process via the –set-vars option i.e. --set-vars 'binlog_format=ROW'  to make it more easier!

The post Syncing MySQL slave table with pt-online-schema-change appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK Toll Free)
0-800-181-0665 (GER Toll Free)
More Numbers
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>