Emergency

MySQL QA Episode 12: My server is crashing… Now what? For customers or users experiencing a crash

Latest MySQL Performance Blog posts - August 11, 2015 - 3:00am

My server is crashing… Now what?

This special episode in the MySQL QA Series is for customers or users experiencing a crash.

  1. A crash?
    1. Cheat sheet: https://goo.gl/rrmB9i
    2. Sever install & crash. Note this is as a demonstration: do not action this on a production server!
      sudo yum install -y http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
      sudo yum install -y Percona-Server-client-56 Percona-Server-server-56
      sudo service mysql start
  2. Gimme Stacks!
    1. Debug info packages (can be executed on a production system, but do match your 5.5, 5.6 or 5.7 version correctly)
      sudo yum install -y Percona-Server-56-debuginfo
  3. Testcase?

Full-screen viewing @ 720p resolution recommended.

The post MySQL QA Episode 12: My server is crashing… Now what? For customers or users experiencing a crash appeared first on MySQL Performance Blog.

Percona/Galera Bug

Lastest Forum Posts - August 10, 2015 - 12:55pm
Greetings,

I was referred to you by SeveralNines due to a bug we are seeing over and over again:
09:36:59 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=25165824
read_buffer_size=131072
max_used_connections=13
max_threads=152
thread_count=19
connection_count=9
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3781883 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x113974c0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa794237d38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8fa965]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x665644]
/lib64/libpthread.so.0[0x305b60f710]
/lib64/libc.so.6(gsignal+0x35)[0x305b232625]
/lib64/libc.so.6(abort+0x175)[0x305b233e05]
/usr/lib64/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5Stat eENS1_10TransitionENS_10EmptyGuardENS_11EmptyActio nEE8shift_toES2_+0x188)[0x7fab9fcd7d68]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_r ollbackEPNS_9TrxHandleE+0x2e)[0x7fab9fcf29de]
/usr/lib64/libgalera_smm.so(galera_post_rollback+0x68)[0x7fab9fd0c1e8]
/usr/sbin/mysqld[0x7a906e]
/usr/sbin/mysqld(_Z15ha_rollback_lowP3THDb+0x97)[0x5a5057]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG8rollbackEP3THDb+0x10f)[0x8b26ff]
/usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0x74)[0x5a4e24]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDbb+0x312)[0x5a5702]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x35)[0x78e495]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x954)[0x6ea7b4]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x60)[0x82ed50]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreE P3THDPjb+0x228)[0x82f088]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute _coreEP3THDPjb+0x8f)[0x830a6f]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x2b0)[0x831b20]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x782)[0x66c442]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI 4ItemE+0x707)[0x66d997]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x19cf)[0x6eb82f]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x60)[0x82ed50]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreE P3THDPjb+0x228)[0x82f088]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute _coreEP3THDPjb+0x8f)[0x830a6f]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x2b0)[0x831b20]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x782)[0x66c442]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI 4ItemE+0x707)[0x66d997]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x19cf)[0x6eb82f]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x65 8)[0x6f0958]
/usr/sbin/mysqld[0x6f0acd]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3 THDPcj+0x19d5)[0x6f2de5]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x22b)[0x6f42cb]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x6bc52f]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x6bc717]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xaf611a]
/lib64/libpthread.so.0[0x305b6079d1]
/lib64/libc.so.6(clone+0x6d)[0x305b2e88fd]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fa487440b90): is an invalid pointer
Connection ID (thread ID): 25603
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.

No more server metrics since Aug 8 @ 8 pm EDT

Lastest Forum Posts - August 10, 2015 - 12:01pm
My Cloud Tools UI no longer updates with server statistics (Metrics tab). The queries statistics *are* updating correctly. I've restarted the agent (percona-agent-1.0.13) but that did not help. Agent looks to be connected from everything I see in the portal, but metrics are not shown after Saturday evening at 8 pm.

pt-online-schema-change doesn't support updating PRIMARY KEY or unique index

Lastest Forum Posts - August 10, 2015 - 12:56am
In the process of altering a big table with pt-online-schema-change, I updated the PRIMARY KEY of one record. When pt-online-schema-change reported it has successfully altered the table, I found the record that I have just updated still existed. It is because the update trigger pt-online-schema-change create replaces the new record into the new table. The test procedure is as follows:

1. Create a table.

CREATE TABLE `ord` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`customer_id` int(11) DEFAULT NULL,
`good_id` int(11) DEFAULT NULL,
`create_time` datetime DEFAULT NULL,
`datetime_lastchanged` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;


2. Insert 17000000+ rows into the table.

The first 20 records are as follows:

mysql> select * from ord limit 20;
+----+-------------+---------+---------------------+----------------------+
| id | customer_id | good_id | create_time | datetime_lastchanged |
+----+-------------+---------+---------------------+----------------------+
| 1 | 6 | 1 | 2015-06-15 14:46:03 | 2015-06-26 15:52:47 |
| 10 | 7 | 2 | 2015-06-15 14:46:14 | 2015-06-26 15:52:52 |
| 11 | 3 | 5 | 2015-06-15 14:14:05 | 2015-06-26 15:53:47 |
| 12 | 1 | 2 | 2015-06-26 16:40:54 | 2015-06-26 16:40:54 |
| 13 | 6 | 1 | 2015-06-15 14:46:03 | 2015-06-26 15:52:47 |
| 14 | 7 | 2 | 2015-06-15 14:46:14 | 2015-06-26 15:52:52 |
| 15 | 3 | 5 | 2015-06-15 14:14:05 | 2015-06-26 15:53:47 |
| 16 | 1 | 2 | 2015-06-26 16:40:54 | 2015-06-26 16:40:54 |
| 20 | 6 | 1 | 2015-06-15 14:46:03 | 2015-06-26 15:52:47 |
| 21 | 7 | 2 | 2015-06-15 14:46:14 | 2015-06-26 15:52:52 |
| 22 | 3 | 5 | 2015-06-15 14:14:05 | 2015-06-26 15:53:47 |
| 23 | 1 | 2 | 2015-06-26 16:40:54 | 2015-06-26 16:40:54 |
| 24 | 6 | 1 | 2015-06-15 14:46:03 | 2015-06-26 15:52:47 |
| 25 | 7 | 2 | 2015-06-15 14:46:14 | 2015-06-26 15:52:52 |
| 26 | 3 | 5 | 2015-06-15 14:14:05 | 2015-06-26 15:53:47 |
| 27 | 1 | 2 | 2015-06-26 16:40:54 | 2015-06-26 16:40:54 |
| 35 | 6 | 1 | 2015-06-15 14:46:03 | 2015-06-26 15:52:47 |
| 36 | 7 | 2 | 2015-06-15 14:46:14 | 2015-06-26 15:52:52 |
| 37 | 3 | 5 | 2015-06-15 14:14:05 | 2015-06-26 15:53:47 |
| 38 | 1 | 2 | 2015-06-26 16:40:54 | 2015-06-26 16:40:54 |
+----+-------------+---------+---------------------+----------------------+
20 rows in set (0.00 sec)


3. Altering the table with pt-online-schema-change.

[admin@CentOS6 ~]$ pt-online-schema-change u=root,p=pass,D=db1,t=ord --alter="add name varchar(20)" --statistics --execute
No slaves found. See --recursion-method if host CentOS6.4study has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `db1`.`ord`...
Creating new table...
Created new table db1._ord_new OK.
Altering new table...
Altered `db1`.`_ord_new` OK.
2015-08-10T10:35:59 Creating triggers...
2015-08-10T10:35:59 Created triggers OK.
2015-08-10T10:35:59 Copying approximately 16312164 rows...
Copying `db1`.`ord`: 8% 05:45 remain
################################################## ########
#######pt-online-schema-change is still running!!!########
################################################## ########

4. When pt-online-schema-change is still running, update the PRIMARY KEY of one record.

mysql> update ord set id=5 where id=17235885;
Query OK, 1 row affected (0.80 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> select * from ord where id=17235885;
Empty set (0.07 sec)

mysql> select * from ord where id=5;
+----+-------------+---------+---------------------+----------------------+
| id | customer_id | good_id | create_time | datetime_lastchanged |
+----+-------------+---------+---------------------+----------------------+
| 5 | 1 | 2 | 2015-06-26 16:40:54 | 2015-08-10 10:36:38 |
+----+-------------+---------+---------------------+----------------------+


5. When pt-online-schema-change reports it has successfully altered the table, I find the record id=17235885 still exists, and the record id=5 exists too.

mysql> select * from ord where id=17235885;
+----------+-------------+---------+---------------------+----------------------+------+
| id | customer_id | good_id | create_time | datetime_lastchanged | name |
+----------+-------------+---------+---------------------+----------------------+------+
| 17235885 | 1 | 2 | 2015-06-26 16:40:54 | 2015-08-10 10:36:38 | NULL |
+----------+-------------+---------+---------------------+----------------------+------+
1 row in set (0.03 sec)

mysql> select * from ord where id=5;
+----+-------------+---------+---------------------+----------------------+------+
| id | customer_id | good_id | create_time | datetime_lastchanged | name |
+----+-------------+---------+---------------------+----------------------+------+
| 5 | 1 | 2 | 2015-06-26 16:40:54 | 2015-08-10 10:36:38 | NULL |
+----+-------------+---------+---------------------+----------------------+------+
1 row in set (0.03 sec)


Because of this problem, I dare NOT use pt-online-schema-change because I cannot gurantee the application never updates PRIMARY KEY. Can anybody solve the problem and give me any help? Thank you very much, and I hope percona team can solve this problem completely.

Percona TokuDB recovery running over 24 hours

Lastest Forum Posts - August 7, 2015 - 12:31pm
Background
Yesterday, we had a situation that presented (at least to my eyes) like a deadlock. A Truncate table (running against an un-contended) table, had been issued, it's status was "query completed" but never cleared from the process list. Further, user connections to unrelated tables within that schema were unable to connect with "waiting for metadata lock (or similar, didn't get the screenshot)".

A first attempt to kill the truncate query, then then the hung queries, resulted in their status changing to "killed" but never had them disappear from the process list.

At this point, even a superuser was unable to login to any schema (a previously opened monitor showed the connection as -- waiting for metadata lock).

An orderly shutdown was attempted but after some lengthy period, it looks like the OS kill -9'd mysql. As a bunch of memory unexpectedly seemed pinned, a reboot seemed in order and was done.

Now, for the past 24ish hours, Percona seems to be either replaying (very--slowly) a number of transactions that seem unbelievable or is in a loop of some kind. To give you a sense of the log messages:

: 2015-08-07 09:01:42 25894 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2015-08-07 09:01:42 25894 [Note] InnoDB: Memory barrier is not used 2015-08-07 09:01:42 25894 [Note] InnoDB: Compressed tables use zlib 1.2.3 2015-08-07 09:01:42 25894 [Note] InnoDB: Using Linux native AIO 2015-08-07 09:01:42 25894 [Note] InnoDB: Using CPU crc32 instructions 2015-08-07 09:01:42 25894 [Note] InnoDB: Initializing buffer pool, size = 20.0G 2015-08-07 09:01:43 25894 [Note] InnoDB: Completed initialization of buffer pool 2015-08-07 09:01:44 25894 [Note] InnoDB: Highest supported file format is Barracuda. 2015-08-07 09:01:44 25894 [Note] InnoDB: The log sequence numbers 551884808204 and 551884808204 in ibdata files do not match the log sequence number 551884808684 in the ib_logfiles! 2015-08-07 09:01:44 25894 [Note] InnoDB: Database was not shutdown normally! 2015-08-07 09:01:44 25894 [Note] InnoDB: Starting crash recovery. 2015-08-07 09:01:44 25894 [Note] InnoDB: Reading tablespace information from the .ibd files... 2015-08-07 09:01:44 25894 [Note] InnoDB: Restoring possible half-written data pages 2015-08-07 09:01:44 25894 [Note] InnoDB: from the doublewrite buffer... 2015-08-07 09:01:44 25894 [Note] InnoDB: starting tracking changed pages from LSN 551884808684 InnoDB: Transaction 122307881 was in the XA prepared state. InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 122333952 InnoDB: Last MySQL binlog file position 0 134745064, file name mysqld-bin.000281 2015-08-07 09:01:44 25894 [Note] InnoDB: 128 rollback segment(s) are active. InnoDB: Starting in background the rollback of uncommitted transactions 2015-08-07 09:01:44 7f7b96bff700 InnoDB: Rollback of non-prepared transactions completed 2015-08-07 09:01:44 25894 [Note] InnoDB: Waiting for purge to start 2015-08-07 09:01:44 25894 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.25-73.1 started; log sequence number 551884808684 Fri Aug 7 09:01:44 2015 TokuFT recovery starting in env /var/lib/mysql/ Fri Aug 7 09:01:45 2015 TokuFT recovery scanning backward from 957963270 Fri Aug 7 09:01:45 2015 TokuFT recovery bw_end_checkpoint at 957917890 timestamp 1438892691965548 xid 957841798 (bw_newer) Fri Aug 7 09:01:45 2015 TokuFT recovery bw_begin_checkpoint at 957841798 timestamp 1438892662604852 (bw_between) Fri Aug 7 09:01:45 2015 TokuFT recovery turning around at begin checkpoint 957841798 time 29360696 Fri Aug 7 09:01:45 2015 TokuFT recovery starts scanning forward to 957963270 from 957841798 left 121472 (fw_between) Fri Aug 7 09:02:09 2015 TokuFT lsn 957930340 commit xid 405185:0 1909760/293672232 1% Fri Aug 7 09:02:24 2015 TokuFT lsn 957930340 commit xid 405185:0 3742720/293672232 1% Fri Aug 7 09:02:39 2015 TokuFT lsn 957930340 commit xid 405185:0 5556224/293672232 2% Fri Aug 7 09:02:54 2015 TokuFT lsn 957930340 commit xid 405185:0 7327744/293672232 2% Fri Aug 7 09:03:09 2015 TokuFT lsn 957930340 commit xid 405185:0 8884224/293672232 3% -- snip -- Fri Aug 7 09:21:09 2015 TokuFT lsn 957930340 commit xid 405185:0 150720512/293672232 51% Fri Aug 7 09:21:24 2015 TokuFT lsn 957930340 commit xid 405185:0 152825856/293672232 52% Fri Aug 7 09:21:39 2015 TokuFT lsn 957930340 commit xid 405185:0 154894336/293672232 53% 150807 09:21:42 mysqld_safe Transparent huge pages are already set to: never. 150807 09:21:42 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 2015-08-07 09:21:42 0 [Note] /usr/sbin/mysqld (mysqld 5.6.25-73.1-log) starting as process 29206 ... 2015-08-07 09:21:42 29206 [Note] InnoDB: Using atomics to ref count buffer pool pages 2015-08-07 09:21:42 29206 [Note] InnoDB: The InnoDB memory heap is disabled 2015-08-07 09:21:42 29206 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2015-08-07 09:21:42 29206 [Note] InnoDB: Memory barrier is not used 2015-08-07 09:21:42 29206 [Note] InnoDB: Compressed tables use zlib 1.2.3 2015-08-07 09:21:42 29206 [Note] InnoDB: Using Linux native AIO 2015-08-07 09:21:42 29206 [Note] InnoDB: Using CPU crc32 instructions 2015-08-07 09:21:42 29206 [Note] InnoDB: Initializing buffer pool, size = 20.0G 2015-08-07 09:21:43 29206 [Note] InnoDB: Completed initialization of buffer pool 2015-08-07 09:21:44 29206 [Note] InnoDB: Highest supported file format is Barracuda. 2015-08-07 09:21:44 29206 [Note] InnoDB: The log sequence numbers 551884808204 and 551884808204 in ibdata files do not match the log sequence number 551884808694 in the ib_logfiles! 2015-08-07 09:21:44 29206 [Note] InnoDB: Database was not shutdown normally! 2015-08-07 09:21:44 29206 [Note] InnoDB: Starting crash recovery. 2015-08-07 09:21:44 29206 [Note] InnoDB: Reading tablespace information from the .ibd files... 2015-08-07 09:21:44 29206 [Note] InnoDB: Restoring possible half-written data pages 2015-08-07 09:21:44 29206 [Note] InnoDB: from the doublewrite buffer... 2015-08-07 09:21:44 29206 [Note] InnoDB: starting tracking changed pages from LSN 551884808694 InnoDB: Transaction 122307881 was in the XA prepared state. InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 122334464 InnoDB: Last MySQL binlog file position 0 134745064, file name mysqld-bin.000281 2015-08-07 09:21:45 29206 [Note] InnoDB: 128 rollback segment(s) are active. InnoDB: Starting in background the rollback of uncommitted transactions 2015-08-07 09:21:45 7feb457ff700 InnoDB: Rollback of non-prepared transactions completed 2015-08-07 09:21:45 29206 [Note] InnoDB: Waiting for purge to start 2015-08-07 09:21:45 29206 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.25-73.1 started; log sequence number 551884808694 Fri Aug 7 09:21:45 2015 TokuFT recovery starting in env /var/lib/mysql/ Fri Aug 7 09:21:45 2015 TokuFT recovery scanning backward from 957963270 Fri Aug 7 09:21:45 2015 TokuFT recovery bw_end_checkpoint at 957917890 timestamp 1438892691965548 xid 957841798 (bw_newer) Fri Aug 7 09:21:46 2015 TokuFT recovery bw_begin_checkpoint at 957841798 timestamp 1438892662604852 (bw_between) Fri Aug 7 09:21:46 2015 TokuFT recovery turning around at begin checkpoint 957841798 time 29360696 Fri Aug 7 09:21:46 2015 TokuFT recovery starts scanning forward to 957963270 from 957841798 left 121472 (fw_between) Fri Aug 7 09:22:10 2015 TokuFT lsn 957930340 commit xid 405185:0 2105344/293672232 1% Fri Aug 7 09:22:40 2015 TokuFT lsn 957930340 commit xid 405185:0 5748736/293672232 2% Fri Aug 7 09:23:10 2015 TokuFT lsn 957930340 commit xid 405185:0 9150464/293672232 3% --- snip -- Fri Aug 7 09:36:40 2015 TokuFT lsn 957930340 commit xid 405185:0 115834880/293672232 39% Fri Aug 7 09:36:55 2015 TokuFT lsn 957930340 commit xid 405185:0 118014976/293672232 40% Fri Aug 7 09:37:10 2015 TokuFT lsn 957930340 commit xid 405185:0 120132608/293672232 41% Fri Aug 7 09:37:25 2015 TokuFT lsn 957930340 commit xid 405185:0 122251264/293672232 42% Fri Aug 7 09:37:40 2015 TokuFT lsn 957930340 commit xid 405185:0 124372992/293672232 42% -- snip -- Fri Aug 7 09:38:40 2015 TokuFT lsn 957930340 commit xid 405185:0 133026816/293672232 45% Fri Aug 7 09:38:55 2015 TokuFT lsn 957930340 commit xid 405185:0 135154688/293672232 46% Fri Aug 7 09:39:25 2015 TokuFT lsn 957930340 commit xid 405185:0 139439104/293672232 47% -- snip -- Fri Aug 7 09:41:10 2015 TokuFT lsn 957930340 commit xid 405185:0 154284032/293672232 53% Fri Aug 7 09:41:25 2015 TokuFT lsn 957930340 commit xid 405185:0 156420096/293672232 53% Fri Aug 7 09:41:40 2015 TokuFT lsn 957930340 commit xid 405185:0 158516224/293672232 54% 150807 09:41:42 mysqld_safe Transparent huge pages are already set to: never. 150807 09:41:42 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
So while the log sequence and the TokuFT commit xid numbers seems to stay static, the scanned checkpoints and % complete (always ~49-55%) change from run to run.

It's unclear if this is somehow unrolling some gigantic transaction (the user had done some huge table updates, DDL changes, etc recently and I suspect had autocommit=0) or this is loop that's never going to end.

Any thoughts on how I can determine if this is a valid recovery or if more drastic measures are in order.

Thanks.


The MySQL query cache: Worst enemy or best friend?

Latest MySQL Performance Blog posts - August 7, 2015 - 3:00am

During the last couple of months I have been involved in an unusually high amount of performance audits for e-commerce applications running with Magento. And although the systems were quite different, they also had one thing in common: the MySQL query cache was very useful. That was counter-intuitive for me as I’ve always expected the query cache to be such a bottleneck that response time is better when the query cache is turned off no matter what. That lead me to run a few experiments to better understand when the query cache can be helpful.

Some context

The query cache is well known for its contentions: a global mutex has to be acquired for any read or write operation, which means that any access is serialized. This was not an issue 15 years ago, but with today’s multi-core servers, such serialization is the best way to kill performance.

However from a performance point of view, any query cache hit is served in a few tens of microseconds while the fastest access with InnoDB (primary lookup) still requires several hundreds of microseconds. Yes, the query cache is at least an order of magnitude faster than any query that goes to InnoDB.

A simple test

To better understand how good or bad the query cache can be, I set up a very simple benchmark:

  • 1M records were inserted in 16 tables.
  • A moderate write load (65 updates/s) was run with a modified version of the update_index.lua sysbench script (see the end of the post for the code).
  • The select.lua sysbench script was run, with several values for the --num-threads option.

Note that the test is designed to be unfavorable to the query cache as the whole dataset fits in the buffer pool and the SELECT statements are very simple. Also note that I configured the query cache to be large enough so that no entry was evicted from the cache due to low memory.

Results – MySQL query cache ON

First here are the results when the query cache is enabled:

This configuration scales well up to 4 concurrent threads, but then the throughput degrades very quickly. With 10 concurrent threads, SHOW PROCESSLIST is enough to show you that all threads spend all their time waiting for the query cache mutex. Okay, this is not a surprise.

Results – MySQL query cache OFF

When the query cache is disabled, this is another story:

Throughput scales well up to somewhere between 10 and 20 threads (for the record the server I was using had 16 cores). But more importantly, even at the higher concurrencies, the overall throughput continued to increase: at 20 concurrent threads, MySQL was able to serve nearly 3x more queries without the query cache.

Conclusion

With Magento, you can expect to have a light write workload, very low concurrency and also quite complex SELECT statements. Given the results of our simple benchmarks, it is finally not that surprising that the MySQL query cache is a good fit in this case.

It is also worth noting that many applications run a database workload where writes are light and concurrency is low: the query cache should then not be discarded immediately. And maybe it is time for Oracle to make plans to improve the query cache as suggested by Peter a few years ago?

Annex: sysbench commands

# Modified update_index.lua function event(thread_id) local table_name table_name = "sbtest".. sb_rand_uniform(1, oltp_tables_count) rs = db_query("UPDATE ".. table_name .." SET k=k+1 WHERE id=" .. sb_rand(1, oltp_table_size)) db_query("SELECT SLEEP(0.015)") end

# Populate the tables sysbench --mysql-socket=/data/mysql/mysql.sock --mysql-user=root --mysql-db=db1 --oltp-table-size=1000000 --oltp-tables-count=16 --num-threads=16 --test=/usr/share/doc/sysbench/tests/db/insert.lua prepare # Write workload sysbench --mysql-socket=/data/mysql/mysql.sock --mysql-user=root --mysql-db=db1 --oltp-tables-count=16 --num-threads=1 --test=/usr/share/doc/sysbench/tests/db/update_index.lua --max-requests=1000000 run # Read workload sysbench --mysql-socket=/data/mysql/mysql.sock --mysql-user=root --mysql-db=db1 --oltp-tables-count=16 --num-threads=1 --test=/usr/share/doc/sysbench/tests/db/select.lua --max-requests=10000000 run

The post The MySQL query cache: Worst enemy or best friend? appeared first on MySQL Performance Blog.

Percona XtraDB, Question to the experts!!! ;)

Lastest Forum Posts - August 7, 2015 - 12:52am
Hi everyone!!

I have a Percona XtraCluster with 2 nodes on Primary mode each one. (I know taht is preferable 3nodes, but the project needs only 2 nodes).
The rare situation that I have tested is this:

Step 1 - I start the node A with "/etc/init.d/mysql bootstrap-pxc".
Step 2 - I start the node B with "/etc/init.d/mysql start" (Both nodes are synced).
Step 3 - I reboot the machine with node A.
Step 4 - I modify the DataBase Insert/updates... by node B (is the unique up).
Step 5 - I stopt the node B with "/etc/init.d/mysql stop".
Step 6 - I start the node A with "/etc/init.d/mysql bootstrap-pxc".
Step 7 - I start the node B with "/etc/init.d/mysql start".......and the log-error is:
MySQL (Percona XtraDB Cluster) is not running, but lock fil[FALLÓ]/lock/subsys/mysql) exists
Starting MySQL (Percona XtraDB Cluster)....................[FALLÓ]...The server quit without updating PID file (/DATAMYSQL/data1/mysql.pid).
MySQL (Percona XtraDB Cluster) server startup failed! [FALLÓ]


I know....this situation is very very strange, but the 'Percona Experts' can say me if it possible conncet 2 nodes and would happend??

Best Regards.

How do i get the last lsn number if i am Streaming the Backup?ly 2 ways i

Lastest Forum Posts - August 6, 2015 - 10:46pm
Hi want to use the steaming feature of taking backup. Steaming works but i need to get the lsn number so that i can take incrementals later. The only 2 ways i see is i will have to openup the backup tar and read the backup file or from stdout get the last lsn. Question is , is there a better way ?

Bug in ss_get_mysql_stat / thread graph?

Lastest Forum Posts - August 6, 2015 - 5:34am
To me it looks like a bug in the Threads graph.
I monitor mutiple mysql servers with the Percona MySQL Cacti template. All plots are fine except for the Thread graph, ewhich is identicval for all servers.
Looking att he log file is seems like the --host option is set wrongly for the fetching of thread info.

/usr/bin/php -q /.../www/cacti/scripts/ss_get_mysql_stats.php --host cacti-log-db.mydomian.com --items it,iu,iv,iw,iy --user myuser --pass mypass --port --server-id , output: it:3 iu:21 iv:218 iw:1 iy:10

To mee it looks like it use the cacti backend DB server rather than the DB server ity should fetch statistics from.


Har anybody expeerience simliar problems?
Where can I report this as a bug?
Any suggestion for further debugging?

Regards Gjermundf

PXC – Incremental State transfers in detail

Latest MySQL Performance Blog posts - August 5, 2015 - 3:00am
IST Basics

State transfers in Galera remain a mystery to most people.  Incremental State transfers (as opposed to full State Snapshot transfers) are used under the following conditions:

  • The Joiner node reports Galera a valid Galera GTID to the cluster
  • The Donor node selected contains all the transactions the Joiner needs to catch up to the rest of the cluster in its Gcache
  • The Donor node can establish a TCP connection to the Joiner on port 4568 (by default)
IST states

Galera has many internal node states related to Joiner nodes.  They currently are:

  1. Joining
  2. Joining: preparing for State Transfer
  3. Joining: requested State Transfer
  4. Joining: receiving State Transfer
  5. Joining: State Transfer request failed
  6. Joining: State Transfer failed
  7. Joined

I don’t claim any special knowledge of most of these states apart from what their titles indicate.  Many of these states are occur very briefly and it is unlikely you’ll ever actually see them on a node’s wsrep_local_state_comment.

During IST, however, I have observed the following states have the potential to take a long while:

Joining: receiving State Transfer

During this state transactions are being streamed to the Joiner’s wsrep_local_recv_queue.  You can connect to the node at this time and poll state.  If you do, you’ll easily see the inbound queue increasing (usually quickly) but no writesets being ‘received’ (read: applied).  It’s not clear to me if there is a reason why transction apply couldn’t be started during this steam, but it does not do so currently.

The further behind the Joiner is, the longer this can take.  Here’s some output from the latest release of myq-tools showing wsrep stats:

[root@node2 ~]# myq_status wsrep mycluster / node2 (idx: 1) / Galera 3.11(ra0189ab) Cluster Node Outbound Inbound FlowC Conflct Gcache Appl time P cnf # stat laten msgs data que msgs data que pause snt lcf bfa ist idx %ef 14:04:40 P 4 2 J:Rc 0.4ms 0 0b 0 1 197b 4k 0ns 0 0 0 367k 0 94% 14:04:41 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 5k 0ns 0 0 0 368k 0 93% 14:04:42 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 6k 0ns 0 0 0 371k 0 92% 14:04:43 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 7k 0ns 0 0 0 373k 0 92% 14:04:44 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 8k 0ns 0 0 0 376k 0 92% 14:04:45 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 10k 0ns 0 0 0 379k 0 92% 14:04:46 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 11k 0ns 0 0 0 382k 0 92% 14:04:47 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 12k 0ns 0 0 0 386k 0 91% 14:04:48 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 13k 0ns 0 0 0 390k 0 91% 14:04:49 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 14k 0ns 0 0 0 394k 0 91% 14:04:50 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 15k 0ns 0 0 0 397k 0 91% 14:04:51 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 16k 0ns 0 0 0 401k 0 91% 14:04:52 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 18k 0ns 0 0 0 404k 0 91% 14:04:53 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 19k 0ns 0 0 0 407k 0 91% 14:04:54 P 4 2 J:Rc 0.4ms 0 0b 0 0 0b 20k 0ns 0 0 0 411k 0 91%

The node is in ‘J:Rc’ (Joining: Receiving) state and we can see the Inbound queue growing (wsrep_local_recv_queue). Otherwise this node is not sending or receiving transactions.

Joining

Once all the requested transactions are copied over, the Joiner flips to the ‘Joining’ state, during which it starts applying the transactions as quickly as the wsrep_slave_threads can go.  For example:

Cluster Node Outbound Inbound FlowC Conflct Gcache Appl time P cnf # stat laten msgs data que msgs data que pause snt lcf bfa ist idx %ef 14:04:55 P 4 2 Jing 0.6ms 0 0b 0 2243 3.7M 19k 0ns 0 0 0 2236 288 91% 14:04:56 P 4 2 Jing 0.5ms 0 0b 0 4317 7.0M 16k 0ns 0 0 0 6520 199 92% 14:04:57 P 4 2 Jing 0.5ms 0 0b 0 4641 7.5M 12k 0ns 0 0 0 11126 393 92% 14:04:58 P 4 2 Jing 0.4ms 0 0b 0 4485 7.2M 9k 0ns 0 0 0 15575 200 93% 14:04:59 P 4 2 Jing 0.5ms 0 0b 0 4564 7.4M 5k 0ns 0 0 0 20102 112 93%

Notice the Inbound msgs (wsrep_received) starts increasing rapidly and the queue decreases accordingly.

Joined

14:05:00 P 4 2 Jned 0.5ms 0 0b 0 4631 7.5M 2k 0ns 0 0 0 24692 96 94%

Towards the end the node briefly switches to the ‘Joined’ state, though that is a fast state in this case. ‘Joining’ and ‘Joined’ are similar states, the difference (I believe) is that:

  • ‘Joining’ is applying transactions acquired via the IST
  • ‘Joined’ is applying transactions that have queued up via standard Galera replication since the IST (i.e., everything has been happening on the cluster since the IST)
Flow control during Joining/Joined states

The Codership documentation says something interesting about ‘Joined’ (from experimentation, I believe the ‘Joining’ state behaves the same here.):

Nodes in this state can apply write-sets. Flow Control here ensures that the node can eventually catch up with the cluster. It specifically ensures that its write-set cache never grows. Because of this, the cluster wide replication rate remains limited by the rate at which a node in this state can apply write-sets. Since applying write-sets is usually several times faster than processing a transaction, nodes in this state hardly ever effect cluster performance.

What this essentially means is that a Joiner’s wsrep_local_recv_queue is allowed to shrink but NEVER GROW during an IST catchup.  Growth will trigger flow control, but why would it grow?  Writes on other cluster nodes must still be replicated to our Joiner and added to the queue.

If the Joiner’s apply rate is less than the rate of writes coming from Cluster replication, flow control will be applied to slow down Cluster replication (read: your application writes).  As far as I can tell, there is no way to tune this or turn it off.  The Codership manual continues here:

The one occasion when nodes in the JOINED state do effect cluster performance is at the very beginning, when the buffer pool on the node in question is empty.

Essentially a Joiner node with a cold cache can really hurt performance on your cluster.  This can possibly be improved by:

  • Better IO and other resources available to the Joiner for a quicker cache warmup.  A huge example of this would be flash over convention storage.
  • Buffer pool preloading
  • More Galera apply threads
  • etc.
Synced

From what I can tell, the ‘Joined’ state ends when the wsrep_local_recv_queue drops lower than the node’s configured flow control limit.  At that point it changes to ‘Synced’ and the node behaves more normally (WRT to flow control).

Cluster Node Outbound Inbound FlowC Conflct Gcache Appl time P cnf # stat laten msgs data que msgs data que pause snt lcf bfa ist idx %ef 14:05:01 P 4 2 Sync 0.5ms 0 0b 0 3092 5.0M 0 0ns 0 0 0 27748 150 94% 14:05:02 P 4 2 Sync 0.5ms 0 0b 0 1067 1.7M 0 0ns 0 0 0 28804 450 93% 14:05:03 P 4 2 Sync 0.5ms 0 0b 0 1164 1.9M 0 0ns 0 0 0 29954 67 92% 14:05:04 P 4 2 Sync 0.5ms 0 0b 0 1166 1.9M 0 0ns 0 0 0 31107 280 92% 14:05:05 P 4 2 Sync 0.5ms 0 0b 0 1160 1.9M 0 0ns 0 0 0 32258 606 91% 14:05:06 P 4 2 Sync 0.5ms 0 0b 0 1154 1.9M 0 0ns 0 0 0 33401 389 90% 14:05:07 P 4 2 Sync 0.5ms 0 0b 0 1147 1.8M 1 0ns 0 0 0 34534 297 90% 14:05:08 P 4 2 Sync 0.5ms 0 0b 0 1147 1.8M 0 0ns 0 0 0 35667 122 89% 14:05:09 P 4 2 Sync 0.5ms 0 0b 0 1121 1.8M 0 0ns 0 0 0 36778 617 88%

Conclusion

You may notice these states during IST if you aren’t watching the Joiner closely, but if your IST is talking a long while, it should be easy using the above situation to understand what is happening.

The post PXC – Incremental State transfers in detail appeared first on MySQL Performance Blog.

MySQL QA Episode 11: Valgrind Testing: Pro’s, Con’s, Why and How

Latest MySQL Performance Blog posts - August 4, 2015 - 3:00am

Today’s episode is all about Valgrind – from the pro’s to the con’s, from the why to the how! This episode will be of interest to anyone who is or wants to work with Valgrind on a regular or semi-regular basis.

  1. Pro’s/Why
  2. Con’s
  3. How
    1. Using the latest version
      sudo [yum/apt-get] install valgrind
      #OR#
      sudo [yum/apt-get] remove valgrind
      sudo [yum/apt-get] install bzip2 glibc*
      wget http://valgrind.org/downloads/valgrind-3.10.1.tar.bz2
      tar -xf valgrind-3.10.1.tar.bz2; cd valgrind-3.10.1
      ./configure; make; sudo make install
      valgrind –version # This should now read 3.10.1
    2. VGDB (cd ./mysql-test)
      ./lib/v1/mysql-test-run.pl –start-and-exit –valgrind –valgrind-option=”–leak-check=yes”
      –valgrind-option=”–vgdb=yes” –valgrind-option=”–vgdb-error=0″ 1st
      #OR# valgrind –leak-check=yes –vgdb=yes –vgdb-error=0 ./mysqld –options…
      #ALSO CONSIDER# –num-callers=40 –show-reachable=yes –track-origins=yes
      gdb /Percona-server/bin/mysqld
      (gdb) target remote | vgdb
      (gdb) bt
    3. pquery Valgrind
    4. Valgrind stacks overview & analysis

Full-screen viewing @ 720p resolution recommended.

 

The post MySQL QA Episode 11: Valgrind Testing: Pro’s, Con’s, Why and How appeared first on MySQL Performance Blog.

Zombie Node

Lastest Forum Posts - August 3, 2015 - 5:53pm
I have a 6 node cluster that was running well ,
Unfortunately the other day one of the servers had a lockup , still pingable but not reachable ,
Until I killed the node the entire cluster was locked and was unable to make any queries ,
Version I am using is Percona-XtraDB-Cluster 56-5.6.22-25.8.978
once the Node was disabled the services retuned to normal

Has anybody experienced this before and a possible solution to stop it happening again .

Regards

Darren

Percona XtraBackup 2.2.12 is now available

Latest MySQL Performance Blog posts - August 3, 2015 - 5:31am

Percona is glad to announce the release of Percona XtraBackup 2.2.12 on August 3, 2015. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

Bugs Fixed:

  • Percona XtraBackup would segfault during the prepare phase of certain FTS pages. Bug fixed #1460138.
  • Fixed compilation error due to missing dependency caused by the upstream bug #77226. Bug fixed #1461129.
  • Regression introduced by fixing a bug #1403237 in Percona XtraBackup 2.2.8 could cause xtrabackup to read a redo log from incorrect offset which would cause an assertion. Bug fixed #1464608.
  • Fixed uninitialized current_thd thread-local variable. This also completely fixes bug #1415191. Bug fixed #1467574.
  • After the release of Percona XtraBackup 2.2.11, innobackupex issues a FLUSH TABLE before running the FLUSH TABLES WITH READ LOCK. While it will help the backups in some situation, it also implies that the FLUSH TABLE will be written to the binary log. On MariaDB 10.0 with GTID enabled, when backup was taken on the slave, this altered the GTID of that slave and Percona XtraBackup didn’t see the correct GTID anymore. Bug fixed #1466446 (Julien Pivotto).
  • RPM compilation of Percona XtraBackup was still requiring bzr. Bug fixed #1466888 (Julien Pivotto).
  • Compiling Percona XtraBackup RPMs with XB_VERSION_EXTRA option would create an incorrect RPM version. Bug fixed #1467424 (Julien Pivotto).
  • Percona XtraBackup would complete successfully even when redo log wasn’t copied completely. This means that backup were considered successful even when they were corrupt. Bug fixed #1470847.
  • In rare cases when there are two or more tablespaces with the same ID in the data directory, xtrabackup picks up the first one by lexical order, which could lead to losing the correct table. Bug fixed #1475487.
  • Percona XtraBackup was missing revision_id in binaries. Bug fixed #1394174.

Release notes with all the bugfixes for Percona XtraBackup 2.2.12 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.12 is now available appeared first on MySQL Performance Blog.

InnoDB checkpoint strikes back

Latest MySQL Performance Blog posts - August 3, 2015 - 3:00am

In my recent benchmarks for MongoDB, we can see that the two engines WiredTiger and TokuMX struggle from periodical drops in throughput, which is clearly related to a checkpoint interval – and therefore I correspond it to a checkpoint activity.

The funny thing is that I thought we solved checkpointing issues in InnoDB once and for good. There are bunch of posts on this issue in InnoDB, dated some 4 years ago.  We did a lot of research back then working on a fix for Percona Server

But, like Baron said, “History Repeats“… and it seems it also repeats in technical issues.

So, let’s take a look what is checkpointing is, and why it is a problem for storage engines.

As you may know, a transactional engine writes transactional logs (name you may see: “redo logs”, write-ahead logs or WAL etc),
to be able to perform a crash recovery in the case of database crash or server power outage. To maintain somehow the time for recovery (we all expect that database will start quick), the engine has to limit how many changes are in logs. For this a transactional engine performs a “checkpoint,” which basically synchronizes changes in memory with corresponding changes in logs – so old log records can be deleted. Often it results in writing changed database pages from memory to a permanent storage.

InnoDB takes an approach to limit size of log files (in total it equals to innodb-log-file-size * innodb-log-files-in-group), which I name “size limited checkpointing”.
Both TokuDB and WiredTiger limit changes by time periods (by default 60 sec), which results in that log files may grow unlimited within a given time interval (I name it “time limited checkpointing”).

Also the difference is InnoDB takes a “fuzzy checkpointing” approach, which was not really “fuzzy”, until we fixed it, but basically it is not to wait until we reach size limited, but perform “checkpointing” all the time, and the more intensive the closer we get to log size limit. This allows to achieve more or less a smooth throughput, without significant drops in throughput.

Unlike InnoDB, both TokuDB (that I am sure) and WiredTiger (I speculate here, I did not look into WiredTiger internal) wait till the last moment, and perform checkpoint strictly by prescribed interval. If it happens that a database contains many changes in memory, it will result in performance stalls. This effect is close by effect to “hitting a wall on full speed”: user queries get locked, until an engine writes all changes in memory it has to write.

Interestingly enough, RocksDB, because it has a different architecture (I may write on it in future, but for now I will point to RocksDB Wiki), does not have this problem with checkpoints (it however has its own background activity, like level compactions and tombstone maintenance, but it is a different topic).

I do not know how WiredTiger is going to approach this issue with checkpoint, but we are looking to improve TokuDB to make this less painful for user queries – and eventually to move to “fuzzy checkpointing” model.

The post InnoDB checkpoint strikes back appeared first on MySQL Performance Blog.

Ubuntu 14.04: invoke-rc.d: initscript mysql, action "start" failed

Lastest Forum Posts - August 2, 2015 - 6:03am
When upgrade Percona Server 5.6.25-73.1-1.trusty , I got these error messages on apt process:

: * Starting MySQL (Percona Server) database server mysqld [fail] invoke-rc.d: initscript mysql, action "start" failed. dpkg: error processing package percona-server-server-5.6 (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: percona-server-server-5.6 E: Sub-process /usr/bin/dpkg returned an error code (1) on the /var/log/syslog:

: Aug 2 19:45:01 bcserver CRON[23017]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Aug 2 19:46:41 bcserver mysqld_safe[23120]: 2015-08-02 19:46:41 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead. Aug 2 19:46:41 bcserver mysqld_safe[23120]: 2015-08-02 19:46:41 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). Aug 2 19:46:41 bcserver mysqld_safe[23120]: 2015-08-02 19:46:41 0 [Note] /usr/sbin/mysqld (mysqld 5.6.25-73.1) starting as process 23124 ... Aug 2 19:47:18 bcserver mysqld_safe[23120]: Aug 2 19:47:18 bcserver mysqld_safe[23120]: Aug 2 19:47:18 bcserver mysqld_safe[23120]: 2015-08-02 19:47:18 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead. Aug 2 19:47:18 bcserver mysqld_safe[23120]: 2015-08-02 19:47:18 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). Aug 2 19:47:18 bcserver mysqld_safe[23120]: 2015-08-02 19:47:18 0 [Note] /usr/sbin/mysqld (mysqld 5.6.25-73.1) starting as process 23150 ... Aug 2 19:47:53 bcserver mysqld_safe[23120]: WARNING: Default config file /etc/mysql/my.cnf exists on the system Aug 2 19:47:53 bcserver mysqld_safe[23120]: This file will be read by default by the MySQL server Aug 2 19:47:53 bcserver mysqld_safe[23120]: If you do not want to use this, either remove it, or use the Aug 2 19:47:53 bcserver mysqld_safe[23120]: --defaults-file argument to mysqld_safe when starting the server Aug 2 19:47:53 bcserver mysqld_safe[23120]: Aug 2 19:47:53 bcserver mysqld_safe[23191]: 2015-08-02 19:47:53 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead. Aug 2 19:47:53 bcserver mysqld_safe[23191]: 2015-08-02 19:47:53 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). Aug 2 19:47:53 bcserver mysqld_safe[23191]: 2015-08-02 19:47:53 0 [Note] /usr/sbin/mysqld (mysqld 5.6.25-73.1) starting as process 23190 ... Aug 2 19:48:40 bcserver /etc/init.d/mysql[23783]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in Aug 2 19:48:40 bcserver /etc/init.d/mysql[23783]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed Aug 2 19:48:40 bcserver /etc/init.d/mysql[23783]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)' Aug 2 19:48:40 bcserver /etc/init.d/mysql[23783]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists! Aug 2 19:48:40 bcserver /etc/init.d/mysql[23783]: Yes, I have add another configuration because I want to put the data on other partition. The additional config on /etc/mysql/conf.d/bc.cnf

: [mysqld] datadir = /data/mysql max_connections = 4000 key_buffer_size = 200M low_priority_updates = 1 innodb_buffer_pool_size=300G skip-name-resolve Eventhough I could login to new mysql:

: ~$ mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.25-73.1 Percona Server (GPL), Release 73.1, Revision 07b797f Copyright (c) 2009-2015 Percona LLC and/or its affiliates Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> But on dpkg, percona server status is failed

: ~$ dpkg -l|grep percona ii libperconaserverclient18.1 5.6.25-73.1-1.trusty amd64 Percona Server database client library ii percona-server-client-5.6 5.6.25-73.1-1.trusty amd64 Percona Server database client binaries ii percona-server-common-5.6 5.6.25-73.1-1.trusty amd64 Percona Server database common files (e.g. /etc/mysql/my.cnf) iF percona-server-server-5.6 5.6.25-73.1-1.trusty amd64 Percona Server database server binaries that make everytime I install/upgrade other package, percona always restart and failed

: ~$ sudo invoke-rc.d mysql stop * Stopping MySQL (Percona Server) mysqld [ OK ] ~$ mysql -u root -p Enter password: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) : ~$ sudo invoke-rc.d mysql start * Starting MySQL (Percona Server) database server mysqld [fail] invoke-rc.d: initscript mysql, action "start" failed. ~$ mysql -u root -p Enter password: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) ~$ mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.25-73.1 Percona Server (GPL), Release 73.1, Revision 07b797f Copyright (c) 2009-2015 Percona LLC and/or its affiliates Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> And /var/log/mysql/error.log:

: 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'MRG_MYISAM' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'CSV' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'MEMORY' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'sha256_password' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'mysql_old_password' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'mysql_native_password' 2015-08-02 20:13:50 23666 [Note] Shutting down plugin 'binlog' 2015-08-02 20:13:50 23666 [Note] /usr/sbin/mysqld: Shutdown complete 150802 20:13:50 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended 150802 20:14:15 mysqld_safe Starting mysqld daemon with databases from /data/mysql 2015-08-02 20:14:16 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead. 2015-08-02 20:14:16 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2015-08-02 20:14:16 0 [Note] /usr/sbin/mysqld (mysqld 5.6.25-73.1) starting as process 24389 ... 2015-08-02 20:14:16 24389 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead. 2015-08-02 20:14:16 24389 [Note] Plugin 'FEDERATED' is disabled. 2015-08-02 20:14:16 24389 [Note] InnoDB: Using atomics to ref count buffer pool pages 2015-08-02 20:14:16 24389 [Note] InnoDB: The InnoDB memory heap is disabled 2015-08-02 20:14:16 24389 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2015-08-02 20:14:16 24389 [Note] InnoDB: Memory barrier is not used 2015-08-02 20:14:16 24389 [Note] InnoDB: Compressed tables use zlib 1.2.8 2015-08-02 20:14:16 24389 [Note] InnoDB: Using Linux native AIO 2015-08-02 20:14:16 24389 [Note] InnoDB: Using CPU crc32 instructions 2015-08-02 20:14:16 24389 [Note] InnoDB: Initializing buffer pool, size = 300.0G 2015-08-02 20:14:29 24389 [Note] InnoDB: Completed initialization of buffer pool 2015-08-02 20:14:31 24389 [Note] InnoDB: Highest supported file format is Barracuda. 2015-08-02 20:14:34 24389 [Note] InnoDB: 128 rollback segment(s) are active. 2015-08-02 20:14:34 24389 [Note] InnoDB: Waiting for purge to start 2015-08-02 20:14:35 24389 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.25-73.1 started; log sequence number 4535928816 2015-08-02 20:14:35 24389 [Note] RSA private key file not found: /data/mysql//private_key.pem. Some authentication plugins will not work. 2015-08-02 20:14:35 24389 [Note] RSA public key file not found: /data/mysql//public_key.pem. Some authentication plugins will not work. 2015-08-02 20:14:35 24389 [Note] Server hostname (bind-address): '*'; port: 3306 2015-08-02 20:14:35 24389 [Note] IPv6 is available. 2015-08-02 20:14:35 24389 [Note] - '::' resolves to '::'; 2015-08-02 20:14:35 24389 [Note] Server socket created on IP: '::'. 2015-08-02 20:14:35 24389 [Warning] 'user' entry 'root@bcserver' ignored in --skip-name-resolve mode. 2015-08-02 20:14:35 24389 [Warning] 'proxies_priv' entry '@ root@bcserver' ignored in --skip-name-resolve mode. 2015-08-02 20:14:35 24389 [Note] Event Scheduler: Loaded 0 events 2015-08-02 20:14:35 24389 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.25-73.1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Percona Server (GPL), Release 73.1, Revision 07b797f Thanks in advanced

mysql skip-name-resolve not stopping dns lookups

Lastest Forum Posts - July 31, 2015 - 9:30am
Recently our Rackspace DNS servers were experiencing major lag which had a stalling effect on our mysql database servers. This was confusing since I had disabled DNS lookups months ago by adding **skip_name_resolve = ON** in my.cnf and restarting the mysql server.

After doing some research I found this article, which confirmed I had already taken the necessary steps to prevent this from happening:
http://codeinthehole.com/writing/sol...d-name-server/

However when doing a strace on the mysql process and attempting to connect to it the following output confirmed it was still accessing the Rackspace DNS server, which was exactly where the process was stalling:

> connect(209, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("173.203.4.9")}, 16) = 0

The temporary solution that fixed the stalling of mysql was to add googles dns server to the top of the list in **/etc/resolve.conf** like this:

> nameserver 8.8.8.8

I've searched and have not found any instances of the skip_name_resolve configuration not working. I could use some help in truly disabling DNS lookups for mysql.

Here are my specs for one of the database servers:

- OS version: Ubuntu 14.04.1 LTS
- MySQL version: 5.6.23-72.1-log Percona Server
- my.cnf setting: skip_name_resolve = ON
- show variables output: skip_name_resolve | ON

MySQL QA Episode 10: Reproducing and Simplifying: How to get it Right

Latest MySQL Performance Blog posts - July 31, 2015 - 3:00am

Welcome to the 10th episode in the MySQL QA series! Today we’ll talk about reproducing and simplifying: How to get it Right.

Note that unless you are a QA engineer stuck on a remote, and additionally difficult-to-reproduce or difficult-to-reduce bug, this episode will largely be non-interesting for you.

However, what you may like to see – especially if you watched episodes 7 (and possibly 8 and 9) – is how reducer automatically generates handy start/stop/client (cl) etc. scripts, all packed into a handy bug tarball, in combination with the reduced SQL testcase.

This somewhat separate part is covered directly after the introduction (ends at 11:17), as well as with an example towards the end of the video (starts at time index 30:35).

The “in between part” (11:17 to 30:35) is all about reproducing and simplifying, which – unless you are working on a remote case – can likely be skipped by most; remember that 85-95% of bugs reproduce & reduce very easily – and for this – episode 7, episode 8 (especially the FORCE_SKIPV/FORCE_SPORADIC parts), and the script-related parts of this episode (start to 11:17 and 30:35 to end) would suffice.

As per the above, the topics covered in this video are:
1. percona-qa/reproducing_and_simplification.txt
2. Automatically generated scripts (produced by Reducer)

========= Example bug excerpt for copy/paste – as per the video
Though the testcase above should suffice for reproducing the bug, the attached tarball gives the testcase as an exact match of our system, including some handy utilities
$ vi {epoch}_mybase # Update base path in this file (the only change required!)
$ ./{epoch}_init # Initializes the data dir
$ ./{epoch}_start # Starts mysqld (MYEXRA –option)
$ ./{epoch}_stop # Stops mysqld
$ ./{epoch}_cl # To check mysqld is up
$ ./{epoch}_run # Run the testcase (produces output) using mysql CLI
$ ./{epoch}_run_pquery # Run the testcase (produces output) using pquery
$ vi /dev/shm/{epoch}/error.log.out # Verify the error log
$ ./{epoch}_gdb # Brings you to a gdb prompt
$ ./{epoch}_parse_core # Create {epoch}_STD.gdb and {epoch}_FULL.gdb; standard and full var gdb stack traces

Full-screen viewing @ 720p resolution recommended

The post MySQL QA Episode 10: Reproducing and Simplifying: How to get it Right appeared first on MySQL Performance Blog.

Foreign key error

Lastest Forum Posts - July 30, 2015 - 10:13pm
Hello,

Performed INNODB ENGINE STATUS. Got below error in LAST FOREIGN KEY ERROR SECTION:

Please find table sructure provided below. Could anyone please help me to understand this?

ERROR:

MySQL thread id 205914, OS thread handle 0x7f070168c700, query id 1440847 custom copy to tmp table
alter table CTABLE add index FKEF5F18AE1D1C86DC (DRAFT_ID_PK_COL), add constraint FKEF5F18AE1D1C86DC foreign key (DRAFT_ID_PK_COL) references INSURANCE_PROPOSER_DRAFT_TAB (DRAFT_ID_PK_COL)
Foreign key constraint fails for table `DB1`.`#sql-5c5c_3245a`:
,
CONSTRAINT `FKEF5F18AE1D1C86DC` FOREIGN KEY (`DRAFT_ID_PK_COL`) REFERENCES `PTABLE` (`DRAFT_ID_PK_COL`)
Trying to add in child table, in index `PROPOSER_REJ_REM_DRAFT_ID` tuple:
DATA TUPLE: 3 fields;
0: len 8; hex 8000000000000000; asc ;;
1: len 8; hex 8000000000000107; asc ;;
2: len 8; hex 8000000000000002; asc ;;

But in parent table `DB1`.`INSURANCE_PROPOSER_DRAFT_TAB`, in index `PRIMARY`,
the closest match we can find is record:
PHYSICAL RECORD: n_fields 52; compact format; info bits 0
0: len 8; hex 800000000000009b; asc ;;
1: len 6; hex 0000000a24dd; asc $ ;;
2: len 7; hex ff0000028c0110; asc ;;
3: len 5; hex 41646d696e; asc Admin;;
4: len 5; hex 999208bbe9; asc ;;
5: len 5; hex 9992080000; asc ;;
6: len 10; hex 466f7274204d616b6572; asc FM;;
7: len 1; hex 4e; asc N;;
8: len 8; hex 8000000000000000; asc ;;
9: len 1; hex 59; asc Y;;
10: len 0; hex ; asc ;;

Table: CTABLE
Create Table: CREATE TABLE `CTABLE` (
`PROPOSER_ID_PK` bigint(20) NOT NULL,
`PROPOSER_REJECTION_REMARKS_PK` bigint(20) NOT NULL,
`DRAFT_ID_PK_COL` bigint(20) NOT NULL,
PRIMARY KEY (`PROPOSER_ID_PK`,`PROPOSER_REJECTION_REMARKS_PK`) ,
UNIQUE KEY `PROPOSER_REJECTION_REMARKS_PK` (`PROPOSER_REJECTION_REMARKS_PK`),
KEY `FKEF5F18AEC4EB848F` (`PROPOSER_ID_PK`),
KEY `FKEF5F18AE38BD3439` (`PROPOSER_REJECTION_REMARKS_PK`),
KEY `PROPOSER_REJ_REM_DRAFT_ID` (`DRAFT_ID_PK_COL`),
CONSTRAINT `FKEF5F18AE38BD3439` FOREIGN KEY (`PROPOSER_REJECTION_REMARKS_PK`) REFERENCES `PROPREJECTMARKS` (`PROPOSER_REJECTION_REMARKS_PK`),
CONSTRAINT `FKEF5F18AEC4EB848F` FOREIGN KEY (`PROPOSER_ID_PK`) REFERENCES `INSPRODETAILS` (`PROPOSER_ID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1


Table: PTABLE
Create Table: CREATE TABLE `PTABLE` (
`DRAFT_ID_PK_COL` bigint(20) NOT NULL AUTO_INCREMENT,
`CREATED_BY` varchar(255) DEFAULT NULL,
`CREATED_DATE` datetime DEFAULT NULL,
`DATE_OF_RECEIPT` datetime DEFAULT NULL,
`DRAFT_CREATOR` varchar(255) DEFAULT NULL,
`DRAFT_IN_USE` varchar(255) DEFAULT NULL,
`DRAFT_IN_USE_USER` bigint(20) NOT NULL,
`DRAFT_INACTIVE_FLAG` varchar(2) DEFAULT NULL,
`EIA_ACCOUNT_NO` varchar(255) DEFAULT NULL,
`EIA_UPLOAD_FLAG` varchar(2) DEFAULT NULL,
`INSURANCE_CMPNY_CD` varchar(6) DEFAULT NULL,
`MODIFIED_BY` varchar(255) DEFAULT NULL,
`MODIFIED_DATE` datetime DEFAULT NULL,
`NO_OF_PROPOSER` int(11) DEFAULT NULL,
`PANSITENAME` varchar(255) DEFAULT NULL,
`PAN_VERIFICATION_STATUS` varchar(2) DEFAULT NULL,
`POS_MAKER_USERID` bigint(20) DEFAULT NULL,
`POS_ORGANIZATIONID` bigint(20) DEFAULT NULL,
`PROPOSER_ACCOUNT_CATEGORY` varchar(255) DEFAULT NULL,
`PROPOSER_ACCOUNT_STATUS` varchar(255) DEFAULT NULL,
`PROPOSER_CONTINUOUS_KYC` varchar(255) DEFAULT NULL,
`PROPOSER_DATE_OF_BIRTH` datetime DEFAULT NULL,
`PROPOSER_FATHER_HUSBAND_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_FATHER_HUSBAND_NAME_INDICATOR` varchar(255) DEFAULT NULL,
`PROPOSER_FIRST_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_GENDER` varchar(255) DEFAULT NULL,
`ITAX_PROPOSER_NAME` varchar(100) DEFAULT NULL,
`PROPOSER_LAST_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_MARITAL_STATUS` varchar(255) DEFAULT NULL,
`PROPOSER_MIDDLE_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_OCCUPATION` varchar(255) DEFAULT NULL,
`PAN` varchar(255) DEFAULT NULL,
`PROPOSER_RECORD_REFERENCE` varchar(255) DEFAULT NULL,
`PROPOSER_RESIDENTIAL_STATUS` varchar(255) DEFAULT NULL,
`PROPOSER_SCAN_FORM` longblob,
`PROPOSER_SCAN_FORM_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_SCAN_PHOTOGRAPH` longblob,
`PROPOSER_SCAN_PHOTOGRAPH_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_SCAN_SIGNATURE` longblob,
`PROPOSER_SCAN_SIGNATURE_NAME` varchar(255) DEFAULT NULL,
`PROPOSER_TITLE` varchar(255) DEFAULT NULL,
`PROPOSER_TYPE_OF_ACCOUNT` varchar(255) DEFAULT NULL,
`PROPOSER_TYPE_OF_ACCOUNT_OTHERS` varchar(255) DEFAULT NULL,
`UID` varchar(255) DEFAULT NULL,
`RECORD_REFERENCE_ID` varchar(255) DEFAULT NULL,
`RELATIONSHIP` varchar(255) DEFAULT NULL,
`REPRESENTATIVE_ADDRESS_FLAG` varchar(255) DEFAULT NULL,
`SAME_ADDRESS_FLAG` varchar(255) DEFAULT NULL,
`PROPOSER_SINGLE_DOCUMENT` longblob,
`PROPOSER_SINGLE_DOCUMENT_NAME` varchar(255) DEFAULT NULL,
PRIMARY KEY (`DRAFT_ID_PK_COL`),
KEY `PROPOSER_DRAFT_MODIFIED_DATE` (`MODIFIED_DATE`),
KEY `PROPOSER_DRAFT_IN_USE_N_USER` (`DRAFT_IN_USE`,`DRAFT_IN_USE_USER`)
) ENGINE=InnoDB AUTO_INCREMENT=8850 DEFAULT CHARSET=latin1

Not Connecting ?

Lastest Forum Posts - July 30, 2015 - 12:50am
Hi,

I tried playback this morning, as per display message its not clear the connection was successful or not ? seem its not connection to database and on other side it can login to mysql from same machine to address and credential I'm providing to playback .

Any idea how to debug ? or check if there is no connection issue ? Below is playback output.


percona-playback --mysql-host=<AWS ENDPOINT> --mysql-user=<some user> --mysql-password=<some password> --mysql-schema=<db name> --query-log-file=test.log
Database Plugin: libmysqlclient
Running...
Connection 0 Rows Sent: 1 != expected 0 for query: select count(*) from test;
Error Report finished

Detailed Report
----------------
SELECTs : 0 queries (0 faster, 0 slower)
INSERTs : 0 queries (0 faster, 0 slower)
UPDATEs : 0 queries (0 faster, 0 slower)
DELETEs : 0 queries (0 faster, 0 slower)
REPLACEs : 0 queries (0 faster, 0 slower)
DROPs : 0 queries (0 faster, 0 slower)


Report
------
Executed 1 queries
Spent 00:00:00.225578 executing queries versus an expected 00:00:00 time.
0 queries were quicker than expected, 0 were slower
A total of 0 queries had errors.
Expected 0 rows, got 1 (a difference of 1)
Number of queries where number of rows differed: 1.

Average of 1.00 queries per connection (1 connections).