Buy Percona ServicesBuy Now!


Lastest Forum Posts - February 20, 2017 - 6:45pm

i cant find pmm-server:1.1.1 in dockerhub .

MongoDB 3.4 Bundle Release: Percona Server for MongoDB 3.4, Percona Monitoring and Management 1.1, Percona Toolkit 3.0 with MongoDB

Latest MySQL Performance Blog posts - February 20, 2017 - 1:51pm

This blog post is the first in a series on Percona’s MongoDB 3.4 bundle release. This release includes Percona Server for MongoDB, Percona Monitoring and Management, and Percona Toolkit. In this post, we’ll look at the features included in the release.

We have a lot of great MongoDB content coming your way in the next few weeks. However, I wanted first to give you a quick list of the major things to be on the look out for.

This new bundled release ensures a robust, secure database that you can adapt to changing business requirements. It helps demonstrate how organizations can use MongoDB (and Percona Server for MongoDB), PMM and Percona Toolkit together to benefit from the cost savings and agility provided by free and proven open source software.

Percona Server for MongoDB 3.4 delivers all the latest MongoDB 3.4 Community Edition features, additional Enterprise features and a greater choice of storage engines.

Some of these new features include:

  • Shard member types. All nodes now need to know what they do – this helps with reporting and architecture planning more than the underlying code, but it’s an important first step.
  • Sharding balancer moved to config server primary
  • Configuration servers must not be a replica set
  • Faster balancing (shard count/2) – concurrent balancing actions can now happen at the same time!
  • Sharding and replication tags renamed to “zones” – again, an important first step
  • Default write behavior moved to majority – this could majorly impact many workloads, but moving to a default safe write mode is important
  • New decimal data type
  • Graph aggregation functions – we will talk about these more in a later blog, but for now note that graph and faceted searches are added.
  • Collations added to most access patterns for collections and databases
  • . . .and much more

Percona Server for MongoDB includes all the features of MongoDB Community Edition 3.4, providing an open source, fully-compatible, drop-in replacement with many improvements, such as:

  • Integrated, pluggable authentication with LDAP that provides a centralized enterprise authentication service
  • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as user names and IP addresses) from log files
  • Hot backups for the WiredTiger engine to protect against data loss in the case of a crash or disaster, without impacting performance
  • Two storage engine options not supported by MongoDB Community Edition 3.4 (doubling the total engine count choices):
    • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud.
    • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads.

Percona Monitoring and Management 1.1

  • Support for MongoDB and Percona Server for MongoDB
  • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine
  • Cluster and replica set wide views
  • Many more graphable metrics available for both for the OS and the database layer than currently provided by other tools in the ecosystem

Percona Toolkit 3.0

  • Two new tools for MongoDB are now in Percona’s Toolkit:
    • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance
      • This is useful for any DBA who wants a general idea of what’s happening in the system, what the state of their cluster/replica set is, and more.
    • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting
      • Query digest is one of the most used Toolkit features ever. In MongoDB, this is no different. Typically you might only look at your best and worst query times and document scans. However, this will show 90th percentiles, and top 10 queries take seconds versus minutes.

For all of these topics, you will see more blogs in the next few weeks that cover them in detail. Some people have asked what Percona’s MongoDB commitment looks like. Hopefully, this series of blogs help show how improving open source databases is central to the Percona vision. We are here to make the world better for developers, DBAs and other MongoDB users.

Percona Toolkit 3.0.1 is now available

Latest MySQL Performance Blog posts - February 20, 2017 - 1:50pm

Percona announces the availability of Percona Toolkit 3.0.1 on February 20, 2017. This is the first general availability (GA) release in the 3.0 series with a focus on padding MongoDB tools:

Downloads are available from the Percona Software Repositories.

NOTE: If you are upgrading using Percona’s yum repositories, make sure that the you enable the basearch repo, because Percona Toolkit 3.0 is not available in the noarch repo.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release includes changes from the previous 3.0.0 RC and the following additional changes:

  • Added requirement to run pt-mongodb-summary as a user with the clusterAdmin or root built-in roles.

You can find release details in the release notes. Bugs can be reported on Toolkit’s launchpad bug tracker.

Percona Monitoring and Management 1.1.1 is now available

Latest MySQL Performance Blog posts - February 20, 2017 - 1:49pm

Percona announces the release of Percona Monitoring and Management 1.1.1 on February 20, 2017. This is the first general availability (GA) release in the PMM 1.1 series with a focus on providing alternative deployment options for PMM Server:

NOTE: The AMIs and VirtualBox images above are still experimental. For production, it is recommended to run Docker images.

The instructions for installing Percona Monitoring and Management 1.1.1 are available in the documentation. Detailed release notes are available here.

There are no changes compared to previous 1.1.0 Beta release, except small fixes for MongoDB metrics dashboards.

A live demo of PMM is available at

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Percona Server for MongoDB 3.4.2-1.2 is now available

Latest MySQL Performance Blog posts - February 20, 2017 - 1:49pm

Percona announces the release of Percona Server for MongoDB 3.4.2-1.2 on February 20, 2017. It is the first general availability (GA) release in the 3.4 series. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release candidate is based on MongoDB 3.4.2, includes changes from PSMDB 3.4.0 Beta and 3.4.1 RC, and the following additional changes:

  • Fixed the audit log message format to comply with upstream MongoDB:
    • Changed params document to param
    • Added roles document
    • Fixed date and time format
    • Changed host field to ip in the local and remote documents

Percona Server for MongoDB 3.4.2-1.2 release notes are available in the official documentation.

PMM from Docker to .deb and AMI

Lastest Forum Posts - February 19, 2017 - 4:00pm
How discussed on Highload++, after months I will come back to this topic, how it is on the progress primes to porting PMM to a compiled deb-package or at least a AMI for easily use for a starter, who do not like to fight against dockers (All our team hate docker, sorry).

Big greetings from Sankt-Petersburg =)

На конференции Хайлоада мы с вами обсуждали портацию ПММ сервера из докера в идеальной версии в готовый компилированный деб-пакет (да, я видел, что там много зависимостей, но их же можно тоже подтягивать при инсталяции автоматом), ну а если совсем не вариант, то хотя бы готовый AMI слепок для амазона (в идеальном мире на Дебиане). Насколько я помню Пётр сказал, что таск такой либо уже был озвучен, либо будет озвучен после конферанции и надо будет подождать неесколько месяцев для обкатки. Можно надеятся на то, что это случится?

Server summary:mysql summary - Exit Status 1

Lastest Forum Posts - February 17, 2017 - 11:56pm
I get "Exit Status 1" error in mysql summary

and query error log like this:
[root@test log]# cat pmm-mysql-queries-0.log
# Version: percona-qan-agent 1.0.7
# Basedir: /usr/local/percona/qan-agent
# PID: 11372
# API:
# UUID: e7b416c0e9c0466162728fb8840ab066
2017/02/18 14:46:17.492888 main.go:165: Starting agent...
2017/02/18 14:46:17.493570 main.go:346: Agent is ready
2017/02/18 14:46:17.498969 main.go:207: API is ready
2017/02/18 14:54:50.460563 ERROR agent exit status 1
2017/02/18 14:54:50.460641 WARNING agent Cmd fail: Reply[Cmd:GetMySQLSummary Error:'exit status 1' Id:17731696-8329-44ce-6aee-90381f02f738]
2017/02/18 15:39:13.971141 ERROR agent exit status 1
2017/02/18 15:39:13.971279 WARNING agent Cmd fail: Reply[Cmd:GetMySQLSummary Error:'exit status 1' Id:d1c1ff6f-a90b-4e82-7bb3-6fa556491672]

and read this

I install percona-toolkit and create conf file like this:
[root@i-zryeyuku mysql]# cat /etc/percona-toolkit/pt-mysql-summary.conf

I can run pt-mysql-summary like this:
[root@test log]# pt-mysql-summary
mysql: [Warning] Using a password on the command line interface can be insecure.
# Percona Toolkit MySQL Summary Report #######################
System time | 2017-02-18 07:46:27 UTC (local TZ: CST +0800)

but still get this error,What am I missing?

Galera misreports cluster size

Lastest Forum Posts - February 17, 2017 - 5:06pm
I have a 3 node galera cluster, with 2 members on same subnet (call them db1 and db2), different ESX hosts. 3rd node (db3) was on a remote ESX server that had site-to-site tunnel. Recently the 2 original servers were taken down to be moved to the same location as the 1 new one. After the equipment was moved and the servers (Red Hat 7) were started, both of the nodes failed to start. Worse, while db3 remained up, it was still reporting the cluster size as 3 (even though db1 & 2 had failed to start).

I eventually shut down db3, and started db1 with new-cluster option. What I do not understand, is how or why db3 would still report 3. Anyone ever experienced something like this? Any ideas as to why or how this could have happened?

PMM Gravana does not show all data

Lastest Forum Posts - February 17, 2017 - 2:33am
Centos 7, percona server 5.7.17.
We have to servers connected to PMM. Both configured to send same info to PMM Grafana. But one of them do that successfully, other lost some info, buh not all.

For example:

2017-02-17 12-02-36 2.png
2017-02-17 12-08-08 2.png
2017-02-17 12-02-52 2.png
2017-02-17 12-05-41 2.png
2017-02-17 12-05-51 2.png

mysqld.cnf from good server

bind-address =
character_set_server = utf8
collation_server = utf8_general_ci
max_allowed_packet = 256M
innodb_buffer_pool_size = 10G
innodb_page_size = 32K
innodb_data_file_path = ibdata1:10M:autoextend
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2 # 0 # 1
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_doublewrite = 0
innodb_io_capacity = 600
secure-file-priv = /tmp
# Disabling symbolic-links is recommended to prevent assorted security risks
# bin-log replication options
max_binlog_files = 10
# statistics for PMM server
userstat = ON
query_response_time_stats = ON

mysqld.cnf from bad server:

prompt="\u@\h \d> "


innodb_temp_data_file_path=ibtmp1:12M:autoextend:m ax:1024M
secure_file_priv =


XtraDB not cleaning up galera.cache pages

Lastest Forum Posts - February 16, 2017 - 7:04pm
We are seeing a strange issue where xtradb on one of our 3 cluster nodes just stopped using galera.cache file and started creating at a rate of about one file every 5 minutes.
Looking wsrep% variables, wsrep_gcache_pool_size is now 113021942801 even though it is supposedly capped at 64GB in my.cnf.
All the other nodes seem to honor the cap. The node in question is synced and all the transactions are replicated to it.
Cached down to variable is extremely low. Lowest in the entire cluster.

| wsrep_local_cached_downto | 83655 |

One of the other nodes:
| wsrep_local_cached_downto | 62524343 |

Anyone seen anything like that?

All the nodes are:
Server version: 5.6.34-79.1-56-log Percona XtraDB Cluster (GPL), Release rel79.1, Revision 7c38350, WSREP version 26.19, wsrep_26.19

MySQL Bug 72804 Workaround: “BINLOG statement can no longer be used to apply query events”

Latest MySQL Performance Blog posts - February 16, 2017 - 3:39pm

In this blog post, we’ll look at a workaround for MySQL bug 72804.

Recently I worked on a ticket where a customer performed a point-in-time recovery PITR using a large set of binary logs. Normally we handle this by applying the last backup, then re-applying all binary logs created since the last backup. In the middle of the procedure, their new server crashed. We identified the binary log position and tried to restart the PITR from there. However, using the option --start-position, the restore failed with the error “The BINLOG statement of type Table_map was not preceded by a format description BINLOG statement.” This is a known bug and is reported as MySQL Bug #72804: “BINLOG statement can no longer be used to apply Query events.”

I created a small test to demonstrate a workaround that we implemented (and worked).

First, I ran a large import process that created several binary logs. I used a small value in max_binlog_size and tested using the database “employees” (a standard database used for testing).Then I dropped the database.

mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.33 sec) mysql> drop database employees; Query OK, 8 rows affected (1.25 sec)

To demonstrate the recovery process, I joined all the binary log files into one SQL file and started an import.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ ../bin/mysqlbinlog var/mysqld.1/data/master.000001 var/mysqld.1/data/master.000002 var/mysqld.1/data/master.000003 var/mysqld.1/data/master.000004 var/mysqld.1/data/master.000005 > binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs.sql ERROR 1064 (42000) at line 9020: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'inserting error

I intentionally generated a syntax error in the resulting file with the help of the script (which just inserts a bogus SQL statement in a random row). The error message clearly showed where the import stopped: line 9020. I then created a file that cropped out the part that had already been imported (lines 1- 9020), and tried to import this new file.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ tail -n +9021 binlogs.sql >binlogs_rest.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest.sql ERROR 1609 (HY000) at line 134: The BINLOG statement of type `Table_map` was not preceded by a format description BINLOG statement.

Again, the import failed with exactly the same error as the customer. The reason for this error is that the BINLOG statement – which applies changes from the binary log – expects that the format description event gets run in the same session as the binary log import, but before it. The format description existed initially at the start of the import that failed at line 9020. The later import (from line 9021 on) doesn’t contain this format statement.

Fortunately, this format is the same for the same version! We can simply take it from the beginning the SQL log file (or the original binary file) and put into the file created after the crash without lines 1-9020.

With MySQL versions 5.6 and 5.7, this event is located in the first 11 rows:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql | cat -n 1 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; 2 /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; 3 DELIMITER /*!*/; 4 # at 4 5 #170128 17:58:11 server id 1 end_log_pos 123 CRC32 0xccda074a Start: binlog v 4, server v 5.7.16-9-debug-log created 170128 17:58:11 at startup 6 ROLLBACK/*!*/; 7 BINLOG ' 8 g7GMWA8BAAAAdwAAAHsAAAAAAAQANS43LjE2LTktZGVidWctbG9nAAAAAAAAAAAAAAAAAAAAAAAA 9 AAAAAAAAAAAAAAAAAACDsYxYEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA 10 AUoH2sw= 11 '/*!*/;

The first six rows are meta information, and rows 6-11 are the format event itself. The only thing we need to export into our resulting file is these 11 lines:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql > binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat binlogs_rest.sql >> binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$

After this, the import succeeded!

Percona Blog Poll Results: What Programming Languages Are You Using for Backend Development?

Latest MySQL Performance Blog posts - February 16, 2017 - 1:53pm

In this blog we’ll look at the results from Percona’s blog poll on what programming languages you’re using for backend development.

Late last year we started a poll on what backend programming languages are being used by the open source community. The three components of the backend – server, application, and database – are what makes a website or application work. Below are the results of Percona’s poll on backend programming languages in use by the community:

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

One of the best-known and earliest web service stacks is the LAMP stack, which spelled out refers to Linux, Apache, MySQL and PHP/Perl/Python. We can see that this early model is still popular when it comes to the backend.

PHP still remains a very common choice for a backend programming language, with Python moving up the list as well. Perl seems to be fading in popularity, despite being used a lot in the MySQL world.

Java is also showing signs of strength, demonstrating the strides MySQL is making in enterprise applications. We can also see JavaScript is increasingly getting used not only as a front-end programming language, but also as back-end language with the Node.JS framework.

Finally, Go is a language to look out for. Go is an open source programming language created by Google. It first appeared in 2009, and is already more popular than Perl or Ruby according to this poll.

Thanks to the community for participating in our poll. You can take our latest poll on what database engine are you using to store time series data here. 

MariaDB at Percona Live Open Source Database Conference 2017

Latest MySQL Performance Blog posts - February 16, 2017 - 10:08am

In this blog, we’ll look at how we plan to represent MariaDB at Percona Live.

The MariaDB Corporation is organizing a conference called M17 on the East Coast in April. Some Perconians (Peter Zaitsev, Vadim Tchachenko, Sveta Smirnova, Alex Rubin, Colin Charles) decided to submit some interesting talks for that conference. Percona also offered to sponsor the conference.

As of this post, the talks haven’t been accepted, and we were politely told that we couldn’t sponsor.

Some of the proposed talks were:

  • MariaDB Backup with Percona XtraBackup (Vadim Tchachenko)
  • Managing MariaDB Server operations with Percona Toolkit (Colin Charles)
  • MariaDB Server Monitoring with Percona Monitoring and Management (Peter Zaitsev)
  • Securing your MariaDB Server/MySQL data (Colin Charles, Ronald Bradford)
  • Data Analytics with MySQL, Apache Spark and Apache Drill (Alexander Rubin)
  • Performance Schema for MySQL and MariaDB Troubleshooting (Sveta Smirnova)

At Percona, we think MariaDB Server is an important part of the MySQL ecosystem. This is why the Percona Live Open Source Database Conference 2017 in Santa Clara has a MariaDB mini-track, consisting of talks from various Percona and MariaDB experts:

If any of these topics look enticing, come to the conference. We have MariaDB at Percona Live.

To make your decision easier, we’ve created a special promo code that gets you $75 off a full conference pass! Just use MariaDB@PL17 at checkout.

In the meantime, we will continue to write and discuss MariaDB, and any other open source database technologies. The power of the open source community is the free exchange of ideas, healthy competition and open dialog within the community.

Here are some more past presentations that are also relevant:

Group Replication: Shipped Too Early

Latest MySQL Performance Blog posts - February 15, 2017 - 4:02pm

This blog post is my overview of Group Replication technology.

With Oracle clearly entering the “open source high availability solutions” arena with the release of their brand new Group Replication solution, I believe it is time to review the quality of the first GA (production ready) release.

TL;DR: Having examined the technology, it is my conclusion that Oracle seems to have released the GA version of Group Replication too early. While the product is definitely “working prototype” quality, the release seems rushed and unfinished. I found a significant number of issues, and I would personally not recommend it for production use.

It is obvious that Oracle is trying hard to ship technology to compete with Percona XtraDB Cluster, which is probably why they rushed to claim Group Replication GA quality.

If you’re all set to follow along and test Group Replication yourself, simplify the initial setup by using this Docker image. We can review some of the issues you might face together.

For the record, I tested the version based on MySQL 5.7.17 release.

No automatic provisioning

First off, the first thing you’ll find is there is NO way to automatically setup of a new node.

If you need to setup new node or recover an existing node from a fatal failure, you’ll need to manually provision the slave.

Of course, you can clone a slave using Percona XtraBackup or LVM by employing some self-developed scripts. But given the high availability nature of the product, one would expect Group Replication to automatically re-provision any failed node.

Bug: stale reads on nodes

Please see this bug:

One line summary: while any secondary nodes are “catching up” to whatever happened on a first node (it takes time to apply changes on secondary nodes), reads on a secondary node could return stale data (as shown in the bug report).

This behavior brings us back to the traditional asynchronous replication slave behavior (i.e., Group Replication’s predecessor).

It also contradicts the Group Replication documentation, which states: “There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time.” (See

I might also mention here that Percona XtraDB Cluster prevents stale reads (see

Bug: nodes become unusable after a big transaction, refusing to execute further transactions

There are two related bugs:

One line summary: after running a big transaction, any secondary nodes become unusable and refuse to perform any further transactions.

Obscure error messages

It is not uncommon to see cryptic error messages while testing Group Replication. For example:

mysql> commit; ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

This is fairly useless and provides little help until I check the mysqld error log. The log provides a little bit more information:

2017-02-09T02:05:36.996776Z 18 [ERROR] Plugin group_replication reported: '[GCS] Gcs_packet's payload is too big. Only the packets smaller than 2113929216 bytes can be compressed.'


The items highlighted above might not seem too bad at first, and you could assume that your workload won’t be affected. However, stale reads and node dysfunctions basically prevent me from running a more comprehensive evaluation.

My recommendation:

If you care about your data, then I recommend not using Group Replication in production. Currently, it looks like it might cause plenty of headaches, and it is easy to get inconsistent results.

For the moment, Group Replication appears an advanced – but broken – traditional MySQL asynchronous replication.

I understand Oracle’s dilemma. Usually people are hesitant to test a product that is not GA. So in order to get feedback from users, Oracle needs to push the product to GA. Oracle must absolutely solve the issues above during future QA cycles.

Docker Images for Percona Server for MySQL Group Replication

Latest MySQL Performance Blog posts - February 15, 2017 - 8:27am

In this blog post, we’ll point to a new Docker image for Percona Server for MySQL Group Replication.

Our most recent release of Percona Server for MySQ (Percona Server for MySQL 5.7.17) comes with Group Replication plugins. Unfortunately, since this technology is very new, it requires some fairly complicated steps to setup and get running. To help with that process, I’ve prepare Docker images that simplify its setup procedures.

You can find the image here:

To start the first node (bootstrap the group):

docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 perconalab/pgr-57

To add nodes into the group after:

docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 -e CLUSTER_JOIN=CONTAINER_ID_FROM_THE_FIRST_STEP perconalab/pgr-57

You can also get a full script that starts “N” number of nodes, here:


Percona Live Open Source Database Conference 2017 Crash Courses: MySQL and MongoDB!

Latest MySQL Performance Blog posts - February 14, 2017 - 12:23pm

The Percona Live Open Source Database Conference 2017 will once again host crash courses on MySQL and MongoDB. Read below to get an outstanding discount on either the MySQL or MongoDB crash course (or both).

The database community constantly tells us how hard it is to find someone with MySQL and MongoDB DBA skills who can help with the day-to-day management of their databases. This is especially difficult when companies don’t have a full-time requirement for a DBA. Developers, system administrators and IT staff spend too much time trying to solve basic database problems that keep them from doing their other job duties. Eventually, the little problems or performance inefficiencies that start to pile up lead to big problems.

In answer to this growing need, Percona Live is once again hosting crash courses for developers, systems administrators and other technical resources. A crash course is a one-day training session on either MySQL 101 or MongoDB 101.

Don’t let the name fool you: these courses are led by Percona database experts who will show you the fundamentals of MySQL or MongoDB tools and techniques.

And it’s not just for DBAs: developers are encouraged to attend to hone their database skills.

Below is a list of the topics covered in each course this year:

MySQL 101 Topics MongoDB 101 Topics

Attendees will return ready to quickly and correctly take care of the day-to-day and week-to-week management of your MySQL or MongoDB environment.

The schedule and cost for the 101 courses (without a full-conference pass) are:

  • MySQL 101: Tuesday, April 25 ($400)
  • MongoDB 101: Wednesday, April 26 ($400)
  • Both MySQL and MongoDB 101 sessions ($700)

(Tickets to the 101 sessions do not grant access to the main Percona Live breakout sessions. Full Percona Live conferences passes grant admission to the 101 sessions. 101 Crash Course attendees will have full access to Percona Live keynote speakers the exhibit hall and receptions.)

As a special promo, the first 101 people to purchase the single 101 talks receive a $299.00 discount off the ticket price! Each session only costs $101! Get both sessions for a mere $202 and save $498.00! Register now using the following codes for your discount:

  • 101: $299 off of either the MySQL or MongoDB tickets
  • 202: $498 off of the combined MySQL/MongoDB ticket

Click here to register.

Register for Percona Live 2017 now! Advanced Registration lasts until March 5, 2017. Percona Live is a very popular conference: this year’s Percona Live Europe sold out, and we’re looking to do the same for Percona Live 2017. Don’t miss your chance to get your ticket at its most affordable price. Click here to register.

Percona Live 2017 sponsorship opportunities are available now. Click here to find out how to sponsor.

Table 'gtid_executed' is read only

Lastest Forum Posts - February 14, 2017 - 8:18am

$ mysql --version
mysql Ver 14.14 Distrib 5.7.14-8, for Linux (x86_64) using 6.0

I'm trying to enable slave replication, and get the above error when running this command on the slave:
Code: mysql> SET @@global.gtid_purged='b5180a5d-4c0e-ee19-7685-3479989fa43a:1-4'; ERROR 1036 (HY000): Table 'gtid_executed' is read only I put "innodb-read-only = 1" in the slave /etc/my.cnf, but this is a global variable, so wouldn't think that would be a problem.


Percona Server for MongoDB 3.4 Product Bundle Release is a Valentine to the Open Source Community

Latest MySQL Performance Blog posts - February 14, 2017 - 7:46am

Percona today announced a Percona Server for MongoDB 3.4 solution bundle of updated products. This release enables any organization to create a robust, secure database environment that can be adapted to changing business requirements.

Percona Server for MongoDB 3.4, Percona Monitoring and Management 1.1, and Percona Toolkit 3.0 offer more features and benefits, with enhancements for both MongoDB® and MySQL® database environments. When these Percona products are used together, organizations gain all the cost and agility benefits provided by free, proven open source software that delivers all the latest MongoDB Community Edition 3.4 features, additional Enterprise features, and a greater choice of storage engines. Along with improved insight into the database environment, the solution provides enhanced control options for optimizing a wider range of database workloads with greater reliability and security.

The solution will be generally available the week of Feb. 20.

New Features and Benefits Summary

Percona Server for MongoDB 3.4

  • All the features of MongoDB Community Edition 3.4, which provides an open source, fully compatible, drop-in replacement
  • Integrated, pluggable authentication with LDAP to provide a centralized enterprise authentication service
  • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as usernames and IP addresses) from log files
  • Hot backups for the WiredTiger engine protect against data loss in the case of a crash or disaster, without impacting performance
  • Two storage engine options not supported by MongoDB Community Edition 3.4:

    • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud
    • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads

Percona Monitoring and Management 1.1

  • Support for MongoDB and Percona Server for MongoDB
  • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine

Percona Toolkit 3.0

  • Two new tools for MongoDB:
    • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance.
    • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting.

For more information, see Percona’s press release.

ClickHouse: New Open Source Columnar Database

Latest MySQL Performance Blog posts - February 13, 2017 - 3:24pm

For this blog post, I’ve decided to try ClickHouse: an open source column-oriented database management system developed by Yandex (it currently powers Yandex.Metrica, the world’s second-largest web analytics platform).

In my previous set of posts, I tested Apache Spark for big data analysis and used Wikipedia page statistics as a data source. I’ve used the same data as in the Apache Spark blog post: Wikipedia Page Counts. This allows me to compare ClickHouse’s performance to Spark’s.

I’ve spent some time testing ClickHouse for relatively large volumes of data (1.2Tb uncompressed). Here is a list of ClickHouse advantages and disadvantages that I saw:

ClickHouse advantages

  • Parallel processing for single query (utilizing multiple cores)
  • Distributed processing on multiple servers
  • Very fast scans (see benchmarks below) that can be used for real-time queries
  • Column storage is great for working with “wide” / “denormalized” tables (many columns)
  • Good compression
  • SQL support (with limitations)
  • Good set of functions, including support for approximated calculations
  • Different storage engines (disk storage format)
  • Great for structural log/event data as well as time series data (engine MergeTree requires date field)
  • Index support (primary key only, not all storage engines)
  • Nice command line interface with user-friendly progress bar and formatting

Here is a full list of ClickHouse features

ClickHouse disadvantages

  • No real delete/update support, and no transactions (same as Spark and most of the big data systems)
  • No secondary keys (same as Spark and most of the big data systems)
  • Own protocol (no MySQL protocol support)
  • Limited SQL support, and the joins implementation is different. If you are migrating from MySQL or Spark, you will probably have to re-write all queries with joins.
  • No window functions

Full list of ClickHouse limitations

Group by: in-memory vs. on-disk

Running out of memory is one of the potential problems you may encounter when working with large datasets in ClickHouse:

SELECT min(toMonth(date)), max(toMonth(date)), path, count(*), sum(hits), sum(hits) / count(*) AS hit_ratio FROM wikistat WHERE (project = 'en') GROUP BY path ORDER BY hit_ratio DESC LIMIT 10 ↖ Progress: 1.83 billion rows, 85.31 GB (68.80 million rows/s., 3.21 GB/s.) ██████████▋ 6%Received exception from server: Code: 241. DB::Exception: Received from localhost:9000, DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 1048576 bytes), maximum: 9.31 GiB: (while reading column hits):

By default, ClickHouse limits the amount of memory for group by (it uses a hash table for group by). This is easily fixed – if you have free memory, increase this parameter:

SET max_memory_usage = 128000000000; #128G

If you don’t have that much memory available, ClickHouse can “spill” data to disk by setting this:

set max_bytes_before_external_group_by=20000000000; #20G set max_memory_usage=40000000000; #40G

According to the documentation, if you need to use max_bytes_before_external_group_by it is recommended to set max_memory_usage to be ~2x of the size of max_bytes_before_external_group_by.

(The reason for this is that the aggregation is performed in two phases: (1) reading and building an intermediate data, and (2) merging the intermediate data. The spill to disk can only happen during the first phase. If there won’t be spill, ClickHouse might need the same amount of RAM for stage 1 and 2.)

Benchmarks: ClickHouse vs. Spark

Both ClickHouse and Spark can be distributed. However, for the purpose of this test I’ve run a single node for both ClickHouse and Spark. The results are quite impressive.

Benchmark summary

 Size / compression  Spark v. 2.0.2  ClickHouse  Data storage format  Parquet, compressed: snappy   Internal storage, compressed   Size (uncompressed: 1.2TB)   395G  212G


 Test  Spark v. 2.0.2  ClickHouse   Diff  Query 1: count (warm)  7.37 sec (no disk IO)  6.61 sec   ~same  Query 2: simple group (warm)   792.55 sec (no disk IO)   37.45 sec  21x better  Query 3: complex group by   2522.9 sec  398.55 sec  6.3x better


ClickHouse vs. MySQL

I wanted to see how ClickHouse compared to MySQL. Obviously, we can’t compare some workloads. For example:

  • Storing terabytes of data and querying (“crunching” would be a better word here) data without an index. It would take weeks (or even months) to load data and build the indexes. That is a much more suitable workload for ClickHouse or Spark.
  • Real-time updates / OLTP. ClickHouse does not support real-time updates / deletes.

Usually big data systems provide us with real-time queries. Systems based on map/reduce (i.e., Hive on top of HDFS) are just too slow for real-time queries, as it takes a long time to initialize the map/reduce job and send the code to all nodes.

Potentially, you can use ClickHouse for real-time queries. It does not support secondary indexes, however. This means it will probably scan lots of rows, but it can do it very quickly.

To do this test, I’m using the data from the Percona Monitoring and Management system. The table I’m using has 150 columns, so it is good for column storage. The size in MySQL is ~250G:

mysql> show table status like 'query_class_metrics'G *************************** 1. row *************************** Name: query_class_metrics Engine: InnoDB Version: 10 Row_format: Compact Rows: 364184844 Avg_row_length: 599 Data_length: 218191888384 Max_data_length: 0 Index_length: 18590056448 Data_free: 6291456 Auto_increment: 416994305

Scanning the whole table is significantly faster in ClickHouse. Retrieving just ten rows by key is faster in MySQL (especially from memory).

But what if we only need to scan limited amount of rows and do a group by? In this case, ClickHouse may be faster. Here is the example (real query used to create sparklines):


SELECT (1480888800 - UNIX_TIMESTAMP(start_ts)) / 11520 as point, FROM_UNIXTIME(1480888800 - (SELECT point) * 11520) AS ts, COALESCE(SUM(query_count), 0) / 11520 AS query_count_per_sec, COALESCE(SUM(Query_time_sum), 0) / 11520 AS query_time_sum_per_sec, COALESCE(SUM(Lock_time_sum), 0) / 11520 AS lock_time_sum_per_sec, COALESCE(SUM(Rows_sent_sum), 0) / 11520 AS rows_sent_sum_per_sec, COALESCE(SUM(Rows_examined_sum), 0) / 11520 AS rows_examined_sum_per_sec FROM query_class_metrics WHERE query_class_id = 7 AND instance_id = 1259 AND (start_ts >= '2014-11-27 00:00:00' AND start_ts < '2014-12-05 00:00:00') GROUP BY point; ... 61 rows in set (0.10 sec) # Query_time: 0.101203 Lock_time: 0.000407 Rows_sent: 61 Rows_examined: 11639 Rows_affected: 0 explain SELECT ... *************************** 1. row *************************** id: 1 select_type: PRIMARY table: query_class_metrics partitions: NULL type: range possible_keys: agent_class_ts,agent_ts key: agent_class_ts key_len: 12 ref: NULL rows: 21686 filtered: 100.00 Extra: Using index condition; Using temporary; Using filesort *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: NULL partitions: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: No tables used 2 rows in set, 2 warnings (0.00 sec)

It is relatively fast.

ClickHouse (some functions are different, so we will have to rewrite the query):

SELECT intDiv(1480888800 - toRelativeSecondNum(start_ts), 11520) AS point, toDateTime(1480888800 - (point * 11520)) AS ts, SUM(query_count) / 11520 AS query_count_per_sec, SUM(Query_time_sum) / 11520 AS query_time_sum_per_sec, SUM(Lock_time_sum) / 11520 AS lock_time_sum_per_sec, SUM(Rows_sent_sum) / 11520 AS rows_sent_sum_per_sec, SUM(Rows_examined_sum) / 11520 AS rows_examined_sum_per_sec, SUM(Rows_affected_sum) / 11520 AS rows_affected_sum_per_sec FROM query_class_metrics WHERE (query_class_id = 7) AND (instance_id = 1259) AND ((start_ts >= '2014-11-27 00:00:00') AND (start_ts < '2014-12-05 00:00:00')) GROUP BY point; 61 rows in set. Elapsed: 0.017 sec. Processed 270.34 thousand rows, 14.06 MB (15.73 million rows/s., 817.98 MB/s.)

As we can see, even though ClickHouse scans more rows (270K vs. 11K – over 20x more) it is faster to execute the ClickHouse query (0.10 seconds in MySQL compared to 0.01 second in ClickHouse). The column store format helps a lot here, as MySQL has to read all 150 columns (stored inside InnoDB pages) and ClickHouse only needs to read seven columns.

Wikipedia trending article of the month

Inspired by the article about finding trending topics using Google Books n-grams data, I decided to implement the same algorithm on top of the Wikipedia page visit statistics data. My goal here is to find the “article trending this month,” which has significantly more visits this month compared to the previous month. As I was implementing the algorithm, I came across another ClickHouse limitation: join syntax is limited. In ClickHouse, you can only do join with the “using” keyword. This means that the fields you’re joining need to have the same name. If the field name is different, we have to use a subquery.

Below is an example.

First, create a temporary table to aggregate the visits per month per page:

CREATE TABLE wikistat_by_month ENGINE = Memory AS SELECT path, mon, sum(hits) / total_hits AS ratio FROM ( SELECT path, hits, toMonth(date) AS mon FROM wikistat WHERE (project = 'en') AND (lower(path) NOT LIKE '%special%') AND (lower(path) NOT LIKE '%page%') AND (lower(path) NOT LIKE '%test%') AND (lower(path) NOT LIKE '%wiki%') AND (lower(path) NOT LIKE '%index.html%') ) AS a ANY INNER JOIN ( SELECT toMonth(date) AS mon, sum(hits) AS total_hits FROM wikistat WHERE (project = 'en') AND (lower(path) NOT LIKE '%special%') AND (lower(path) NOT LIKE '%page%') AND (lower(path) NOT LIKE '%test%') AND (lower(path) NOT LIKE '%wiki%') AND (lower(path) NOT LIKE '%index.html%') GROUP BY toMonth(date) ) AS b USING (mon) GROUP BY path, mon, total_hits ORDER BY ratio DESC Ok. 0 rows in set. Elapsed: 543.607 sec. Processed 53.77 billion rows, 2.57 TB (98.91 million rows/s., 4.73 GB/s.)

Second, calculate the actual list:

SELECT path, mon + 1, a_ratio AS ratio, a_ratio / b_ratio AS increase FROM ( SELECT path, mon, ratio AS a_ratio FROM wikistat_by_month WHERE ratio > 0.0001 ) AS a ALL INNER JOIN ( SELECT path, CAST((mon - 1) AS UInt8) AS mon, ratio AS b_ratio FROM wikistat_by_month WHERE ratio > 0.0001 ) AS b USING (path, mon) WHERE (mon > 0) AND (increase > 2) ORDER BY mon ASC, increase DESC LIMIT 100 ┌─path───────────────────────────────────────────────┬─plus(mon, 1)─┬──────────────────ratio─┬───────────increase─┐ │ Heath_Ledger │ 2 │ 0.0008467223172121601 │ 6.853825241458039 │ │ Cloverfield │ 2 │ 0.0009372609760313347 │ 3.758937474560766 │ │ The_Dark_Knight_(film) │ 2 │ 0.0003508532447770276 │ 2.8858100355450484 │ │ Scientology │ 2 │ 0.0003300109101992719 │ 2.52497180013816 │ │ Barack_Obama │ 3 │ 0.0005786473399980557 │ 2.323409928527576 │ │ Canine_reproduction │ 3 │ 0.0004836300843539438 │ 2.0058985801174662 │ │ Iron_Man │ 6 │ 0.00036261003907049 │ 3.5301196568303888 │ │ Iron_Man_(film) │ 6 │ 0.00035634745198422497 │ 3.3815325090507193 │ │ Grand_Theft_Auto_IV │ 6 │ 0.0004036713142943461 │ 3.2112732008504885 │ │ Indiana_Jones_and_the_Kingdom_of_the_Crystal_Skull │ 6 │ 0.0002856570195547951 │ 2.683443198030021 │ │ Tha_Carter_III │ 7 │ 0.00033954377342889735 │ 2.820114216429247 │ │ EBay │ 7 │ 0.0006575000133427979 │ 2.5483158977946787 │ │ Bebo │ 7 │ 0.0003958340022793501 │ 2.3260912792668162 │ │ Facebook │ 7 │ 0.001683658379576915 │ 2.16460972864883 │ │ Yahoo!_Mail │ 7 │ 0.0002190640575012259 │ 2.1075879062784737 │ │ MySpace │ 7 │ 0.001395608643577507 │ 2.103263660621813 │ │ Gmail │ 7 │ 0.0005449834079575953 │ 2.0675919337716757 │ │ Hotmail │ 7 │ 0.0009126863121737026 │ 2.052471735190232 │ │ Google │ 7 │ 0.000601645849087389 │ 2.0155448612416644 │ │ Barack_Obama │ 7 │ 0.00027336526076130943 │ 2.0031305241832302 │ │ Facebook │ 8 │ 0.0007778115183044431 │ 2.543477658022576 │ │ MySpace │ 8 │ 0.000663544314346641 │ 2.534512981232934 │ │ Two-Face │ 8 │ 0.00026975137404447024 │ 2.4171743959768803 │ │ YouTube │ 8 │ 0.001482456447101451 │ 2.3884527929836152 │ │ Hotmail │ 8 │ 0.00044467667764940547 │ 2.2265750216262954 │ │ The_Dark_Knight_(film) │ 8 │ 0.0010482536106662156 │ 2.190078096294301 │ │ Google │ 8 │ 0.0002985028319919154 │ 2.0028812075734637 │ │ Joe_Biden │ 9 │ 0.00045067411455437264 │ 2.692262662620829 │ │ The_Dark_Knight_(film) │ 9 │ 0.00047863754833213585 │ 2.420864550676665 │ │ Sarah_Palin │ 10 │ 0.0012459220318907518 │ 2.607063205782761 │ │ Barack_Obama │ 12 │ 0.0034487235202817087 │ 15.615409029600414 │ │ George_W._Bush │ 12 │ 0.00042708730873936023 │ 3.6303098900144937 │ │ Fallout_3 │ 12 │ 0.0003568429236849597 │ 2.6193094036745155 │ └────────────────────────────────────────────────────┴──────────────┴────────────────────────┴────────────────────┘ 34 rows in set. Elapsed: 1.062 sec. Processed 1.22 billion rows, 49.03 GB (1.14 billion rows/s., 46.16 GB/s.)

Their response time is really good, considering the amount of data it needed to scan (the first query scanned 2.57 TB of data).


The ClickHouse column-oriented database looks promising for data analytics, as well as for storing and processing structural event data and time series data. ClickHouse can be ~10x faster than Spark for some workloads.

Appendix: Benchmark details


  • CPU: 24xIntel(R) Xeon(R) CPU L5639 @ 2.13GHz (physical = 2, cores = 12, virtual = 24, hyperthreading = yes)
  • Disk: 2 consumer grade SSD in software RAID 0 (mdraid)

Query 1

select count(*) from wikistat


:) select count(*) from wikistat; SELECT count(*) FROM wikistat ┌─────count()─┐ │ 26935251789 │ └─────────────┘ 1 rows in set. Elapsed: 6.610 sec. Processed 26.88 billion rows, 53.77 GB (4.07 billion rows/s., 8.13 GB/s.)


spark-sql> select count(*) from wikistat; 26935251789 Time taken: 7.369 seconds, Fetched 1 row(s)

Query 2

select count(*), month(dt) as mon from wikistat where year(dt)=2008 and month(dt) between 1 and 10 group by month(dt) order by month(dt);


:) select count(*), toMonth(date) as mon from wikistat where toYear(date)=2008 and toMonth(date) between 1 and 10 group by mon; SELECT count(*), toMonth(date) AS mon FROM wikistat WHERE (toYear(date) = 2008) AND ((toMonth(date) >= 1) AND (toMonth(date) <= 10)) GROUP BY mon ┌────count()─┬─mon─┐ │ 2100162604 │ 1 │ │ 1969757069 │ 2 │ │ 2081371530 │ 3 │ │ 2156878512 │ 4 │ │ 2476890621 │ 5 │ │ 2526662896 │ 6 │ │ 2489723244 │ 7 │ │ 2480356358 │ 8 │ │ 2522746544 │ 9 │ │ 2614372352 │ 10 │ └────────────┴─────┘ 10 rows in set. Elapsed: 37.450 sec. Processed 23.37 billion rows, 46.74 GB (623.97 million rows/s., 1.25 GB/s.)


spark-sql> select count(*), month(dt) as mon from wikistat where year(dt)=2008 and month(dt) between 1 and 10 group by month(dt) order by month(dt); 2100162604 1 1969757069 2 2081371530 3 2156878512 4 2476890621 5 2526662896 6 2489723244 7 2480356358 8 2522746544 9 2614372352 10 Time taken: 792.552 seconds, Fetched 10 row(s)

Query 3

SELECT path, count(*), sum(hits) AS sum_hits, round(sum(hits) / count(*), 2) AS hit_ratio FROM wikistat WHERE project = 'en' GROUP BY path ORDER BY sum_hits DESC LIMIT 100;


:) SELECT :-] path, :-] count(*), :-] sum(hits) AS sum_hits, :-] round(sum(hits) / count(*), 2) AS hit_ratio :-] FROM wikistat :-] WHERE (project = 'en') :-] GROUP BY path :-] ORDER BY sum_hits DESC :-] LIMIT 100; SELECT path, count(*), sum(hits) AS sum_hits, round(sum(hits) / count(*), 2) AS hit_ratio FROM wikistat WHERE project = 'en' GROUP BY path ORDER BY sum_hits DESC LIMIT 100 ┌─path────────────────────────────────────────────────┬─count()─┬───sum_hits─┬─hit_ratio─┐ │ Special:Search │ 44795 │ 4544605711 │ 101453.41 │ │ Main_Page │ 31930 │ 2115896977 │ 66266.74 │ │ Special:Random │ 30159 │ 533830534 │ 17700.54 │ │ Wiki │ 10237 │ 40488416 │ 3955.11 │ │ Special:Watchlist │ 38206 │ 37200069 │ 973.67 │ │ YouTube │ 9960 │ 34349804 │ 3448.78 │ │ Special:Randompage │ 8085 │ 28959624 │ 3581.9 │ │ Special:AutoLogin │ 34413 │ 24436845 │ 710.11 │ │ Facebook │ 7153 │ 18263353 │ 2553.24 │ │ Wikipedia │ 23732 │ 17848385 │ 752.08 │ │ Barack_Obama │ 13832 │ 16965775 │ 1226.56 │ │ index.html │ 6658 │ 16921583 │ 2541.54 │ … 100 rows in set. Elapsed: 398.550 sec. Processed 26.88 billion rows, 1.24 TB (67.45 million rows/s., 3.10 GB/s.)


spark-sql> SELECT > path, > count(*), > sum(hits) AS sum_hits, > round(sum(hits) / count(*), 2) AS hit_ratio > FROM wikistat > WHERE (project = 'en') > GROUP BY path > ORDER BY sum_hits DESC > LIMIT 100; ... Time taken: 2522.903 seconds, Fetched 100 row(s)


Sudden Failure of Automated Backups

Lastest Forum Posts - February 13, 2017 - 2:49pm

Automated Xtrabackups for our MySQL Percona Cluster suddenly seem to have stopped working without us changing anything specific in the backup procedures/normal operating processes of our cluster. In the error log, there are entries like this:

2017-02-13 02:00:02,007 - INFO - Pre-process for backup: Ensure the backup directory is present and empty...
2017-02-13 02:00:02,023 - INFO - Running innobackupex for the running instance...
2017-02-13 02:00:02,025 - INFO - ... xtrabackup still running, check [/home/percona/backup.log/backup.log]
^Gxbstream: Can't create/write to file '././backup-my.cnf' (Errcode: 17 - File exists)
xbstream: failed to create file.
2017-02-13 02:00:09,033 - ERROR - Caught this exception in Manager: Xtrabackup failed - please see log at [/home/percona/backup.log/backup.log]
2017-02-13 02:00:09,033 - INFO - Clean up any subprocesses before exit...

The backup.log states:

170213 02:00:02 innobackupex: Starting ibbackup with command: xtrabackup --defaults-extra-file="/home/percona/.my.cnf" --defaults-group="mysqld" --backup --suspend-at-end --target-dir=/tmp --innodb_data_file_path="ibdata1:12M:autoextend" --tmpdir=/tmp --extra-lsndir='/tmp' --stream=xbstream

innobackupex: Waiting for ibbackup (pid=5292) to suspend
innobackupex: Suspend file '/tmp/xtrabackup_suspended_2'

xtrabackup version 2.2.12 based on MySQL server 5.6.24 Linux (x86_64) (revision id: 8726828)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: open files limit requested 65535, set to 1024
xtrabackup: using the following InnoDB configuration:
xtrabackup: innodb_data_home_dir = ./
xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 2
xtrabackup: innodb_log_file_size = 4294967296
xtrabackup: using ALL_O_DIRECT
>> log scanned up to (12969935228387)
xtrabackup: Generating a list of tablespaces
[01] Streaming ./ibdata1
^Gxtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
xb_stream_write_data() failed.
^Gxtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
[01] xtrabackup: Error: xtrabackup_copy_datafile() failed.
[01] xtrabackup: Error: failed to copy datafile.

innobackupex: got a fatal error with the following stacktrace: at /usr/bin/innobackupex line 2711
main::wait_for_ibbackup_file_create('/tmp/xtrabackup_suspended_2') called at /usr/bin/innobackupex line 2731
main::wait_for_ibbackup_suspend('/tmp/xtrabackup_suspended_2') called at /usr/bin/innobackupex line 1984
main::backup() called at /usr/bin/innobackupex line 1609
innobackupex: Error: The xtrabackup child process has died at /usr/bin/innobackupex line 2711.

The only thing I can think of that would have changed over time is the size of the database backups, as of course the database is always growing. Could the backup buffer sizes need to be increased? More information: our backups stream daily from the secondary node of our 3 node cluster to another server where the backups are kept and then a test restore is done from this backup server to another final server instance to verify the backups restore correctly. Furthermore, our binlogs are constantly also streamed from this secondary node of the 3 node cluster the final restore server instance (the binlogs have not failed and continue to stream properly according to files present on the final restore server).

Topology Summary:
SqlCluster Node1/2/3: Node 2 Streams Backups Daily to -> Backup Keeper Server which restores to -> Final Restore server
Node 2 also Streams constant binlogs to -> Final Restore server

Any suggestions on what might be causing this sudden backup failure? The cluster has been running without issue otherwise so it is perplexing. There is also plenty of free space given the amount of data we are backing up/testing restores on. Thanks in advance and have a great day! ^_^
Visit Percona Store

General Inquiries

For general inquiries, please send us your question and someone will contact you.