Buy Percona SupportEmergency? Get 24/7 Help Now!

Percona Server for MySQL 5.5.53-38.5 is now available

Latest MySQL Performance Blog posts - November 23, 2016 - 10:59am

Percona announces the release of Percona Server for MySQL 5.5.53-38.4 on November 18, 2016. Based on MySQL 5.5.53, including all the bug fixes in it, Percona Server for MySQL 5.5.53-38.5 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.53-38.5 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

New Features:
  • Metrics for scalability measurement feature is being built by default but deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs.

Find the release notes for Percona Server for MySQL 5.5.53-38.5 in our online documentation. Report bugs on the launchpad bug tracker.

Several instances

Lastest Forum Posts - November 23, 2016 - 6:17am
Hello, is it possible to visualize all individual instances of a db server in PMM ? For example : io usage per instances, qps etc ...

MySQL query response time dashboard no results

Lastest Forum Posts - November 22, 2016 - 9:31pm
Everything is fine but can't get the "MySQL query response time" dashboard to display data, thanks for helping me.

My Variable is here

Persona Server 5.6.33-79.0
query_response_time_stats = 1
userstat = 1
thread_statistics = 1

query_response_time_flush | OFF |
query_response_time_range_base | 10 |
query_response_time_stats | ON |


Grafana 3.1.1 with percona/grafana-dashboards and apply patch
sed -i 's/expr=\(.\)\.replace(\(.\)\.expr,\(.\)\.scopedVars\ (.*\)var \(.\)=\(.\)\.interval/expr=\1.replace(\2.expr,\3.scopedVars\4var \5=\1.replace(\6.interval, \3.scopedVars)/' /usr/share/grafana/public/app/plugins/datasource/prometheus/datasource.js
sed -i 's/,range_input/.replace(\/"{\/g,"\"").replace(\/}"\/g,"\""),range_input/; s/step_input:""/' /usr/share/grafana/public/app/plugins/datasource/prometheus/query_ctrl.js

Prometheus 1.3.1
mysqld_exporter, version 0.9.0 (branch: master, revision: 8400af20ccdbf6b5e0faa2c925c56c48cd78d70b)

Running using an environment
./mysqld_exporter".my.cnf" -collect.auto_increment.columns=true -collect.binlog_size=true -collect.engine_innodb_status=true -collect.info_schema.clientstats=true -collect.info_schema.innodb_metrics=true -collect.info_schema.innodb_tablespaces=true -collect.info_schema.processlist=true -collect.info_schema.query_response_time=true -collect.info_schema.tablestats=true -collect.info_schema.userstats=true -collect.perf_schema.eventsstatements=true -collect.perf_schema.eventswaits=true -collect.perf_schema.file_events=true -collect.perf_schema.indexiowaits=true -collect.perf_schema.tableiowaits=true -collect.perf_schema.tablelocks=true

No results

Work normal


Webinar Q/A: MySQL High Availability with Percona XtraDB Cluster 5.7

Latest MySQL Performance Blog posts - November 22, 2016 - 12:34pm

In this blog I will provide answers to the questions and queries that some of you have raised during the webinar on Nov 17th.

I would like to say thank you to all of the audience who attended the talk on November 17, 2016. You can also check the recording and slides here.

Q. How is storage distribution done across the node?

A. Each node has independent storage and other resources. There is no sharing of resource. Only the write-sets are replicated.

Q. If write-set propagation fails in some manner is there any retry mechanism?

A. write-set are written to group channel and originating node waits for ack from all the nodes of the cluster. If some nodes fails to respond back then it may be loose its cluster membership. Each node needs to consume all write-sets and in given order only.

Q. Normally, we point only to one write node, can we point in Percona XtraDB Cluster 5.7 to two writing nodes balanced ? Or should the solution be ProxySQL ?

A. Percona XtraDB Cluster (PXC) being multi-master you can execute writes on multiple-nodes. (This is possible even with 5.6). ProxySQL will help you load-balance your traffic but facility to write to any node is inherent to PXC.

Q. Which service call does a joining node have to be to get cluster membership? Is there some kind of registry service?

A. There is no special registry service. This is transparent to the end-user and is handled as part of gcomm communication layer.

Q. Would it be possible to get more information about setting up proxy-sql as we are currently using haproxy but would like a more aware balancer.

A. These articles should help:

Q. Is there a recommended setup for Cluster (White Paper)? I did hear a lot of conflict issues between nodes. So I would like to see if there is a recommended setup.

A. There is not a single way to do this but there are a lot of blogs based on your use-case. Simplest one is 3 node cluster in LAN. Conflicts generally happens if user tend to update same data through multiple nodes. Dis-joint workload distribution will help avoid conflict. Said that if conflicts are inherent part of application or workload Percona XtraDB Cluster (PXC) is well armed to handle it.

Q. What is best way to figure out timeouts for geo clusters?

A. Studying latency and ensuring timeout > latency.

Q. Lets say we are running Percona XtraDB Cluster 5.6 version with 2 cluster. Can i join new node with latest version of Percona XtraDB Cluster 5.7?

A. This scenario is possible as part of Percona XtraDB Cluster (PXC) support rolling upgrade a new node demanding SST from 5.6 node will surely not work. Also, this should be a temporary solution with plan for full upgrade not something you want to continue working with.

Q. Currently i am using Percona XtraDB Cluster 5.6. Mostly i am facing a deadlock situation. When insert query is running on big table. Then Percona trys to synch with another node. At that time ant dml query won’t be executed. So at that time i need to shutdown another node. Then query execution will be fine. Then i need to start another node one by node. I even changed may Gelera/percona wrep_xx configuration, but it did not work. So is this kind of issue solved in Percona XtraDB Cluster 5.7?

A. I am not sure I understood the complete setup but let me try to summarize my understanding. You have DML running on node-1 that is replication to node-2 and node-2 workload is trying to touch the same big-table that is getting replicated write-set. Local transaction may face a abort as replicated transaction always take priority over local running transaction. There shouldn’t be a need to shutdown any of the node. If you still face this problem you can file the detailed report on lp or forum. We can discuss what is going wrong.

Q. I need to make DR platform. which replication will be suitable for this. Do i need to upgrade with Percona XtraDB Cluster 5.7 at DR side or Replication manager requires?

A. For DR you can either use extended cluster so that DR site get instant write-set or setup a new cluster and enable cluster-cluster replication using MySQL MASTER-MASTER async replication. (Given DR one way MASTER-SLAVE should also work). You don’t need to upgrade it but it is better to use consistent and updated version for all node especially mix-match of MASTER-SLAVE may have compatibility issue.

Q. What are the major differences/benefits between Percona XtraDB Cluster 5.7 and MariaDB Cluster with Galera ?

A. Percona XtraDB Cluster (PXC) is 5.7 GA. MariaDB 10.2 is proposed to be GA by Dec 2016. Besides this PXC is fully PS compatible that uses XtraDB engine and there are some small functional/usage difference and stability difference.

Q. How much time a node can be out of a cluster and still can rejoin applying writesets ? How is managed writesets retention ?

A. Time node can be offline without need for SST depends on 2 factors: rate of replicating transaction (including size) and size of galera-cache that caches these write-sets. If you think you need longer offline time and then you should set galera cache accordingly.

Q. Can we have a sample config file for geo-clusters?

A. We will try to come up with one in due-course through an upcoming blog. In the meantime, you can look at existing blogs on the Percona Database Performance blog.

Q. Whats is the limit for max_rows and max_tnx_size in Percona XtraDB Cluster (PXC) 5.7..specially for batch datalaods across multi-region cluster nodes

A. wsrep_max_ws_rows (DEFAULT 0: no limit, max: 1048576). wsrep_max_ws_size (DEFAULT: 2G, range: 1024, 2G)

Q: Does Percona XtraDB Cluster (PXC) support MySQL’s GTIDs?

A. Yes. But for Percona XtraDB Cluster (PXC) replication it uses its own GTID. This blog will help clear confusion.

Q. How does Percona XtraDB Cluster (PXC) compare to MySQL’s Group Replication?

A. Both are trying to solve the same problem, except Percona XtraDB Cluster (PXC) is matured and has been in market for quite sometime. GR is being built.

Q. Does Percona XtraDB Cluster (PXC) have a size limitations? I recently tried to setup a 2TB PXC cluster, however, during load tests there were a few instances where one node got out of sync. The server did a full copy of the data, but could not complete because the load tests kept filling up the gcache.

A. There is no such known limitation. Generally if the node received queue fills up then it will emit a FLOW CONTROL signal. Generally you will receive a queue that is small enough not to fill up gcache. If you still have log files you can share them through LP or forum. We will try to look at them.

Q. How do you perform a major version upgrade. Per MySQL’s documentation, you can not replicate from a major version to the last major version. But it is fine to replicate from one major version to the next. So how would you do this in the cluster?

A. As per MySQL you may face issues if you try to replicate from lower version (master in 5.6) to higher version slave (slave in 5.7) but it is not blocked. Some of the semantics may be different. Percona XtraDB Cluster (PXC) write-sets are different though as it shares binlog events and this write-set format has not changed in 5.7.

Q. Does Galera set a max number of nodes that can be part of the cluster?

A. No such realistic limitation.

Q. Are there docker images with this configured? Dockerhub or something?

A. This should help.

Q. What is the maximum latency that would be supported on the LAN before you would say that running a Percona XtraDB Cluster is not a good idea?

A. I guess this is configurable based on timeout. So there is no such recommended latency threshold for LAN. Lesser the better.

Q. When you start a cluster and bootstrap Node 1, then start Node 2 and Node 3. If you restart Node 1, it will rejoin the cluster but not has a bootstrap state, but it does not matter because it will join a live cluster. If my understanding is correct Bootstrap only matter for the first node starting Is that correct ? What would happens if node 1 restart with bootstrap option, will it force the other node to sync against it ? will it join the running cluster?

A. When you start node-1 for the first time it will create a new cluster and node-2 and node-3 will join the existing cluster. Depending on how node-1 is restarted it can join the existing cluster or create one more independent cluster. Recommended way is to use a valid value of wsrep_cluster_address for all nodes and just pass following extra param –wsrep_new_cluster to the bootstrap node. If you happen to restart this node avoid passing this param. The node will try to join the existing cluster.

Q. What is the overhead of running Percona Monitoring and Management (PMM)

A. Percona Monitoring and Management (PMM) installs an agent on the node to collect a lot of other statistics. From Percona XtraDB Cluster (PXC) perspective it will only run to show a status, so pretty lightweight for PXC.

Q. Is it easy (any procedure) to move from codership galera to Percona XtraDB Cluster (PXC)?

A. I don’t think there is blog about it but they are fully compatible so moving should be easy. I will findout if there is set process for this.

Q. Where is the documentation for Cluster Safe Mode and other new features discussed here?

A. pxc_strict_mode. for PFS you can check this out. ProxySQL and Percona Monitoring and Management (PMM) has blog too.

Q. Is there some integrity issues that a client believes a node is up while this one has lost the cluster ?

A. No known issue.

Q. Is there any limit of running a huge number of databases ? Say several millions ?

A. No known issue.

Q. How are the performance of proxy sql compared with ha proxy?

A. You can check this out.

Q. We use Nagios for monitoring, will a plug-in be added for monitoring the cluster, or will it be only Percona Monitoring and Management (PMM)?

A. Check this out.

Q. “Cross data center replication”. We have two data centers that have a ping latency of 2ms (consistent) and I would like to replicate between the two for DR (disaster recovery) purposes.

A. 2 ms latency between 2 DC and consistent network sounds pretty good. Just tune timeout and things will work.

Q. Do you guys have a sample config files for a quick spin off of a 3 node cluster?

A. This should help.

Q. i see that there is added features like pam authentication,thraed pool which is given fro free in percona can you elobrate on it 

A. Percona XtraDB Cluster (PXC) is PS compatible. So any feature that is present in PS will be part of Percona XtraDB Cluster (PXC).

Q. In the example that you showed, where you had a 6 node cluster , where 3 was in Site A and 3 was in Site B. If the WAN link goes down, how does the cluster determine what data set is the master set, once the wan link comes back up after a few hours?

A. In the example I have used 2 DCs. Recommended is to use 3 DCs to avoid split-brain. If you have 6 nodes in 2 DCs and WAN link goes off it will create split-brain and no node will accept workload unless user set weight to form quorum or re-bootstrap primary.

I hope I was able to answer most of the questions/queries. If you happen to have follow-up questions please post them on forum.

Mergin incremental with full backup to create a new backup

Lastest Forum Posts - November 22, 2016 - 3:05am

I wonder if it is possible to create a full database backup and then an incremental backup which is merged together to create a »new« full database backup. If so, could you please be so kind to refer me to a tutorial or documentation.

I am looking forward to hearing from.

Kind regards,


is Oracle Enterprise Manager for MySQL - compatible with Percona DB 5.5 / 5.6 / 5.7 ?

Lastest Forum Posts - November 21, 2016 - 12:18pm
is OEM : Oracle Enterprise Manager for MySQL Database ( for Linux x86-64 - compatible with Percona DB 5.5 / 5.6 / 5.7 ?
has anyone been able to run the OEM against percona database installations ?

percona xtradb cluster add new nodes

Lastest Forum Posts - November 21, 2016 - 12:17pm
Dear Percona,

Now our company ( try to using percona xtradb cluster as database server.

I see the tutorial on URL

There are some question for you :

1. Why need three nodes for "percona xtradb cluster", whatis the reason?

2. I just setup two nodes server "percona xtradb cluster" and already working, my question :

- If i want to add new node so it three nodes , how about the steps?

- If one node fails (example : server crash) how can add new node?

3. Is it "percona xtradb cluster" support innodbbackup?


Intelix Global Crossing


Percona Server 5.5.53-38.4 is now available

Latest MySQL Performance Blog posts - November 18, 2016 - 1:33pm

Percona announces the release of Percona Server 5.5.53-38.4 on November 18, 2016. Based on MySQL 5.5.53, including all the bug fixes in it, Percona Server 5.5.53-38.4 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. You can find release details in the 5.5.53-38.4 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Removed Features: Bugs Fixed:
  • When a stored routine would call an “administrative” command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.

Other bugs fixed: #1612076, #1633061, #1633430, and #1635184.

Find the release notes for Percona Server 5.5.53-38.4 in our online documentation. Report bugs on the launchpad bug tracker.

WiredTiger B-Tree versus WiredTiger In-Memory: Q & A

Latest MySQL Performance Blog posts - November 18, 2016 - 10:15am

In this blog, I will provide answers to the Q & A for the WiredTiger B-Tree versus WiredTiger In-Memory webinar.

First, I want to thank everybody for attending the October, 13 webinar. The recording and slides for the webinar are available here. Below is the list of questions that I wasn’t able to fully answer during the webinar, with responses:

Q: Does the In-Memory storage engine have an oplog? Do we need more RAM if the oplog is set to be bigger?
Q: So we turn off the oplog?
Q: How is data replicated without oplog? Do you confound it with journaling?

A: Percona Memory Engine for MongoDB can be started with or without oplog, depending on whether it started as part of a replica set or standalone (you cannot explicitly turn oplog on or off). But if created, oplog will be stored in memory as well. You can still control its size with the option --oplogSize .

The recovery log (journal) is disabled for the Percona Memory Engine.

Q: After a crash of the In-Memory storage engine, does it need a complete initial sync? Means, cloning all databases?

A: Yes.

Q: WiredTiger reserves 50% of RAM for de-compression. Is this also true for the In-Memory engine?

A: Where did you find this information? Please point to its location in the docs in the comments section below. I asked Percona developers to confirm or deny this for the Percona Memory Engine, and this was their answer:

WT decompresses data block-wise, and each block is of some reasonable size (usual numbers are couple of Megs, let’s say). Decompressor knows the size of uncompressed data by reading this info from compressed block (this info is stored during compression). It creates an extra buffer of uncompressed block size, decompresses data into this buffer, then uses that decompressed buffer and frees the initial one. So there’s no reserve of memory for either compression or decompression, and no docs stating that.

Please note that this comment applies only to block compression, which is only used during disk I/O when WiredTiger reads and writes blocks, thus not available for Percona Memory Engine.

Q: There is no compression of data in this engine?

A: The Percona Memory Engine uses only prefix compression for indexes. Theoretically, it can use other types of compression: dictionary and Huffman (but they both disabled in MongoDB).

Q: With all the data in memory, is there much benefit to having indexes on the data?

A: Yes, because with index access you will read less data. While reading from memory is much faster than from disk, it is faster to read just few rows from memory instead of scanning millions.

Q: Our db is 70g. Will we need 70g memory to use Percona In-Memory?
Q: How much memory should be allocated for 70g db size?

A: What storage engine do you use? How do you calculate size? If this is WiredTiger and you count the space it allocates, answer is “yes, you need 70G RAM to use Percona Memory Engine.”

Q: What is the difference in size of data between WiredTiger on disks versus WiredTiger In-Memory?

A: There is no difference: the size is same. Please note that WiredTiger (on which the Percona Memory Engine is based) itself can additionally allocate up to 50% of the amount specified in the --inMemorySize option. You can check db.serverStatus().inMemory.cache to find out how much of the specified memory is used for storing your data. "bytes currently in the cache"  shows the total number of bytes occupied by the physical representation of all MongoDB’s databases, and "maximum bytes configured"  shows what is passed in option --inMemorySize. The difference between the two can be used to calculate the amount of memory in bytes available.

Q: What is the way to convert data from disk to In-Memory? Using mongodump and rebuild the indexes?

A: Yes

Q: An enhancement request is to enable regular and In-Memory engines on the same MongoDB instance.

A: This is a MongoDB limitation, but noted and reported for Percona at

MySQL Table Statistics dashboard no results

Lastest Forum Posts - November 17, 2016 - 7:10pm
I can't get the "MySQL Table Statistics" dashboard to display anything. Here is what I updated:

set global userstat=1;
set global innodb_stats_on_metadata = ON;
set global innodb_monitor_enable=all;

Anything else I should do to have the reporting live? Thanks

Help Us Shape the Future of Percona

Latest MySQL Performance Blog posts - November 17, 2016 - 11:05am

Let us know what you think about Percona, and what we should be thinking about for the future.

Over the last ten years, Percona has grown considerably. We’ve moved from being a strictly MySQL company, to a company that supports MongoDB and other open source databases. Percona Live and Percona Live Europe have become key meeting places for the open source community, and now are important hubs for learning about and discussing open source database solutions.

As we look forward to the next ten years of business, we want to get an idea of what you think of us. As we plan for the future, we’d like to hear about your experience with Percona today and get your input on how we can continue to evolve. 

To achieve that end, we’ve put together a survey of questions about us, our services, our products and the open source community’s perception of us. We would appreciate you taking the time to fill it out so we can know your thoughts. Your feedback helps us shape our company and grow the community.

Take the survey here:

It should take 10-15 minutes to complete and will remain open until Friday, Dec. 2. Thanks again for helping us prepare for the future.

TokuDB Hot Backup Plugin not releasing drive

Lastest Forum Posts - November 17, 2016 - 9:17am
I performed a backup using TokuDB Hot Backup to a drive that I mounted specifically for transferring data. Now I can't unmount the drive. When I run lsof on the drive it says that mysqld has 4 files that it has read/write permission on but are deleted. Looking at the source mysql directory, the files exist, but in the backup they don't. Trying to copy the files from mysql does transfer the files to the backup location, but it doesn't fix the issue. I'm trying to not have to restart mysql in order to unmount this drive. I've tried redoing the backup, moving it to another directory and adding an exclude for the files listed, and it doesn't resolve the issue. The output of lsof is below. Does anyone have any ideas on this?

mysqld 34959 mysql 4786u REG 8,49 0 28574175 /var/local/mysqlDump/mysql_data_dir/mysql/col
umns_priv.MYD (deleted)
mysqld 34959 mysql 5624u REG 8,49 132032 28575014 /var/local/mysqlDump/mysql_data_dir/mysql/use
r.MYD (deleted)
mysqld 34959 mysql 5635u REG 8,49 27435682 28575025 /var/local/mysqlDump/mysql_data_dir/mysql/tab
les_priv.MYD (deleted)
mysqld 34959 mysql 6966u REG 8,49 1183504 28576356 /var/local/mysqlDump/mysql_data_dir/mysql/pro
c.MYD (deleted)

All You Need to Know About GCache (Galera-Cache)

Latest MySQL Performance Blog posts - November 16, 2016 - 5:21pm

This blog discusses some important aspects of GCache.

Why do we need GCache?

Percona XtraDB Cluster is a multi-master topology, where a transaction executed on one node is replicated on another node(s) of the cluster. This transaction is then copied over from the group channel to Galera-Cache followed by apply action.

The cache can be discarded immediately once the transaction is applied, but retaining it can help promote a node as a DONOR node serving write-sets for a newly booted node.

So in short, GCache acts as a temporary storage for replicated transactions.

How is GCache managed?

Naturally, the first choice to cache these write-sets is to use memory allocated pool, which is governed by gcache.mem_store. However, this is deprecated and buggy and shouldn’t be used.

Next on the list is on-disk files. Galera has two types of on-disk files to manage write-sets:

  • RingBuffer File:
    • A circular file (aka RingBuffer file). As the name suggests, this file is re-usable in a circular queue fashion, and is pre-created when the server starts. The size of this file is preconfigured and can’t be changed dynamically, so selecting a proper size for this file is important.
    • The user can set the size of this file using gcache.size. (There are multiple blogs about how to estimate size of the Galera Cache, which is generally linked to downtime. If properly planned, the next booting node will find all the missing write-sets in the cache, thereby avoiding need for SST.)
    • Write-sets are appended to this file and, when needed, the file is re-cycled for use.
  • On-demand page store:
    • If the transaction write-set is large enough not to fit in a RingBuffer File (actually large enough not to fit in half of the RingBuffer file) then an independent page (physical disk file) is allocated to cache the write-sets.
    • Again there are two types of pages:
      • Page with standard size: As defined by gcache.page_size (default=128M).
      • Page with non-standard page size: If the transaction is large enough not to fit into a standard page, then a non-standard page is created for the transaction. Let’s say gcache.page_size=1M and transaction write_set = 1.5M, then a separate page (in turn on-disk file) will be created with a size of 1.5M.

How long are on demand pages retained? This is controlled using following two variables:

  • gcache.keep_pages_size
    • keep_pages_size defines total size of allocated pages to keep. For example, if keep_pages_size = 10M then N pages that add up to 10M can be retained. If N pages add to more than 10M, then pages are removed from the start of the queue until the size falls below set threshold. A size of 0 means don’t retain any page.
  • gcache.keep_pages_count (PXC specific)
    • But before pages are actually removed, a second check is done based on page_count. Let’s say keep_page_count = N+M, then even though N pages adds up to 10M, they will be retained as the page_count threshold is not yet hit. (The exception to this is non-standard pages at the start of the queue.)

So in short, both condition must be satisfied. The recommendation is to use whichever condition is applicable in the user environment.

Where are GCache files located?

The default location is the data directory, but this can be changed by setting gcache.dir. Given the temporary nature of the file, and iterative read/write cycle, it may be wise to place these files in a faster IO disk. Also, the default name of the file is gcache.cache. This is configurable by setting

What if one of the node is DESYNCED and PAUSED?

If a node desyncs, it will continue to received write-sets and apply them, so there is no major change in gcache handling.

If the node is desynced and paused, that means the node can’t apply write-sets and needs to keep caching them. This will, of course, affect the desynced/paused node and the node will continue to create on-demand page store. Since one of the cluster nodes can’t proceed, it will not emit a “last committed” message. In turn, other nodes in the cluster (that can purge the entry) will continue to retain the write-sets, even if these nodes are not desynced and paused.

what it means for transaction to have high number of lock struct(s)

Lastest Forum Posts - November 16, 2016 - 4:22pm

what would be the potential cause(s) for a transaction to have a large number of lock struct(s), (db version is Percona 5.6.25-73.1):



TRANSACTION 5277811249, ACTIVE 74 sec inserting
mysql tables in use 2, locked 2
3850 lock struct(s), heap size 570920, 102473 row lock(s), undo log entries 102493

it showed that there're only 2 tables in use, thus trying to see why the lock struct(s) number would be so high...


Is Docker Good for Your Database?

Latest MySQL Performance Blog posts - November 16, 2016 - 8:27am

This blog post reviews the appropriateness of Docker and other container solutions for your database environment.

A few weeks back, I wrote a fairly high-level blog post about containers. It covered what you should consider when thinking about using Docker, rkt, LXC, etc. I hope you’ve taken the chance to give it a quick read. It’s a good way to understand some of the disciplines you need to consider before moving to a new technology. However, it sparked a conversation in our Solutions Engineering team. Hopefully, the same one that you’re having in your organization: should customers run their database in containers?

Before we start, I’ll admit that Percona uses containers. Percona Monitoring and Management (PMM for short) presents all of the pretty graphs and query analytics by running in a Docker container. We made that choice because the integration between the components is where we could provide the most value to users. Docker lets us distribute a single ready-to-go unit of awesomeness. In short, it has huge potential on the application side of your environment. 

However, for databases… here are some of our recommendations:

Quick n Dirty

Decision = NOT FOR DBs (as it sits right now)

This is not the case for every environment. It is the default that we think is the best recommendation for the majority of our customers. Please note, that I am only making this recommendation for your database. If you’re using microservices for your application today, then it could make more sense to containerize your database depending on the load characteristics of your database, your scaling needs and the skillset you currently have.


Lack of Synergy

Before you decide to shoot me, please take some time to understand where we’re coming from. First of all, people designed container solutions to deal with stateless applications that have ephemeral data. Containers spin up a quick microservice and then destroy it. This includes all the components of that container (including its cache and data). The transient nature of containers is because all of the components and services of that container are considered to be part of the container (essentially it’s all or nothing). Serving the container a data volume owned by the underlying OS by punching a hole through the container can be very challenging. Current methods are too unreliable for most databases.

Most of the development efforts put into the various solutions had one goal in mind: statelessness. There are solutions that can help keep your data persistent, but they are very quickly evolving. From what we can tell, they require a high level of complexity, that negate any efficiency gains due to increased operational complexity (and risk). To further my point, this is precisely the conclusion that we’ve come to time and again when we’ve reviewed any “real world” information about the use of containers (especially Docker).

They’re Just Not Stable Yet

These container solutions are meant for quick development and deployment of applications that are broken into tiny components: microservices. Normally, these applications evolve very quickly in organizations that are very software/developer driven. That seems to be how these container solutions (again, especially Docker) are developed as well. New features are pushed out with little testing and design. The main focus seems to be the latest featureset and being first to market. They “beg for forgiveness” instead of “ask for permission.” On top of that, backward compatibility (from what we can tell) is a distant concern (and even that might be an overstatement). This means that you’re going to have to have a mature Continuous Delivery and testing environment as well as a known and tested image repository for your containers.

These are awesome tools to have for the right use cases, but they take time, money, resources and experience. In speaking with many of our customers, this is just not where they’re at as an organization. Their businesses aren’t designed around software development, and they simply don’t have the checkbooks to support the resources needed to keep this hungry machine fed. Rather, they are looking for something stable and performant that can keep their users happy 24×7. I know that we can give them a performant, highly-available environment requires much less management if we strip out containers.

Is There Hope?

Absolutely, in fact, there’s a lot more than hope. There are companies running containers (including databases) at massive scale today! These are the types of companies that have very mature processes. Their software development is a core part of their business plan and value proposition. You probably know who I’m talking about: Uber, Google, Facebook (there are more, these are a just a few). There’s even a good rundown of how you can get persistence in containers from Joyent. But as I said before, the complexity needed to get the basic features necessary to keep your data alive and available (the most basic use of a database) is much too high. When containers have a better and more stable solution for persistent storage volumes, they will be one step closer to being ready, in my opinion. Even then, containerizing databases in most organizations that aren’t dealing with large scale deployments (50+ nodes) with wildly varying workloads is probably unnecessary.

Don’t’ Leave Us Hanging…

I realize that the statement “you’re probably not ready to containerize your database” does not constitute a solution. So here it is: the Solutions Engineering team (SolEng for short) has you covered. Dimitri Vanoverbeke is in the process of a great blog series on configuration management. Configuration management solutions can greatly increase the repeatability of your infrastructure, and make sure that your IT/App Dev processes are repeatable in the physical configuration of your environment. Automating this process can lead to great gains. However, this should make use of a mature development/testing process as part of your application development lifecycle. The marriage of process and technology creates stable applications and happy customers.

Besides configuration management as an enhanced solution, there are some services that can make the life of your operations team much easier. Service discovery and health checking come to mind. My favorite solution is Consul, which we use extensively in PMM for configuration and service metadata. Consul can make sure that your frontend applications and backend infrastructure are working from a real-time snapshot of the state of your services.


There is a lot to think about when it comes to managing an environment, especially when your application develops at a quick pace. With the crafty use of available solutions, you can reduce the overhead that goes into every release. On top of that, you can increase resiliency and availability. If you need our help, please reach out. We’d love to help you!

PXC replication from 5.7 master to 5.5 slave

Lastest Forum Posts - November 16, 2016 - 4:02am
Currently I am running Percona XtraDB Cluster 5.5. To switch smoothly to 5.7, I need to set up two-directions replication between one of cluster nodes and standalone PXC 5.7 node. So that I can switch production to 5.7 and switch back if some problems occur.

Config file for 5.5:

log_slave_updates = TRUE
server-id = XXXX
wsrep_auto_increment_control = off
auto-increment-offset = 1
auto-increment-increment = 10

Config file for 5.7:

log_slave_updates = FALSE
server-id = YYYY
wsrep_auto_increment_control = off
auto-increment-offset = 3
auto-increment-increment = 10

Replication from 5.5(master) to 5.7(slave) works just fine.

Replication from 5.7(master) to 5.5(slave) produces an error

161116 14:09:27 [Note] WSREP: ready state reached
161116 14:09:27 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000082' at position 150, relay log './host55-relay-bin.000001' position: 4
161116 14:09:27 [ERROR] Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 27, event_type: 35
161116 14:09:27 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
161116 14:09:27 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
161116 14:09:27 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000083' position 123

Output from mysqlbinlog host55-relay-bin.000004:

/*!40019 SET @@session.max_insert_delayed_threads=0*/;
# at 4
#161116 14:05:43 server id 4310 end_log_pos 107 Start: binlog v 4, server v 5.5.41-37.0-55-log created 161116 14:05:43
# at 107
#700101 3:00:00 server id 4313 end_log_pos 0 Rotate to mysql-bin.000083 pos: 4
# at 150
#161116 14:06:43 server id 4313 end_log_pos 123 Start: binlog v 4, server v 5.7.14-8-57-log created 161116 14:06:43 at startup
ERROR: Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 27, event_type: 35
ERROR: Could not read entry at offset 269: Error in log format or read error.
# End of log file
ROLLBACK /* added by mysqlbinlog */;

How can I fix this to make replication work in both directions? Thanks

Webinar Thursday, November 17: MySQL High Availability with Percona XtraDB Cluster 5.7

Latest MySQL Performance Blog posts - November 15, 2016 - 10:37am

Join Percona’s Percona XtraDB Cluster Lead Software Engineer Krunal Bauskar for a webinar on Thursday, November 17, 2016, at 7:30 am PST on MySQL High Availability with Percona XtraDB Cluster 5.7.

Percona XtraDB Cluster 5.7 is our brand new MySQL 5.7 compatible Galera-based high availability (HA) solution. Whether you’re new to MySQL clustering technology, or experienced with Galera-based replication, this tutorial provides great insights into working with the software, including:

  • New and unique Features XtraDB Cluster 5.7, including Cluster Safe Mode, instrumentation with Performance Schema and extended support for encrypted tablespace in multi-master topology
  • Seamless integration with ProxySQL for better HA and read/write splitting
  • Improved security with native data at rest encryption and secure networking
  • Native integration with Docker, optimized for Container World
  • Monitoring with Percona Monitoring and Management (PMM)
  • Improved stability with many critical bug fixes and improved error messaging

This tutorial will demonstrate how to set up XtraDB Cluster, complete with High Availability Proxy and Monitoring, as well as perform the most important MySQL high availability management operations.

Register for this webinar here.

Krunal Bauskar, Percona XtraDB Cluster Lead Software Engineer

Krunal joined Percona in September 2015. Before joining Percona, he worked as part of the InnoDB team at MySQL/Oracle. He authored most of the temporary table revamp work, in addition to many other features. In the past, he worked with Yahoo! Labs researching big data issues, as well as working for a database startup that is now part of Teradata. His interests mainly include data management at any scale – which he has been working at for more than decade now.

Server summary:mysql summary - Exit Status 1

Lastest Forum Posts - November 15, 2016 - 10:17am
PMM Ver 1.05, Grafana seems fine on OS, Percona-Mysql.

Did a pmm-admin remove mysql:queries, then added back

pmm-mysql-queries-42001.log shows:

# Version: percona-qan-agent 1.0.5

# Basedir: /usr/local/percona/qan-agent

# Listen:

# PID: 24463

# API:

# UUID: 015a594c599943cd62fb5b3f9ad37c00

09:43:40.458053 main.go:165: Starting agent...

09:43:40.458517 main.go:343: Agent is ready

09:43:40.462348 main.go:207: API is ready

09:44:04.674202 ERROR agent exit status 1

09:44:04.674264 WARNING agent Cmd fail: Reply[Cmd:GetMySQLSummary Error:'exit status 1' Id:eb6dc548-ac1b-4b9a-7d2d-9f94f7e813b5]

09:44:48.460546 WARNING data-sender Waiting for API to ack qan_1479231720025795596: read tcp> i/o timeout

09:47:30.728374 ERROR agent exit status 1

09:47:30.728464 WARNING agent Cmd fail: Reply[Cmd:GetMySQLSummary Error:'exit status 1' Id:adcb7031-93fa-445f-4357-9c6ef3dc3315]

All communication are via LAN with *MY* firewall rules allowing everything and anything on (still digging into what Docker does as far as chains and rules).

On the server, qan-api.log (not sure about this warning yet):

09:44:51 mysql.go:178: WARNING: %!s(int=30) duplicate query class metrics: start_ts='2016-11-15 17:41:00.00059533 +0000 UTC': MySQL ab743f3216584e9e4c4c6f6e191ef3df

09:44:51 mysql.go:242: WARNING: duplicate global metrics: start_ts='2016-11-15 17:41:00.00059533 +0000 UTC': MySQL ab743f3216584e9e4c4c6f6e191ef3df

What am I missing, ideas?

Percona Monitoring and Management 1.0.6 is now available

Latest MySQL Performance Blog posts - November 15, 2016 - 9:07am

Percona announces the release of Percona Monitoring and Management 1.0.6 on November 15, 2016.

The instructions for installing or upgrading Percona Monitoring and Management 1.0.6 are available in the documentation. Detailed release notes are available here.

New in PMM Server:

  • Prometheus 1.2.2
  • External static files are now local for PMM home page
  • Metrics Monitor improvements:
    • Added Amazon RDS OS Metrics dashboard and CloudWatch data source.
    • Added the PMM Server host to metrics monitoring.
    • Refactored MongoDB dashboards.
    • Added File Descriptors graph to System Overview dashboard.
    • Added Mountpoint Usage graph to Disk Space dashboard.
  • Query Analytics improvements:
    • QAN data is now purged correctly.
    • QAN data retention is made configurable with QUERIES_RETENTION option. The default is eight days.
    • Various small fixes to Query Analytics.

New in PMM Client:

  • Fixes for mysql:queries service using Performance Schema as query source:
    • Fixed crash when DIGEST_TEXT is NULL.
    • Removed iteration over all query digests on startup.
    • Added sending of query examples to QAN if available (depends on the workload).
  • Added query source information for mysql:queries service in pmm-admin list output.
  • Added purge command to purge metrics data on the server.
  • Updated mongodb_exporter with RocksDB support and various fixes.
  • Removed --nodetype and --replset flags for mongodb:metrics. The --cluster flag is now optional.
    It is recommended to re-add mongodb:metrics service and purge existing MongoDB metrics using the purgecommand.
  • Enabled monitoring of file descriptors (requires re-adding linux:metrics service).
  • Improved full uninstallation when PMM Server is unreachable.
  • Added time drift check between server and client to pmm-admin check-network output.

Live demo of PMM is available at

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. It is developed by Percona in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Setting up Percona Xtrabackup with MariaDB 10.1 - sync on donor fails

Lastest Forum Posts - November 15, 2016 - 4:18am

I am trying to setup a mysql cluster with two machines. Setting up the donor works just as expected, however if I restart mysql on the joiner I do get the following log entry on the donor:

WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20161114 17:21:59.971)
Nov 14 17:21:59 hostname -wsrep-sst-donor: Streaming with xbstream
Nov 14 17:21:59 hostname -wsrep-sst-donor: Using socat as streamer
Nov 14 17:21:59 hostname -wsrep-sst-donor: Using /tmp/tmp.skLia96dpY as innobackupex temporary directory
Nov 14 17:21:59 hostname -wsrep-sst-donor: Streaming GTID file before SST
Nov 14 17:21:59 hostname -wsrep-sst-donor: Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:; RC=( ${PIPESTATUS[@]} )
Nov 14 17:21:59 hostname -wsrep-sst-donor: 2016/11/14 17:21:59 socat[23738] E connect(12, AF=2, 16): Connection refused
Nov 14 17:21:59 hostname -wsrep-sst-donor: Error while getting data from donor node: exit codes: 141 1
Nov 14 17:21:59 hostname -wsrep-sst-donor: Cleanup after exit with status:32
Nov 14 17:21:59 hostname -wsrep-sst-donor: Cleaning up temporary directories
Nov 14 17:21:59 hostname mysqld[16906]: 2016-11-14 17:21:59 139679923865344 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --binlog '/var/log/mysql/mariadb-bin' --gtid '3446d879-aa8e-11e6-9f87-ae867f6719f6:0' --gtid-domain-id '0'
Nov 14 17:21:59 hostname mysqld[16906]: 2016-11-14 17:21:59 139679923865344 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --binlog '/var/log/mysql/mariadb-bin' --gtid '3446d879-aa8e-11e6-9f87-ae867f6719f6:0' --gtid-domain-id '0': 32 (Broken pipe)
Nov 14 17:21:59 hostname mysqld[16906]: 2016-11-14 17:21:59 139679923865344 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --binlog '/var/log/mysql/mariadb-bin' --gtid '3446d879-aa8e-11e6-9f87-ae867f6719f6:0' --gtid-domain-id '0'
Nov 14 17:22:00 hostname mysqld[16906]: 2016-11-14 17:22:00 139680953546496 [Warning] WSREP: 0.0 (hostname): State transfer to 1.0 (mariadb2-stg-aws-eu-west-1b) failed: -32 (Broken pipe)
Nov 14 17:22:00 hostname mysqld[16906]: 2016-11-14 17:22:00 139680953546496 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)

I am using the xtrabackup-version of the percona repository. However I am not able to proceed behind the log message above.

Any help is greatly appreciated,

General Inquiries

For general inquiries, please send us your question and someone will contact you.