Buy Percona SupportBuy Now

MariaDB 10.2 CHECK and DEFAULT clauses

In this blog post, we’ll look at the MariaDB 10.2 CHECK and DEFAULT clauses.

MariaDB 10.2 includes some long-awaited features. In this blog, we are going to discuss the improvements to some table definitions: the DEFAULT clause and the CHECK constraints. These clauses describe columns default values and rules for data validation.

Note that MariaDB 10.2 is still in alpha stage. This article describes the current state of these features, which could change before MariaDB 10.2 becomes GA.

The DEFAULT clause

The DEFAULT clause has always been supported in MariaDB/MySQL, but traditionally it only accepted literal values (like “hello world” or “2”). MariaDB 10.2 removes this limitation, so DEFAULT can now accept most SQL expressions. For example:

  • fiscal_year SMALLINT DEFAULT (YEAR(NOW()))
  • valid_until DATE DEFAULT (NOW() + INTERVAL 1 YEAR)
  • owner VARCHAR(100) DEFAULT (USER())

Additionally, MariaDB 10.2 allows you to set a DEFAULT value for the TEXT and BLOB columns. This was not possible in previous versions. While this might look like a small detail, it can be hard to add a column to an existing table that is used by production applications, if it cannot have a default value.

The DEFAULT clause has some very reasonable limitations. For example, it cannot contain a subquery or a stored function. An apparently strange limitation is that we can mention another column in DEFAULT only if it comes first in the CREATE TABLE command.

Note that DEFAULT can make use of non-deterministic functions even if the binary log uses the STATEMENT format. In this case, default non-deterministic values will be logged in the ROW format.

CHECK constraints

CHECK constraints are SQL expressions that are checked when a row is inserted or updated. If this expression result is false (0, empty string, empty date) or NULL, the statement will fail with an error. The error message states which CHECK failed in a way that is quite easy to parse:

ERROR 4022 (23000): CONSTRAINT `consistent_dates` failed for `test`.`author`

Some example of CHECK constraints:

  • CONSTRAINT non_empty_name CHECK (CHAR_LENGTH(name) > 0)
  • CONSTRAINT consistent_dates CHECK (birth_date IS NULL OR death_date IS NULL OR birth_date < death_date)
  • CONSTRAINT past_date CHECK (birth_date < NOW())

A possible trick is checking that a column is different from its default value. This forces users to assign values explicitly.

CHECK constraints cannot be added or altered. It is only possible to drop them. This is an important limitation for production servers.

Another limitation is that CHECK metadata are not accessible via the INFORMATION_SCHEMA. The only way to find out if a table has CHECK clauses is parsing the output of SHOW CREATE TABLE.

The exact behavior of CHECK constraints in a replication environment depends on the master binary log format. If it is STATEMENT, the slaves will apply CHECK constraints to events received from the master. If it is ROW, only the master will need to apply constraints, because failed statements will not be replicated.

Thus, in all cases, we recommend having identical constraints on master and slaves, and only using deterministic constraints.


While I didn’t run a professional benchmark, I can say that both DEFAULT and CHECK clauses don’t have a noticeable impact on a simple test where we insert one million rows (on my local machine).

However, these clauses evaluate an SQL expression each time a row is inserted or updated. The overhead is at least equal to the SQL expression performance. If high-performing writes are important, you will probably not want to use complex data validation.

To check how fast an expression is, we can use the BENCHMARK() function:

MariaDB [(none)]> SELECT BENCHMARK(10000000, (555 / 100 * 20)); +---------------------------------------+ | BENCHMARK(10000000, (555 / 100 * 20)) | +---------------------------------------+ | 0 | +---------------------------------------+ 1 row in set (1.36 sec) MariaDB [(none)]> SELECT BENCHMARK(100000000, MD5('hello world')); +------------------------------------------+ | BENCHMARK(100000000, MD5('hello world')) | +------------------------------------------+ | 0 | +------------------------------------------+ 1 row in set (14.84 sec)

In this example, we executed the specified expressions ten million times. BENCHMARK() always returns 0, but what we want to check is the execution time. We can see for example that evaluating MD5(‘hello world’) takes less than 0.000002 seconds. In some cases, we may want to retry the same expressions with different parameters (longer strings, higher numbers, etc.) to check if the execution times varies.

Unfortunately, we don’t have a status variable which tells us how many times MariaDB evaluated CHECK clauses. If our workload performs many writes, that variable could help us to find out if CHECK constraints are slowing down inserts. Maybe the MariaDB team can take this as a suggestion for the future.

MySQL Query Analytics configuration not found

Lastest Forum Posts - 15 hours 20 min ago
Hi, all.
I've started queries:
pmm-admin list
queries host 42001 YES root:***@unix(/var/lib/mysql/mysql.sock) query_source=slowlog
mysql host 42002-42004 YES root:***@unix(/var/lib/mysql/mysql.sock)

But in QueryAnalytics I don't see any queries, instead, I see message:

QAN API error: "MySQL Query Analytics configuration not found".
Check the /var/log/qan-api.log file in docker container for more information.

docker exec 1af5acecd98b cat /var/log/qan-api.log
Percona Query Analytics API is running...
2016/07/29 08:50:34 config.go:79: Loading /usr/local/percona/qan-api/src/
2016/07/29 08:50:34 config.go:119: API root dir /usr/local/percona/qan-api/src/
INFO 2016/07/29 08:50:34 revel.go:206: Initialized Revel v0.12.0dev (2015-01-19) for >= go1.3
INFO 2016/07/29 08:50:34 main.go:31: Running revel server
Listening on
INFO 2016/07/29 09:30:28 ws.go:107: [Agent.Cmd] agent_id=2 0.0.9 http://host: connected

What did I wrong?

Can't start a joiner due to SST problems

Lastest Forum Posts - 16 hours 16 min ago
I've successfully bootstraped a Cluster (mysqld Ver 5.6.30-76.3-56 for debian-linux-gnu on x86_64 (Percona XtraDB Cluster (GPL), Release rel76.3, Revision aa929cb, WSREP version 25.16, wsrep_25.16)).

When I start a second node there is a problem with SST via xtrabackup-v2:
Code: 2016-07-29 09:31:58 12359 [Note] WSREP: New cluster view: global state: 3f3843a3-54aa-11e6-af38-5b145f899741:195614, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3 2016-07-29 09:31:58 12359 [Warning] WSREP: Gap in state sequence. Need state transfer. 2016-07-29 09:31:58 12359 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'XYZ.40.39.13' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '12359' '' ' 2016-07-29 09:31:58 12359 [ERROR] execlp() failed: Permission denied 2016-07-29 09:31:58 12359 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'XYZ.40.39.13' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '12359' '' Read: '(null)' 2016-07-29 09:31:58 12359 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'XYZ.40.39.13' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '12359' '' : 1 (Operation not permitted) 2016-07-29 09:31:58 12359 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable. 2016-07-29 09:31:58 12359 [ERROR] Aborting There is no special config, just:
Code: wsrep_provider = /usr/lib/ wsrep_cluster_name = "legacy_cluster" wsrep_cluster_address = "gcomm://ZYX.4.123.242,XYZ.40.39.13" wsrep_node_name = legacy6 wsrep_node_address = XYZ.40.39.13 #wsrep_sst_method = xtrabackup-v2 #wsrep_sst_auth = sst:secret How could I further debug this problem? I've also tried SST with rsync, but got the same error messages.

Percona Monitoring and Management 1.0.2 Beta

Latest MySQL Performance Blog posts - July 28, 2016 - 12:39pm

Percona announces the release of Percona Monitoring and Management 1.0.2 Beta on 28 July 2016.

Like prior versions, PMM is distributed through Docker Hub and is free to download. Full instructions for download and installation of the server and client are available in the documentation.

Notable changes to the tool include:

  • Upgraded to Grafana 3.1.0.
  • Upgraded to Prometheus 1.0.1.
  • Set default metrics retention to 30 days.
  • Eliminated port 9001. Now the container uses only one configurable port, 80 by default.
  • Eliminated the need to specify ADDRESS variable when creating Docker container.
  • Completely re-wrote pmm-admin with more functions.
  • Added ability to stop all services using the new pmm-admin.
  • Added support to name instances using the new pmm-admin.
  • Query Analytics Application updates:
    • Redesigned queries profile table
    • Redesigned metrics table
    • Redesigned instance settings page
    • Added sparkline charts
    • Added ability to show more than ten queries
  • Various updates for MongoDB dashboards.

The full release notes are available in the documentation. The documentation also includes details on installation and architecture.

A demonstration of the tool has been set up at

We have also implemented forums for the discussion of PMM.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Some screen shots of the updates:

Note the new sparkline that shows the current load in context (so you know if the number is higher/normal/lower than normal), and the option to “Load next 10 queries” at the bottom of the listing.

Our admin tool was completely re-written with new functions:

pmm-admin –help output


pmm-admin list command output


pmm-admin check-network output, which provides information on the status of the client’s network connection to the server.

Upcoming Webinar August 2 10:00 am PDT: MySQL and Ceph

Latest MySQL Performance Blog posts - July 28, 2016 - 11:27am

Join Brent Compton, Kyle Bader and Yves Trudeau on August 2, 2016 at 10 am PDT (UTC-7) for a MySQL and Ceph webinar.

Many operators select OpenStack as their control plane of choice for providing both internal and external IT services. The OpenStack user survey repeatedly shows Ceph as the dominant backend for providing persistent storage volumes through OpenStack Cinder. When building applications and repatriating old workloads, developers are discovering the need to provide OpenStack infrastructure database services. Given MySQL’s ubiquity, and it’s reliance on persistent storage, it is of utmost importance to understand how to achieve the performance demanded by today’s applications. Databases like MySQL can be incredibly IO intensive, and Ceph offers a great opportunity to go beyond the limitations presented by a single scale-up system. Since Ceph provides a mutable object store with atomic operations, could MySQL store InnoDB pages directly in Ceph?

This talk reviews the general architecture of Ceph, and then discusses benchmark results from small to mid-size Ceph clusters. These benchmarks lead to the development of prescriptive guidance around tuning Ceph storage nodes (OSDs), the impact the amount of physical memory, and the presence of SSDs, high-speed networks or RAID controllers.

Click here to register now. Speakers: Brent Compton
Director Storage Solution Architectures, Red Hat
Brent Compton is Director Storage Solution Architectures at Red Hat. He leads the team responsible for building Ceph and Gluster storage reference architectures with Red Hat Storage partners. Before Red Hat, Brent was responsible for emerging non-volatile memory software technologies at Fusion-io. Previous enterprise software leadership roles include VP Product Management at Micromuse (now IBM Tivoli Netcool) and Product Marketing Director within HP’s OpenView software division. Brent also served as Director Middleware Development Platforms at the LDS Church and as CIO at Joint Commission International. Brent has a tight-knit family, and can be found on skis or a mountain bike whenever possible. Kyle Bader
Sr Solution Architect, Red Hat
Kyle Bader, a Red Hat senior architect, provides expertise in the design and operation of petabyte-scale storage systems using Ceph. He joined Red Hat as part of the 2014 Inktank acquisition. As a senior systems engineer at DreamHost, he helped implement, operate, and design Ceph and OpenStack-based systems for DreamCompute and DreamObjects cloud products. Yves Trudeau
Principal Architect
Yves is a Principal Consultant at Percona, specializing in MySQL High-Availability and scaling solutions. Before joining Percona in 2009, he worked as a senior consultant for MySQL AB and Sun Microsystems, assisting customers across North America with NDB Cluster and Heartbeat/DRBD technologies. Yves holds a Ph.D. in Experimental Physics from Université de Sherbrooke. He lives in Québec, Canada with his wife and three daughters.

Percona Live Europe Amsterdam 2016 Tutorial Schedule is Up!

Latest MySQL Performance Blog posts - July 27, 2016 - 3:09pm

This blog post lists the Percona Live Europe Amsterdam 2016 tutorial schedule.

We are excited to announce that the tutorial schedule for the Percona Live Europe Amsterdam Open Source Database Conference 2016 is up!

The Percona Live Europe Amsterdam Open Source Database Conference is the premier event for the diverse and active open source community, as well as businesses that develop and use open source software. The conferences have a technical focus with an emphasis on the core topics of MySQL, MongoDB, and other open source databases. Tackling subjects such as analytics, architecture and design, security, operations, scalability and performance, Percona Live provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs.

Tackling subjects such as analytics, architecture and design, security, operations, scalability and performance, Percona Live Europe provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs. This conference is an opportunity to network with peers and technology professionals by bringing together accomplished DBA’s, system architects and developers from around the world to share their knowledge and experience – all to help you learn how to tackle your open source database challenges in a whole new way. These tutorials are a must for any data performance professional!

The Percona Live Europe Open Source Database Conference is October 3-5 at the Mövenpick Hotel Amsterdam City Centre.

Click through to the tutorial link right now, look them over, and pick which sessions you want to attend. Discounted passes available below!

Tutorial List: Early Bird Discounts

Just a reminder to everyone out there: our Early Bird discount rate for the Percona Live Europe Amsterdam Open Source Database Conference is only available ‘til August 8, 2016, 11:30 pm PST! This rate gets you all the excellent and amazing opportunities that Percona Live offers, at a very reasonable price!

Sponsor Percona Live

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, MongoDB and open source database event. Sponsors become a part of a dynamic and growing ecosystem and interact with hundreds of DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

Multi-Master Replication Manager for MySQL

Lastest Forum Posts - July 27, 2016 - 12:23pm
1.- Is there a course or training that you provide for this solution?

2.-Is this solution supports MyISAM storage engine? this question is important.

3.- Is Redhat (RHEL) is compatible with this solution?

systemd and multiple instances

Lastest Forum Posts - July 27, 2016 - 11:52am

I can see from this page how to setup multiple instances using systemd, but, my issue is how can I can use the effective my.cnf for each instance?

Say I have the following in /etc/my.cnf:

.. config ..

.. config ..

If I want to run mysql_upgrade for example, I can't because it doesn't understand which instance I'm using if I do mysql_upgrade --defaults-file=/etc/my.cnf because there doesn't seem to be a way to target each instance specifically.

Monitoring MongoDB with Nagios

Latest MySQL Performance Blog posts - July 27, 2016 - 7:08am

In this blog, we’ll discuss monitoring MongoDB with Nagios.

There is a significant amount of talk around graphing MongoDB metrics using things like Prometheus, Data Dog, New Relic, and Ops Manager from MongoDB Inc. However, I haven’t noticed a lot of talk around “What MongoDB alerts should I be setting up?”

While building out Percona’s remote DBA service for MongoDB, I looked at Prometheus’s AlertManager. After reviewing it, I’m not sure it’s quite ready to be used exclusively. We needed to decide quickly if there are better Nagios checks on the market, or did I need to write my own?

In the end, we settled on a hybrid approach. There are some good frameworks, but we need to create or tweak some of the things needed for an “SEV 1-” or “SEV 2-” type issue (which are most important to me). One of the most common problems for operations, Ops, DevOps, DBA teams and most engineering is alert spam. As such I wanted to be very careful to only alert on the things pointing to immediate dangers or current outages. As a result, we have now added to the GitHub for Percona Monitoring Plugins. Since we use Grafana and Prometheus for metrics and graphing, there are no accompanying Catci information templates. In the future, we’ll need to decide how this will change PMP overtime. In the meantime, we wanted to make the tool available now and worry about some of the issues later on.

As part of this push, I want to give you some real world examples of how you might use this tool. There are many options available to you, and Nagios is still a bit green in regards to making those options as user-friendly as our tools are.

Usage: [options] Options: -h, --help show this help message and exit -H HOST, --host=HOST The hostname you want to connect to -P PORT, --port=PORT The port mongodb is running on -u USER, --user=USER The username you want to login as -p PASSWD, --password=PASSWD The password you want to use for that user -W WARNING, --warning=WARNING The warning threshold you want to set -C CRITICAL, --critical=CRITICAL The critical threshold you want to set -A ACTION, --action=ACTION The action you want to take. Valid choices are (check_connections, check_election, check_lock_pct, check_repl_lag, check_flushing, check_total_indexes, check_balance, check_queues, check_cannary_test, check_have_primary, check_oplog, check_index_ratio, check_connect) Default: check_connect -s SSL, --ssl=SSL Connect using SSL -r REPLICASET, --replicaset=REPLICASET Connect to replicaset -c COLLECTION, --collection=COLLECTION Specify the collection in check_cannary_test -d DATABASE, --database=DATABASE Specify the database in check_cannary_test -q QUERY, --query=QUERY Specify the query, only used in check_cannary_test --statusfile=STATUS_FILENAME File to current store state data in for delta checks --backup-statusfile=STATUS_FILENAME_BACKUP File to previous store state data in for delta checks --max-stale=MAX_STALE Age of status file to make new checks (seconds)

There seems to be a huge amount going on here, but let’s break it down into a few categories:

  • Connection options
  • Actions
  • Action options
  • Status options

Hopefully, this takes some of the scariness out of the script above.

Connection options
  • Host / Port Number
    • Pretty simple, this is just the host you want to connect to and what TCP port it is listening on.
  • Username and Password
    • Like with Host/Port, this is some of your normal and typical Mongo connection field options. If you do not set both the username and password, the system will assume auth was disabled.
  • SSL
    • This is mostly around the old SSL support in Mongo clients (which was a boolean). This tool needs updating to support the more modern SSL connection options. Use this as a “deprecated” feature that might not work on newer versions.
  • ReplicaSet
    • Very particular option that is only used for a few checks and verifies that the connection uses a replicaset connection. Using this option lets the tool automatically find a primary node for you, and is helpful to some checks specifically around replication and high availability (HA):
      • check_election
      • check_repl_lag
      • check_cannary_test
      • chech_have_primary
      • check_oplog
Actions and what they mean
  • check_connections
    • This parameter refers to memory usage, but beyond that you need to know if your typical connections suddenly double. This indicates something unexpected happened in the application or database and caused everything to reconnect. It often takes up to 10 minutes for those old connections to go away.
  • check_election
    • This uses the status file options we will cover in a minute, but it checks to see if the primary from the last check differs from the current found primary. If so, it alerts. This check should only have a threshold of one before it alarms (as an alert means an HA event occurred).
  • check_lock_pct
    • MMAP only, this engine has a write lock on the whole collection/database depending on the version. This is a crucial metric to determine if MMAP writes are blocking reads, meaning you need to scale the DB layer in some way.
  • check_repl_lag
    • Checks the replication stream to understand how lagged a given node is the primary. To accomplish this, it uses a fake record in the test DB to cause a write. Without this, a read-only system would look lagged artificially as no new oplog entries get created.
  • check_flushing
    • A common issue with MongoDB is very long flush times, causing a system halt. This is a caused by your disk subsystem not keeping up, and then the DB having to wait on flushing to make sure writes get correctly journaled.
  • check_total_indexes
    • The more indexes you have, the more the planner has to work to determine which index is a good fit. This increases the risk that the recovery of a failure will take a long time. This is due to the way a restore builds indexes and how MongoDB can only make one index at a time.
  • check_balance
    • While MongoDB should keep things in balance across a cluster, many things can happen: jumbo chunks, a disabled balancer being, constantly attempting to move the same chunk but failing, and even adding/removing sharding. This alert is for these cases, as an imbalance means some records might get served faster than others. It is purely based on the chunk count that the MongoDB balancer is also based on, which is not necessarily the same as disk usage.
  • check_queues
    • No matter what engine you have selected, a backlog of sustained reads or writes indicates your DB layer is unable to keep up with demand. It is important in these cases to send an alert if the rate is maintained. You might notice this is also in our Prometheus exporter for graphics as both trending and alerting are necessary to watch in a MongoDB system.
  • check_cannary_test
    • This is a typical query for the database and then used to set critical/warning levels based on the latency of the returned query. While not as accurate as full synthetic transactions, queries through the application are good to measure response time expectations and SLAs.
  • check_have_primary
    • If we had an HA event but failed to get back up quickly, it’s important to know if a new primary is causing writes to error on the system. This check simply determines if the replica set has a primary, which means it can handle reads and writes.
  • check_oplog
    • This check is all about how much oplog history you have. This is much like measuring how much history you have in MySQL blogs. The reason this is important is when recovering from a backup and performing a point in time recovery, you can use the current oplog if the oldest timestamp in the oplog is newer than the backup timestamp. As a result, this is normal three times the backup interval you use to guarantee that you have plenty of time to find the newest recovery and then do the recovery.
  • check_index_ratio
    • This is an older metric that modern MongoDB versions will not find useful, but in the past, it was a good way to understand the percentage of queries not handled by an index.
  • check_connect
    • A very basic check to ensure it can connect (and optionally login) to MongoDB and verify the server is working.
Status File options

These options rarely need to be changed but are present in case you want to store the status on an SHM mount point to avoid actual disk writes.

  • statusfile
    • This is where a copy of the current rs.status, serverStatus and other command data is stored
  • backup-statusfile
    • Like status_file, but status_file is moved here when a new check is done. These two objects can then be compared to find the delta between two checkpoints.
  • max-stale
    • This is the amount of age for which an old file is still valid. Deltas older then this aren’t allowed and exist to protect the system from will assumption when a statusfile is hours or days old.

If you have any questions on how to use these parameters, feel free to let us know. In the code, there is also a defaults dictionary for most of these options so that in many cases setting warning and critical level are not needed.

Can't restart one node

Lastest Forum Posts - July 27, 2016 - 4:16am
Hi everyone, I'm nervous as can be. This cluster has been working fine for a year, after I somehow reset it to position zero and lost everything. (Thus the nervousness).

I have one server (in a cluster of 3) that is showing disk errors, so I took one down for needed maintenance in advance of doing the disk swaps.

Now it won't come back.

I swear nothing has changed, but as this system is incredibly reliable 'til it's not, I can't guarantee.

The lines that are confusing me in the error log are:
Connecting to MySQL server host: localhost, user: (null), password: not set, port: 0, socket: /var/lib/mysql/mysql.sock
-innobackupex-backup: Using server version 5.6.28-76.1-56-log
-innobackupex-backup: innobackupex version 2.3.4 based on MySQL server 5.6.24 Linux (x86_64) (revision id: e80c779)
-innobackupex-backup: xtrabackup: uses posix_fadvise().
-innobackupex-backup: xtrabackup: cd to /var/lib/mysql
-innobackupex-backup: xtrabackup: open files limit requested 0, set to 5000
-innobackupex-backup: xtrabackup: using the following InnoDB configuration:
-innobackupex-backup: xtrabackup: innodb_data_home_dir = ./
-innobackupex-backup: xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
-innobackupex-backup: xtrabackup: innodb_log_group_home_dir = ./
-innobackupex-backup: xtrabackup: innodb_log_files_in_group = 2
-innobackupex-backup: xtrabackup: innodb_log_file_size = 20971520
-innobackupex-backup: xtrabackup: using O_DIRECT
-innobackupex-backup: 160727 07:01:40 >> log scanned up to (70923364024)
-innobackupex-backup: xtrabackup: Generating a list of tablespaces
-innobackupex-backup: 2016-07-27 07:01:40 7f6534131740 InnoDB: Operating system error number 13 in a file operation.
-innobackupex-backup: InnoDB: The error means mysqld does not have the access rights to
-innobackupex-backup: InnoDB: the directory.
-wsrep-sst-donor: innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log
-wsrep-sst-donor: Cleanup after exit with status:22

If I read this correctly, it's trying to log in with no username or password, when I have it clearly listed in the /etc/my.cnf
wsrep_sst_auth = x:x

It also says it can't access anything in /var/lib/mysql, but that directory is owned by mysql:mysql. I even changed it to 777 to no avail.

There is no /var/lib/mysql/innobackup.backup.log - so something is definitely weird. The double slash part also seems odd - there is no trailing slash in /etc/my.cnf

Please point me in the right direction. My pants are down, 1 of 3 in the cluster is down, and one of the remaining 2 has a wonky disk. I don't want another disaster like last year!

Testing Samsung storage in tpcc-mysql benchmark of Percona Server

Latest MySQL Performance Blog posts - July 26, 2016 - 11:00am

This blog post will detail the results of Samsung storage in tpcc-mysql benchmark using Percona Server.

I had an opportunity to test different Samsung storage devices under tpcc-mysql benchmark powered by Percona Server 5.7. You can find a summary with details here

I have in my possession:

  • Samsung 850 Pro, 2TB: This is a SATA device and is positioned as consumer-oriented, something that you would use in a high-end user desktop. As of this post, I estimate the price of this device as around $430/TB.
  • Samsung SM863, 1.92TB: this device is also a SATA, and positioned for a server usage. The current price is about $600/TB. 
  • Samsung PM1725, 800GB: This is an NVMe device, in a 2.5″ form factor, but it requires a connection to a PCIe slot, which I had to allocate in my server. The device is high-end, oriented for server-side and demanding workloads. The current price is about $1300/TB.

I am going to use 1000 warehouses in the tpcc-mysql benchmarks, which corresponds roughly to a data size of 100GB.

This benchmark varies the innodb_buffer_pool_size from 5GB to 115GB. With 5GB buffer pool size only a very small portion of data fits into memory, which results in intensive foreground IO reads and intensive background IO writes. With 115GB almost all data fits into memory, which results in very small (or almost zero) IO reads and moderate background IO writes.

All buffer pool sizes in the middle of the interval correspond to resulting IO reads and writes. For example, we can see the read to write ratio on the chart below (received for the PM1725 device) with different buffer pool sizes:

We can see that for the 5GB buffer pool size we have 56000 read IOPs operations and 32000 write IOPs. For 115GB, the reads are minimal at about 300 IOPS and the background writes are at the 20000 IOPs level. Reads gradually decline with the increasing buffer pool size.

The charts are generated with the Percona Monitoring and Management tools.


Let’s review the results. The first chart shows measurements taken every one second, allowing us to see the trends and stalls.

If we take averages, the results are:

In table form (the results are in new order transactions per minute (NOTPM)):

bp, GB pm1725 sam850 sam863 pm1725 / sam863 pm1725 / sam850 5 42427.57 1931.54 14709.69 2.88 21.97 15 78991.67 2750.85 31655.18 2.50 28.72 25 108077.56 5156.72 56777.82 1.90 20.96 35 122582.17 8986.15 93828.48 1.31 13.64 45 127828.82 12136.51 123979.99 1.03 10.53 55 130724.59 19547.81 127971.30 1.02 6.69 65 131901.38 27653.94 131020.07 1.01 4.77 75 133184.70 38210.94 131410.40 1.01 3.49 85 133058.50 39669.90 131657.16 1.01 3.35 95 133553.49 39519.18 132882.29 1.01 3.38 105 134021.26 39631.03 132126.29 1.01 3.38 115 134037.09 39469.34 132683.55 1.01 3.40 Conclusion

The Samsung 850 obviously can’t keep with the more advanced SM863 and PM1725. The PM1725 shows a greater benefit with smaller buffer pool sizes. In cases using large amounts of memory, there is practically no difference with SM863. The reason is that with big buffer pool sizes, MySQL does not push IO subsystem much to use all the PM1725 performance.

For the reference, my.cnf file is

[mysqld] datadir=/var/lib/mysql socket=/tmp/mysql.sock ssl=0 symbolic-links=0 sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES # general thread_cache_size=2000 table_open_cache = 200000 table_open_cache_instances=64 back_log=1500 query_cache_type=0 max_connections=4000 # files innodb_file_per_table innodb_log_file_size=15G innodb_log_files_in_group=2 innodb_open_files=4000 innodb_io_capacity=10000 loose-innodb_io_capacity_max=12000 innodb_lru_scan_depth=1024 innodb_page_cleaners=32 # buffers innodb_buffer_pool_size= 200G innodb_buffer_pool_instances=8 innodb_log_buffer_size=64M # tune innodb_doublewrite= 1 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit= 1 innodb_flush_method=O_DIRECT_NO_FSYNC innodb_max_dirty_pages_pct=90 join_buffer_size=32K sort_buffer_size=32K innodb_use_native_aio=0 innodb_stats_persistent = 1 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 16 innodb_write_io_threads = 8 innodb_purge_threads=4 innodb_adaptive_hash_index=0 innodb_change_buffering=none loose-innodb-log_checksum-algorithm=crc32 loose-innodb-checksum-algorithm=strict_crc32 loose-innodb_sched_priority_cleaner=39 loose-metadata_locks_hash_instances=256

Clean shutdown after xtrabackup completion

Lastest Forum Posts - July 26, 2016 - 7:47am
I'm using xtrabackup to do an online data migration between 2 servers (xtrabackup piped to another server). The complication is that we have 3 slaves that i'd like to keep in sync during the process. The easiest way I can think of is to have xtrabackup shutdown the master server after the backup completes so I end up with 2 servers that are identical, then i can just bring the new master online and tell the slaves to talk to it without having to refresh them. Is this possible?

Innobackupex extremely slow running since 5 days

Lastest Forum Posts - July 26, 2016 - 6:33am

I am trying to backup a 3TB mysql database on on EC2 instance using below command

innobackupex --user=root --password='xxxxxxxxxxxxx' --stream=tar /srv/mysqldata_new/dumps/rds3data | gzip -9 | split -d --bytes=322122547200 - /srv/mysqldata_new/dumps/rds3data/backup.tar.gz
Below is the output that is running continuously

>> log scanned up to (880937057492) >> log scanned up to (880937057492) >> log scanned up to (880937057492)

Its been 5 days since I start the backup and till now 340 GB backed up. Can someone help me on this? How to find estimation time for the backup to complete or how to check percentage of backup completed? Is there any way to speed up the process? I have 3 big tables of size 1 TB,1 TB and 700 GB.

Sai Mandadi

How Does One Reset a Node?

Lastest Forum Posts - July 25, 2016 - 5:01pm
Throughout my development process, I need to reset nodes to either re-cluster, or use for different purposes. How does one reset the "galera" cluster state? Regardless of what I do, the grastate.dat file is always populated with the last cluster information (seqno, etc).

Is there any command or set of commands one can run to reset a node to forget about its last clustering? "RESET MASTER", "flush-logs", etc?

It is as simple as just deleting the file? Are there other files that must be removed/reset/etc?

Percona XtraBackup 2.4.4 is now available

Latest MySQL Performance Blog posts - July 25, 2016 - 11:05am

Percona announces the GA release of Percona XtraBackup 2.4.4 on July 25th, 2016. You can download it from our download site and from apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.7.13.

Bugs Fixed:

  • Percona XtraBackup reported the difference in the actual size of the system tablespace and the size which was stored in the tablespace header. This check is now skipped for tablespaces with autoextend support. Bug fixed #1550322.
  • Because Percona Server 5.5 and MySQL 5.6 store the LSN offset for large log files at different places inside the redo log header, Percona XtraBackup was trying to guess which offset is better to use by trying to read from each one and compare the log block numbers and assert lsn_chosen == 1 when both LSNs looked correct, but they were different. Fixed by improving the server detection. Bug fixed #1568009.
  • Percona XtraBackup didn’t correctly detect when tables were both compressed and encrypted. Bug fixed #1582130.
  • Percona XtraBackup would crash if the keyring file was empty. Bug fixed #1590351.
  • Backup couldn’t be prepared when the size in cache didn’t match the physical size. Bug fixed #1604299.
  • Free Software Foundation address in copyright notices was outdated. Bug fixed #1222777.
  • Backup process would fail if the datadir specified on the command-line was not the same as one that is reported by the server. Percona XtraBackup now allows the datadir from my.cnf override the one from SHOW VARIABLES. xtrabackup prints a warning that they don’t match, but continues. Bug fixed #1526467.
  • With upstream change of maximum page size from 16K to 64K, the size of incremental buffer became 1G. Which increased the requirement to 1G of RAM in order to prepare the backup. While in fact there is no need to allocate such a large buffer for smaller pages. Bug fixed #1582456.
  • Backup process would fail on MariaDB Galera cluster operating in GTID mode if binary logs were in non-standard directory. Bug fixed #1517629.

Other bugs fixed: #1583717, #1583954, and #1599397.

Release notes with all the bugfixes for Percona XtraBackup 2.4.4 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

MongoDB Consistent Backups

Latest MySQL Performance Blog posts - July 25, 2016 - 10:35am

In this post, I’m going to discuss MongoDB consistent backups, and how to achieve them.

You might have read before that MongoDB backup is not consistent. But what if I told you there is a tool that could make them consistent. What if this tool also would make it cluster-wide consistent, automatically compress the backup, become the first step toward continually incremental recording, notify your monitoring system and upload the backup to cloud storage for you?

It’s all TRUE!

Recently Percona-Labs created a new repository aimed at exactly these issues. We hope it will eventually grow into something that becomes part of the officially supported tools (like Percona Toolkit and  Percona’s Xtrabackup utility). Before we get into how it works, let’s talk about why we need it and its key highlights. Then (for all the engineering types reading this) we can discuss what is does and why.

Why do we need a consistent backup tool?

The first thing to note is you absolutely can’t have a consistent backup on a working system unless your node is in a replicaset. (You could even have a single node replicaset for this to be accurate.) Why? Consistency requires an operations log to say what changes occurred from the first point in the backup to the last point. This lets us ensure we are consistent to the end timestamp of the backup. We are unable to verify consistency when the MongoDB backup started without the ability to take a “snapshot” of data and then save the data while other changes occur. MongoDB does not have ACID-like isolation in this way. However, it can be consistent to the backup endpoint by applying any deltas at the end of the backup restore process.

You might say, “but mongodump already provides --oplog for this feature.” You are right: it does, and it works great if you only have a single replicaset to backup. When we bring sharding into the mix, however, things get vastly more complicated. It ignores that flag and hits your primaries:

In the diagram above you can see the backup and oplog recording for the first shard ended long before the second shard. As such, the consistency point needed is nowhere close to being covered by the red line. Even if all your shards are the same size, there would be some level of variance due to network, disk, CPU and memory speeds. The new tool helps you here by keeping track of the dumps, but also by having a thread recording the oplog for all shards until the last shard finishes. This ensures that all shards can be synced to the point in time where the last shard finished. At that moment in time, we have a consistent backup across all the shards. As you can see below, the oplog finished watching both shards after the last shard finish. On recovery, they remain in sync.

You might ask, “well what about the meta-data stored in the config servers.” This is a great quest, as the behavior differs in our tool depending on if you’re using MongoDB 3.2’s new Config Servers as a replica set feature, or a legacy config server approach.

In the legacy mode, we fsyncAndLock the config servers just long enough to record a server config data dump. Then we stop the oplog tailer threads for all the shards. After that, and after the oplog tailers finish, we unlock the config server. This ensures we remove the race conditions that could occur if it took longer than expected to close an oplog cursor. However, if we run in 3.2 mode, the config servers act just like another shard. They get dumped at the same time, and the oplog just gets tailed until we complete the data shard dumps. The newest features available to MongoDB Community, MongoDB Enterprise, and Percona Server for MongoDB 3.2 make the process much simpler.

Key Takeaways from new tool

  1. Not yet an official Percona tool, but being used already by people as it’s just a wrapper to run multiple mongo dumps for you.
  2. If you execute the make setup, it outputs a single binary file that needs only python2.7 installed on your database system, even though under the hood it’s running many python modules in a virtualenv
  3. Dumps all shard in parallel and keeps tailing the oplog until all dumps are complete
  4. Handled backing up metadata for old and new config server topologies
  5. Can currently upload to S3, but more cloud storage is coming
  6. Backups compressed by default
  7. Uses the cluster_name,  time, and shard_name to make backup paths look like  /cluster1/<timestamp>/shard1.tgz, helping you keep things organized and letting you remove old backups by timestamp and cluster name.

Desired Roadmap

  • Mature into an officially support Percona product like  Xtrabackup
  • Fully Opensource and welcoming community improvements
  • Extending uploading to  CloudFiles by Rackspace, Azure ZRS, Google Cloud Storage and more
  • Complementary documentation on restores but can just natively use mongorestore tool also
  • Modular backup methods to extend to mongodump, LVM snapshots, ISCSI, EBS snapshots, MongoDB commands and more
  • Encryption before saving to disk
  • Partial backups and restores limit to specific databases and collections
  • Offline backup querying

Please be sure to check out the GitHub @mongodb_consistent_backup and log any issues or features requests.

Feel free to reach out to me on Twitter @dbmurphy_data or @percona with any questions or suggestions as well.

Mysql skipped some transactions from binary logs after recovered from defunct mode

Lastest Forum Posts - July 25, 2016 - 3:51am
On one of the slave we faced some issue like it skipped some transactions. After recovered of the EBS issue,

Due to this EBS issue mysql went to defunct mode. when it recovered we observed it skipped 16 mins transactions on binary log.And started replication from skipped position. Can you please confirm why this happened.

This salve is the master of 4 other slaves. So all slaves also affected due to this.

Moving from gtid on_permissive to on

Lastest Forum Posts - July 25, 2016 - 1:24am

I'm trying to move from 5.6 to 5.7. As part of that I want to have GTID fully on once I get to 5.7.

Right now I've got the following working setup:

master1 (5.6.19, gtid off)
- slave1m1 (5.6.19, gtid off, master = master1)
- newmaster1 (5.7.13 gtid on_permissive, master = master1)
- newslave1m1 (5.7.13 gtid on_permissive, master = newmaster1)

Now I've got more slaves to add which is fine but my question is I know I need on_permissve to get it replicating from a gtid off server but what steps should I follow to get the slaves of the newmaster1 fully gtid = on and then the newmaster1 gtid = on.

A full outage will need to occur at some point so that's OK if needed.


The donor node gets locked up during SST

Lastest Forum Posts - July 25, 2016 - 12:34am
Hi there,
Recently moved to percona xtradb cluster 5.6 but currently operating on standalone mode due to some issues with the cluster.
It seems the donor node gets blocked when a second node tries to join the cluster and full SST happens.
Using the mysqlcheck script on port 9200 from haproxy, 503 Service Unavailable is being thrown during this period. When SST is over, then the cluster services normally.
I understood the donor should be operable during this time from the documentation.
Can someone clarify if this is a bug or expected bahaviour please?

Error message about corrupted mysql.user in DB-Log after Restore of innobackupex bak

Lastest Forum Posts - July 22, 2016 - 2:57am
Hi all,

i did some tests with recovery of a database out of innobackupex backups.

- DB: 5.7.13
- innobackupex version 2.4.3
- Backup via innobackupex --stream=tar | ssh host "cat - > backup.tar"
- Restore on other location via tar xvif backup.tar
- innobackupex --apply-log
- correcting file ownership and permissions
- starting restored database

After that I get reproducible the following error message in the db log:

2016-07-22T09:38:03.215194Z 0 [ERROR] /opt/mysql/bin/mysqld: Table './mysql/user' is marked as crashed and should be repaired
2016-07-22T09:38:03.215396Z 0 [Warning] Checking table: './mysql/user'
2016-07-22T09:38:03.215441Z 0 [ERROR] 1 client is using or hasn't closed the table properly

Can someone help me as I couldn't find a hint for a solution?

Best regards

General Inquiries

For general inquiries, please send us your question and someone will contact you.