Buy Percona ServicesBuy Now!

remove channel for show slave status

Lastest Forum Posts - 3 hours 45 min ago
Hello, i have configured 2 channels in replication , after that - i have stopped one of them.
How can i remove output about this channel from show slave status ?
reset slave for channel do not help for me

Tuning Linux for MongoDB: Automated Tuning on Redhat and CentOS

Latest MySQL Performance Blog posts - December 8, 2016 - 2:34pm

In a previous blog post: “Tuning Linux for MongoDB,” I covered several tunings for an efficient MongoDB deployment on Linux in Production. This post expands on that one.

While I felt the tuning Linux for MongoDB was a very useful blog post that results in a great baseline tuning, something bugged me about how much effort and touch-points were required to achieve an efficient Linux installation for MongoDB. More importantly, I noticed some cases where the tunings (example: changes to disk I/O scheduler in /etc/udev.d) were ignored on some recent RedHat and CentOS versions. With these issues in mind, I started to investigate better solutions for achieving the tuned baseline.

Tuned

In RedHat (and thus CentOS) 7.0, a daemon called “tuned” was introduced as a unified system for applying tunings to Linux. tuned operates with simple, file-based tuning “profiles” and provides an admin command-line interface named “tuned-adm” for applying, listing and even recommending tuned profiles.

Some operational benefits of tuned:

  • File-based configuration – Profile tunings are contained in a simple, consolidated files
  • Swappable profiles – Profiles are easily changed back/forth
  • Standards compliance – Using tuned profiles ensures tunings are not overridden or ignored

Note: If you use configuration management systems like Puppet, Chef, Salt, Ansible, etc., I suggest you configure those systems to deploy tunings via tuned profiles instead of applying tunings directly, as tuned will likely start to fight this automation, overriding the changes.

The default available tuned profiles (as of  RedHat 7.2.1511) are:

  • balanced
  • desktop
  • latency-performance
  • network-latency
  • network-throughput
  • powersave
  • throughput-performance
  • virtual-guest
  • virtual-host

The profiles that are generally interesting for database usage are:

  • latency-performance

    “A server profile for typical latency performance tuning. This profile disables dynamic tuning mechanisms and transparent hugepages. It uses the performance governer for p-states through cpuspeed, and sets the I/O scheduler to deadline.network-latency.”

  • throughput-performance

    “A server profile for typical throughput performance tuning. It disables tuned and ktune power saving mechanisms, enables sysctl settings that improve the throughput performance of your disk and network I/O, and switches to the deadline scheduler. CPU governor is set to performance.”

  • network-latency – Includes “latency-performance,” disables transparent_hugepages, disables NUMA balancing and enables some latency-based network tunings.
  • network-throughput – Includes “throughput-performance” and increases network stack buffer sizes.

I find “network-latency” is the closest match to our recommended tunings, but some additional changes are still required.

The good news is tuned was designed to be flexible, so I decided to make a MongoDB-specific profile: enter “tuned-percona-mongodb”.

tuned-percona-mongodb

tuned-percona-mongodb: https://github.com/Percona-Lab/tuned-percona-mongodb

“tuned-percona-mongodb” is a performance-focused tuned profile for MongoDB on Linux, and is currently considered experimental (no gurantees/warranties). It’s hosted in our Percona-Lab Github repo.

tuned-percona-mongodb applies the following tunings (from the previous tuning article) on a Redhat/CentOS 7+ host:

  • Disabling of transparent huge pages
  • Kernel network tunings (sysctls)
  • Virtual memory dirty ratio changes (sysctls)
  • Virtual memory “swappiness” (sysctls)
  • Block-device readahead settings (on all disks except /dev/sda by default)
  • Block-device I/O scheduler (on all disks except /dev/sda by default)

The following tunings that our previous tuning article didn’t cover are also applied:

After a successful deployment of this profile, only these recommendations are outstanding:

  1. Filesystem type and mount options:
    Tuned does not handle filesystem mount options, this needs to be done manually in /etc/fstab. To quickly summarize: we recommend the XFS or EXT4 filesystem type for MongoDB data when using MMAPv1 or RocksDB storage engines, and XFS ONLY when using WiredTiger. For all filesystems, using the mount options “rw,noatime” will reduce some activity.
  2. NUMA disabling or interleaving:
    Tuned does not handle NUMA settings and these still need to be handled via the MongoDB init script or the BIOS on/off switch.
  3. Linux ulimits:
    Tuned does not set Linux ulimit settings. However, Percona Server for MongoDB RPM packages do this for you at startup! See “LimitNOFILE” and “LimitNPROC” in “/usr/lib/systemd/system/mongod.service” for more information.
  4. NTP server:
    Tuned does not handle installation of RPM packages or enabling of services. You will need to install the “ntp” package and enable/start the “ntpd” service manually:

    sudo yum install ntp sudo systemctl enable ntpd sudo systemctl start ntpd

tuned-percona-mongodb: Installation

The installation of this profile is as simple as checking-out the repository with a “git” command and then running “sudo make enable”, full output here:

$ git clone https://github.com/Percona-Lab/tuned-percona-mongodb $ cd tuned-percona-mongodb $ sudo make enable if [ -d /etc/tuned ]; then cp -dpR percona-mongodb /etc/tuned/percona-mongodb; echo "### 'tuned-percona-mongodb' is installed. Enable with 'make enable'."; else echo "### ERROR: cannot find tuned config dir at /etc/tuned!"; exit 1; fi ### 'tuned-percona-mongodb' is installed. Enable with 'make enable'. tuned-adm profile percona-mongodb tuned-adm active Current active profile: percona-mongodb

In the example above you can see “percona-mongodb” is now the active tuned profile on the system (mentioned on the last output line).

The tuned profile files are installed to “/etc/tuned/percona-mongodb”, as seen here:

$ ls -alh /etc/tuned/percona-mongodb/*.* -rwxrwxr-x. 1 root root 677 Nov 22 20:00 percona-mongodb.sh -rw-rw-r--. 1 root root 1.4K Nov 22 20:00 tuned.conf

Let’s check that the “deadline” i/o scheduler is now the current scheduler on any disk that isn’t /dev/sda (“sdb” used below):

$ cat /sys/block/sdb/queue/scheduler noop [deadline] cfq

Transparent huge pages should be disabled (it is!):

$ cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never] $ cat /sys/kernel/mm/transparent_hugepage/defrag always madvise [never]

Block-device readahead should be 32 (16kb) on /dev/sdb (looks good!):

$ blockdev --getra /dev/sdb 32

That was easy!

tuned-percona-mongodb: Uninstallation

To uninstall the profile, run “sudo make uninstall” in the github checkout directory:

if [ -d /etc/tuned/percona-mongodb ]; then echo "### Disabling tuned profile 'tuned-percona-mongodb'"; echo "### Changing tuned profile to 'latency-performance', adjust if necessary after!"; tuned-adm profile latency-performance; tuned-adm active; else echo "tuned-percona-mongodb profile not installed!"; fi ### Disabling tuned profile 'tuned-percona-mongodb' ### Changing tuned profile to 'latency-performance', adjust if necessary after! Current active profile: latency-performance if [ -d /etc/tuned/percona-mongodb ]; then rm -rf /etc/tuned/percona-mongodb; fi

Note: the uninstallation will enable the “latency-performance” tuned profile, change this after the uninstall if needed

To confirm the uninstallation, let’s check if the block-device readahead is set back to default (256/128kb):

$ sudo blockdev --getra /dev/sdb 256

Uninstall complete.

Conclusion

So far tuned shows a lot of promise for tuning Linux for MongoDB, providing a single, consistent interface for tuning the Linux operating system. In the future, I would like to see the documentation for tuned improve. However, its simplicity makes the need for documentation rarely necessary.

As mentioned, after applying “tuned-percona-mongodb” you still need to configure an NTP server, NUMA (in some cases) and the filesystem type+tunings manually. The majority of the time, effort and room for mistakes is greatly reduced using this method.

If you have any issues with this profile for tuning Linux for MongoDB, or have any questions, please create a Github issue at this URL: https://github.com/Percona-Lab/tuned-percona-mongodb/issues/new.

Links

Too many connections error

Lastest Forum Posts - December 8, 2016 - 11:26am
I spun up a pmm 1.0.6 server. After enabling qan on about 60 servers I get errors like:

2016/12/08 19:20:13 server.go:2317: http: response.WriteHeader on hijacked connection
2016/12/08 19:20:13 server.go:2317: http: response.Write on hijacked connection
2016/12/08 19:20:13.252 127.0.0.1 500 705.712µs WS /agents/f47e5df7c29446a04ea13530da9a21d2/data
ERROR 2016/12/08 19:20:14 init.go:229: auth agent: auth.MySQLHandler.GetAgentId: dbm.Open: Error 1040: Too many connections

Is this a parameter I can change?

Any ideas about how many mysql instances I could support with a single pmm server if I only run qan?

Thanks!

Percona Server for MongoDB 3.2.11-3.1 is now available

Lastest Forum Posts - December 8, 2016 - 3:02am
Percona announces the release of Percona Server for MongoDB 3.2.11-3.1 on December 7, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Java database connector for Percona XtraDB

Lastest Forum Posts - December 7, 2016 - 9:17pm
Hi All. Are there any recommendations for the Java database connector for Percona XtraDB cluster other than the JDBC from Oracle?

First MongoDB replica-set Configuration for MySQL DBAs

Latest MySQL Performance Blog posts - December 7, 2016 - 3:23pm

In this blog post, we will work on the first replica-set configuration for MySQL DBAs. We will map as many names as possible and compare how the databases work.

Replica-sets are the most common MongoDB deployment nowadays. One of the most frequent questions is: How do you deploy a replica-set? In this blog, the setup we’ll use compares the MongoDB replica-set to a standard MySQL master-slave replication not using GTID.

replica-set

The replica-set usually consists of 3+ instances in different hosts that communicate with each other through both dedicated connections and heartbeat packages. The latter checks the other instances’ health in order to keep the high availability of the replica-sets. The names are slightly different: while “primary” corresponds to “master” in MySQL, “secondary” corresponds to “slave.” MongoDB only supports a single master — different from MySQL, which can have more than one depending on how you set it.

master-slave

Unlike MySQL, MongoDB does not use files to replicate each other (such as binary log or relay log files). All the statements that should be replicated are in the oplog.rs collection. This collection is a capped collection, which means it handles a limited number of documents. Therefore, when it becomes full new content replaces old documents. The amount of data that the oplog.rs can keep is called the “oplog window,” and it is measured in seconds. If a secondary node is delayed for longer than the oplog can handle, a new initial sync is needed. The same happens in MySQL when a slave tries to read binary logs that have been deleted. 

When the replica-set is initialized, all the inserts, updates and deletes are saved in a database called “local” in a collection called oplog.rs. The replica-set initialization can be compared to enabling bin logs in the MySQL configuration.

Now let’s point out the most important differences between such databases: the way they handle replication, and how they keep high availability.

For a standard MySQL replication we need a to enable the binlog in the config file, perform a backup, be aware of the binlog position, restore this backup in a server with a different server id, and finally start the slave thread in the slave. On the other hand, in MongoDB you only need a primary that has been previously configured with the replSet parameter, and then add the new secondaries with the same replSet parameter. No backup needed, no restore needed, no oplog position needed.

Unlike MySQL, MongoDB is capable of electing a new primary when the primary fails. This process is called election, and each instance will vote for a new primary based on how up-to-date they are without human intervention. This is why at least three instances are necessary for a reliable production replica-set. The election is based on votes, and for a secondary to become primary it needs the majority of votes – at least two out of three votes/boxes are required. We can also have an arbiter dedicated to voting only – it does not handle any data, but only decides which secondary should receive a vote. Most drivers are capable of changing the master once we need to pass the replica-set name in the connection string, and with this information drivers map primary and secondary on the fly using the result of rs.config().

Note: There are a few tools capable of emulating this behavior in MySQL. One example is: https://www.percona.com/blog/2016/09/02/mha-quickstart-guide/

Maintaining Replica-sets

After deploying a replica-set, we should monitor it. There are a couple of commands that identify not only the available hosts, but also the replication status. They edit such replication as well.

The command rs.status() will show all the details of the replication, such as the replica-set name, all the hosts that belong to this replica-set, and their status. This command is similar to “show slave hosts” in MySQL.

In addition, the command rs.printSlaveReplicationInfo() shows how delayed the secondaries are. It can be compared to “show slave status” in MySQL.

Replica-sets can be managed online by the command rs.config(). Passing the replica-set name as a parameter in the mongod process, or in the config file, is the only necessary action to start a replica-set. All the other configs can be managed using rs.config().

Step-by-Step How to Start Your First Replica-Set:

Please follow the following instructions to start testing replica-set with three nodes, using all the commands we’ve talked about.

For a production installation, please follow instructions on how to use our repositories here.

Download Percona Server for MongoDB:

$ cd ~ wget https://www.percona.com/downloads/percona-server-mongodb-3.2/percona-server-mongodb-3.2.10-3.0/binary/tarball/percona-server-mongodb-3.2.10-3.0-trusty-x86_64.tar.gz tar -xvzf percona-server-mongodb-3.2.10-3.0-trusty-x86_64.tar.gz mv percona-server-mongodb-3.2.10-3.0 mongodb

Create folders:

cd mongodb/bin mkdir data1 data2 data3

Generate the configs file:

(This is a simple config file, and almost all parameters are the default, so please edit the database directory first.)

for i in {1..3}; do echo echo 'storage: dbPath: "'$(pwd)'/data'$i'" systemLog: destination: file path: "'$(pwd)'/data'$i'/mongodb.log" logAppend: true processManagement: fork: true net: port: '$(( 27017 + $i -1 ))' replication: replSetName: "rs01"' > config$i.cfg; done

Starting MongoDB’s:

  •  Before initializing any MongoDB instance, confirm if the config files exist:

percona@mongo32:~/mongodb/bin$ ls -lah *.cfg config1.cfg config2.cfg config3.cfg

  • Then start mongod process and repeat for the others:

percona@mongo32:~/mongodb/bin$ ./mongod -f config1.cfg 2016-11-10T16:56:12.854-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:12.855-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1263 child process started successfully, parent exiting percona@mongo32:~/mongodb/bin$ ./mongod -f config2.cfg 2016-11-10T16:56:21.992-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:21.993-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1287 child process started successfully, parent exiting percona@mongo32:~/mongodb/bin$ ./mongod -f config3.cfg 2016-11-10T16:56:24.250-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:24.250-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1310 child process started successfully, parent exiting

Initializing a replica-set:

  • Connect to the first MongoDB:

$ ./mongo > rs.initiate() { "info2" : "no configuration specified. Using a default configuration for the set", "me" : "mongo32:27017", "ok" : 1 }

  • Add a new member

rs01:PRIMARY> rs.add('mongo32:27018') // replace to your hostname, localhost is not allowed. { "ok" : 1 } rs01:PRIMARY> rs.add('mongo32:27019') { "ok" : 1 } rs01:PRIMARY> rs.status() { "set" : "rs01", "date" : ISODate("2016-11-10T19:40:08.190Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "mongo32:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2636, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "electionTime" : Timestamp(1478804218, 2), "electionDate" : ISODate("2016-11-10T18:56:58Z"), "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "mongo32:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 44, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:40:07.129Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:40:05.132Z"), "pingMs" : NumberLong(0), "syncingTo" : "mongo32:27017", "configVersion" : 3 }, { "_id" : 2, "name" : "mongo32:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 3, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:40:07.130Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:40:06.239Z"), "pingMs" : NumberLong(0), "configVersion" : 3 } ], "ok" : 1 }

  • Check replication lag:

$ mongo rs01:PRIMARY> rs.printSlaveReplicationInfo() source: mongo32:27018 syncedTo: Thu Nov 10 2016 17:40:05 GMT-0200 (BRST) 0 secs (0 hrs) behind the primary source: mongo32:27019 syncedTo: Thu Nov 10 2016 17:40:05 GMT-0200 (BRST) 0 secs (0 hrs) behind the primary

  • Start an election:

$mongo rs01:PRIMARY> rs.stepDown() 2016-11-10T17:41:27.271-0200 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27017': DB.prototype.runCommand@src/mongo/shell/db.js:135:1 DB.prototype.adminCommand@src/mongo/shell/db.js:153:16 rs.stepDown@src/mongo/shell/utils.js:1182:12 @(shell):1:1 2016-11-10T17:41:27.274-0200 I NETWORK [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed 2016-11-10T17:41:27.275-0200 I NETWORK [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) ok rs01:SECONDARY> rs01:SECONDARY> rs.status() { "set" : "rs01", "date" : ISODate("2016-11-10T19:41:39.280Z"), "myState" : 2, "term" : NumberLong(2), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "mongo32:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 2727, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "mongo32:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 135, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:41:37.155Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:41:37.155Z"), "pingMs" : NumberLong(0), "configVersion" : 3 }, { "_id" : 2, "name" : "mongo32:27019", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 94, "optime" : { "ts" : Timestamp(1478806897, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2016-11-10T19:41:37Z"), "lastHeartbeat" : ISODate("2016-11-10T19:41:39.151Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:41:38.354Z"), "pingMs" : NumberLong(0), "electionTime" : Timestamp(1478806896, 1), "electionDate" : ISODate("2016-11-10T19:41:36Z"), "configVersion" : 3 ], "ok" : 1 } rs01:SECONDARY> exit

Shut down instances:

$ killall mongod

Hopefully, this was helpful. Please post any questions in the comments section.

How to reset Percona XtraDB Cluster on all nodes?

Lastest Forum Posts - December 7, 2016 - 2:03pm
Hello,

I have a problems with cluster testing. As documentation says I have installed 3 nodes and connect them. All work. After that I manually crashed all nodes. Then bring up two nodes (assuming that 3rd node is fully dead).

I am using CentOS 7 and after node restarts mysql service automatically starts (systemctl enable mysql).

I see next states on nodes:
systemctl status mysql
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: activating (start-post) since Thu 2016-12-08 01:38:33 EET; 1h 57min left
Process: 926 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 984 (mysqld_safe); : 985 (mysql-systemd)
CGroup: /system.slice/mysql.service
├─ 984 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
├─1566 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsr...
└─control
├─ 985 /bin/bash -ue /usr/bin/mysql-systemd start-post 984
└─4168 sleep 1


They are waiting for the 3rd node. But as I said I assume that it is fully crashed. So, I tried to manually restart first node as a new donor node.

First of all I tried to stop a node:
systemctl stop mysql

But it doesn't want to stop. I checked the status one more time:
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: deactivating (stop-sigterm) (Result: exit-code)
Process: 6950 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 985 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 926 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 984 (mysqld_safe)
CGroup: /system.slice/mysql.service
├─ 984 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
└─1566 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/usr/lib64/galera3/libgalera_smm.so --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock --wsrep_start_position=43de3d74-bca8-11e6-a178-57b39b925285:9

Dec 08 01:38:33 GlusterDC1_1 systemd[1]: Starting Percona XtraDB Cluster...
Dec 08 01:38:35 GlusterDC1_1 mysqld_safe[984]: 2016-12-07T23:38:35.462867Z mysqld_safe Logging to '/var/log/mysqld.log'.
Dec 08 01:38:35 GlusterDC1_1 mysqld_safe[984]: 2016-12-07T23:38:35.708324Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Dec 08 01:38:35 GlusterDC1_1 mysqld_safe[984]: 2016-12-07T23:38:35.849329Z mysqld_safe Skipping wsrep-recover for 43de3d74-bca8-11e6-a178-57b39b925285:9 pair
Dec 08 01:38:35 GlusterDC1_1 mysqld_safe[984]: 2016-12-07T23:38:35.852782Z mysqld_safe Assigning 43de3d74-bca8-11e6-a178-57b39b925285:9 to wsrep_start_position
Dec 07 23:53:41 GlusterDC1_1 mysql-systemd[985]: ERROR!
Dec 07 23:53:41 GlusterDC1_1 systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 07 23:53:41 GlusterDC1_1 mysql-systemd[6950]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 07 23:53:41 GlusterDC1_1 mysql-systemd[6950]: ERROR! mysql already dead
Dec 07 23:53:41 GlusterDC1_1 systemd[1]: mysql.service: control process exited, code=exited status=2


Seems that something failed. Ok. I decided to disable mysql service and restart the node. After node has been restarted I used:
systemctl start mysql@bootstrap.service

To bootstrup a new cluster. But node 2 doesn't want to connect to it. Ok, I bring back node 3 and used:
SET GLOBAL wsrep_provider_options='pc.bootstrap=true';

Seems that nothing works. I stopped all three nodes and now all nodes waiting for something and don't work.

How to reset all nodes and rejoin them together?

Sincerely,
Alexandr

Percona Server for MongoDB 3.2.11-3.1 is now available

Latest MySQL Performance Blog posts - December 7, 2016 - 9:07am

Percona announces the release of Percona Server for MongoDB 3.2.11-3.1 on December 7, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.2.11-3.1 is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks, Percona Memory Engine, and PerconaFT storage engine, as well as enterprise-grade features like external authentication and audit logging at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: We deprecated the PerconaFT storage engine. It will not be available in future releases.

This release is based on MongoDB 3.2.11 and includes the following additional fixes:

  • PSMDB-93: Fixed hang during shutdown of mongod when started with the --storageEngine=PerconaFT and --nojournal options
  • PSMDB-92: Added Hot Backup to Ubuntu/Debian packages
  • PSMDB-83: Updated default configuration file to include recommended settings templates for various storage engines
  • Added support for Ubuntu 16.10 (Yakkety Yak)
  • Added binary tarballs for Ubuntu 16.04 LTS (Xenial Xerus)

The release notes are available in the official documentation.

 

How to delete double host entries

Lastest Forum Posts - December 7, 2016 - 6:40am
I have host entries with different hostnames for same hosts.

Original hosts were created with name <hostname> and then servers are reinstalled and added as <hostname.domainname>.

Now i have for each host two entries in grafana/prometheus.

How can I remove an entry?

... and i'm a little bit confused.

Code: PMM client is not configured, missing config file. Please make sure you have run 'pmm-admin config'. How is it possible to get data without configured clients?

negative count() results

Lastest Forum Posts - December 7, 2016 - 6:32am
We're experiencing some weird results with indices and count() in PSMDB (PerconaFT engine) on collections with high read/write access.
At first we got some wrong numbers: count() => 3 but real 94 entries.

So we decided to do a reindex() on that collection. After the reindex we get negative counts like e.g. -88 when only 2 documents are in the collection!

There seems something wrong with indices! E.g.

db.reports.count() // Returns 29
db.reports.find() // Returns 46 entries

db.reports.count({ "count": { $gte: 0 } }) // Returns 46
db.reports.find({ "count": { $gte: 0 } }) // Returns 46 entries
'count' field in a non-index field.

Any ideas what's wrong? Re-indexing didn't fix it! Also stats() on that collection returns a negative size:

"ns" : "data.mails",
"count" : -88,
"size" : -2532158,
"avgObjSize" : 0,
"storageSize" : 805306368,
"capped" : false,

Best, Tom

PS: some additional data
OS: Deb8
ReplicaSet 3 nodes
PSMDB 3.2.10-3.0

Webinar Thursday, December 8: Virtual Columns in MySQL and MariaDB

Latest MySQL Performance Blog posts - December 6, 2016 - 4:14pm

Please join Federico Razzoli, Consultant at Percona, on Thursday, December 8, 2016, at 8 AM PT / 11 AM ET (UTC – 8) as he presents Virtual Columns in MySQL and MariaDB.

MariaDB 5.2 and MySQL 5.7 introduced virtual columns, with different implementations.Their features and limitations are similar, but not identical. The main difference is that only MySQL allows you to build an index on a non-persistent column.

In this talk, we’ll present some use cases for virtual columns. These cases include query simplification and UNIQUE constraints based on an SQL expression. In particular, we will see how to use them to index JSON data in MySQL, or dynamic columns in MariaDB.

Performance and limitations will also be discussed.

Sign up for the webinar here.

Federico Razzoli is a relational databases lover and open source supporter. He is a MariaDB Community Ambassador and wrote “Mastering MariaDB” in 2014. Currently, he works for Percona as a consultant.

How do I do a clean reinstallation of the PMM client?

Lastest Forum Posts - December 6, 2016 - 3:29pm
I’m having difficulty configuring the PMM client on a database server. One approach I’ve taken is to uninstall and reinstall the client, however the uninstall script seems to be leaving around configuration which is then picked up during the reinstall. How do I clean out the old configuration before reinstallation?

$ ls /etc/init.d | grep perc # nothing
$ ps -ef | grep perc # nothing

$ s find / -name 'pmm*'
/var/log/pmm-queries-exporter-42001.log
/var/log/pmm-mysql-exporter-42003.log
/var/log/pmm-mysql-exporter-42004.log
/var/log/pmm-mysql-exporter-42002.log
/root/pmm-client-1.0.3-x86_64
/root/pmm-client-1.0.3-x86_64/bin/pmm-admin
/root/pmm-client.tar.gz
/tmp/pmm-client-install.log

$ s find / -name 'percona*'
/usr/share/doc/percona-toolkit-2.2.19
/usr/share/doc/percona-release-0.1
/usr/share/doc/percona-xtrabackup-24-2.4.4
/usr/share/man/man1/percona-toolkit.1p.gz
/usr/share/percona-server
/usr/local/percona
/etc/yum.repos.d/percona-release.repo
/var/lib/yum/repos/x86_64/6/percona-release-noarch
/var/lib/yum/repos/x86_64/6/percona-release-x86_64
/var/tmp/yum-sonia-J3Dyd6/x86_64/6/percona-release-noarch
/var/tmp/yum-sonia-J3Dyd6/x86_64/6/percona-release-x86_64
/var/tmp/yum-root-BF1ppj/percona-release-0.1-3.noarch.rpm
/var/cache/yum/x86_64/6/percona-release-noarch
/var/cache/yum/x86_64/6/percona-release-x86_64
/root/pmm-client-1.0.3-x86_64/percona-qan-agent-1.0.0-20160805.869bf3d-x86_64.tar.gz
/root/pmm-client-1.0.3-x86_64/percona-qan-agent-1.0.0-20160805.869bf3d-x86_64
/root/pmm-client-1.0.3-x86_64/percona-qan-agent-1.0.0-20160805.869bf3d-x86_64/bin/percona-qan-agent-installer
/root/pmm-client-1.0.3-x86_64/percona-qan-agent-1.0.0-20160805.869bf3d-x86_64/bin/percona-qan-agent
/root/pmm-client-1.0.3-x86_64/percona-qan-agent-1.0.0-20160805.869bf3d-x86_64/init.d/percona-qan-agent
/root/percona-toolkit-2.2.18
/root/percona-toolkit-2.2.18/docs/percona-toolkit.pod
/root/percona-toolkit.tar.gz
/tmp/percona-version-check

$ s ./install 3.4.5.6:8080 4.5.6.7
[1/3] Installing pmm-admin...
[2/3] Installing Query Analytics Agent...
[3/3] Installing Prometheus exporters...

Done installing PMM client. Next steps:


# —> some old configuration seems to be hanging around:

$ s pmm-admin list
pmm-admin 1.0.3

PMM Server | 3.4.5.6:8080
Client Name | database1
Client Address | 4.5.6.7
Service manager | unix-systemv

--------------- -------- ------------ -------- ------------------------------------------- ---------------------
METRIC SERVICE NAME CLIENT PORT RUNNING DATA SOURCE OPTIONS
--------------- -------- ------------ -------- ------------------------------------------- ---------------------
queries database1 42001 NO root:***@unix(/var/run/mysqld/mysqld.sock) query_source=slowlog

$ s pmm-admin remove queries database1
Error removing queries database1: "service" failed: exit status 1, pmm-queries-exporter-42001: unrecognized service

$ ls /etc/init.d | grep perc # nothing

Network check is also unhappy:

$ s pmm-admin check-network --no-emoji
PMM Network Status

Server | 1.2.3.4:8080
Client | 4.5.6.7

* Client > Server
--------------- -------------
SERVICE CONNECTIVITY
--------------- -------------
Consul API OK
QAN API OK
Prometheus API OK

Connection duration | 680.806µs
Request duration | 941.322µs
Full round trip | 1.622128ms

* Server > Client
-------- -------- ---------------------- -------------
METRIC NAME PROMETHEUS ENDPOINT REMOTE STATE
-------- -------- ---------------------- -------------
mysql database1 4.5.6.7:42003 PROBLEM
mysql database1 4.5.6.7:42004 PROBLEM

For endpoints in problem state, please check if the corresponding service is running ("pmm-admin list").
If all endpoints are down here and "pmm-admin list" shows all services are up,
please check the firewall settings whether this system allows incoming connections by addressort in question.

Firewall allows all traffic on the required 10 network:

$ s iptables -nL
Chain INPUT (policy DROP)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 icmp type 8
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:22 state NEW,ESTABLISHED
ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp state NEW,ESTABLISHED
ACCEPT udp -- 10.0.0.0/8 0.0.0.0/0 udp state NEW,ESTABLISHED
LOGGING all -- 0.0.0.0/0 0.0.0.0/0

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0

Chain LOGGING (1 references)
target prot opt source destination
LOG all -- 0.0.0.0/0 0.0.0.0/0 limit: avg 2/min burst 5 LOG flags 0 level 4 prefix `IPTables-Dropped: '
DROP all -- 0.0.0.0/0 0.0.0.0/0






QAN API error: &amp;quot;agent not connected&amp;quot; Check whether percona-qan-agent is started.

Lastest Forum Posts - December 6, 2016 - 2:25pm
I have the PMM server running, and the docker containers are up. But I'm getting the following error message in the web interface, and the init script referred to doesn't exist. I get it when I click on the settings wheel under a database; also there's no data for the database.

https://4.5.6.7/qan/#/management/mysql/deadbeefdeadbeef

QAN API error: "agent not connected".
Check whether percona-qan-agent is started.
sudo /etc/init.d/percona-qan-agent start|stop|restart|status

$ ls -1 /etc/init.d | grep perc # nothing

$ cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

$ s netstat -tanp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 5.6.7.8:53 5.6.7.8:* LISTEN 1095/named
tcp 0 0 5.6.7.8:3030 5.6.7.8:* LISTEN 2204/ruby
tcp 0 0 5.6.7.8:22 5.6.7.8:* LISTEN 1001/sshd
tcp 0 0 5.6.7.8:25 5.6.7.8:* LISTEN 1629/master
tcp 0 0 5.6.7.8:953 5.6.7.8:* LISTEN 1095/named
tcp 0 0 5.6.7.8:56696 5.6.7.8:5672 ESTABLISHED 2204/ruby
tcp 0 0 5.6.7.8:22 5.6.7.8:56391 ESTABLISHED 11230/sshd: sonia [
tcp 0 36 5.6.7.8:22 5.6.7.8:12441 ESTABLISHED 27318/sshd: sonia [
tcp 0 0 5.6.7.8:43176 5.6.7.8:4505 ESTABLISHED 12783/python
tcp6 0 0 :::8080 :::* LISTEN 2966/docker-proxy
tcp6 0 0 :::80 :::* LISTEN 2863/docker-proxy
tcp6 0 0 ::1:53 :::* LISTEN 1095/named
tcp6 0 0 :::22 :::* LISTEN 1001/sshd
tcp6 0 0 ::1:25 :::* LISTEN 1629/master
tcp6 0 0 ::1:953 :::* LISTEN 1095/named
tcp6 0 0 :::443 :::* LISTEN 2849/docker-proxy






pmm-* tools not found

Lastest Forum Posts - December 6, 2016 - 10:41am
Hello all,

I am running PMM Server using docker toolbox in Windows 7. I finished step 1-3 and was able to verify all web links are working (https://www.percona.com/doc/percona-...t/install.html)

However I am unable to find any pmm-* tools in order to start data collection for MySQL :
sudo pmm-admin add mysql In fact, both sudo and pmm-admin are not found. Does anyone know what happened? Do I need to install pmm-client? I am testing this to check the Mysql on a remote host. Any help is appreicated. thanks, benny

innobackupex fails on different positions

Lastest Forum Posts - December 6, 2016 - 5:56am
hello,

i want to take a backup with innobackupex to add a new slave.

i execute
innobackupex --user=xxxx --password=xxxx .
but it fails on different positions:

=========
first run:
=========
...
161206 09:20:39 [01] Copying ./_session_em_at/v_2016_11_24.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/v_2016_11_24.ibd
161206 09:20:40 [01] ...done
161206 09:20:40 [01] Copying ./_session_em_at/FTS_00000000000ea744_BEING_DELETED_CACHE.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/FTS_00000000000ea744_BEING_DELETED_CACHE.ibd
161206 09:20:40 [01] ...done
161206 09:20:40 [01] Copying ./_session_em_at/FTS_00000000000ec3c9_00000000001b985e_INDEX_6.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/FTS_00000000000ec3c9_00000000001b985e_INDEX_6.ibd
InnoDB: Last flushed lsn: 4472480740170 load_index lsn 4472506073978
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
PXB will not be able take a consistent backup. Retry the backup operation
2016-12-06 09:20:40 0x7f7d6c35f700 InnoDB: Assertion failure in thread 140176663115520 in file ut0ut.cc line 916
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/...-recovery.html
InnoDB: about forcing recovery.
08:20:40 UTC - xtrabackup got signal 6 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
innobackupex(my_print_stacktrace+0x3b)[0xc7183b]
innobackupex(handle_fatal_signal+0x281)[0xa8da01]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f7d6eaa33d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f7d6d11c418]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f7d6d11e01a]
innobackupex[0x6f1b41]
innobackupex(_ZN2ib5fatalD1Ev+0x145)[0x9d62c5]
innobackupex[0x942177]
innobackupex(_Z19recv_parse_log_recsm7store_tb+0x2 81)[0x946df1]
innobackupex[0x713558]
innobackupex[0x7137b8]
innobackupex[0x713c8b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f7d6ea996fa]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f7d6d1edb5d]

=========
second run:
=========
...
161206 09:40:46 [01] Copying ./_session_em_at/FTS_0000000000007cae_000000000001365a_INDEX_5.ibd to /var/lib/mysql/.tmp/2016-12-06_09-36-26/_session_em_at/FTS_0000000000007cae_000000000001365a_INDEX_5.ibd
161206 09:40:46 [01] ...done
161206 09:40:46 [01] Copying ./_session_em_at/v_2016_11_30.ibd to /var/lib/mysql/.tmp/2016-12-06_09-36-26/_session_em_at/v_2016_11_30.ibd
InnoDB: Last flushed lsn: 4472662375842 load_index lsn 4472756690590
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
PXB will not be able take a consistent backup. Retry the backup operation
2016-12-06 09:40:47 0x7f910132d700 InnoDB: Assertion failure in thread 140260767094528 in file ut0ut.cc line 916
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/...-recovery.html
InnoDB: about forcing recovery.
08:40:47 UTC - xtrabackup got signal 6 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
innobackupex(my_print_stacktrace+0x3b)[0xc7183b]
innobackupex(handle_fatal_signal+0x281)[0xa8da01]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f9103a713d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f91020ea418]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f91020ec01a]
innobackupex[0x6f1b41]
innobackupex(_ZN2ib5fatalD1Ev+0x145)[0x9d62c5]
innobackupex[0x942177]
innobackupex(_Z19recv_parse_log_recsm7store_tb+0x2 81)[0x946df1]
innobackupex[0x713558]
innobackupex[0x7137b8]
innobackupex[0x713c8b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f9103a676fa]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f91021bbb5d]


during the second run, the error occurred later.

has anyone have an idea?


my system:
Ubuntu 16.04.1
Percona Server 5.7.12-5-1.xenial
xtrabackup 2.4.4-1.xenial

innobackup fails on different positions

Lastest Forum Posts - December 6, 2016 - 5:54am
hello,

i want to take a backup with innobackupex to add a new slave.

i execute
innobackupex --user=xxxx --password=xxxx .
but it fails on different positions:

=========
first run:
=========
...
161206 09:20:39 [01] Copying ./_session_em_at/v_2016_11_24.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/v_2016_11_24.ibd
161206 09:20:40 [01] ...done
161206 09:20:40 [01] Copying ./_session_em_at/FTS_00000000000ea744_BEING_DELETED_CACHE.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/FTS_00000000000ea744_BEING_DELETED_CACHE.ibd
161206 09:20:40 [01] ...done
161206 09:20:40 [01] Copying ./_session_em_at/FTS_00000000000ec3c9_00000000001b985e_INDEX_6.ibd to /var/lib/mysql/.tmp/2016-12-06_09-18-45/_session_em_at/FTS_00000000000ec3c9_00000000001b985e_INDEX_6.ibd
InnoDB: Last flushed lsn: 4472480740170 load_index lsn 4472506073978
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
PXB will not be able take a consistent backup. Retry the backup operation
2016-12-06 09:20:40 0x7f7d6c35f700 InnoDB: Assertion failure in thread 140176663115520 in file ut0ut.cc line 916
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/...-recovery.html
InnoDB: about forcing recovery.
08:20:40 UTC - xtrabackup got signal 6 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
innobackupex(my_print_stacktrace+0x3b)[0xc7183b]
innobackupex(handle_fatal_signal+0x281)[0xa8da01]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f7d6eaa33d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f7d6d11c418]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f7d6d11e01a]
innobackupex[0x6f1b41]
innobackupex(_ZN2ib5fatalD1Ev+0x145)[0x9d62c5]
innobackupex[0x942177]
innobackupex(_Z19recv_parse_log_recsm7store_tb+0x2 81)[0x946df1]
innobackupex[0x713558]
innobackupex[0x7137b8]
innobackupex[0x713c8b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f7d6ea996fa]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f7d6d1edb5d]

=========
second run:
=========
...
161206 09:40:46 [01] Copying ./_session_em_at/FTS_0000000000007cae_000000000001365a_INDEX_5.ibd to /var/lib/mysql/.tmp/2016-12-06_09-36-26/_session_em_at/FTS_0000000000007cae_000000000001365a_INDEX_5.ibd
161206 09:40:46 [01] ...done
161206 09:40:46 [01] Copying ./_session_em_at/v_2016_11_30.ibd to /var/lib/mysql/.tmp/2016-12-06_09-36-26/_session_em_at/v_2016_11_30.ibd
InnoDB: Last flushed lsn: 4472662375842 load_index lsn 4472756690590
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
PXB will not be able take a consistent backup. Retry the backup operation
2016-12-06 09:40:47 0x7f910132d700 InnoDB: Assertion failure in thread 140260767094528 in file ut0ut.cc line 916
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/...-recovery.html
InnoDB: about forcing recovery.
08:40:47 UTC - xtrabackup got signal 6 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
innobackupex(my_print_stacktrace+0x3b)[0xc7183b]
innobackupex(handle_fatal_signal+0x281)[0xa8da01]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x113d0)[0x7f9103a713d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f91020ea418]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f91020ec01a]
innobackupex[0x6f1b41]
innobackupex(_ZN2ib5fatalD1Ev+0x145)[0x9d62c5]
innobackupex[0x942177]
innobackupex(_Z19recv_parse_log_recsm7store_tb+0x2 81)[0x946df1]
innobackupex[0x713558]
innobackupex[0x7137b8]
innobackupex[0x713c8b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76fa)[0x7f9103a676fa]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f91021bbb5d]


during the second run, the error occurred later.

has anyone have an idea?


my system:
Ubuntu 16.04.1
Percona Server 5.7.12-5-1.xenial
xtrabackup 2.4.4-1.xenial

Percona Live 2017 Open Source Database Conference Tutorial Schedule is Live!

Latest MySQL Performance Blog posts - December 5, 2016 - 12:43pm

We are excited to announce that the tutorial schedule for the Percona Live 2017 Open Source Database Conference is up!

The Percona Live 2017 Open Source Database Conference 2017 is April 24th – 27th, at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Click through to the tutorial link right now, look them over, and pick which sessions you want to attend. Discounted passes available below!

Tutorial List: Early Bird Discounts

Just a reminder to everyone out there: our Early Bird discount rate for the Percona Live Open Source Database Conference 2017 is only available ‘til January 8, 2017, 11:30 pm PST! This rate gets you all the excellent and amazing opportunities that Percona Live offers, at a very reasonable price!

Sponsor Percona Live

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, MongoDB and open source database event. Sponsors become a part of a dynamic and growing ecosystem and interact with hundreds of DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

MongoDB Troubleshooting: My Top 5

Latest MySQL Performance Blog posts - December 5, 2016 - 11:38am

In this blog post, I’ll discuss my top five go-to tips for MongoDB troubleshooting.

Every DBA has a war chest of their go-to solutions for any support issues they run into for a specific technology. MongoDB is no different. Even if you have picked it because it’s a good fit and it runs well for you, things will change. When things change – sometimes there is a new version of your application, or a new version of the database itself – you need to have a solid starting place.

To help new DBA’s, I like to point out my top five things that cover the bulk of requests a DBA might need to work on.

Table of Contents

jQuery('.toc').attr('target','_self');

Common greps to use

This issue is all about what are some ways to pair down the error log and make it a bit more manageable. The error log is a slew of information and sometimes, without grep, it’s challenging to correlate some events.

Is an index being built?

As a DBA you will often get a call saying the database has “stopped.” The developer might say, “I didn’t change anything.” Looking at the error log is a great first port of call. With this particular grep, you just want to see if all index builds were done, if a new index was built and is still building, or an index was removed. This will help catch all of the cases in question.

>grep -i index mongod.log 2016-11-11T17:08:53.731+0000 I INDEX [conn458] build index on: samples.col1 properties: { v: 1, key: { friends: 1.0 }, name: "friends_1", ns: "samples.col1" } 2016-11-11T17:08:53.733+0000 I INDEX [conn458] building index using bulk method 2016-11-11T17:08:56.045+0000 I - [conn458] Index Build: 24700/1000000 2% 2016-11-11T17:08:59.004+0000 I - [conn458] Index Build: 61000/1000000 6% 2016-11-11T17:09:02.001+0000 I - [conn458] Index Build: 103200/1000000 10% 2016-11-11T17:09:05.013+0000 I - [conn458] Index Build: 130800/1000000 13% 2016-11-11T17:09:08.013+0000 I - [conn458] Index Build: 160300/1000000 16% 2016-11-11T17:09:11.039+0000 I - [conn458] Index Build: 183100/1000000 18% 2016-11-11T17:09:14.009+0000 I - [conn458] Index Build: 209400/1000000 20% 2016-11-11T17:09:17.007+0000 I - [conn458] Index Build: 239400/1000000 23% 2016-11-11T17:09:20.010+0000 I - [conn458] Index Build: 264100/1000000 26% 2016-11-11T17:09:23.001+0000 I - [conn458] Index Build: 286800/1000000 28% 2016-11-11T17:09:30.783+0000 I - [conn458] Index Build: 298900/1000000 29% 2016-11-11T17:09:33.015+0000 I - [conn458] Index Build: 323900/1000000 32% 2016-11-11T17:09:36.000+0000 I - [conn458] Index Build: 336600/1000000 33% 2016-11-11T17:09:39.000+0000 I - [conn458] Index Build: 397000/1000000 39% 2016-11-11T17:09:42.000+0000 I - [conn458] Index Build: 431900/1000000 43% 2016-11-11T17:09:45.002+0000 I - [conn458] Index Build: 489100/1000000 48% 2016-11-11T17:09:48.003+0000 I - [conn458] Index Build: 551200/1000000 55% 2016-11-11T17:09:51.004+0000 I - [conn458] Index Build: 567700/1000000 56% 2016-11-11T17:09:54.004+0000 I - [conn458] Index Build: 589600/1000000 58% 2016-11-11T17:10:00.929+0000 I - [conn458] Index Build: 597800/1000000 59% 2016-11-11T17:10:03.008+0000 I - [conn458] Index Build: 633100/1000000 63% 2016-11-11T17:10:06.001+0000 I - [conn458] Index Build: 647200/1000000 64% 2016-11-11T17:10:09.008+0000 I - [conn458] Index Build: 660000/1000000 66% 2016-11-11T17:10:12.001+0000 I - [conn458] Index Build: 672300/1000000 67% 2016-11-11T17:10:15.009+0000 I - [conn458] Index Build: 686000/1000000 68% 2016-11-11T17:10:18.001+0000 I - [conn458] Index Build: 706100/1000000 70% 2016-11-11T17:10:21.006+0000 I - [conn458] Index Build: 731400/1000000 73% 2016-11-11T17:10:24.006+0000 I - [conn458] Index Build: 750900/1000000 75% 2016-11-11T17:10:27.000+0000 I - [conn458] Index Build: 773900/1000000 77% 2016-11-11T17:10:30.000+0000 I - [conn458] Index Build: 821800/1000000 82% 2016-11-11T17:10:33.026+0000 I - [conn458] Index Build: 843800/1000000 84% 2016-11-11T17:10:36.008+0000 I - [conn458] Index Build: 874000/1000000 87% 2016-11-11T17:10:43.854+0000 I - [conn458] Index Build: 896600/1000000 89% 2016-11-11T17:10:46.009+0000 I - [conn458] Index Build: 921800/1000000 92% 2016-11-11T17:10:49.000+0000 I - [conn458] Index Build: 941600/1000000 94% 2016-11-11T17:10:52.011+0000 I - [conn458] Index Build: 955700/1000000 95% 2016-11-11T17:10:55.007+0000 I - [conn458] Index Build: 965500/1000000 96% 2016-11-11T17:10:58.046+0000 I - [conn458] Index Build: 985200/1000000 98% 2016-11-11T17:11:01.002+0000 I - [conn458] Index Build: 995000/1000000 99% 2016-11-11T17:11:13.000+0000 I - [conn458] Index: (2/3) BTree Bottom Up Progress: 8216900/8996322 91% 2016-11-11T17:11:14.021+0000 I INDEX [conn458] done building bottom layer, going to commit 2016-11-11T17:11:14.023+0000 I INDEX [conn458] build index done. scanned 1000000 total records. 140 secs 2016-11-11T17:11:14.035+0000 I COMMAND [conn458] command samples.$cmd command: createIndexes { createIndexes: "col1", indexes: [ { ns: "samples.col1", key: { friends: 1.0 }, name: "friends_1" } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:173 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, MMAPV1Journal: { acquireCount: { w: 9996326 }, acquireWaitCount: { w: 1054 }, timeAcquiringMicros: { w: 811319 } }, Database: { acquireCount: { w: 1, W: 1 } }, Collection: { acquireCount: { W: 1 } }, Metadata: { acquireCount: { W: 12 } }, oplog: { acquireCount: { w: 1 } } } 140306ms

What’s happening right now?

Like with the above index example, this helps you remove many of the messages you might not care about, or you want to block off. MongoDB does have some useful sub-component tags in the logs, such as “ReplicationExecutor” and “connXXX” that can be helpful, but I find it helpful to remove the noisy lines as opposed to the log facility types. In this example, I opted to also not have “| grep -v connection” – typically I will look at the log with connections first to see if they are acting funny, and filter those out to see the core data of what is happening. If you only want to see the long queries and command, replace “ms” with “connection” to make them easier to find.

>grep -v conn mongod.log | grep -v auth | grep -vi health | grep -v ms 2016-11-11T14:41:06.376+0000 I REPL [ReplicationExecutor] This node is localhost:28001 in the config 2016-11-11T14:41:06.377+0000 I REPL [ReplicationExecutor] transition to STARTUP2 2016-11-11T14:41:06.379+0000 I REPL [ReplicationExecutor] Member localhost:28003 is now in state STARTUP 2016-11-11T14:41:06.383+0000 I REPL [ReplicationExecutor] Member localhost:28002 is now in state STARTUP 2016-11-11T14:41:06.385+0000 I STORAGE [FileAllocator] allocating new datafile /Users/dmurphy/Github/dbmurphy/MongoDB32Labs/labs/rs2-1/local.1, filling with zeroes... 2016-11-11T14:41:06.586+0000 I STORAGE [FileAllocator] done allocating datafile /Users/dmurphy/Github/dbmurphy/MongoDB32Labs/labs/rs2-1/local.1, size: 256MB, took 0.196 secs 2016-11-11T14:41:06.610+0000 I REPL [ReplicationExecutor] transition to RECOVERING 2016-11-11T14:41:06.614+0000 I REPL [ReplicationExecutor] transition to SECONDARY 2016-11-11T14:41:08.384+0000 I REPL [ReplicationExecutor] Member localhost:28003 is now in state STARTUP2 2016-11-11T14:41:08.386+0000 I REPL [ReplicationExecutor] Standing for election 2016-11-11T14:41:08.388+0000 I REPL [ReplicationExecutor] Member localhost:28002 is now in state STARTUP2 2016-11-11T14:41:08.390+0000 I REPL [ReplicationExecutor] not electing self, localhost:28002 would veto with 'I don't think localhost:28001 is electable because the member is not currently a secondary (mask 0x8)' 2016-11-11T14:41:08.391+0000 I REPL [ReplicationExecutor] not electing self, we are not freshest 2016-11-11T14:41:10.387+0000 I REPL [ReplicationExecutor] Standing for election 2016-11-11T14:41:10.389+0000 I REPL [ReplicationExecutor] replSet info electSelf 2016-11-11T14:41:10.393+0000 I REPL [ReplicationExecutor] received vote: 1 votes from localhost:28003 2016-11-11T14:41:10.395+0000 I REPL [ReplicationExecutor] replSet election succeeded, assuming primary role 2016-11-11T14:41:10.396+0000 I REPL [ReplicationExecutor] transition to PRIMARY 2016-11-11T14:41:10.631+0000 I REPL [rsSync] transition to primary complete; database writes are now permitted 2016-11-11T14:41:12.390+0000 I REPL [ReplicationExecutor] Member localhost:28003 is now in state SECONDARY 2016-11-11T14:41:12.393+0000 I REPL [ReplicationExecutor] Member localhost:28002 is now in state SECONDARY versus 2016-11-11T14:41:12.393+0000 I REPL [ReplicationExecutor] Member localhost:28002 is now in state SECONDARY 2016-11-11T14:41:36.433+0000 I NETWORK [conn3] end connection 127.0.0.1:65497 (1 connection now open) 2016-11-11T14:41:36.433+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49191 #8 (3 connections now open) 2016-11-11T14:41:36.490+0000 I NETWORK [conn2] end connection 127.0.0.1:65496 (1 connection now open) 2016-11-11T14:41:36.490+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49192 #9 (3 connections now open) 2016-11-11T14:41:54.480+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49257 #10 (3 connections now open) 2016-11-11T14:41:54.486+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49258 #11 (4 connections now open) 2016-11-11T14:42:06.493+0000 I NETWORK [conn8] end connection 127.0.0.1:49191 (3 connections now open) 2016-11-11T14:42:06.494+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49290 #12 (5 connections now open) 2016-11-11T14:42:06.550+0000 I NETWORK [conn9] end connection 127.0.0.1:49192 (3 connections now open) 2016-11-11T14:42:06.550+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49294 #13 (5 connections now open) 2016-11-11T14:42:36.550+0000 I NETWORK [conn12] end connection 127.0.0.1:49290 (3 connections now open) 2016-11-11T14:42:36.550+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49336 #14 (5 connections now open) 2016-11-11T14:42:36.601+0000 I NETWORK [conn13] end connection 127.0.0.1:49294 (3 connections now open) 2016-11-11T14:42:36.601+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49339 #15 (5 connections now open) 2016-11-11T14:43:06.607+0000 I NETWORK [conn14] end connection 127.0.0.1:49336 (3 connections now open) 2016-11-11T14:43:06.608+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49387 #16 (5 connections now open) 2016-11-11T14:43:06.663+0000 I NETWORK [conn15] end connection 127.0.0.1:49339 (3 connections now open) 2016-11-11T14:43:06.663+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49389 #17 (5 connections now open) 2016-11-11T14:43:36.655+0000 I NETWORK [conn16] end connection 127.0.0.1:49387 (3 connections now open) 2016-11-11T14:43:36.656+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49436 #18 (5 connections now open) 2016-11-11T14:43:36.718+0000 I NETWORK [conn17] end connection 127.0.0.1:49389 (3 connections now open) 2016-11-11T14:43:36.719+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49439 #19 (5 connections now open) 2016-11-11T14:44:06.705+0000 I NETWORK [conn18] end connection 127.0.0.1:49436 (3 connections now open) 2016-11-11T14:44:06.705+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49481 #20 (5 connections now open) 2016-11-11T14:44:06.786+0000 I NETWORK [conn19] end connection 127.0.0.1:49439 (3 connections now open) 2016-11-11T14:44:06.786+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49483 #21 (5 connections now open) 2016-11-11T14:44:36.757+0000 I NETWORK [conn20] end connection 127.0.0.1:49481 (3 connections now open) 2016-11-11T14:44:36.757+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49526 #22 (5 connections now open) 2016-11-11T14:44:36.850+0000 I NETWORK [conn21] end connection 127.0.0.1:49483 (3 connections now open)

Did any elections happen? Why did they happen?

While this isn’t the most common command to run, it is very helpful if you aren’t using Percona Monitoring and Management (PMM) to track the historical frequency of elections. In this example, we want up to 20 lines before and after the word “SECONDARY”, which typically guards when a step-down or election takes place. Then you can see around that time if a command was issued, did a network error occur, was there a heartbeat failure or other such scenario.

grep -i SECONDARY -A20 -B20 2016-11-11T14:44:38.622+0000 I COMMAND [conn22] Attempting to step down in response to replSetStepDown command 2016-11-11T14:44:38.625+0000 I REPL [ReplicationExecutor] transition to SECONDARY 2016-11-11T14:44:38.627+0000 I NETWORK [conn10] end connection 127.0.0.1:49253 (4 connections now open) 2016-11-11T14:44:38.627+0000 I NETWORK [conn11] end connection 127.0.0.1:49254 (4 connections now open) 2016-11-11T14:44:38.630+0000 I NETWORK [thread1] trying reconnect to localhost:27001 (127.0.0.1) failed 2016-11-11T14:44:38.628+0000 I NETWORK [conn22] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:49506] 2016-11-11T14:44:38.630+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49529 #25 (5 connections now open) 2016-11-11T14:44:38.633+0000 I NETWORK [thread1] reconnect localhost:27001 (127.0.0.1) ok 2016-11-11T14:44:40.567+0000 I REPL [ReplicationExecutor] replSetElect voting yea for localhost:27002 (1) 2016-11-11T14:44:42.223+0000 I REPL [ReplicationExecutor] Member localhost:27002 is now in state PRIMARY 2016-11-11T14:44:44.314+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:49538 #26 (4 connections now open)

Is replication lagged, do I have enough oplog?

Always write a single test document just to ensure replication has a recent write:

db.getSiblingDB('repltest').col.insert({x:1}); db.getSiblingDB('repltest').dropDatabase();

Checking lag information:

rs1:PRIMARY> db.printSlaveReplicationInfo() source: localhost:27002 syncedTo: Fri Nov 11 2016 17:11:14 GMT+0000 (GMT) 0 secs (0 hrs) behind the primary source: localhost:27003 syncedTo: Fri Nov 11 2016 17:11:14 GMT+0000 (GMT) 0 secs (0 hrs) behind the primary

Oplog Size and Range:

rs1:PRIMARY> db.printReplicationInfo() configured oplog size: 192MB log length start to end: 2154secs (0.6hrs) oplog first event time: Fri Nov 11 2016 16:35:20 GMT+0000 (GMT) oplog last event time: Fri Nov 11 2016 17:11:14 GMT+0000 (GMT) now: Fri Nov 11 2016 17:16:46 GMT+0000 (GMT)

Taming the profiler

MongoDB is filled with tons of data in the profiler. I have highlighted some key points to know:

{ "queryPlanner" : { "mongosPlannerVersion" : 1, "winningPlan" : { "stage" : "SINGLE_SHARD", "shards" : [ { "shardName" : "rs3", "connectionString" : "rs3/localhost:29001,localhost:29002,localhost:29003", "serverInfo" : { "host" : "Davids-MacBook-Pro-2.local", "port" : 29001, "version" : "3.0.11", "gitVersion" : "48f8b49dc30cc2485c6c1f3db31b723258fcbf39" }, "plannerVersion" : 1, "namespace" : "blah.foo", "indexFilterSet" : false, "parsedQuery" : { "name" : { "$eq" : "Bob" } }, "winningPlan" : { "stage" : "COLLSCAN", "filter" : { "name" : { "$eq" : "Bob" } }, "direction" : "forward" }, "rejectedPlans" : [ ] } ] } }, "executionStats" : { "nReturned" : 0, "executionTimeMillis" : 0, "totalKeysExamined" : 0, "totalDocsExamined" : 1, "executionStages" : { "stage" : "SINGLE_SHARD", "nReturned" : 0, "executionTimeMillis" : 0, "totalKeysExamined" : 0, "totalDocsExamined" : 1, "totalChildMillis" : NumberLong(0), "shards" : [ { "shardName" : "rs3", "executionSuccess" : true, "executionStages" : { "stage" : "COLLSCAN", "filter" : { "name" : { "$eq" : "Bob" } }, "nReturned" : 0, "executionTimeMillisEstimate" : 0, "works" : 3, "advanced" : 0, "needTime" : 2, "needFetch" : 0, "saveState" : 0, "restoreState" : 0, "isEOF" : 1, "invalidates" : 0, "direction" : "forward", "docsExamined" : 1 } } ] }, "allPlansExecution" : [ { "shardName" : "rs3", "allPlans" : [ ] } ] }, "ok" : 1 }

Metric Description Filter Formulated query that was run. Right above it you can find the parsed query. These should be the same. It’s useful to know what the engine was sent in the end. nReturned Number of documents to return via the cursor to the client running the query/command. executionTimeMillis This used just to be called “ms”, but it means how long did this operation take. Typically you would measure this like a slow query in any database. total(Keys|Docs)Examined Unlike returned, this is what might be considered since not all indexes have perfect coverage, and sometimes you scan many documents to find no results. stage While poorly named, this will tell you if a collection scan (table scan) or index is used to answer a given operation. In the case of an index, it will say the name.

 

CurrentOp and killOp explained

When using db.CurrentOp() to see what is running, I frequently include db.currentOp(true) so that I can see everything and not just limited items. This makes the currentOp function look and act much more like SELECT * from information_schema.processlist in MySQL. One significant difference that commonly catches a new DBA off guard is the killing of operations between MySQL and MongoDB. While Mongo does have a handy db.killOp(<op_id>) function, it is important to know that unlike MySQL – which immediately kills the thread running the process – MongoDB is a bit different. When you run killOp(), MongoDB appends “killed: true” into the document structure. When the next yield occurs (if it occurs), it will tell the operation to quit. This is also how a shutdown works: if it seems like it’s not shutting down, it might be waiting for an operation to yield and notice the shutdown request.

I’m not arguing that this is bad or good, just different from MySQL and something of which you should be aware. One thing to note, however, is that MongoDB has great built in HA. Sometimes it is better to cause an election and let the drivers gracefully handle things, rather than running the killOp command (unless it’s a write, then you should always try and use  killOp).

Conclusion

I hope you have found some of this insightful. Look for future posts from the MongoDB team around other MongoDB areas we like to look at (or in different parts of the system) to help ourselves and clients get to the root of an issue.

PMM Server to monitor Solaris database client

Lastest Forum Posts - December 5, 2016 - 10:04am
Hi, I came across PMM monitoring server. We run most of the database on Solaris. and I understand there is no PMM client for Solaris. I want to know whether I can use PMM Server to monitor our Solaris mysql database? We could monitoring the OS metrics by another tool and all I need is to monitor the mysql metrics. Is PMM Server itself sufficient? Of course PMM server will be installed on a RH machine and monitoring the Solaris server remotely.

Adding server by hostname instead of IP ok?

Lastest Forum Posts - December 4, 2016 - 10:28pm
Is using a hostname supported instead of an ip? We have a mesos/marathon deploy so the server container does not have a dedicated ip, rather it is routed via haproxy. I can browse the server container fine but when I try and add via the client I get.

Unable to connect to PMM server by address: pmm-server.prdmesos.<redacted>.com


Even though the server is reachable it does not look to be PMM server.
Check if the configured address is correct.

I can also ping that address on pmm-server.prdmesos.<redacted>.com

$ ping pmm-server.prdmesos.<redacted>.com
PING pmm-server.prdmesos.<redacted>.com (10.1.202.165) 56(84) bytes of data.

Any ideas?



General Inquiries

For general inquiries, please send us your question and someone will contact you.