]]>
]]>

You are here

Feed aggregator

No backup speed increase using more memory

Lastest Forum Posts - December 8, 2014 - 1:51am
Hi

I was try to run a backup of my database only using:

: innobackupex --no-timestamp --databases="databasename" /backup/mysql and if i am not wrong that uses the default ram 100mb value and it takes to finish the backup 18 minutes.

Then i run it using:

: innobackupex --no-timestamp --use-memory=8G --databases="databasename" backup/mysql and it takes exactly the same time 18 minutes.

Why i don't get any speed increase of backup?

Thanks

--use-memory= - No backup speed increase?

Lastest Forum Posts - December 8, 2014 - 1:50am
Hi

I was try to run a backup of my database only using:

: innobackupex --no-timestamp --databases="databasename" /backup/mysql and if i am not wrong that uses the default ram 100mb value and it takes to finish the backup 18 minutes.

Then i run it using:

: innobackupex --no-timestamp --use-memory=8G --databases="databasename" backup/mysql and it takes exactly the same time 18 minutes.

Why i don't get any speed increase of backup?

Thanks

Cluster nodes keep crashing

Lastest Forum Posts - December 7, 2014 - 12:43am
I have 3-node cluster. At some point it went bad, so after that it wast started again, with one node being started with 'bootstrap-pxc' option, and two others just syncing to it.
Now I have two nodes crashing all the time, while the node which was started first is always ok.

On the crashing nodes, in the log I see the following:

2014-12-04 14:58:05 16834 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 204996287)
2014-12-04 14:58:05 16834 [Warning] WSREP: RBR event 3 Delete_rows apply warning: 1615, 204996287
2014-12-04 14:58:05 16834 [Warning] WSREP: failed to replay trx: source: 8c76e5c3-7b99-11e4-b260-4be15515cbc1 version: 3 local: 1 state: REPLAYING flags: 1 conn_id: 43055 trx_id: 1783377670 seqnos (l: 80620, g: 204996287, s: 204996283, d: 204996171, ts: 21382320446607328)
2014-12-04 14:58:05 16834 [Warning] WSREP: Failed to apply trx 204996287 4 times
2014-12-04 14:58:05 16834 [ERROR] WSREP: trx_replay failed for: 6, query: void
2014-12-04 14:58:05 16834 [ERROR] Aborting

This error almost always shows 'Delete_rows apply warning', however the general-log show no delete queries at all.
The average life of crashing node is about 15-20 minutes, however when I run it inside the gdb trying to catch the problem, it works for days without any problems, which makes the situation even more strange.

Percona Agent Proxy Support

Lastest Forum Posts - December 5, 2014 - 11:01am
I receive this error while installing the Percona Agent:
Error: API connection timeout (10s): Get https://v2-cloud-api.percona.com/ping: dial tcp 162.220.4.204:443: i/o timeout

We can't get out on 443 without using our proxy. Can I install the agent somehow so that it uses the proxy?

Streamlined Percona XtraDB Cluster (or anything) testing with Consul and Vagrant

Latest MySQL Performance Blog posts - December 5, 2014 - 4:00am
Introducing Consul

I’m always interested in what Mitchell Hashimoto and Hashicorp are up to, I typically find their projects valuable.  If you’ve heard of Vagrant, you know their work.

I recently became interested in a newer project they have called ‘Consul‘.  Consul is a bit hard to describe.  It is (in part):

  • Highly consistent metadata store (a bit like Zookeeeper)
  • A monitoring system (lightweight Nagios)
  • A service discovery system, both DNS and HTTP-based. (think of something like haproxy, but instead of tcp load balancing, it provides dns lookups with healthy services)
What this has to do with Percona XtraDB Cluster

I’ve had some more complex testing for Percona XtraDB Cluster (PXC) to do on my plate for quite a while, and I started to explore Consul as a tool to help with this.  I already have Vagrant setups for PXC, but ensuring all the nodes are healthy, kicking off tests, gathering results, etc. were still difficult.

So, my loose goals for Consul are:

  • A single dashboard to ensure my testing environment is healthy
  • Ability to adapt to any size environment — 3 node clusters up to 20+
  • Coordinate starting and stopping load tests running on any number of test clients
  • Have the ability to collect distributed test results

I’ve succeeded on some of these fronts with a Vagrant environment I’ve been working on. This spins up:

  • A Consul cluster (default is a single node)
  • Test server(s)
  • A PXC cluster

Additionally, it integrates the Test servers and PXC nodes with Consul such that:

  • The servers setup a Consul agent in client mode to the  Consul cluster
  • Additionally, they setup a local DNS forwarder that sends all DNS requests to the ‘.consul’ domain to the local agent to be serviced by the Consul cluster.
  • The servers register services with Consul that run local health checks
  • The test server(s) setup a ‘watch’ in consul to wait for starting sysbench on a consul ‘event’.
Seeing it in action

Once I run my ‘vagrant up’, I get a consul UI I can connect to on my localhost at port 8501:

Consul’s Node Overview

 

I can see all 5 of my nodes.  I can check the services and see that test1 is failing one health check because sysbench isn’t running yet:

Consul reporting sysbench is not running.

This is expected, because I haven’t started testing yet.  I can see that my PXC cluster is healthy:

 

Health checks are using clustercheck from the PXC package

 

Involving Percona Cloud Tools in the system

So far, so good.  This Vagrant configuration (if I provide a PERCONA_AGENT_API_KEY in my environment) also registers my test servers with Percona Cloud Tools, so I can see data being reported there for my nodes:

Percona Cloud Tool’s Dashboard for a single node

So now I am ready to begin my test.  To do so, I simply need to issue a consul event from any of the nodes:

jayj@~/Src/pxc_consul [507]$ vagrant ssh consul1 Last login: Wed Nov 26 14:32:38 2014 from 10.0.2.2 [root@consul1 ~]# consul event -name='sysbench_update_index' Event ID: 7c8aab42-fd2e-de6c-cb0c-1de31c02ce95

My pre-configured watchers on my test node knows what to do with that event and launches sysbench.  Consul shows that sysbench is indeed running:

 

And I can indeed see traffic start to come in on Percona Cloud Tools:

I have testing traffic limited for my example, but that’s easily tunable via the Vagrantfile.  To show something a little more impressive, here’s a 5 node cluster running hitting around 2500 tps total throughput:

So to summarize thus far:
  • I can spin up any size cluster I want and verify it is healthy with Consul’s UI
  • I can spin up any number of test servers and kick off sysbench on all of them simultaneously
Another big trick of Consul’s

That so far so good, but let me point out a few things that may not be obvious.  If you check the Vagrantfile, I use a consul hostname in a few places.  First, on the test servers:

# sysbench setup 'tables' => 1, 'rows' => 1000000, 'threads' => 4 * pxc_nodes, 'tx_rate' => 10, 'mysql_host' => 'pxc.service.consul'

then again on the PXC server configuration:

# PXC setup "percona_server_version" => pxc_version, 'innodb_buffer_pool_size' => '1G', 'innodb_log_file_size' => '1G', 'innodb_flush_log_at_trx_commit' => '0', 'pxc_bootstrap_node' => (i == 1 ? true : false ), 'wsrep_cluster_address' => 'gcomm://pxc.service.consul', 'wsrep_provider_options' => 'gcache.size=2G; gcs.fc_limit=1024',

Notice ‘pxc.service.consul’.  This hostname is provided by Consul and resolves to all the IPs of the current servers both having and passing the ‘pxc’ service health check:

[root@test1 ~]# host pxc.service.consul pxc.service.consul has address 172.28.128.7 pxc.service.consul has address 172.28.128.6 pxc.service.consul has address 172.28.128.5

So I am using this to my advantage in two ways:

  1. My PXC cluster bootstraps the first node automatically, but all the other nodes use this hostname for their wsrep_cluster_address.  This means: no specific hostnames or ips in the my.cnf file, and this hostname will always be up to date with what nodes are active in the cluster; which is the precise list that should be in the wsrep_cluster_address at any given moment.
  2. My test servers connect to this hostname, therefore they always know where to connect and they will round-robin (if I have enough sysbench threads and PXC nodes) to different nodes based on the response of the dns lookup, which returns 3 of the active nodes in a different order each time.
(Some of) The Issues

This is still a work in progress and there are many improvements that could be made:

  • I’m relying on PCT to collect my data, but it’d be nice to utilize Consul’s central key/value store to store results of the independent sysbench runs.
  • Consul’s leader election could be used to help the cluster determine which node should bootstrap on first startup. I am assuming node1 should bootstrap.
  • A variety of bugs in various software still makes this a bit clunky sometimes to manage.  Here is a sample:
    • Consul events sometimes don’t fire in the current release (though it looks to be fixed soon)
    • PXC joining nodes sometimes get stuck putting speed bumps into the automated deploy.
    • Automated installs of percona-agent (which sends data to Percona Cloud Tools) is straight-forward, except when different cluster nodes clobber each other’s credentials.

So, in summary, I am happy with how easily Consul integrates and I’m already finding it useful for a product in its 0.4.1 release.

The post Streamlined Percona XtraDB Cluster (or anything) testing with Consul and Vagrant appeared first on MySQL Performance Blog.

MySQL and OpenStack deep dive: Dec. 10 webinar

Latest MySQL Performance Blog posts - December 4, 2014 - 1:59pm

Fact: MySQL is the most commonly used database in OpenStack deployments. Of course that includes a number of MySQL variants – standard MySQL by Oracle, MariaDB, Percona Server, MySQL Galera, Percona XtraDB Cluster, etc.

However, there are many misconceptions and myths around the pros and cons of these MySQL flavors. Join me and my friend Jay Pipes of Mirantis next Wednesday (Dec. 10) at 10 a.m. Pacific and we’ll dispel some of these myths and provide a clearer picture of the strengths and weaknesses of each of these flavors.

This free Percona webinar, titled “MySQL and OpenStack Deep Dive,” will also illuminate the pitfalls to avoid when migrating between MySQL flavors – and what architectural information to take into account when planning your OpenStack MySQL database deployments.

We’ll also discuss replication topologies and techniques, and explain how the Galera Cluster variants differ from standard MySQL replication.

Finally, in the latter part of the session, we’ll take a deep dive into MySQL database performance analysis, diving into the results of a Rally run showing a typical Nova workload. In addition, we’ll use Percona Toolkit’s famed pt-query-digest tool to determine if a synchronously replication database cluster like the free Percona XtraDB Cluster is a good fit for certain OpenStack projects.

The webinar is free but I encourage you to register now to reserve your spot. See you Dec. 10! In the meantime, learn more about the new annual OpenStack Live Conference and Expo which debuts April 13-14 in the heart of Silicon Valley. If you register now you’ll save with Early Bird pricing. However, one lucky webinar attendee will win a full pass! So be sure to register for next week’s webinar now for your chance to win! (Click here!) The winner will be announced at the end of the webinar.

The post MySQL and OpenStack deep dive: Dec. 10 webinar appeared first on MySQL Performance Blog.

IST or SST interrupted

Lastest Forum Posts - December 4, 2014 - 6:24am
Hi,

I have a 3-node cluster, each node is in a different datacenter.

Sometimes connectivity is intermittent between two nodes. When this happens, one node goes into recovery while a second becomes the donor. The remaining node stays online as expected.

The problem is that, if the connectivity breaks again while the donor is repairing the failed node, the 3 nodes seem to hang in their respective states,. That is, the donor never leaves donor mode, and the failed node is never recovered.

The only solution is to kill the donor manually (a 'mysql stop' just hangs) and then recover the donor first, but that means the entire cluster is effectively down as the last remaining node then becomes a donor itself. Obviously this is not ideal.

So, the question is, what is the expected behaviour under these conditions? I would expect if a recovery is interrupted, the donor would online himself and then the recovery of the failed node would begin again when connectivity is restored? Is that correct?

Sneak peek at the Percona Live MySQL Conference & Expo 2015

Latest MySQL Performance Blog posts - December 4, 2014 - 5:55am

You know you’ll be there so why not save some $$ by registering now for the Percona Live MySQL Conference & Expo 2015 (April 13-16 in Santa Clara, Calif.). Super Saver registration discounts are available through Dec. 14 at 11:30 p.m. PST. (That’s just 10 days away!)

What to expect this year? The Percona Live 2015 conference committee is putting together another fantastic event for the global MySQL community’s next meeting in April. The full conference agenda will be announced in January, but the initial roster includes:

  • Sunny Bains, Senior Engineering Manager at Oracle; “InnoDB 5.7- What’s New”
  • Yoshinori Matsunobu, Database Engineer at Facebook; “Fast Master Failover Without Data Loss”
  • Jeremy Cole, Senior Systems Engineer at Google, Inc.; “Exploring Your Data With InnoDB Explorer”
  • Tom Krouper, Staff Database Administrator at Twitter; “Upgrading to MySQL 5.6 @ Scale”
  • Jenni Snyder, Database Administrator at Yelp; “Schema changes multiple times a day? OK!”
  • Ike Walker, Database Architect at Flite; “Assembling the Perfect MySQL Toolbox”
  • Jean-François Gagné, Senior System Engineer/Architect at Booking.com; “Binlog Servers at Booking.com”
  • Jeremy Glick, Lead DBA at MyDBAteam, and Andrew Moore, MySQL DBA at Percona; “Using MySQL Audit Plugins and Elasticsearch ELK”
  • Tomáš Komenda, Team Lead and Database Specialist, and Lukáš Putna, Senior Developer and Database Specialist at Seznam.cz; “MySQL and HBase Ecosystem for Real-time Big Data Overviews”
  • Alexander Rubin, Principal Consultant at Percona; “Advanced MySQL Query Tuning”

And while the call for papers deadline has expired, there are still sponsorship opportunities available for the world’s largest annual MySQL event. Sponsors become a part of a dynamic and growing ecosystem and interact with more than 1,000 DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

Current sponsors include:

  • Diamond Plus: VMware
  • Gold: Codership, Pythian
  • Silver: Box, SpringbokSQL, Yelp
  • Exhibit Only: FoundationDB, Severalnines, Tokutek, VividCortex
  • Other Sponsors: MailChimp
  • Media Sponsor: Database Trends & Applications , Datanami, InfoQ , Linux Journal, O’Reilly Media

Percona Live 2015 will feature a variety of formal tracks and sessions related to High Availability, DevOps, Programming, Performance Optimization, Replication and Backup, MySQL in the Cloud, MySQL and NoSQL, MySQL Case Studies, Security, and What’s New in MySQL.

As usual the conference will be held in the heart of Silicon Valley at the Hyatt Regency Santa Clara and Santa Clara Convention Center. But this year Percona has also unveiled OpenStack Live 2015, a new conference that will run in parallel with Percona Live MySQL Conference & Expo 2015 on April 13 and 14.

And don’t forget, Super Saver registration discounts are available through Dec. 14 at 11:30 p.m. PST. I hope to see you in Santa Clara!

The post Sneak peek at the Percona Live MySQL Conference & Expo 2015 appeared first on MySQL Performance Blog.

mysql with pool of threads : explanation on execution priority strategy

Lastest Forum Posts - December 4, 2014 - 5:28am
Hi percona members,

I have one production server with Percona-Server 56-5.6.20-rel68

my configuration is simple :
<my.cnf>
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
#skip_bdb
key_buffer_size = 32M
max_allowed_packet = 32M
sort_buffer_size = 4M
read_buffer_size = 4M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 12
query_cache_size =256M
query_cache_limit = 8M
wait_timeout = 604800
max_connections = 1000
tmp_table_size = 64M
max_heap_table_size = 64M
thread_concurrency = 8
lower-case-table-names=1
thread_handling=pool-of-threads
thread_pool_size=8
datadir = /var/lib/mysql
innodb_data_file_path = ibdata1:10M:autoextend
innodb_buffer_pool_size = 3G
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_lock_wait_timeout = 120
innodb_flush_log_at_trx_commit = 2
innodb_lru_scan_depth=2500
innodb_flush_neighbors=0
innodb_io_capacity = 2500
innodb_io_capacity_max= 5000
innodb_file_format = Barracuda
innodb_checksum_algorithm = crc32
innodb_file_per_table = true
innodb_doublewrite=1
innodb_flush_method=O_DIRECT_NO_FSYNC
</my.cnf>

so, I use pool-of-threads with many default values like thread_pool_high_prio_mode = transactions.
at 99%, I'm really happy with this implementation (I hava migrated two month ago from a mysql 5.1) but I encounter a strange behaviour
with one batch that create some short bootleneck on all applications. one time per hour, this batch make many requests (500/1000 rqs) during 30/40s.
These requests are some simple select with where clause on primary key and return one row

<example rqs>
# Schema: tatex_agence Last_errno: 0 Killed: 0
# Query_time: 0.000204 Lock_time: 0.000113 Rows_sent: 1 Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 2979 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 8CFDA4B9
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 1
<example rqs>

what's I don't understand when this batch run other client of the database can't execute some request like
insert or complex select and only simple select (primary key in where clause, no join, no temp, no sort,..) are executed.
So, according to me as I don't have lock on table, no cpu/ram congestion... my only explanation is
thread_pool_high_prio_mode = transaction in this case is too strict and only process "open transaction" and
simple request.
if my analyze is good, I can put in place some workaround like :
- manage transaction / commit on the batch to force new transcation
- just add a small sleep between all select
- make a big select with a in (key1, key2) in where clause

but, I'm interresting to know if my analyze is good and if there are a pure mysql tuning solution ?

br

help me with update records in mysql database

Lastest Forum Posts - December 3, 2014 - 7:46pm
I have a problem .
I 've been trying for a long time to make an update for php mysql to change the data.
but every time I do not manage to make it work with a form.
but it works if I only if I put this ($ sql = "UPDATE users SET username = 'value' WHERE id = 10 " ; )
so it only works when I put the value of the id.

but I want in an html form to indicate what I want to change and what id goes.

but I have tried so long that I do not feel like I so want someone help me.

make the same database and same as my records and make the code and test it if it works show me please

my database name : web test
my table called : users
my records are called :

id int ( 11) AUTO_INNCREMENT
username , varchar ( 255 )
password , varchar ( 255 )
first_name , varchar ( 255 )
last_name , varchar ( 255 )
email, varchar ( 255 )
Age, int ( 11)






Look, my update.php is like this now
<?php
$servername = "localhost";
$username = "root";
$password = ".....";
$dbname = "webtest";


$conn = new mysqli($servername, $username, $password, $dbname);

if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}

$sql = "UPDATE users SET password='cotton candy' WHERE id=10";

if ($conn->query($sql) === TRUE) {
echo "Record updated successfully";
} else {
echo "Error updating record: " . $conn->error;
}

$conn->close();
?>
but now i have still have to go into the php file to change the valeu or the id
but i looked on site and youtube how to put it in a simple html form
but it still does not work.

i want it in a html from.
I want that when I enter the ID that the data of the user appears and that I can change any valeu separately.


please help me out!!




InnoDB: Operating system error number 24 in a file operation

Lastest Forum Posts - December 3, 2014 - 4:39pm
Hi Guys,

I was trying to make MySQL backup using xtrabackup and I kept getting following error:

7f9c33d0a720 InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/...ror-codes.html
InnoDB: Error: could not open single-table tablespace file ./optiweber2/top_keyword_urls#P#p61.ibd
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
innobackupex: Error: The xtrabackup child process has died at /usr/bin/innobackupex line 2672.

However, the issue is resolved when, I simply increase the 'open-files-limit' in my.cnf from

open-files-limit = 131072

to

open-files-limit = 151072

Please note, MySQL has not picked up these changes, its merely editing my.cnf file.

And, my systems open file hard limit is set to 131072 as well.

~ # cat /etc/security/limits.conf

* soft nofile 131072
* hard nofile 131072

~ # ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 118674
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 131072
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 118674
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

The solution did not make sense. Posting it here to get some insight.

Thanks,
Ashim

InnoDB: Operating system error number 24 in a file operation

Lastest Forum Posts - December 3, 2014 - 4:38pm
Hi Guys,

I was trying to make MySQL backup using xtrabackup and I kept getting following error:

7f9c33d0a720 InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/...ror-codes.html
InnoDB: Error: could not open single-table tablespace file ./optiweber2/top_keyword_urls#P#p61.ibd
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
innobackupex: Error: The xtrabackup child process has died at /usr/bin/innobackupex line 2672.

However, the issue is resolved when, I simply increase the 'open-files-limit' in my.cnf from

open-files-limit = 131072

to

open-files-limit = 151072

Please note, MySQL has not picked up these changes, its merely editing my.cnf file.

And, my systems open file hard limit is set to 131072 as well.

Excerpt of /etc/security/limits.conf

* soft nofile 131072
* hard nofile 131072

~ # ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 118674
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 131072
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 118674
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

The solution did not make sense. Posting it here to get some insight.

Thanks,
Ashim

Auto-bootstrapping an all-down cluster with Percona XtraDB Cluster

Latest MySQL Performance Blog posts - December 3, 2014 - 4:01am

One new feature in Percona XtraDB Cluster (PXC) in recent releases was the inclusion of the ability for an existing cluster to auto-bootstrap after an all-node-down event.  Suppose you lose power on all nodes simultaneously or something else similar happens to your cluster. Traditionally, this meant manually re-bootstrapping the cluster, but not any more.

How it works

Given the above all-down situation, if all nodes are able to restart and see each other such that they all agree what the state was and that all nodes have returned, then the nodes will make a decision that it is safe for them to recover PRIMARY state as a whole.

This requires:

  • All nodes went down hard — that is; a kill -9, kernel panic, server power failure, or similar event
  • All nodes from the last PRIMARY component are restarted and are able to see each other again.
Demonstration

Suppose I have a 3 node cluster in a stable state. I then kill all nodes simultaneously (simulating a power failure or similar event):

[root@node1 ~]# killall -9 mysqld [root@node2 ~]# killall -9 mysqld [root@node3 ~]# killall -9 mysqld

I can see that each node maintained a state file in its datadir called ‘gvwstate.dat’. This contains the last known view of the cluster:

[root@node1 ~]# cat /var/lib/mysql/gvwstate.dat my_uuid: 78caedfe-75a5-11e4-ac69-fb694ee06530 #vwbeg view_id: 3 78caedfe-75a5-11e4-ac69-fb694ee06530 9 bootstrap: 0 member: 78caedfe-75a5-11e4-ac69-fb694ee06530 0 member: 87da2387-75a5-11e4-900f-ba49ecdce584 0 member: 8a25acd8-75a5-11e4-9e22-97412a1263ac 0 #vwend

This file will not exist on a node if it was shutdown cleanly, only if the mysqld was uncleanly terminated. This file should exist and be the same on all the nodes for the auto-recovery to work.

I can now restart all 3 nodes more or less at the same time. Note that none of these nodes are bootstrapping and all of the nodes have the wsrep_cluster_address set to a proper list of the nodes in the cluster:

[root@node1 ~]# service mysql start [root@node2 ~]# service mysql start [root@node3 ~]# service mysql start

I can indeed see that they all start successfully and enter the primary state:

[root@node1 ~]# mysql -e "show global status like 'wsrep_cluster%'" +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 0 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 1ba6f69a-759b-11e4-89ba-62a713a26cd1 | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+ [root@node2 ~]# mysql -e "show global status like 'wsrep_cluster%'" +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 0 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 1ba6f69a-759b-11e4-89ba-62a713a26cd1 | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+ [root@node3 ~]# mysql -e "show global status like 'wsrep_cluster%'" +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 0 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 1ba6f69a-759b-11e4-89ba-62a713a26cd1 | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+

Checking the logs, I can see this indication that the feature is working:

2014-11-26 19:59:36 1809 [Note] WSREP: promote to primary component 2014-11-26 19:59:36 1809 [Note] WSREP: view(view_id(PRIM,78caedfe,13) memb { 78caedfe,0 87da2387,0 8a25acd8,0 } joined { } left { } partitioned { }) 2014-11-26 19:59:36 1809 [Note] WSREP: save pc into disk 2014-11-26 19:59:36 1809 [Note] WSREP: clear restored view

Changing this behavior

This feature is enabled by default, but you can toggle it off with the pc.recovery setting in the wsrep_provider_options

This feature helps cover an edge case where manual bootstrapping was necessary in the past to recovery properly. This feature was added in Percona XtraDB Cluster version 5.6.19, but was broken due to this bug.  It was fixed in PXC 5.6.21

The post Auto-bootstrapping an all-down cluster with Percona XtraDB Cluster appeared first on MySQL Performance Blog.

How to integrate handlersocket with xtradbcluster

Lastest Forum Posts - December 3, 2014 - 1:10am
I have configured with five machines(keepalived+haproxy+xtradbcluster,It works well)
I want to integrate handlersocket with xtradbcluster.
I have installed handersocket in xtradbcluster and I am sure it works in single node!
I have modified the file haproxy.cfg , as follows:


listen mysql-cluster 0.0.0.0:33306
mode tcp
balance roundrobin
option httpchk
server 192.168.10.131 192.168.10.131:9998 check port 9200 inter 12000 rise 3 fall 3
server 192.168.10.132 192.168.10.132:9998 check port 9200 inter 12000 rise 3 fall 3
server 192.168.10.143 192.168.10.143:9998 check port 9200 inter 12000 rise 3 fall 3

listen mysql-cluster-failover :43306
mode tcp
balance leastconn
option httpchk
server 192.168.10.131 192.168.10.131:9999 check port 9200
server 192.168.10.132 192.168.10.132:9999 check port 9200 backup
server 192.168.10.143 192.168.10.143:9999 check port 9200 backup




Do I need some modification in xtradbcluster( with 3 nodes)?

I found another problem , as follows:
https://bugs.launchpad.net/percona-x...r/+bug/1375897
how to fix it ?
I am looking forwarding your reply.
many thanks!

Best way for search tags

Lastest Forum Posts - December 3, 2014 - 12:52am
I have a one column string(255) filled by on word - tag name, for example, radio or music. What the best way (best search result and search time) to find tags? user can input "ra" or "rad" and radio word must be present in results. Thank you.

Is the Memcache extension baked in like Handlersocket was?

Lastest Forum Posts - December 2, 2014 - 1:12pm
We've been invested in the Handlersocket capabilities in Percona for a while, but it seems with Oracle deciding to prop up memcache instead, the delayed integration of Handlersocket by Percona in 5.6 until just recently, and the seemingly low profile it's getting everywhere else, it just feels like the memcache route will win out in the end. Correct me if I'm wrong!

Our application is abstracted from the DB enough that it won't be too much work to write a new memcache layer to replace the handlersocket one but I'm having a hard time finding information on the memcache plugin beyond the interface itself. My biggest question is: is the memcache extension for mysql baked in like handlersocket was?

Thanks,

Kevin.

Error The server quit without updating PID file

Lastest Forum Posts - December 2, 2014 - 7:29am
Hello,
I have Percona XtraDB Cluster 5.6 with Ubuntu Server 14.04.1 LTS.
I have 3 nodes. Everything worked perfectly for 5 months
Today I find all 3 nodes with mysql server stopped.
When I try to start mysql service I receive this error:

* Starting MySQL (Percona XtraDB Cluster) database server mysqld
* The server quit without updating PID file (/var/run/mysqld/mysqld.pid).

This error occour on all nodes.
Can you help me ?

My my.cnf file have this configuration:


#
# MySQL Configuration
#

[mysql]

# CLIENT #
port = 3306
socket = /var/run/mysqld/mysqld.sock


[mysqld]

# GENERAL #
user = mysql
default_storage_engine = InnoDB
socket = /var/run/mysqld/mysqld.sock
pid-file = /var/run/mysqld/mysqld.pid
bind-address = 0.0.0.0

# MyISAM #
key-buffer-size = 32M
myisam-recover = FORCE,BACKUP


# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000


# DATA STORAGE #
#basedir = /usr
datadir = /var/lib/mysql
#tmpdir = /tmp


# BINARY LOGGING #
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 3
sync-binlog = 1


# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 4096


# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 256M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1
innodb-buffer-pool-size = 2G


# LOGGING #
log-error = /var/lib/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/lib/mysql/mysql-slow.log


# Percona XtraDB Cluster Settings #
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_cluster_address = gcomm://10.11.20.78,10.11.20.79,10.11.20.80
binlog_format = ROW
wsrep_node_address = 10.12.20.80
wsrep_sst_method = xtrabackup-v2
wsrep_cluster_name = cluster-mysql
wsrep_sst_auth = "xxxxx"


!includedir /etc/mysql/conf.d/




Tnx
Manuel

How to sync table schema?

Lastest Forum Posts - December 2, 2014 - 6:37am
Hi,

Form what I read on pt-table-sync documentation, it is not possible to sync table structure from server1 db.table to server2 db.table? Are there any work-arounds or alternatives to sync the table structure?

Thanks

Mysql Incremental Backup and restored

Lastest Forum Posts - December 2, 2014 - 2:52am
I want to take mysql incremental backup and restored it on another server. Now i am taking incremental backup but it is not restored. can you help or send me detail step by step.

Tips from the trenches for over-extended MySQL DBAs

Latest MySQL Performance Blog posts - December 2, 2014 - 12:00am

This post is a follow-up to my November 19 webinar, “Tips from the Trenches: A Guide to Preventing Downtime for the Over-Extended DBA,” during which I described some of the most common reasons DBAs experience avoidable downtime. The session was aimed at the “over-stretched DBA,” identified as the MySQL DBA short of time or an engineer of another discipline without the depth of the MySQL system. The over-stretched DBA may be prone to making fundamental mistakes that cause downtime through poor response time, operations that cause blocking on important data or administrative mishaps through the lack of best practice monitoring and alerting. (You can download my slides and view the recorded webinar here.)

Monitor the things
One of the aides to keeping the system up and running is ensuring that your finger is on the pulse of the environment. Here on the Percona Managed Services team, we leverage Percona Monitoring Plugins (open source plugins for Nagios, Cacti and Zabbix) to ensure we have visibility of our client’s operations. Having a handle on basics such as disk space, memory usage and MySQL operational metrics ensures that we avoid trivial downtime that would affect the client’s uptime or worse, their bottom line.

Road Blocks
One of the most common reasons that an application is unable to serve data to its end user is that access to a table is being blocked due to another ongoing operation. This can be blamed on a variety of sources: backups, schema changes, poor configuration and long running transactions can all lend themselves to costly blocking. Understanding the impact of actions on a MySQL server can be the difference between a happy end user and a frustrated one.

During the webinar I made reference to some resources and techniques that can assist the over extended DBA avoid downtime and here are some highlights….

Monitoring and Alerting
It’s important that you have some indications that something is reaching its capacity. It might be the disk, connections to MySQL or auto_increment limit on a highly used table. There is quite the landscape to cover but here are a handful of helpful tools:
* Percona Monitoring Plugins
* Monyog
* New Relic

Query Tuning
Poorly performing SQL can be indicative that the configuration is incorrect, that there’s a missing index or that your development team needs a quick lesson on MySQL anti-patterns. Arm yourself with proof that the SQL statements are substandard using these resources and work with the source to make things more efficient:
* Percona Cloud Tools
* pt-query-digest, explain, indexes

High Availability
If you need to ensure that your application survives hiccups such as hardware failure or network impairment, a well deployed HA solution will give you the peace of mind that you can quickly mitigate bumps in the road.
* MHA
Percona XtraDB Cluster, Galera
* Percona Replication Manager
* LinuxHA/Corosync/DRBD

Backups
A wise man once quoted “A backup today saves you tomorrow.” Covering all bases can be the difference between recovering from a catastrophic failure and job hunting. Mixing logical, physical and incremental backups while adding in some offsite copies can provide you with the safety net in the event that a small mistake like a dropped table is met or worse, all working copies of data and backups are lost in a SAN failure. It happens so be prepared.
* Percona XtraBackup
* mydumper
* mysqldump
* mysqlbinlog (5.6)
* mylvmbackup

We had some great questions from the attendees and regrettably were unable to answer them all, so here are some of them with my response.

Q: I use MySQL on Amazon RDS. Isn’t much of the operations automated or do these tips still apply?
A: It’s not completely automated. There are still challenges to address and configuration opportunities, but understanding the limitations of RDS is key. For example, the location and size of the tmpdir is something you are unable to customise on RDS. You would typically review this config in a production environment if your workload required it. Any costly queries that perform operations requiring tmp area to sort (think OLAP) might not be a good fit on RDS due to this limitation. Getting to know the limitations around hosted or DBaaS services is time well spent to avoid explaining what keeps taking the application down in peak hours.

Q: What other parts of Percona Toolkit do you recommend for MySQL operations?
A: Percona Toolkit is a well-evolved suite of tools that all MySQL DBAs should familiarize themselves with. In particular I will fit many tools into my weekly workflow:

Operations

  • pt-online-schema-change
  • pt-table-checksum
  • pt-table-sync

Troubleshooting

  • pt-stalk
  • pt-pmp
  • pt-config-diff

Knowledge Gathering

  • pt-summary
  • pt-mysql-summary
  • pt-duplicate -key-checker

The key with Percona Toolkit is that many common tasks or problems that could cause you to reinvent the wheel are covered, mature and production ready. As with any tool, you should always read the label or in this case the documentation so you’re well aware what the tools can do, the risks and the features that you can make use of.

Q: HA – are there any solutions that you would stay away from?
A: Using any particular HA solution is going to be another R&D exercise. You will need to understand the tradeoffs, configuration options and compare between products. Some might have a higher TCO or lack functionality. Once the chosen solution is implemented it’s pertinent that the engineers understand the technology to be able to troubleshoot or utilize the functionality in the situation where failover needs to be instigated. I like HA solutions to be fast to failover to and some entail starting MySQL from cold.

Q: You mentioned having tested backups. How do you perform this?
A: Percona’s method is using a dedicated host with access to the backup files. Then with a combination of mysqlsandbox and pt-table-checksum we can discover if we trust the files we capture for disaster recovery. Many people underestimate the importance of this task.

Q: Percona Cloud Tools – how much does it cost?
A: Right now it’s a free service. Visit cloud.percona.com for more information, but in a nutshell Percona Cloud Tools is a hosted service providing access to query performance insights for all MySQL uses.

Q: Is there API access to Percona Cloud Tools for application integration?
A: There is currently not a public API available. It is on the roadmap, though. We’d be interested to hear more about your use case so please sign up for the service and try it out. After signing in, all pages include a Feedback link to share your thoughts and ideas such as how you’d like to use a public API.

Q: Can you use MHA with Percona XtraDB Cluster?
A: MHA is not something that can be used with Percona XtraDB Cluster (PXC). It’s common to partner PXC with HAProxy for making sure your writes are going to the appropriate node.

Q: Can MHA make automatic failover? If MHA has automatic failover, what do you recommend? Configure it for automatic failover?
A: MHA can make an automatic failover. Personally I prefer managed failover. When working with automated failover it’s important that failback is manual to avoid “flapping.” “Splitbrain” is an ailment that you don’t want to suffer from as well and auto failover removes the human judgment from the decision to relocate operations from a failed node onto a standby node. If you are going to vote for an automatic failover it is advised to test all potential failure scenarios and to employ a STONITH method to really ensure that the unresponsive node is not serving read/write traffic.

Q: What is the best way to detect database blocking from DML statements? Is there a tool that will show blocking after the fact so you don’t have to catch it real-time?
A: Once again, Percona has a tool called pt-deadlock-logger that can detect and log deadlocks. Detecting locking can be achieved using “SHOW ENGINE INNODB STATUS” or utilizing the information_schema.innodb_locks table. Some engineering might be required for this to be logged but those resources exist for use.

Q: Since you mentioned tinkering with ELK I was wondering if you had any tips on good Kibana dashboards to build to monitor MySQL databases/clusters?
A: ELK is something that I’m looking to publish some information on soon so watch this space!

Thanks again everyone for the great questions! And as a reminder, you can download my slides and view the recorded webinar here.

The post Tips from the trenches for over-extended MySQL DBAs appeared first on MySQL Performance Blog.

Pages

Subscribe to Percona aggregator
]]>