GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Read performance (Percona < Master/Slave) ?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Read performance (Percona < Master/Slave) ?

    Hi there,

    I have got a strange problem, and I don't even know if Percona or MySQL is responsible for it. But maybe you can help me nevertheless.

    We have implemented a test-cluster consisting of three Percona-nodes. All seems quite well and we would like to change production from our MySQL-Master/Slave-Replication-system to the cluster very soon. But we have some questions regarding the performance of the cluster, especially SELECTs; the write performance seems okay. According to our results, SELECT/SHOWs on the test-cluster are taking three or four times more time than in our production system.

    For example a "SHOW TABLE STATUS;" is varying between 180 and 400 seconds. On the production system it takes never above 60 seconds.

    Our question now: Is it normal, that the Percona-cluster is slower than a Master/Slave-environment? Of course, there is no more query_cache, but we thought that with reads we should have a little better performance and rather some issues with the write performance.


    What seems rather strange to me is additionally the following: During "SHOW TABLE STATUS;" for example, the I/O-load is very heavy... so I think the database has some problems with it's data storage files (which are SAN-volumes). Is there anything for me to check this?

    --> The first two of the 8 cores during the query:
    Cpu0 : 2.7%us, 1.3%sy, 0.0%ni, 11.0%id, 84.9%wa, 0.0%hi, 0.0%si, 0.0%st
    Cpu1 : 0.3%us, 0.3%sy, 0.0%ni, 33.5%id, 65.8%wa, 0.0%hi, 0.0%si, 0.0%st


    All nodes have 8 cores and are equipped with 24 GB memory. They all have a ramdisk of 8 GB for temporary tables.
    The database (production and test) is something around 70 GB and completely InnoDB. The tables are divided among several files (file_per_table) on the volumes.

    The test-cluster nodes are running with "mysqld Ver 5.5.28-55-log for Linux on x86_64 (Percona XtraDB Cluster (GPL), wsrep_23.7.r3821)".

    The production environment servers are running with:
    W: "mysqld Ver 5.1.63-0+squeeze1-log for debian-linux-gnu on x86_64 ((Debian))"
    RO: "mysqld Ver 5.1.63-0+squeeze1 for debian-linux-gnu on x86_64 ((Debian))"

    All servers a running under Debian 6.0.6 x64.


    On the production system one system is responsible for handling INSERT/UPDATEs, the results are replicated to the second system. All SELECTs (if they aren't fired in a write-transaction of course) are fired against the slave.

    On the new Percona-cluster we have one node, which is responsible for handling the INSERT/UPDATEs (and the transactions). All three nodes on the other hand can handle simple SELECTs. These are getting distributed via haproxy/roundrobin to all of them. (For the performance tests, haproxy has been completely omitted of course, all queries were running directly on the servers).


    The following my.cnf is the same for all nodes of the Percona test-cluster (only comments removed):

    [client]port = 3306socket = /var/run/mysqld/mysqld.sock[mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0log-error = /var/log/mysql/mysql.err [mysqld]user = mysqlpid-file = /var/run/mysqld/mysqld.pidsocket = /var/run/mysqld/mysqld.sockport = 3306basedir = /usrdatadir = /var/lib/mysqltmpdir = /var/lib/ramdiskskip-external-locking bind-address = 192.168.0.151 key_buffer_size = 64Mmax_allowed_packet = 1024Mmax_connect_errors = 1000000thread_stack = 192Kthread_cache_size = 50myisam_recover = BACKUPtmp_table_size = 128Mmax_heap_table_size = 128Mquery_cache_type = 0query_cache_size = 0max_connections = 500open_files_limit = 65535table_definition_cache = 4096table_open_cache = 8192 gdb wsrep_provider = /usr/lib64/libgalera_smm.so#wsrep_provider_options = "gmcast.listen_addr=tcp://192.168.0.151;gcache.size=1G"#wsrep_cluster_addres s = gcomm://wsrep_provider_options = "gmcast.listen_addr=tcp://0.0.0.0;gcache.size=1G"wsrep_cluster_address = gcomm://192.168.0.152 wsrep_sst_receive_address = 192.168.0.151wsrep_slave_threads = 16wsrep_sst_auth = userasswsrep_sst_method = xtrabackup wsrep_cluster_name = dbclus01wsrep_node_name = node1wsrep_node_address = 192.168.0.151binlog_format = ROW default_storage_engine = InnoDBinnodb_autoinc_lock_mode = 2innodb_file_per_table = 1innodb_log_buffer_size = 8Minnodb_log_files_in_group = 4innodb_log_file_size = 256Minnodb_buffer_pool_size = 10Ginnodb_flush_method = O_DIRECTinnodb_flush_log_at_trx_commit = 0 expire_logs_days = 10 slow_query_log = 1slow_query_log_file = /var/log/mysql/mysql-slow.log [mysqldump]quickquote-namesmax_allowed_packet = 16M [mysql] [isamchk]key_buffer = 16M




    The Master node of the production environment:

    [client]port = 3306socket = /var/run/mysqld/mysqld.sock [mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0 [mysqld]user = mysqlpid-file = /var/run/mysqld/mysqld.pidsocket = /var/run/mysqld/mysqld.sockport = 3306basedir = /usrdatadir = /var/lib/mysqltmpdir = /var/lib/ramdisklanguage = /usr/share/mysql/englishskip-external-locking key_buffer = 2Mmax_allowed_packet = 16Mthread_stack = 192Kthread_cache_size = 30myisam-recover = BACKUPmax_connections = 300 table_cache = 2048query_cache_limit = 1Mquery_cache_size = 128M log_slow_queries = /var/log/mysql/mysql-slow.loglong_query_time = 5000server-id = 1binlog_format = ROWlog_bin = /var/log/mysql/mysql-bin.logexpire_logs_days = 10 binlog_do_db = xyz innodb_log_buffer_size = 8Minnodb_log_file_size = 6Minnodb_buffer_pool_size = 6144Minnodb_flush_method = O_DIRECT [mysqldump]quickquote-namesmax_allowed_packet = 16M [isamchk]key_buffer = 2M

    The Slave ReadOnly node of the production environment:

    [client]port = 3306socket = /var/run/mysqld/mysqld.sock [mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0 [mysqld]user = mysqlpid-file = /var/run/mysqld/mysqld.pidsocket = /var/run/mysqld/mysqld.sockport = 3306basedir = /usrdatadir = /var/lib/mysqltmpdir = /var/lib/ramdisk language = /usr/share/mysql/englishskip-external-lockingkey_buffer = 512Mmax_allowed_packet = 16Mthread_stack = 192Kthread_cache_size = 8myisam-recover = BACKUPtable_cache = 20480thread_concurrency = 8sort_buffer_size = 4Mread_buffer_size = 8Mread_rnd_buffer_size = 8Mmyisam_sort_buffer_size = 64Mquery_cache_size = 1600M server-id = 2master-host = 192.168.0.51master-user = usermaster-password = passmaster-port = 3306report-host = 192.168.0.52 expire_logs_days = 10max_binlog_size = 100Mignore_builtin_innodbplugin-load=innodb=ha_innodb_plugin.soplugin_dir=/usr/lib/mysql/plugin skip-slave-start innodb_data_home_dir = /var/lib/mysqlinnodb_data_file_path = ibdata1:10M:autoextendinnodb_log_group_home_dir = /var/lib/mysqlinnodb_log_files_in_group = 2innodb_log_file_size = 6Minnodb_buffer_pool_size = 10240Minnodb_flush_method = O_DIRECTinnodb_file_per_table = 1 [mysqldump]quickquote-namesmax_allowed_packet = 16M [isamchk]key_buffer = 16M

  • #2
    We are experiencing almost the same problem. We have percona cluster on 3 nodes (Centos 6 64bit).
    INSERTS seems ok, but SELECT statements are up to ten times slower than on similar master/slave servers.

    "Show table status" on all 3 nodes takes around 1,8 second, on similar non-Percona MySQL server takes around 0,08 second.

    Testing performance with mysqlslap on percona cluster takes around 21 seconds, on Mysql server it takes around 5 seconds.

    Only notable change in configuration files is, that percona cluster nodes doesn't have sql query cache.

    regards,
    Klemen

    Comment


    • #3
      Hi,

      we believe to have found the source for our bad read performance: Our SAN volumes have been the issue, the disk latency showed somewhat around 40ms. The issue for that has been a defunct deduplication on the volumes. Now the latency is something around 10ms during work-hours and the performance is much better and only a little bit weaker than the production system. We think that the production system benefits from the SAN cache.

      Meanwhile we are planning to get the cluster productive within the next weeks. I hope to keep you informed...

      Comment

      Working...
      X