Announcement

Announcement Module
Collapse
No announcement yet.

Poor Galera performance

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poor Galera performance

    Hi,

    I'm new to Galera, however I was able to setup 3 node cluster
    (using MariaDB or Percona). My problem is that when I go from single node to Galera (for example
    single node with wsrep* in my.cnf) performance have decreased about 10 fold (without wsrep node
    1000% faster).
    What may be wrong with my Galera setup ?
    thank you,
    Sergey

  • #2
    Hi Sergey,

    Can you post your my.cnf/wsrep.cnf configurations?

    --Dave

    Comment


    • #3
      Of course. I forgot to say - this slow down only for write queries. Galera reading performance is good. In production environment I have good hardware (Xeon cpu, RAID and so on). I made experiments on Virtual machines (usual Celeron CPU, SATA HDD) and have found the same slow down at transition from stand-alone server to Galera.

      [client]
      port = 3306
      socket = /var/run/mysqld/mysqld.sock

      [mysqld_safe]
      socket = /var/run/mysqld/mysqld.sock
      #nice = 0

      [mysqld]
      #CLUSTER
      wsrep_provider=/usr/lib/galera/libgalera_smm.so
      wsrep_provider_options="gcache.size=2G; gcache.page_size=1G;gcs.fc_limit = 256; gcs.fc_factor = 0.99;"
      wsrep_slave_threads= 32
      innodb_doublewrite=1
      query_cache_size=0
      query_cache_type=0
      wsrep_node_name="Gandalf"
      #wsrep_node_incoming_address=10.0.0.1
      #wsrep_sst_receive_address=10.0.0.1
      innodb_locks_unsafe_for_binlog=1
      wsrep_cluster_address=gcomm://10.0.0.1,10.0.0.6,10.0.0.3
      binlog_format=ROW
      auto_increment_offset = 1
      auto_increment_increment = 1
      wsrep_auto_increment_control = 0
      default_storage_engine=InnoDB
      innodb_autoinc_lock_mode=2
      wsrep_node_address=10.0.0.1
      wsrep_sst_method=xtrabackup-v2
      wsrep_cluster_name=DB_CLUSTER
      wsrep_sst_auth="sstuserroot:842655"
      #wsrep_sst_method=rsync
      wsrep_replicate_myisam=1

      #GENERAL
      user = mysql
      pid-file = /var/run/mysqld/mysqld.pid
      socket = /var/run/mysqld/mysqld.sock
      port = 3306
      basedir = /usr
      datadir = /var/lib/mysql
      tmpdir = /tmp
      lc-messages-dir = /usr/share/mysql
      skip-name-resolve
      default_time_zone='+04:00'
      collation-server = utf8_general_ci
      init-connect='SET NAMES utf8;'
      #init-connect='SET GLOBAL sql_mode="";'
      #init-connect='SET SESSION sql_mode="";'
      #init-connect='SET sql_mode="";'
      default-storage-engine = InnoDB
      character-set-server = utf8
      bind-address = 0.0.0.0
      #bind-address = 127.0.0.1

      #THREADS
      thread_handling = pool-of-threads
      #thread_pool_high_prio_mode = transactions
      thread_pool_size = 16
      thread-cache-size = 400

      # MyISAM #
      key-buffer-size = 32M
      myisam-recover = FORCE,BACKUP

      # SAFETY #
      max-allowed-packet = 16M
      max-connect-errors = 1000000

      # CACHES AND LIMITS #
      tmp-table-size = 32M
      max-heap-table-size = 32M
      query-cache-type = 0
      query-cache-size = 0
      max-connections = 1000
      open-files-limit = 65535
      table-definition-cache = 4096
      table-open-cache = 1900

      # INNODB #
      innodb-log-files-in-group = 2
      innodb-log-file-size = 128M
      innodb-flush-log-at-trx-commit = 1
      innodb-file-per-table = 1
      innodb-buffer-pool-size = 32G

      # LOGGING #
      log-error = /var/log/mysql/mysql-error.log
      #log-queries-not-using-indexes = 1
      #slow-query-log = 1
      #slow-query-log-file = /var/log/mysql/mysql-slow.log

      [mysqldump]
      quick
      quote-names
      max_allowed_packet = 16M

      [mysql]
      #no-auto-rehash # faster start of mysql but no tab completition

      [isamchk]
      key_buffer = 16M

      #
      # * IMPORTANT: Additional settings that can override those from this file!
      # The files must end with '.cnf', otherwise they'll be ignored.
      #
      !includedir /etc/mysql/conf.d/

      Comment


      • #4
        Hi Sergey,

        Thanks for posting. Here's a couple of performance tips for starters:

        * wsrep_slave_threads= 32

        From the README-wsrep file, this number should be low per CPU core. To be safe, I'd say start with 2 threads per core, Use

        cat /proc/cpuinfo | grep -c '^processor\s*:'

        and multiply by 2 for a starting value. You may want to reduce this to 1 thread per core or lower.

        * wsrep_provider_options="gcache.size=2G; gcache.page_size=1G;gcs.fc_limit = 256; gcs.fc_factor = 0.99;"

        Refer to this guide http://www.fromdual.com/gcache_size_in_galera_cluster for these values.

        Also:

        - rule out a network bandwidth issue, use the iperf tool (http://iperf.sourceforge.net/) to measure the bandwidth between your nodes.

        - make sure that values aren't being overridden by configuration files in your /etc/mysql/conf.d directory.

        Comment


        • #5
          Thank you, Dave.

          All is ok on my system, including bandwith issue.
          No one can explain me why Galera so slow on write queries....
          (at least for query "insert into test.tmptable values ('test1','test2','test3')")

          Comment


          • #6
            Hi Sergey,

            Can you provide some general specifications on the write operation? table size, row length, column definition, indexes and avg/max write operations per second would help. I could put together a randgen simulation with your configuration above.

            --Dave

            Comment


            • #7
              One other question, what is your overall transaction size?

              Comment


              • #8
                Hi

                Within my environment, the most dominant tuning parameter was repl.commit_order(although we may not change this parameter)
                http://www.percona.com/doc/percona-x...l.commit_order

                This configuration seems to serialize the transaction in each replicated nodes, scalability for cuncurrent transaction may lost significantly
                This occurred even when the cluster node is just one node.

                To judge whether the problem is caused by galera cluster or not, just to turn off all the wsrep configuration(comment out or so on) and benchmark may help you.

                thank you.

                Comment


                • #9
                  Thank you very much.

                  Yes, I made benchmarks (before posting) with VMWare environment (Ubuntu and so on).
                  Expert from SeveralNines made his benchmark with sysbench (olpt), also using VMWare and he
                  have got ~400% slow down.

                  So, from my experience I can't view database clustering as good solution for performance increase.
                  For example if I will setup 4 nodes I will have performance of 1 node.
                  If I will setup for example 20 nodes I don't know value of performance I will have. May be I will have 4xperformance of 1 node,
                  but may be less than 4x.

                  Comment


                  • #10
                    Hi

                    * scaling for read traffic is very effective and easy(this almost depends on CPU resource)
                    * scaling for write seems difficult and I also need the effective and simple strategy.

                    thank you

                    Comment


                    • #11
                      In most scenarios, moving from a virtualized environment to dedicated hardware would be a logical first scaling step prior to considering multi-master replication. So a performance benchmark of a pxc/galera multi-master virtualized environment may not be practical. Aside from overall performace scaling, there is also the high availability use case in which vitalization may play a role. That being said, there are some tuning steps regarding threading and transaction size that may also be effecting your test results.

                      If you're running a PXC cluster as a VM guest you should definitely reduce your slave threads to match 2x max cores (or lower) defined in the guest. Also, if you are running very large write transactions such as multiple inserts, updates or LOAD DATA there is a known issue. While there's no defined maximum regarding transaction size, a full writeset is contained in a single buffer in memory which can cause serious performance issues involving large memory allocations and slave transfers. Watch memory utilization on the guest receiving the initial transaction from the application using top to see if there's a large memory allocation and/or swapping occurring.

                      There are some tuning parameters for this case. If a large writeset is created by a LOAD DATA, you can use wsrep_load_data_splitting=on to reduce resident memory usage. Also, you can define wsrep_max_ws_rows and wsrep_max_ws_size which will set limits on the transaction size. These parameters will cause replication rejections which will be logged. If they are, you need to modify your code to split up the apps large writesets.
                      Last edited by dbennett455; 08-10-2014, 01:55 PM.

                      Comment


                      • #12
                        Thank you. It is very interesting.

                        Comment

                        Working...
                        X