Announcement

Announcement Module
Collapse
No announcement yet.

Circular replication between mysql server and xtradb cluster

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Circular replication between mysql server and xtradb cluster

    Hi,

    I have 2 locations, a corporate office which has a server with mysql 5.6.17 installed. This houses the UI which users enter data so that the website can be updated. It also holds all the data about 1TB so far, so that reports can be generated.

    The other location is stored on AWS. It's a percona cluster with mysql: 5.6.15-56. This holds data with respect to visitors and what they do on the site. We currently get 20-50 million clicks a data and instead of holding all the data on AWS and racking up major $$$ in HD space. Every day the data should be transferred from the AWS DB to the corporate office DB. At midnight, the DB of the AWS server will be cleared. So thats why only a small DB needs to be on the AWS instances of the cluster.

    I have setup circular replication between the corporate office and the master of the xtradb cluster. Replication seems to work fine. Also within the cluster replication seems to work fine here.

    However, it gets sticky when I do an insert from another node in the cluster which then gets replicated to the Master in the cluster and then replicated over to the corporate server. What happens is for some reason the row tries to get replicated back to the Master in the cluster and I get this. Last_SQL_Error: Could not execute Write_rows event on table tracking.foo_B; Duplicate entry '2' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000001, end_log_pos 488
    But when I do an insert at the Master of the cluster, it gets replicated with no problems.

    So let me break this down:

    - 1 mysql instance, Site A.
    - 1 xtradb cluster, Site B.
    - Circular replication between the mysql instance and master of the xtradb cluster.
    - With an insert on each master (site A or site B), no issues. Data gets replicated to all nodes.
    - With an insert on another server of the xtra cluster (node 2, node 3) then the data gets replicated to all nodes. The xtradb master replicates to its slave (site A). Yet, the master (site A) then replicates back to the slave (site B) and I get the duplicate entry error.

    Another thing to be aware of:

    Tables from Site A, get replicated one way ---> Site B.
    Tables from Site B, get replicated one way ---> Site A.

    There is no cross transactions with the same table. Site A <----> Site B.

    I have tried putting in replicate-ignore-table which I thought would solve the issue. However when its placed on the master of the cluster, when a transaction occurs on another node, it does not get replicated on the primary node.

    Site A config:
    Code:
    ## General
    ignore-db-dir                        = lost+found
    datadir                              = /var/lib/mysql
    socket                               = /var/lib/mysql/mysql.sock
    tmpdir                               = /var/lib/mysqltmp
    skip-name-resolve
    
    ## Cache
    #table-cache                                             = 150
    table-definition-cache               = 4096
    table-open-cache                     = 4096
    #table-open-cache-instances          = 1
    thread-cache-size                    = 150
    query-cache-type                                         = off
    query-cache-size                         = 0
    query-cache-limit                        = 0
    
    ## Per-thread Buffers
    #join-buffer-size                    = 512K
    #read-buffer-size                    = 512K
    #read-rnd-buffer-size                = 512K
    #sort-buffer-size                    = 512K
    
    ## Temp Tables
    #max-heap-table-size                 = 64M
    #tmp-table-size                      = 32M
    
    ## Networking
    #interactive-timeout                 = 3600
    max-connections                      = 400
    max-connect-errors                   = 1000000
    max-allowed-packet                   = 16M
    skip-name-resolve
    wait-timeout                         = 600
    
    ## InnoDB
    #innodb-buffer-pool-size             = 256M
    innodb-file-format                   = Barracuda
    #innodb-file-per-table               = 1
    #innodb-flush-method                 = O_DIRECT
    #innodb-log-file-size                = 128M
    
    ## Replication and PITR
    binlog-format                        = ROW
    expire-logs-days                     = 7
    log-bin                              = mysql-bin
    log-slave-updates                    = 1
    #max-binlog-size                     = 128M
    #read-only                           = 1
    relay-log                            = relay-bin
    relay-log-space-limit                = 16G
    server-id                            = 1
    Site B config:
    Code:
    [client]
    port   = 3306
    socket = /var/run/mysqld/mysqld.sock
    
    [mysqld]
    user = mysql
    port                     = 3306
    datadir                 = /opt/mysql/data
    socket                   = /var/run/mysqld/mysqld.sock
    max_connections         = 500
    transaction-isolation   = 'READ-COMMITTED'
    auto_increment_increment    = 1
    auto_increment_offset = 1
    
    # caching and optimisations
    join_buffer_size   = 128
    #table_cache        = 96
    table_open_cache   = 64
    thread_cache_size  = 128 # see http://cl.ly/142s250i1S0o
    thread_stack       = 256K
    tmp_table_size     = 256M
    max_allowed_packet = 500M
    read_buffer_size   = 128K
    slow-query-log-file = /var/log/mysql/slow-queries.log # file must first be created and owned by the MySQL user - see http://cl.ly/142T211c0Z35
    long_query_time     = 5
    skip-external-locking = 1
    skip-name-resolve = 1
    
    #========#
    # InnoDB #
    #========#
     
    innodb_flush_method                 = ALL_O_DIRECT # see http://cl.ly/040b1B1H1I2C
    innodb_buffer_pool_size             = 5G # for 8GB RAM - it should be ~70-80% of the available RAM.
    innodb_log_buffer_size              = 8M
    innodb_log_file_size                = 50M # DO see http://cl.ly/3A3l2y170J29 and http://cl.ly/133W402C050v for more details on how to change this setting safely
    innodb_support_xa                   = 0 # OK with Galera synchronous replication
    innodb_flush_log_at_trx_commit      = 0 # speeds things up, can lose 1 second worth of transactions if MySQL crashes. This is fine for us at the moment.
    innodb_doublewrite                  = 1 # ensures an incremental state transfer will be tried if possible - see http://cl.ly/1E0w1v0z1m0n
    innodb_file_per_table = 1
    innodb_locks_unsafe_for_binlog = 1 # minimises risk of phantom updates/selects
    innodb_autoinc_lock_mode       = 2 # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    
    #===================#
    # MySQL Replication #
    #===================#
    
    log-bin = mysql-bin
    relay-log = relay-bin
    log-slave-updates = 1
    
    
    #================================#
    # Galera synchronous replication #
    #================================#
     
    binlog_format                  = row # In order for Galera to work correctly binlog format should be ROW
    server-id                      = 2 # change on each node! - not strictly required for Galera, but it's safer to have a unique id on each node in case normal replication is also used
    default_storage_engine         = InnoDB # MyISAM storage engine has only experimental support
    wsrep_cluster_name             = "percona-east-01"
    wsrep_node_name                = "node1" # change this on each node!
    wsrep_node_address   = 10.0.1.153 # Node #1 address
    wsrep_provider                 = /usr/lib64/libgalera_smm.so # Path to Galera library
    wsrep_cluster_address          = gcomm://10.0.1.153,10.0.3.125,10.0.5.233 # set to "gcom://"" to reinitialise (reset) a node; otherwise ensure it is pointing to a valid node.  Cluster connection URL contains the IPs of node#1, node#2 and node#3
    wsrep_slave_threads            = 2 # recommended: 4 slave thread per core - see http://cl.ly/0z0l0K3v3Y0j
    wsrep_retry_autocommit         = 2 # how many times to retry deadlocked autocommits
    wsrep_convert_LOCK_to_trx      = 0 # convert locking sessions into transactions; see http://cl.ly/2c0N3B0f1v41
    wsrep_certify_nonPK            = 1 # Generate fake primary keys for non-PK tables (required for multi-master and parallel applying operation)
    wsrep_sst_method=rsync # rsync or xtrabackup; remember to create the user 'mysql'@'localhost' on each node, with all privileges for the xtrabackup-based SST to work.  see http://cl.ly/153r0A2k0G1N
    
    [mysqldump]
    quick
    max_allowed_packet = 16M
    
    [mysql]
    no-auto-rehash
    prompt="(\u@\h) [\d]>\_"
    It's a bit of a shame that there is no wsrep options for ignoring tables rather than relying on replicate-do-db. That why I would be able to use that functionality between the Master servers in circular replication.

    If anyone has any ideas, please let me know how to resolve this!

    Thanks
    Last edited by no1youknowz; 05-25-2014, 09:04 PM.

  • #2
    So right now, I have stopped my cluster. I am unable to resolve the situation here. To go ahead with the project I am going to do Circular replication with just the 2 masters and for-go the replication on the AWS side with the cluster.

    Comment


    • #3
      well, i have temporary given up with circular replication between a master server in site A and a master server in a cluser in site B. seems its not possible due to duplicate key issues whenever the other nodes have an update. i have kinda resolved the problem by having a slave in site A and a master in site B and then site A updates itself and then does an update over VPN via a backend process to a server in site B which gets replicated over to the other nodes. Not ideal but at least both sites are up-to-date with UI updates and I can still have real-time data for reporting.

      Would be interesting if circular replication between masters is possible. did percona actually test this configuration or is only a slave possible?

      Looking at this http://www.severalnines.com/blog/rep...galera-cluster just shows one way...

      What does percona have to say?

      Comment


      • #4
        The duplicate key thing may be resolved by setting the auto increment settings. I am not super familiar with them, but just a thought. Check this thread for similar conversation about the increment stuff: http://www.percona.com/forums/questi...-autoincrement





        Next, What will happen on the secondary site if it needed to become primary? New data will be present on Site B Master after some time. How will this data get back to Site A when you need to fail back to Site A and primary target? I need this to work on my stack in the future...that's why I ask.

        Site A (master) ----replication----> Site A (slave)
        Site A (slave) ----replication----> Site B (master)

        Comment


        • #5
          Originally posted by shockwavecs View Post
          The duplicate key thing may be resolved by setting the auto increment settings. I am not super familiar with them, but just a thought. Check this thread for similar conversation about the increment stuff: http://www.percona.com/forums/questi...-autoincrement
          Thanks but its nothing to do with the auto-increment. You would have known this because I said the tables would not be cross replicating. Let me explain again so you understand.

          PXC side
          Node 2 does an insert.
          Node 1 gets the data.
          Node 1 replicates the data to its slave.

          Corporate site
          Server (as a slave) receives the data, inserts into database.

          Now this is the problem...

          Server (now as a master) decides to send BACK the data to the PXC side.

          PXC side
          Node 1 (as the slave) tries to get back the data. Oh dear, we have existing records. ERRROR! Duplicate key, etc.

          Now do you understand?

          Note!!! This problem does not happen if the insert applies on either Master. Node1 or the Corporate Master. There are no problems as perhaps the log files have records? Maybe when a node does an insert there is no log file being appended? Yes I already tried with log-sync-etc...

          Originally posted by shockwavecs View Post
          Next, What will happen on the secondary site if it needed to become primary? New data will be present on Site B Master after some time. How will this data get back to Site A when you need to fail back to Site A and primary target? I need this to work on my stack in the future...that's why I ask.

          Site A (master) ----replication----> Site A (slave)
          Site A (slave) ----replication----> Site B (master)
          Did you not read? What is wrong here? I already said. There is no failover between sites. If Site A goes down, doesn't matter. If site B goes down who cares either. There is ONE WAY replication going on either side.

          Why is there one way replication? Because I have someone amending the UI, when they update something I want that update to be replicated to Site B. When site B gets data I want it replicated to Site A so that the replication is in real-time and when people are viewing their screens its in real-time data!

          If Site A is down and no one runs reports. Who cares. I'll fix when I have time.
          If Site B is down and no one can access the site. Again who cares. Pay more money and maybe we'll get another geo location.

          Right now that isn't part of my problem. I am deferring that problem for later on.

          Thanks for replying tho!

          Comment


          • #6
            Turn off log-slave-updates on site-a or use the same server_id for all nodes in site_b

            Comment


            • #7
              Originally posted by mgriffin View Post
              Turn off log-slave-updates on site-a or use the same server_id for all nodes in site_b
              ^ This worked!

              I tested both scenarios and they worked.

              1) All nodes in the cluster have the same server-id. In site A the server had log-slave-updates.
              2) All nodes in the cluster have different server-id. In site A the server DID NOT have log-slave-updates

              Thank you mgriffin for the advice and other advice you have given me!

              Comment

              Working...
              X