GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Replication + Galera = Timeout?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replication + Galera = Timeout?

    I am having an issue with Galera that I'm attempting to track down. I have setup a three node cluster, with the 'first' node getting replication data from an existing, remote SQL server.

    Things run okay for a few days and then randomly the other nodes lose connection to the 'first' node apparently and totally crash MySQL (as in it's no longer running):

    120726 16:10:35 [Note] WSREP: (0b066c90-d4fb-11e1-0800-86a2cf43aaf6, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://172.30.0.163:4567
    120726 16:10:36 [Note] WSREP: (0b066c90-d4fb-11e1-0800-86a2cf43aaf6, 'tcp://0.0.0.0:4567') reconnecting to df4e387f-d4e2-11e1-0800-2e6080299165 (tcp://172.30.0.163:4567), attempt 0
    120726 16:10:37 [Note] WSREP: remote endpoint tcp://172.30.0.163:4567 changed identity df4e387f-d4e2-11e1-0800-2e6080299165 -> 1bc3616b-d777-11e1-0800-4a369f7eed28
    120726 16:10:37 [Note] WSREP: (0b066c90-d4fb-11e1-0800-86a2cf43aaf6, 'tcp://0.0.0.0:4567') turning message relay requesting off
    120726 16:11:08 [Note] WSREP: evs:roto(0b066c90-d4fb-11e1-0800-86a2cf43aaf6, GATHER, view_id(REG,0b066c90-d4fb-11e1-0800-86a2cf43aaf6,11)) suspecting node: df4e387f-d4e2-11e1-0800-2e6080299165
    120726 16:11:08 [Note] WSREP: (0b066c90-d4fb-11e1-0800-86a2cf43aaf6, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://172.30.0.163:4567

    ... and a little later:

    120726 16:11:38 [Warning] WSREP: Failed to report last committed 16744356, -107 (Transport endpoint is not connected)
    120726 16:11:38 [Note] WSREP: Received NON-PRIMARY.
    120726 16:11:38 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 2
    120726 16:11:38 [Note] WSREP: Flow-control interval: [12, 23]
    120726 16:11:38 [Note] WSREP: Received NON-PRIMARY.
    120726 16:11:38 [Note] WSREP: New cluster view: global state: 97e9eadb-d1fa-11e1-0800-d054b2ba0044:16744400, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
    120726 16:11:38 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    120726 16:11:38 [Note] WSREP: New cluster view: global state: 97e9eadb-d1fa-11e1-0800-d054b2ba0044:16744400, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
    120726 16:11:38 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    120726 16:11:38 [Note] WSREP: New cluster view: global state: 97e9eadb-d1fa-11e1-0800-d054b2ba0044:16744400, view# -1: non-Primary, number of nodes: 2, my index: 0, protocol version 2
    120726 16:11:38 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


    On the node that hard-crashes:

    120726 16:10:34 mysqld_safe Number of processes running now: 0
    120726 16:10:34 mysqld_safe mysqld restarted
    120726 16:10:35 [Note] Flashcache bypass: disabled
    120726 16:10:35 [Note] Flashcache setup error is : ioctl failed

    120726 16:10:35 [Warning] You need to use --log-bin to make --log-slave-updates work.
    120726 16:10:35 [Note] WSREP: Read nil XID from storage engines, skipping position init
    120726 16:10:35 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
    120726 16:10:36 [Note] WSREP: wsrep_load(): Galera 2.1(r113) by Codership Oy loaded succesfully.
    120726 16:10:36 [Note] WSREP: Found saved state: 97e9eadb-d1fa-11e1-0800-d054b2ba0044:-1
    120726 16:10:36 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
    120726 16:10:36 [Note] WSREP: Passing config to GCS: base_host = 172.30.0.163; evs.consensus_timeout = PT1M; evs.inactive_check_period = PT10S; evs.inactive_timeout = PT1M; evs.keepalive_period = PT3S; evs.send_window = 1024; evs.suspect_timeout = PT30S; evs.user_send_window = 512; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 1G; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
    120726 16:10:36 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    120726 16:10:36 [Note] WSREP: wsrep_sst_grab()
    120726 16:10:36 [Note] WSREP: Start replication
    120726 16:10:36 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
    120726 16:10:36 [Note] WSREP: (1bc3616b-d777-11e1-0800-4a369f7eed28, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    120726 16:10:36 [Note] WSREP: (1bc3616b-d777-11e1-0800-4a369f7eed28, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    120726 16:10:36 [Note] WSREP: EVS version 0
    120726 16:10:36 [Note] WSREP: PC version 0
    120726 16:10:36 [Note] WSREP: gcomm: connecting to group 'galeraprimary', peer 'galera1.torreycommerce.net:4567'
    120726 16:10:36 [Note] WSREP: (1bc3616b-d777-11e1-0800-4a369f7eed28, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://172.30.0.154:4567
    120726 16:10:36 [Note] WSREP: (1bc3616b-d777-11e1-0800-4a369f7eed28, 'tcp://0.0.0.0:4567') turning message relay requesting off
    120726 16:11:06 [Note] WSREP: view((empty))
    120726 16:11:06 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
    at gcomm/src/pc.cpp:connect():148
    120726 16:11:06 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
    120726 16:11:06 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1290: Failed to open channel 'galeraprimary' at 'gcomm://galera1.torreycommerce.net:4567': -110 (Connection timed out)
    120726 16:11:06 [ERROR] WSREP: gcs connect failed: Connection timed out
    120726 16:11:06 [ERROR] WSREP: wsrep::connect() failed: 6
    120726 16:11:06 [ERROR] Aborting

    120726 16:11:06 [Note] WSREP: Service disconnected.
    120726 16:11:07 [Note] WSREP: Some threads may fail to exit.


    Any ideas what is going on here?

  • #2
    No one has any ideas? This is still an issue for me... random timeouts that don't (or rather can't be) network related.

    Comment


    • #3
      I just did some reading and it's very possible that there is still a bug with log_bin in place. I'll try disabling and see if it fixes the disconnects. If it does, should I file a bug report with the Codership team, or Percona?

      Comment


      • #4
        Okay.. no solution. Node(s) still randomly crash. What's the next step here? Hopefully someone responds.

        Comment


        • #5
          Hi,

          Could you check the connectivity between your nodes ?

          Can you connect from the crashed node to galera1.torreycommerce.net on port 4567 (you can try with telnet)

          regards,

          Comment


          • #6
            I have the same problem. I have 4 node cluster. I try to benchmark by sysbench with this command:
            /usr/bin/sysbench --num-threads=1000 --max-requests=100000 --test=oltp --oltp-table-size=1000000 --oltp-dist-type=gaussian --db-driver=mysql --mysql-db =test --mysql-host=10.2.95.110 --mysql-user=mike --mysql-password=xxx run

            The test started without problems, it took about 20minutes and suddenly it finisned with error:

            Client server:
            ALERT: failed to execute mysql_stmt_execute(): Err2013 Lost connection to MySQL server during query
            FATAL: database error, exiting...
            (last message repeated 1 times)

            DB server in error.log:
            inact120823 11:17:34 [ERROR] Slave SQL: Could not execute Write_rows event on table test.sbtest; Duplicate entry '662808' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 946, Error_code: 1062
            120823 11:17:34 [Warning] WSREP: RBR event 6 apply warning: 121, seqno: 366187
            120823 11:17:34 [ERROR] WSREP: Failed to apply trx: source: 4ffc956c-ecee-11e1-0800-395778ab141a version: 2 local: 0 state: CERTIFYING flags: 1 conn_id: 877 trx_id: 290036706 seqnos (l: 368276, g: 366187, s: 365133, d: 365158, ts: 1345713434143156242)
            120823 11:17:34 [ERROR] WSREP: Failed to apply app buffer: �5P, seqno: 366187, status: WSREP_FATAL
            120823 11:17:34 [ERROR] WSREP: Node consistency compromized, aborting...
            120823 11:17:34 [Note] WSREP: Closing send monitor...
            120823 11:17:34 [Warning] WSREP: Failed to report last committed 365882, -77 (File descriptor in bad state)
            120823 11:17:34 [Note] WSREP: Closed send monitor.
            120823 11:17:34 [Note] WSREP: gcomm: terminating thread
            120823 11:17:34 [Note] WSREP: gcomm: joining thread
            120823 11:17:34 [Note] WSREP: gcomm: closing backend
            120823 11:17:34 [Note] WSREP: view(view_id(NON_PRIM,4ffc956c-ecee-11e1-0800-395778ab141a,4 ) memb {
            120823 11:17:34 [Note] WSREP: view((empty))
            120823 11:17:34 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
            120823 11:17:34 [Note] WSREP: gcomm: closed
            120823 11:17:34 [Note] WSREP: Flow-control interval: [8, 16]
            120823 11:17:34 [Note] WSREP: Received NON-PRIMARY.
            120823 11:17:34 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 366251)
            120823 11:17:34 [Note] WSREP: Received self-leave message.
            120823 11:17:34 [Note] WSREP: Flow-control interval: [0, 0]
            120823 11:17:34 [Note] WSREP: Received SELF-LEAVE. Closing connection.
            120823 11:17:34 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 366251)
            120823 11:17:34 [Note] WSREP: RECV thread exiting 0: Success
            120823 11:17:34 [Note] WSREP: recv_thread() joined.
            120823 11:17:34 [Note] WSREP: Closing slave action queue.
            120823 11:17:34 [Note] WSREP: /usr/sbin/mysqld: Terminated.
            120823 11:17:34 mysqld_safe Number of processes running now: 0
            120823 11:17:34 mysqld_safe mysqld restarted
            120823 11:17:34 [Warning] WSREP: Failed to guess the value of wsrep_node_address variable.You need to set it explicitly.
            120823 11:17:34 [Warning] option 'max_prepared_stmt_count': unsigned value 999999999 adjusted to 1048576
            120823 11:17:34 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
            120823 11:17:34 [Note] WSREP: wsrep_load(): Galera 23.2.1(r129) by Codership Oy loaded succesfully.
            120823 11:17:34 [Warning] WSREP: Failed to autoguess base node address
            120823 11:17:34 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
            120823 11:17:34 [Note] WSREP: Reusing existing '/www/mysql//galera.cache'.
            120823 11:17:34 [Note] WSREP: Passing config to GCS: gcache.dir = /www/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /www/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
            120823 11:17:34 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
            120823 11:17:34 [Note] WSREP: wsrep_sst_grab()
            120823 11:17:34 [Note] WSREP: Start replication
            120823 11:17:34 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
            120823 11:17:34 [Note] WSREP: protonet asio version 0
            120823 11:17:34 [Note] WSREP: backend: asio
            120823 11:17:34 [Note] WSREP: GMCast version 0
            120823 11:17:34 [Note] WSREP: (5fd57905-ed03-11e1-0800-acbd3f1f3485, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
            120823 11:17:34 [Note] WSREP: (5fd57905-ed03-11e1-0800-acbd3f1f3485, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
            120823 11:17:34 [Note] WSREP: EVS version 0
            120823 11:17:34 [Note] WSREP: PC version 0
            120823 11:17:34 [Note] WSREP: gcomm: connecting to group 'kluster', peer '10.2.94.48:4567'
            120823 11:17:34 [Note] WSREP: (5fd57905-ed03-11e1-0800-acbd3f1f3485, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.2.94.47:4567
            120823 11:17:34 [Note] WSREP: (5fd57905-ed03-11e1-0800-acbd3f1f3485, 'tcp://0.0.0.0:4567') cleaning up established 0x2139e90 which is duplicate of 0x214bca0
            120823 11:17:35 [Note] WSREP: remote endpoint tcp://10.2.94.47:4567 changed identity 765d20b2-ecee-11e1-0800-2ccf1082a406 -> 600a896a-ed03-11e1-0800-d117a8b642f6
            120823 11:17:35 [Note] WSREP: (5fd57905-ed03-11e1-0800-acbd3f1f3485, 'tcp://0.0.0.0:4567') turning message relay requesting off
            120823 11:17:40 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:40 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:41 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:41 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:42 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:42 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:43 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:43 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:44 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:44 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:45 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:45 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:46 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:46 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:47 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:47 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) suspecting node: 765d20b2-ecee-11e1-0800-2ccf1082a406
            120823 11:17:48 [Note] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed0ive
            120823 11:18:03 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:04 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:05 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:05 [Warning] WSREP: evs:roto(5fd57905-ed03-11e1-0800-acbd3f1f3485, GATHER, view_id(TRANS,5fd57905-ed03-11e1-0800-acbd3f1f3485,0)) source 600a896a-ed03-11e1-0800-d117a8b642f6 is not supposed to be representative
            120823 11:18:05 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:05 [Note] WSREP: node 765d20b2-ecee-11e1-0800-2ccf1082a406 marked with nil view id and suspected in all present join messages, declaring inactive
            120823 11:18:05 [Note] WSREP: view((empty))
            120823 11:18:05 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
            120823 11:18:05 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
            120823 11:18:05 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1290: Failed to open channel 'kluster' at 'gcomm://10.2.94.48:4567': -110 (Connection timed out)
            120823 11:18:05 [ERROR] WSREP: gcs connect failed: Connection timed out
            120823 11:18:05 [ERROR] WSREP: wsrep::connect() failed: 6
            120823 11:18:05 [ERROR] Aborting
            120823 11:18:05 [Note] WSREP: Service disconnected.
            120823 11:18:06 [Note] WSREP: Some threads may fail to exit.
            120823 11:18:06 [Note] /usr/sbin/mysqld: Shutdown complete
            120823 11:18:06 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended


            Is it true, that Galera close backend when duplikate entry had been find? And after that mysql and galera is restarted, but backend is closed. Galera finish with error: 120823 11:18:05 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out) and mysql is stoped.

            Do you have any idea how resolve this problem?

            Thanks.

            Lukas

            Comment


            • #7
              jlondon: what IP address does the hostname galera1.torreycommerce.net resolve to?

              Is it a valid IP address for connecting to the cluster communication layer?

              Lucas: similar question for a similar error: Is this IP address: 10.2.94.48 a valid IP address for connecting to the cluster communication layer?

              Comment


              • #8
                The IP 10.2.94.48 is specificated in wsrep.cnf as

                wsrep_cluster_address=gcomm://10.2.94.48:4567

                The main node of this IP has in wsrep.cnf this..

                wsrep_cluster_address=gcomm://

                Thanks for your reply.

                Lucas

                Comment


                • #9
                  I'm hitting this issue. Anybody found the cause or solution to this issue?

                  Comment


                  • #10
                    Hi,

                    I did test again with mysql 5.5 and I thing it is OK.

                    I did it witch galera-23.2.2-amd64.deb and with mysql-server-wsrep-5.5.23-23.6-amd64.deb.
                    I made 4 node cluster and benchmarked it by sysbench witch 10000 threads. The cluster worked OK and no mysql crashed or made any error. The client workek OK too and the error in preview comment I didn't see.

                    Does anyone have the same experience? or Did anyone test of new galera replication too?

                    Lucas

                    Comment

                    Working...
                    X