Announcement

Announcement Module
Collapse
No announcement yet.

Cluster won't start after rebooting all servers

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster won't start after rebooting all servers

    We setup a cluster on 3 nodes. Tested everything.

    On next day, we rebooted all 3 nodes and now cluster won't start.

    We have tried starting one node with --wsrep-cluster-address="gcomm://" but it does not work.

    Logs
    ==> /var/log/mysqld.log <==
    130726 10:31:15 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
    130726 10:31:15 mysqld_safe WSREP: Running position recovery with --log_error= --pid-file=/var/lib/mysql/node1-recover.pid
    130726 10:31:38 mysqld_safe WSREP: Failed to recover position:

    ==> //var/lib/mysql/node1.err <==
    130726 10:31:15 InnoDB: Initializing buffer pool, size = 128.0G
    130726 10:31:24 InnoDB: Completed initialization of buffer pool
    130726 10:31:24 InnoDB: highest supported file format is Barracuda.
    130726 10:31:30 InnoDB: Waiting for the background threads to start
    130726 10:31:31 Percona XtraDB (http://www.percona.com) 5.5.31-rel30.3 started; log sequence number 1598630
    130726 10:31:31 [Note] WSREP: Recovered position: 90fd09f4-f40e-11e2-8efc-f65252f54aba:13
    130726 10:31:31 InnoDB: Starting shutdown...
    130726 10:31:38 InnoDB: Shutdown completed; log sequence number 1598630
    130726 10:31:38 [Note] /usr/sbin/mysqld: Shutdown complete

    [root@node1 mysql]# rpm -qa | grep Percona
    Percona-Server-shared-51-5.1.70-rel14.8.580.rhel6.x86_64
    Percona-XtraDB-Cluster-shared-5.5.31-23.7.5.438.rhel6.x86_64
    Percona-XtraDB-Cluster-galera-2.6-1.152.rhel6.x86_64
    Percona-XtraDB-Cluster-client-5.5.31-23.7.5.438.rhel6.x86_64
    Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64

  • #2
    Did you shut down the nodes gracefully with an init script or just kill the process?
    Have you tried bootstrapping all 3 nodes? It's possible the one you're trying to bootstrap with gcomm:// has bad data. Try another one.

    Comment


    • #3
      Servers were rebooted using reboot command so I believe OS shut down cluster gracefully using init script.

      We have tried starting all 3 nodes using gcomm:// one by one but none of them start.

      This seems to a big problem as we now fear that this problem could occur in production and we won't be able to start cluster. I wonder if Percona cluster even fit for production use?

      Comment


      • #4
        I can confirm that Selinux was causing problem.

        Comment


        • #5
          SELINUX Strikes again!

          Comment


          • #6
            Yes, After DISABLED SELinux, it works with the error "140213 12:15:32 mysqld_safe WSREP: Failed to recover position: "

            Comment

            Working...
            X