Announcement

Announcement Module
Collapse
No announcement yet.

One node in the cluster gets stopped

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • One node in the cluster gets stopped

    I faced the issue that one of my node in the cluster keeps getting stopped after these errors.

    [ERROR] Slave SQL: Could not execute Write_rows event on table portal-serve.wp_options; Duplicate entry 'rewrite_rules' for key 'option_name', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 8652, Error_code: 1062

    What's wrong with it ?

    Thanks

  • #2
    Paste here result of:
    SHOW CREATE TABLE portal-serve.wp_options\G

    Also, what is your exact cluster/replication topology now?

    Comment


    • #3
      Here is the result.

      *************************** 1. row ***************************
      Table: wp_options
      Create Table: CREATE TABLE `wp_options` (
      `option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
      `option_name` varchar(64) NOT NULL DEFAULT '',
      `option_value` longtext NOT NULL,
      `autoload` varchar(20) NOT NULL DEFAULT 'yes',
      PRIMARY KEY (`option_id`),
      UNIQUE KEY `option_name` (`option_name`)
      ) ENGINE=InnoDB AUTO_INCREMENT=2816511 DEFAULT CHARSET=utf8
      1 row in set (0.00 sec)


      The replication topology is master-master.

      Comment


      • #4
        on the nodes what do your recent /var/lib/mysql/GRA_ files show?

        use mysqlbinlog to look at the files: http://www.mysqlperformanceblog.com/...ra_-log-files/

        download GRA-header
        cat GRA-header > outputfile.log
        cat /var/lib/mysql/GRA_<number>.log >> outputfile.log
        mysqlbinlog outputfile.log

        Comment


        • #5
          This is what I get from GRA log.

          Code:
          /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
          /*!40019 SET @@session.max_insert_delayed_threads=0*/;
          /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
          DELIMITER /*!*/;
          # at 4
          #120715  2:45:56 server id 1  end_log_pos 107   Start: binlog v 4, server v 5.5.25-debug-log created 120715  2:45:56 at startup
          # Warning: this binlog is either in use or was not closed properly.
          ROLLBACK/*!*/;
          BINLOG '
          NHUCUA8BAAAAZwAAAGsAAAABAAQANS41LjI1LWRlYnVnLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAA
          AAAAAAAAAAAAAAAAAAA0dQJQEzgNAAgAEgAEBAQEEgAAVAAEGggAAAAICAgCAA==
          '/*!*/;
          # at 107
          #140610 11:14:23 server id 2  end_log_pos 87    Query   thread_id=28338 exec_time=0     error_code=0
          SET TIMESTAMP=1402416863/*!*/;
          SET @@session.pseudo_thread_id=28338/*!*/;
          SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
          SET @@session.sql_mode=1073741824/*!*/;
          SET @@session.auto_increment_increment=3, @@session.auto_increment_offset=1/*!*/;
          /*!\C utf8 *//*!*/;
          SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
          SET @@session.lc_time_names=0/*!*/;
          SET @@session.collation_database=DEFAULT/*!*/;
          BEGIN
          /*!*/;
          # at 194
          #140610 11:14:23 server id 2  end_log_pos 160   Table_map: `portal-serve`.`wp_options` mapped to number 855
          ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 8494, event_type: 30
          WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
          DELIMITER ;
          # End of log file
          ROLLBACK /* added by mysqlbinlog */;
          /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
          /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
          Any idea what could be the reason ?

          Comment


          • #6
            To get the GRA (binary) log decoded, you need to use "--base64-output=DECODE-ROWS --verbose" options with mysqlbinlog.
            Also, the header file attached in that blog post works only wih logs from PXC 5.5.x, for 5.6.x you will need appropriate header.

            Btw. by master-master do you mean galera replication between 2 (more) nodes? Do you also use standard MySQL replication anywhere in the topology?
            What is the PXC version here?

            Comment


            • #7
              Hi

              I think I have a similar problem. We have a three node Xtradb Cluster (5.6.15-25.5-759.wheezy). Clients Connect to only two nodes via a ipvs load balancer.

              Every week we import two times a bunch of data (1500-3000 rows) in a single transaction to only one of the two active nodes. Sometimes we got this bug while we import the data. Sometimes not.
              Befor we installed the PXC we used the version directy from galera (5.1.62-23.4 with galera-23.2.1) and got the bug only one time in a year. After we switched to PXC on the 16 june we got the bug three times (23 june, 7 july and 10 july).

              If the bug occur we can start the failed node and do the import once again an it runs without a error.

              Here is the Log from today:
              Code:
              2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
              _ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
              2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
              2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
                       at galera/src/trx_handle.cpp:apply():340
              Retrying 2th time
              2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
              _ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
              2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
              2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
                       at galera/src/trx_handle.cpp:apply():340
              Retrying 3th time
              2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
              _ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
              2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
              2014-07-10 10:29:16 29911 [Warning] WSREP: Failed to apply app buffer: seqno: 11963863, status: 1
                       at galera/src/trx_handle.cpp:apply():340
              Retrying 4th time
              2014-07-10 10:29:16 29911 [ERROR] Slave SQL: Could not execute Write_rows event on table lottohelden.kkb; Duplicate entry '946741' for key 'letzte_nummer', Error_code: 1062; handler error HA
              _ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 278, Error_code: 1062
              2014-07-10 10:29:16 29911 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 11963863
              2014-07-10 10:29:16 29911 [ERROR] WSREP: Failed to apply trx: source: 6afdc17d-0674-11e4-8528-13d0bc22aafe version: 3 local: 0 state: APPLYING flags: 1 conn_id: 44231 trx_id: 415884858 seqno
              s (l: 664181, g: 11963863, s: 11963862, d: 11963858, ts: 1978184313228671)
              2014-07-10 10:29:16 29911 [ERROR] WSREP: Failed to apply trx 11963863 4 times
              2014-07-10 10:29:16 29911 [ERROR] WSREP: Node consistency compromized, aborting...
              2014-07-10 10:29:16 29911 [Note] WSREP: Closing send monitor...
              2014-07-10 10:29:16 29911 [Note] WSREP: Closed send monitor.
              2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: terminating thread
              2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: joining thread
              2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: closing backend
              
              2014-07-10 10:29:16 29911 [Note] WSREP: view(view_id(NON_PRIM,6afdc17d-0674-11e4-8528-13d0bc22aafe,22) memb {
                      f4d2e14b-0673-11e4-99b4-d28700569b85,0
              } joined {
              } left {
              } partitioned {
                      6afdc17d-0674-11e4-8528-13d0bc22aafe,0
                      a89cedd7-0674-11e4-8e61-ab5da27ec321,0
              })
              2014-07-10 10:29:16 29911 [Note] WSREP: view((empty))
              2014-07-10 10:29:16 29911 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
              2014-07-10 10:29:16 29911 [Note] WSREP: gcomm: closed
              2014-07-10 10:29:16 29911 [Note] WSREP: Flow-control interval: [16, 16]
              2014-07-10 10:29:16 29911 [Note] WSREP: Received NON-PRIMARY.
              2014-07-10 10:29:16 29911 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 11963863)
              2014-07-10 10:29:16 29911 [Note] WSREP: Received self-leave message.
              2014-07-10 10:29:16 29911 [Note] WSREP: Flow-control interval: [0, 0]
              2014-07-10 10:29:16 29911 [Note] WSREP: Received SELF-LEAVE. Closing connection.
              2014-07-10 10:29:16 29911 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 11963863)
              2014-07-10 10:29:16 29911 [Note] WSREP: RECV thread exiting 0: Success
              2014-07-10 10:29:16 29911 [Note] WSREP: recv_thread() joined.
              2014-07-10 10:29:16 29911 [Note] WSREP: Closing replication queue.
              2014-07-10 10:29:16 29911 [Note] WSREP: Closing slave action queue.
              2014-07-10 10:29:16 29911 [Note] WSREP: /usr/sbin/mysqld: Terminated.
              Aborted
              140710 10:29:16 mysqld_safe Number of processes running now: 0
              140710 10:29:16 mysqld_safe WSREP: not restarting wsrep node automatically
              140710 10:29:16 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
              Here the output from the GRA_*.log file:

              Code:
              mysqlbinlog --base64-output=DECODE-ROWS -vvv /root/GRA_2_10902541.log
              /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
              /*!40019 SET @@session.max_insert_delayed_threads=0*/;
              /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
              DELIMITER /*!*/;
              # at 4
              #120715  9:45:56 server id 1  end_log_pos 107     Start: binlog v 4, server v 5.5.25-debug-log created 120715  9:45:56 at startup
              # Warning: this binlog is either in use or was not closed properly.
              ROLLBACK/*!*/;
              # at 107
              #140707 10:23:29 server id 0  end_log_pos 80     Query    thread_id=335318    exec_time=0    error_code=0
              SET TIMESTAMP=1404721409/*!*/;
              SET @@session.pseudo_thread_id=335318/*!*/;
              SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
              SET @@session.sql_mode=1073741824/*!*/;
              SET @@session.auto_increment_increment=3, @@session.auto_increment_offset=3/*!*/;
              /*!\C utf8 *//*!*/;
              SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
              SET @@session.lc_time_names=0/*!*/;
              SET @@session.collation_database=DEFAULT/*!*/;
              BEGIN
              /*!*/;
              # at 187
              #140707 10:23:29 server id 0  end_log_pos 149     Table_map: `lottohelden`.`kkb` mapped to number 74
              ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 129, event_type: 30
              WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
              DELIMITER ;
              # End of log file
              ROLLBACK /* added by mysqlbinlog */;
              /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
              /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
              The create statement:
              Code:
              Create Table: CREATE TABLE `kkb` (
                `nummer` bigint(20) unsigned NOT NULL DEFAULT '0',
                `letzte_nummer` bigint(20) unsigned DEFAULT NULL,
                `user` varchar(12) NOT NULL DEFAULT '',
                `betrag` bigint(20) NOT NULL DEFAULT '0',
                `gk` tinyint(6) unsigned NOT NULL DEFAULT '0',
                `gk_ref` bigint(20) unsigned NOT NULL DEFAULT '0',
                `gv_typ` tinyint(6) unsigned NOT NULL DEFAULT '0',
                `gv_ref` bigint(20) unsigned NOT NULL DEFAULT '0',
                `saldo` bigint(20) NOT NULL DEFAULT '0',
                `alter_saldo` bigint(20) NOT NULL DEFAULT '0',
                `gebucht_um` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
                `gebucht_von` varchar(12) NOT NULL DEFAULT '',
                `is_storno` tinyint(1) unsigned NOT NULL DEFAULT '0',
                `storniert_nummer` bigint(20) unsigned DEFAULT NULL,
                `kommentar` bigint(20) unsigned DEFAULT NULL,
                `auszahlbar` bigint(20) NOT NULL DEFAULT '0',
                PRIMARY KEY (`nummer`),
                UNIQUE KEY `storniert_nummer` (`storniert_nummer`),
                UNIQUE KEY `letzte_nummer` (`letzte_nummer`),
                KEY `gebucht_um` (`gebucht_um`),
                KEY `gk` (`gk`),
                KEY `gv_typ` (`gv_typ`),
                KEY `user` (`user`),
                KEY `gv_ref` (`gv_ref`),
                KEY `gk_ref` (`gk_ref`),
                KEY `is_storno` (`is_storno`),
                KEY `gk_ref_2` (`gk_ref`,`gk`),
                KEY `user_2` (`user`,`gk`,`nummer`)
              ) ENGINE=InnoDB DEFAULT CHARSET=utf8
              I found this Post with a similar error message.

              http://www.mysqlperformanceblog.com/...ra_-log-files/

              But Frederic wrote: "In this case it’s obvious why it failed but it’s not always the case" and this is not very helpfull for me ;-(

              Do you need any additional information?

              regards
              ralph
              Last edited by ralph234272; 07-10-2014, 05:41 AM.

              Comment


              • #8
                Ralph, you may be hitting this bug: https://bugs.launchpad.net/codership-mysql/+bug/1280896 or similar.
                Are you using parallel slaves >1?
                In any case, upgrading to 5.6.19 may help.

                Comment

                Working...
                X