Announcement

Announcement Module
Collapse
No announcement yet.

pt-online-schema-change is not working right

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • pt-online-schema-change is not working right

    1.
    Table t1 is the table has 4289656 rows. No any other people use this table.

    CREATE TABLE `t1` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `val1` tinyint ,
    PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

    mysql> select count(*) from t1;
    +----------+
    | count(*) |
    +----------+
    | 4289656 |
    +----------+
    1 row in set (0.94 sec)

    2. In session 1, add one column val2 using pt-online-schema-change. it will take some time.

    # ./pt-online-schema-change --user=root --alter "ADD COLUMN val2 tinyint not null default 0" D=test,t=t1 --print --execute

    3. During pt-online-schema-change running, open session 2, modify vaule of primary key. SQLs are below:

    mysql> update t1 set id=10000000 where id=3000000;
    Query OK, 1 row affected (0.28 sec)
    Rows matched: 1 Changed: 1 Warnings: 0

    mysql> select * from t1 where id=10000000;
    +----------+------+
    | id | val1 |
    +----------+------+
    | 10000000 | 127 |
    +----------+------+
    1 row in set (0.00 sec)

    mysql> select * from t1 where id=3000000;
    Empty set (0.00 sec)

    mysql> update t1 set id=3000000 where id=10000000;
    Query OK, 1 row affected (0.00 sec)
    Rows matched: 1 Changed: 1 Warnings: 0

    mysql> select * from t1 where id=10000000;
    Empty set (0.00 sec)

    mysql> select * from t1 where id=3000000;
    +---------+------+
    | id | val1 |
    +---------+------+
    | 3000000 | 127 |
    +---------+------+
    1 row in set (0.00 sec)


    Things looks right.

    4. After pt-online-schema-change completed, query again in session 2:

    mysql> select * from t1 where id=10000000;
    +----------+------+------+
    | id | val1 | val2 |
    +----------+------+------+
    | 10000000 | 127 | 0 |
    +----------+------+------+
    1 row in set (0.00 sec)

    mysql> select count(*) from t1;
    +----------+
    | count(*) |
    +----------+
    | 4289657 |--- before add column, count(*)=4289656
    +----------+
    1 row in set (1.00 sec)


    Why the row with id=10000000 exists in table? Anybody can figure it out?


    ========================================
    Output of pt-online-schema-change:
    # ./pt-online-schema-change --user=root --alter "ADD COLUMN val2 tinyint not null default 0" D=test,t=t1 --print --execute
    No slaves found. See --recursion-method if host has slaves.
    Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
    Operation, tries, wait:
    copy_rows, 10, 0.25
    create_triggers, 10, 1
    drop_triggers, 10, 1
    swap_tables, 10, 1
    update_foreign_keys, 10, 1
    Altering `test`.`t1`...
    Creating new table...
    CREATE TABLE `test`.`_t1_new` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `val1` tinyint(4) DEFAULT NULL,
    PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=4289657 DEFAULT CHARSET=utf8
    Created new table test._t1_new OK.
    Altering new table...
    ALTER TABLE `test`.`_t1_new` ADD COLUMN val2 tinyint not null default 0
    Altered `test`.`_t1_new` OK.
    2014-07-04T10:41:29 Creating triggers...
    CREATE TRIGGER `pt_osc_test_t1_del` AFTER DELETE ON `test`.`t1` FOR EACH ROW DELETE IGNORE FROM `test`.`_t1_new` WHERE `test`.`_t1_new`.`id` <=> OLD.`id`
    CREATE TRIGGER `pt_osc_test_t1_upd` AFTER UPDATE ON `test`.`t1` FOR EACH ROW REPLACE INTO `test`.`_t1_new` (`id`, `val1`) VALUES (NEW.`id`, NEW.`val1`)
    CREATE TRIGGER `pt_osc_test_t1_ins` AFTER INSERT ON `test`.`t1` FOR EACH ROW REPLACE INTO `test`.`_t1_new` (`id`, `val1`) VALUES (NEW.`id`, NEW.`val1`)
    2014-07-04T10:41:29 Created triggers OK.
    2014-07-04T10:41:29 Copying approximately 4290498 rows...
    INSERT LOW_PRIORITY IGNORE INTO `test`.`_t1_new` (`id`, `val1`) SELECT `id`, `val1` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) AND ((`id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 2127 copy nibble*/
    SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `test`.`t1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) ORDER BY `id` LIMIT ?, 2 /*next chunk boundary*/
    Copying `test`.`t1`: 96% 00:01 remain
    2014-07-04T10:42:00 Copied rows OK.
    2014-07-04T10:42:00 Swapping tables...
    RENAME TABLE `test`.`t1` TO `test`.`_t1_old`, `test`.`_t1_new` TO `test`.`t1`
    2014-07-04T10:42:00 Swapped original and new tables OK.
    2014-07-04T10:42:00 Dropping old table...
    DROP TABLE IF EXISTS `test`.`_t1_old`
    2014-07-04T10:42:01 Dropped old table `test`.`_t1_old` OK.
    2014-07-04T10:42:01 Dropping triggers...
    DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_del`;
    DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_upd`;
    DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_ins`;
    2014-07-04T10:42:01 Dropped triggers OK.
    Successfully altered `test`.`t1`.
    Last edited by gavin.zhang; 07-03-2014, 11:25 PM.

  • #2
    Hi Gavin,

    pt-online-schema-change uses triggers to apply the changes on the table being altered to the new table. You can see the UPDATE trigger was defined as such:

    CREATE TRIGGER `pt_osc_test_t1_upd` AFTER UPDATE ON `test`.`t1` FOR EACH ROW REPLACE INTO `test`.`_t1_new` (`id`, `val1`) VALUES (NEW.`id`, NEW.`val1`)

    When you ran:
    update t1 set id=10000000 where id=3000000;

    The triggered event is:

    REPLACE INTO `test`.`_t1_new`(`id`,`val1`) VALUES (10000000, 127);

    Since id 1000000 didn't exists, this query acted as an INSERT. And note, the id 30000000 record remained unchanged.

    Then you ran:
    update t1 set id=3000000 where id=10000000;

    The triggered event is:
    REPLACE INTO `test`.`_t1_new`(`id`,`val1`) VALUES (30000000, 127);

    Because id 30000000 existed, in this case, the existing row first got deleted, then REPLACE acted like an INSERT.

    So that end result is that you had one more row of id 10000000 in the post-altered table.

    In short, it is a limitation of pt-online-schema-change. It doesn't support concurrent update on primary key value(or unique key if no primary key).

    Comment


    • #3
      Thank you, Song.

      Comment

      Working...
      X