Announcement

Announcement Module
Collapse
No announcement yet.

SQL statement hangs while replicating on slave (5.5.21)

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • SQL statement hangs while replicating on slave (5.5.21)

    Hi,

    We're running the Percona version of MySQL, version 5.5.21, and we're hitting an odd replication issue.

    We run the following long-running update (with some filled in values) and find that the CREATE TEMPORARY TABLE consistently hangs when running on one of our slaves.

    CREATE TEMPORARY TABLE (SELECT csid, count(csid) AS ${COUNT_COLUMN} FROM ${dataset}.locations AS loc, ${dataset}.entityid_to_csid AS e2c WHERE e2c.entityid=loc.entityid AND docid>=${MINDOC} AND docid<=${MAXDOC} GROUP BY csid)
    UNION ALL
    (SELECT csid, count(csid) AS ${COUNT_COLUMN} FROM ${dataset}.images AS img, ${dataset}.entityid_to_csid AS e2c WHERE e2c.entityid=img.entityid AND docid>=${MINDOC} AND docid<=${MAXDOC} GROUP BY csid)


    (table schema defs below)

    This is running on an Ubuntu 11.0.4 server.

    I've checked the MySQL and Percona release histories for more recent builds, but haven't found anything that exactly describes my issue. There's a replication bug fixed in 5.5.22 that sounds like it could be related but it's tenuous.

    Also worth noting that we get a 100% disk I/O on the hanging slave, and that a different slave hangs each time. The replication status appears OK except that the rows read keeps resetting to zero, and the process status is reported as "Sending data". When the update runs correctly, we see a message indicating that the data is being sent to the temporary table instead.

    Any ideas on how we can resolve this would be greatly appreciated. Also apologies for the cross posting to different forums, I'm not quite sure of the best place to ask this question.

    Thanks,
    Jim from Digital Science


    CREATE TABLE locations (
    id INTEGER NOT NULL AUTO_INCREMENT,
    docid INTEGER NOT NULL,
    entityid INTEGER NOT NULL,
    fieldid INTEGER NOT NULL,
    rank INTEGER NOT NULL,
    startoffset INTEGER NOT NULL,
    endoffset INTEGER NOT NULL,
    isExemplified TINYINT(1) DEFAULT 0,
    PRIMARY KEY(id),
    INDEX(docid,fieldid,rank),
    INDEX(fieldid),
    INDEX(entityid),
    UNIQUE(docid,entityid,fieldid,rank,startoffset),
    FOREIGN KEY (docid) REFERENCES documents(docid)
    ) DEFAULT CHARSET=ascii MAX_ROWS=1000000000 ENGINE=INNODB;

    CREATE TABLE entityid_to_csid (
    entityid INTEGER NOT NULL,
    csid INTEGER default NULL,
    fragment BOOLEAN default NULL,
    connected BOOLEAN default NULL,
    radical BOOLEAN default NULL,
    element BOOLEAN default 0,
    molweight FLOAT default NULL,
    INDEX(entityid),
    INDEX(csid),
    UNIQUE(entityid,csid)
    ) DEFAULT CHARSET=ascii MAX_ROWS=100000000 ENGINE=INNODB;

  • #2
    Thinking it's 100% IO, is somehow your MySQL data on a shared storage?
    Our documentation has a lot of answers about common questions on Percona software, have you checked there before posting that question here? http://www.percona.com/forums/core/i...lies/smile.png

    Join us at the annual Percona Live MySQL Users Conference - http://www.percona.com/live/mysql-conference-2014/

    Comment


    • #3
      The storage on all three machines (master and two slaves) is provided by Amazon Web Services. We run all instances on EC2 instances, and use ephemeral volumes for storage.

      Comment


      • #4
        Hi, did you find the reason for these hangs? I have a similar case, the only difference is that the slave hangs on random queries... inserts, updates, no relation between them at all. When it hangs i cannot stop slave, i cannot exit mysqld. Only kill -9 works. This happened after upgrading to 5.5.27-28.0.

        Comment


        • #5
          I can confirm thi issue:

          Upgrade to 5.5.27-rel28.0-291.squeeze was perfomed on Aug 24, and few hours later replication hangs at statement:

          "CREATE TABLE rating_delta SELECT user_to, SUM(summa/18 * IF(operation = 'down', -1, 1)) delta FROM user_votes, user_rating WHERE user_from = user_id AND user_id = '4289612' GROUP BY user_to"

          I had to kill -9 mysqld, replication start with GLOBAL SQL_SLAVE_SKIP_COUNTER = 1 and hangs few INSERT's later

          Slave config:

          [mysqld]
          ###### General settings ######
          max_allowed_packet = 64M
          innodb_file_per_table
          innodb_flush_log_at_trx_commit = 0
          skip-name-resolve
          bind-address=0.0.0.0
          ssl-ca = /etc/mysql/ssl/ca-cert.pem
          ssl-cert = /etc/mysql/ssl/server-cert.pem
          ssl-key = /etc/mysql/ssl/server-key.pem
          ###### Replication and binlogs ######
          server-id = 2
          replicate-do-db = socnet
          replicate-do-db = forum
          replicate-do-db = seolandings
          replicate-do-db = playzzima
          replicate-do-db = playfw
          replicate-do-db = fmdb
          replicate-do-db = fwgifts
          replicate-do-db = guildsforum
          replicate-do-db = nivallab
          slave-skip-errors=1062
          ###### Performance ######
          thread_stack = 128K
          thread_cache_size = 2048
          thread_concurrency = 8
          max_connections = 200
          open-files-limit = 65535
          table_cache = 50000
          query_cache_limit = 1M
          query_cache_size = 256M
          innodb-use-sys-malloc = 0


          Master server is 5.5.19-rel24.0-204.squeeze

          Both hosts runs Debian 6 amd64

          After kill -9 for slave, I had to remove .idb file and then it works for some time, perhaps for 10 or 100 statements, randomly.
          Please, help, I have no space to keeps binlogs!

          Comment


          • #6
            We're experiencing the same issue on our replication slave, using statement-based replication. The replication thread will hang on a query and be stuck in the UPDATE or INIT states. The mysqld process uses minimal CPU.

            We're unable to stop using SLAVE STOP (hangs indefinitely), and can't shutdown the server normally (must use kill -9). In addition, SHOW STATUS commands also hang while this issue is occurring.

            This started happening to us with an upgrade to the package Percona-Server-server-55-5.5.25a-rel27.1.277.rhel5.x86_64 and is continuing with Percona-Server-server-55-5.5.27-rel28.0.291.rhel5.x86_64.

            This message showed up in our mysql error logs the first time the error happened:

            InnoDB: Warning: a long semaphore wait:--Thread 1097607488 has waited at buf0lru.c line 1102 for 241.00 seconds the semaphore:Mutex at 0x2aaab10a38c0 '&buf_pool->mutex', lock var 1waiters flag 1InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:InnoDB: Pending preads 0, pwrites 0InnoDB: ###### Diagnostic info printed to the standard error stream


            When shutting down, the process begins normally, but hangs on slave I/O exit:


            120827 18:30:53 [Note] /usr/sbin/mysqld: Normal shutdown120827 18:30:53 [Note] Event Scheduler: Purging the queue. 0 events120827 18:30:53 [Note] Slave I/O thread killed while reading event120827 18:30:53 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000739', position 515926767--- HANGS HERE ---


            At this point we need to kill -9 the mysqld/mysqld_safe processes and restart the server.

            Running Centos 5.6 with kernel package kernel-2.6.18-308.13.1.el5.centos.plus.x86_64 (as of our percona 5.5.27 update), with plenty of available disk space. Let me know if you need any more information, and thanks for a great product!

            Comment


            • #7
              Upgrading to yum version Percona-Server-server-55-5.5.27-rel28.1.296.rhel5.x86_64 seems to have fixed the issue *thanks!

              Comment

              Working...
              X