Announcement

Announcement Module
Collapse
No announcement yet.

"error writing communication packets" on percona, but not vanilla mysql server

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • "error writing communication packets" on percona, but not vanilla mysql server

    I routinely restore a snapshot, made with mysqldump (--innodb-optimize-keys-enabled), but as the size is growing, I'm exepriencing a problem with the Percona distro, but not the equivalent vanilla one.

    Specifically, I get a timeout (in the log: "error writing communication packets") while restoring the biggest tables - for example:


    Engine: InnoDB Version: 10 Row_format: Compact Rows: 16205204 Avg_row_length: 80 Data_length: 1302331392Max_data_length: 0 Index_length: 1610137600 Data_free: 7340032 Auto_increment: 139395716 Collation: utf8_general_ci

    Now, I've read carefully the information around, but after trying the most obvious things, I'm stuck.
    Specifically, the following are the timeout related variables:


    +----------------------------+----------+| Variable_name | Value |+----------------------------+----------+| connect_timeout | 10 || delayed_insert_timeout | 300 || innodb_lock_wait_timeout | 120 || innodb_rollback_on_timeout | OFF || interactive_timeout | 28800 || lock_wait_timeout | 31536000 || net_read_timeout | 1200 || net_write_timeout | 1200 || slave_net_timeout | 3600 || wait_timeout | 28800 |+----------------------------+----------+

    I'm referring to a Percona distro problem, because if I use the vanilla server, the problem doesn't happen, while it always does with the former.

    The error often happens while the innodb indexes are built, but not necessarily - sometimes also while INSERTing.

    System details follow:

    - Ubuntu precise x86-64 (experienced also on a Hardy x86-64)- Percona 5.5.27-rel28.1 - expand_fast_index_creation = 1- Vanilla 5.5.27- configuration is the shared for both - skip-networking is enabled. - innodb buffer pool: 1 GB

  • #2
    Hi,

    Can you provide "SHOW ENGINE INNODB STATUS" output when you restoring the biggest tables and full mysql error log? It would be helpful to understand this issue.

    Regards,
    Nil

    Comment


    • #3
      A sample follows, collected just after an error. I couldn't find any pattern in the errors occurrence; sometimes they happen shortly after starting the restore.

      Statistics have been collected while running the percona distro.


      =====================================120925 13:45:59 INNODB MONITOR OUTPUT=====================================Per second averages calculated from the last 21 seconds-----------------BACKGROUND THREAD-----------------srv_master_thread loops: 156 1_second, 156 sleeps, 15 10_second, 77 background, 77 flushsrv_master_thread log flush and writes: 287----------SEMAPHORES----------OS WAIT ARRAY INFO: reservation count 308, signal count 318Mutex spin waits 1416, rounds 6310, OS waits 187RW-shared spins 104, rounds 2972, OS waits 99RW-excl spins 1, rounds 629, OS waits 20Spin rounds per wait: 4.46 mutex, 28.58 RW-shared, 629.00 RW-excl--------FILE I/O--------I/O thread 0 state: waiting for completed aio requests (insert buffer thread)I/O thread 1 state: waiting for completed aio requests (log thread)I/O thread 2 state: waiting for completed aio requests (read thread)I/O thread 3 state: waiting for completed aio requests (read thread)I/O thread 4 state: waiting for completed aio requests (read thread)I/O thread 5 state: waiting for completed aio requests (read thread)I/O thread 6 state: waiting for completed aio requests (write thread)I/O thread 7 state: waiting for completed aio requests (write thread)I/O thread 8 state: waiting for completed aio requests (write thread)I/O thread 9 state: waiting for completed aio requests (write thread)Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] , ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0Pending flushes (fsync) log: 0; buffer pool: 02599 OS file reads, 56566 OS file writes, 1504 OS fsyncs0.00 reads/s, 0 avg bytes/read, 326.51 writes/s, 3.90 fsyncs/s-------------------------------------INSERT BUFFER AND ADAPTIVE HASH INDEX-------------------------------------Ibuf: size 1, free list len 597, seg size 599, 0 mergesmerged operations: insert 0, delete mark 0, delete 0discarded operations: insert 0, delete mark 0, delete 0Hash table size 4425293, node heap has 2 buffer(s)31313.70 hash searches/s, 626.30 non-hash searches/s---LOG---Log sequence number 722833317016Log flushed up to 722833317016Last checkpoint at 722833317016Max checkpoint age 162184828Checkpoint age target 157116553Modified age 0Checkpoint age 00 pending log writes, 0 pending chkp writes1443 log i/o's done, 4.29 log i/o's/second----------------------BUFFER POOL AND MEMORY----------------------Total memory allocated 2206203904; in additional pool allocated 0Internal hash tables (constant factor + variable factor) Adaptive hash index 35439328 (35402344 + 36984) Page hash 2213656 (buffer pool 0 only) Dictionary cache 9682270 (8851984 + 830286) File system 195752 (82672 + 113080) Lock system 5313792 (5313416 + 376) Recovery system 0 (0 + 0)Dictionary memory allocated 830286Buffer pool size 131071Buffer pool size, bytes 2147467264Free buffers 77564Database pages 53505Old database pages 19730Modified db pages 0Pending reads 0Pending writes: LRU 0, flush list 0, single page 0Pages made young 0, not young 00.00 youngs/s, 0.00 non-youngs/sPages read 1491, created 52014, written 530060.00 reads/s, 125.47 creates/s, 320.13 writes/sBuffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/sLRU len: 53505, unzip_LRU len: 0I/O sum[0]:cur[0], unzip sum[0]:cur[0]--------------ROW OPERATIONS--------------0 queries inside InnoDB, 0 queries in queue1 read views open inside InnoDB---OLDEST VIEW---Normal read viewRead view low limit trx n 6EEC3FRead view up limit trx id 6EEC3FRead view low limit trx id 6EEC3FRead view individually stored trx ids:-----------------Main thread process no. 9131, id 140187853780736, state: flushing logNumber of rows inserted 18538828, updated 0, deleted 0, read 031689.40 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s------------TRANSACTIONS------------Trx id counter 6EEC3FPurge done for trx's n < 6EEAEA undo n < 0History list length 1641LIST OF TRANSACTIONS FOR EACH SESSION:---TRANSACTION 0, not startedMySQL thread id 18, OS thread handle 0x7f800635c700, query id 955 localhost rootshow engine innodb status----------------------------END OF INNODB MONITOR OUTPUT============================

      Comment


      • #4
        How large is the mysqldump file that you are reloading?

        I would try increasing (temporarily) your net_read_timeout, net_write_timeout, and max_allowed_packet variables. When reloading a mysqldump file dies, it's generally caused by one of those.

        As a side note, have you looked into using Percona Xtrabackup for your backup / restore process instead of mysqldump? With little effort you could script a daily backup and restore process. We reload around 10 servers daily with scrubbed data using this method for performance testing / QA use, and it works great.

        Comment


        • #5
          so, I'm making some guess here, and I'll keep it separate.

          ===================

          I don't think it's a matter of timeout; I sometimes get the "connection lost" after a very short time.
          I've just got it 26 seconds after starting the restore, but today it also happened after 5 seconds.

          An interesting fact is that it seems that the problems sometimes happen when, from another client, I issue a SHOW PROCESSLIST.

          This is what I see in the logs, using --log-warnings=2:


          120927 17:00:55 [Warning] mysqld: Forcing close of thread 5 user: 'root'120927 17:00:55 [Warning] mysqld: Forcing close of thread 4 user: 'root'120927 17:00:56 InnoDB: Starting shutdown...


          ===================

          Now, getting back to the details request.

          Here there are the current mysqld related settings:


          +----------------------------+----------+| Variable_name | Value |+----------------------------+----------+| max_allowed_packet | 16777216 || connect_timeout | 10 || delayed_insert_timeout | 300 || innodb_lock_wait_timeout | 120 || innodb_rollback_on_timeout | OFF || interactive_timeout | 28800 || lock_wait_timeout | 31536000 || net_read_timeout | 1200 || net_write_timeout | 1200 || slave_net_timeout | 3600 || wait_timeout | 28800 |+----------------------------+----------+


          I'm aware of the xtrabackup tool; the connection timeout though, happened also in out webapp connections, so the problem is not restricted to the dump restoring - using the dump is, from this perspective, a convenient mean to reproduce it.
          The dump is 15 GB, anyway.

          Comment


          • #6
            Ah, so it's a bigger issue than just with mysqldump.

            The log portion you posted shows MySQL shutting down, and closing the remaining connections. Did you intentionally shutdown the server? If not, I'd check the log file to see of MySQL started back up right after that, as it could be getting killed and then restarted by mysqld_safe. And if that is the case, my first guess would be that the OS is killing the process for high memory usage.

            Comment


            • #7
              I actually get those errors without the mysqld crash. For example this is the log from today:

              121002 15:52:35 [Warning] Aborted connection 59 to db: 'production_1001' user: 'root' host: 'localhost' (Got an error writing communication packets)[/code]

              The error happened in the (un)usual way - executing a dump restore on one client, then execute SHOW PROCESSLIST on another client.
              The percona server has been running for one day (it still does).

              Comment


              • #8
                By the way, the server has plenty of memory:


                free -g total used free shared buffers cachedMem: 15 13 2 0 0 8-/+ buffers/cache: 4 11Swap: 0 0 0

                Comment


                • #9
                  I've just reproduced it again, in the (un)usual condition of executing a SHOW PROCESSLIST in a separate client session, while restoring the dump. MySQL did not crash.

                  I tried to reproduce the same error on the standard MySQL build, but couldn't manage to do it.

                  Error:


                  121010 10:49:37 [Warning] Aborted connection 2 to db: 'production_1010' user: 'root' host: 'localhost' (Got an error writing communication packets)

                  Comment

                  Working...
                  X