Announcement

Announcement Module
Collapse
No announcement yet.

strange semaphore locking issues, just started

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • strange semaphore locking issues, just started

    Running mysql 5.5.21 server on centos 5.7 (we upgraded to from 5.0 to 5.5 about two weeks ago). Fairly high usage implementation, but we weren't seeing any issues until Friday, when the server appeared to "hang." Saw this in the innodb status:


    ----------SEMAPHORES----------OS WAIT ARRAY INFO: reservation count 5716301, signal count 312799955--Thread 1233332544 has waited at btr0sea.c line 882 for 0.00 seconds the semaphore:S-lock on RW-latch at 0x2ab05721de58 created in file btr0sea.c line 178number of readers 6, waiters flag 0, lock_word: ffffaLast time read locked in file btr0sea.c line 882Last time write locked in file /export/home/pb2/build/sb_0-4846558-1328009422.24/rpm/BUILD/mysql-5.5.21/mysql-5.5.21/storage/innobase/btr/btr0sea.c line 633--Thread 1241786688 has waited at buf0buf.c line 2250 for 0.00 seconds the semaphore:Mutex at 0x20d0cae8 created file buf0buf.c line 1159, lock var 0waiters flag 0--Thread 1237031232 has waited at btr0sea.c line 882 for 0.00 seconds the semaphore:S-lock on RW-latch at 0x2ab05721de58 created in file btr0sea.c line 178number of readers 2, waiters flag 0, lock_word: ffffeLast time read locked in file btr0sea.c line 882Last time write locked in file /export/home/pb2/build/sb_0-4846558-1328009422.24/rpm/BUILD/mysql-5.5.21/mysql-5.5.21/storage/innobase/btr/btr0sea.c line 633--Thread 1252354368 has waited at btr0sea.c line 633 for 0.00 seconds the semaphore:X-lock on RW-latch at 0x2ab05721de58 created in file btr0sea.c line 178number of readers 3, waiters flag 0, lock_word: ffffdLast time read locked in file btr0sea.c line 882Last time write locked in file /export/home/pb2/build/sb_0-4846558-1328009422.24/rpm/BUILD/mysql-5.5.21/mysql-5.5.21/storage/innobase/btr/btr0sea.c line 633Mutex spin waits 2455650660, rounds 8103916598, OS waits 1919724RW-shared spins 112419621, rounds 356817756, OS waits 1176007RW-excl spins 15709334, rounds 424093944, OS waits 579485Spin rounds per wait: 3.30 mutex, 3.17 RW-shared, 27.00 RW-excl


    After some research, it appeared that the innodb_adaptive_hash_index should be turned off, according to this article:
    http://docs.oracle.com/cd/E17952_01/refman-5.5-en/innodb-per formance-adaptive_hash_index.html


    I disabled it on Friday night. This morning, early, there were similar symptoms (and looks like database was hanging). The innodb status showed this (I've cut out some of the lines because there are a lot:


    --Thread 1292212544 has waited at buf0buf.c line 2250 for 0.00 seconds the semaphore:Mutex at 0x44daae8 created file buf0buf.c line 1159, lock var 1waiters flag 0--Thread 1314404672 has waited at buf0buf.c line 2250 for 0.00 seconds the semaphore:Mutex at 0x44daae8 created file buf0buf.c line 1159, lock var 0waiters flag 0--Thread 1248885056 has waited at buf0buf.c line 2250 for 0.00 seconds the semaphore:Mutex at 0x44daae8 created file buf0buf.c line 1159, lock var 1waiters flag 0--Thread 1327614272 has waited at buf0buf.c line 2250 for 0.00 seconds the semaphore:Mutex at 0x44daae8 created file buf0buf.c line 1159, lock var 0waiters flag 0Mutex spin waits 537805198, rounds 3607532492, OS waits 3101709RW-shared spins 4823382, rounds 12144073, OS waits 71610RW-excl spins 722301, rounds 22876801, OS waits 26941Spin rounds per wait: 6.71 mutex, 2.52 RW-shared, 31.67 RW-excl



    I immediately turned the innodb_adaptive_hash_index back on and things settled down to normal.

    Any idea what could be causing this and how to fix? And how can I monitor?

  • #2
    I think you need to capture more information when this happens. I would use pt-stalk. Ensure that oprofile and GDB are installed and enabled -- you will need the data from them.

    Comment


    • #3
      I've spent several hours with pt-stalk. I'm not understanding how to collect data from it (and it keeps asking for my password every single). Any pointers you could give me with it?

      Comment


      • #4
        You should generally run it as root, with default arguments except for customizing the trigger and in this case ensuring that oprofile and gdb data are collected. To avoid entering a password, put the password in roots /root/.my.cnf file. I usually run it as a daemon, too.

        You can also create a configuration file for pt-stalk; see http://www.percona.com/doc/percona-toolkit/2.0/configuration _files.html

        Comment


        • #5
          @bytemare did you find anything upon your research? It seems we are hitting the same issue, and don't have explanation.

          Comment


          • #6
            After a log of research, came up with the following, which seems to be helping:
            innodb_thread_concurrency = 16 (in 5.5 it's set to 0, which is unlimited) this setting made a huge difference
            innodb_buffer_pool_instances = 2 (introduced in 5.5, note our server has 64GB of ram, and buffer_pool_size is 40GB)

            Basically, I collected status whenever we noticed increasaed load on our server. There were a lot of semaphore issues, like above, and also I noticed threads_running = 254 during the first incident. So tuning that down made a huge difference. I started with 32, but lowered it 16 -- it seemed to smooth out the spikes in load. I still saw issues with the buf0buf.c thing at times, so that's when I divided the buffer pool in two. So far, after 1 week, no issues.

            I still look at show engine innodb mutex and see a lot of waits, so I think it's not tune as much as it could be, but the load on the server is much better and we don't seem to be running into any mutex lock issues.

            Comment


            • #7
              Thanks for your reply. We pinned to the same issue, bugs stated here

              https://bugs.launchpad.net/percona-server/+bug/1035892
              https://bugs.launchpad.net/percona-server/+bug/1007268
              https://bugs.launchpad.net/percona-server/+bug/1040735

              Comment


              • #8
                @bytemare,

                I am facing the same issue with btr0sea.c

                OS WAIT ARRAY INFO: reservation count 248796, signal count 10462982
                --Thread 1218939200 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
                S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
                number of readers 0, waiters flag 0, lock_word: 100000
                Last time read locked in file btr0sea.c line 918
                Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/ce ntos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona- Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669

                Can you tell me what is the reason to set the parameters:

                innodb_thread_concurrency = 16
                innodb_buffer_pool_instances = 2

                Currently my RAM is 35 GB and we are using buffer_pool 18 GB. (As we get lots of system memory issues)
                innodb_thread_concurrency = 0
                innodb_buffer_pool_instances = 4
                No. of CPU = 4

                Comment


                • #9
                  hi This is my understanding of the parameters: first , innodb_thread_concurrency -- if you set this to 0 then you have unlimited operating system threads within innodb. If you have issues with performance, you can get a lot of threads queued, and eventually the OS will have to swap them out, which is expensive. So we found it was better to set it to 16 and just let additional threads wait.

                  As for the innodb_buffer_pool_instances, when you have a lot of RAM, eventually you might get contention on the buffer pool, which will also appear to stall the db. Breaking this up into two reduces contention. I don't think you would need this set to 4, I would eliminate it (you only have 18GB assigned to your buffer pool) or set it to two. As for the thread concurrency, there are some interesting analysis, I believe from percona, that show peak performance of around 32 or even 16 threads on a busy server.

                  Comment


                  • #10
                    Is this known issue and accepted by Percona?
                    As per bugs.mysql.com/bug.php?id=66402
                    Issue seems to be at Percona side not at upper stream.

                    Comment

                    Working...
                    X