This looks like a bug in non-lazy drop of table -- lock ordering is not preserved leading to deadlock (I presume you have tables being dropped around the time? ).
Does adding innodb_lazy_drop_table=ON (it is also dynamic, so no need to restart server) help?
Regarding the lock wait itself (in non lazy situation):
--Thread 140476533987072 has waited at buf0buf.c line 2529 for 303.00 seconds the semaphore:
S-lock on RW-latch at 0x416f048 '&buf_pool->page_hash_latch'
a writer (thread id 140476533987072) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0buf.c line 2529
Last time write locked in file /home/jenkins/workspace/percona-server-5.5-debs/label_exp/debian6-64/target/Percona-Server-5.5.24-rel26.0/storage/innobase/buf/buf0lru.c line 629
Both the waiter and the thread waited upon are same - 140476533987072. The reason being drop table path (buf0lru.c:629) also has buf0buf.c:2529 in it -- the root being buf_LRU_remove_all_pages for both.
In other words,
at buf0buf.c:2529 in buf_page_get_gen:
rw_lock_s_lock(&buf_pool->page_hash_latch); --> it is already holding X-latch here, which it acquired at buf0lru.c:629
In btr_search_drop_page_hash_when_freed, they are already accounting for this kind of lock ordering but they are checking only for block->lock I believe.
"""
/* If the caller has a latch on the page, then the caller must
have a x-latch on the page and it must have already dropped
the hash index for the page. Because of the x-latch that we
are possibly holding, we cannot s-latch the page, but must
(recursively) x-latch it, even though we are only reading. */
"""
This looks like a bug in non-lazy drop of table -- lock ordering is not preserved leading to deadlock (I presume you have tables being dropped around the time? ).
Does adding innodb_ lazy_drop_ table=ON (it is also dynamic, so no need to restart server) help?
Regarding the lock wait itself (in non lazy situation):
--Thread 140476533987072 has waited at buf0buf.c line 2529 for 303.00 seconds the semaphore: >page_hash_ latch' workspace/ percona- server- 5.5-debs/ label_exp/ debian6- 64/target/ Percona- Server- 5.5.24- rel26.0/ storage/ innobase/ buf/buf0lru. c line 629
S-lock on RW-latch at 0x416f048 '&buf_pool-
a writer (thread id 140476533987072) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0buf.c line 2529
Last time write locked in file /home/jenkins/
Both the waiter and the thread waited upon are same - 140476533987072. The reason being drop table path (buf0lru.c:629) also has buf0buf.c:2529 in it -- the root being buf_LRU_ remove_ all_pages for both.
In other words,
at buf0buf.c:2529 in buf_page_get_gen:
rw_lock_ s_lock( &buf_pool- >page_hash_ latch); --> it is already holding X-latch here, which it acquired at buf0lru.c:629
In btr_search_ drop_page_ hash_when_ freed, they are already accounting for this kind of lock ordering but they are checking only for block->lock I believe.
"""
/* If the caller has a latch on the page, then the caller must
have a x-latch on the page and it must have already dropped
the hash index for the page. Because of the x-latch that we
are possibly holding, we cannot s-latch the page, but must
(recursively) x-latch it, even though we are only reading. */
"""