Percona Server with XtraDB

Mysql using 100% cpu with no activity

Reported by Nathan March on 2012-08-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server
Undecided
Unassigned
5.5
Undecided
Unassigned

Bug Description

Originally discovered on percona 5.5.24, upgraded to 5.5.27 and bug persists. We see mysqld sitting in top at 100% cpu saturating a single core, with nothing running in mysql. This happens immediately after starting mysql, even with skip-networking and skip-slave-start.

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32137 mysql 20 0 3829m 320m 5768 S 102 8.0 0:09.32 mysqld

metrixdb2 ~ # cat /etc/mysql/my.cnf
[mysqld]
server-id = 3
expire_logs_days = 5
log-bin = metrixdb2-bin
relay-log = metrixdb2-relay-bin

skip-networking
skip-name-resolve
skip-slave-start
log-warnings

max_connections = 255
open_files_limit = 16384
key_buffer = 128M

read_buffer_size = 1M
read_rnd_buffer_size = 4M
sort_buffer_size = 1M
tmp_table_size = 64M
max_heap_table_size = 64M
myisam_sort_buffer_size = 128M

table_open_cache = 16384
thread_cache_size = 64

max_allowed_packet = 32M
bulk_insert_buffer_size = 16M

query_cache_type = 1
query_cache_limit = 8M
query_cache_size = 256M

tmpdir = /var/mysql-tmp

slow_query_log = 1
long_query_time = 2
slow_query_log_file = /var/lib/mysql/slow.log

innodb_file_per_table
innodb_buffer_pool_size = 3G
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 256M
innodb_log_buffer_size = 8M
innodb_data_file_path = ibdata1:100M:autoextend
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 0

innodb_io_capacity = 400
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_adaptive_flushing = 1

[mysqldump]
quick
max_allowed_packet=32M

[myisamchk]
key_buffer=256M
sort_buffer=256M
read_buffer=2M
write_buffer=2M

mysql> Bye
metrixdb2 ~ # service mysql restart
* Restarting MySQL... [ OK ]
metrixdb2 ~ # mysqladmin proc
+----+------+-----------+----+---------+------+-------+------------------+-----------+---------------+-----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+------+-----------+----+---------+------+-------+------------------+-----------+---------------+-----------+
| 1 | root | localhost | | Query | 0 | | show processlist | 0 | 0 | 1 |
+----+------+-----------+----+---------+------+-------+------------------+-----------+---------------+-----------+

metrixdb2 ~ # mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.27-log MySQL Community Server (GPL)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show engine innodb status \G
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
120828 14:13:15 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 11 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 1 1_second, 1 sleeps, 0 10_second, 1 background, 1 flush
srv_master_thread log flush and writes: 11
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 13, signal count 13
Mutex spin waits 12, rounds 352, OS waits 9
RW-shared spins 2, rounds 60, OS waits 2
RW-excl spins 0, rounds 0, OS waits 0
Spin rounds per wait: 29.33 mutex, 30.00 RW-shared, 0.00 RW-excl
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (read thread)
I/O thread 7 state: waiting for completed aio requests (read thread)
I/O thread 8 state: waiting for completed aio requests (read thread)
I/O thread 9 state: waiting for completed aio requests (read thread)
I/O thread 10 state: waiting for completed aio requests (write thread)
I/O thread 11 state: waiting for completed aio requests (write thread)
I/O thread 12 state: waiting for completed aio requests (write thread)
I/O thread 13 state: waiting for completed aio requests (write thread)
I/O thread 14 state: waiting for completed aio requests (write thread)
I/O thread 15 state: waiting for completed aio requests (write thread)
I/O thread 16 state: waiting for completed aio requests (write thread)
I/O thread 17 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0, 0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0, 0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
454 OS file reads, 3 OS file writes, 3 OS fsyncs
4.00 reads/s, 16384 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 14, seg size 16, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 6375023, node heap has 0 buffer(s)
0.00 hash searches/s, 23.27 non-hash searches/s
---
LOG
---
Log sequence number 833482097861
Log flushed up to 833482097861
Last checkpoint at 833482097861
Max checkpoint age 434154333
Checkpoint age target 420587011
Modified age 0
Checkpoint age 0
0 pending log writes, 0 pending chkp writes
8 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 3309305856; in additional pool allocated 0
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 51004400 (51000184 + 4216)
    Page hash 3188312 (buffer pool 0 only)
    Dictionary cache 12848608 (12751472 + 97136)
    File system 110832 (82672 + 28160)
    Lock system 7969872 (7969496 + 376)
    Recovery system 0 (0 + 0)
Dictionary memory allocated 97136
Buffer pool size 196607
Buffer pool size, bytes 3221209088
Free buffers 196166
Database pages 441
Old database pages 0
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 441, created 0, written 0
3.82 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 977 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 441, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
---OLDEST VIEW---
Normal read view
Read view low limit trx n:o 28479000
Read view up limit trx id 28479000
Read view low limit trx id 28479000
Read view individually stored trx ids:
-----------------
Main thread process no. 32222, id 47103927834384, state: flushing log
Number of rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
------------
TRANSACTIONS
------------
Trx id counter 28479000
Purge done for trx's n:o < 28478E34 undo n:o < 0
History list length 543
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 2, OS thread handle 0x2ad73d0b0710, query id 4 localhost root
show engine innodb status
----------------------------
END OF INNODB MONITOR OUTPUT
============================

1 row in set (0.00 sec)

mysql>

Log seems to indicate nothing unusual:

2012-08-28 14:13:02.416209500
2012-08-28 14:13:03.486882500 120828 14:13:03 [Note] Flashcache bypass: disabled
2012-08-28 14:13:03.486929500 120828 14:13:03 [Note] Flashcache setup error is : open flash device failed
2012-08-28 14:13:03.486931500
2012-08-28 14:13:03.513644500 120828 14:13:03 InnoDB: The InnoDB memory heap is disabled
2012-08-28 14:13:03.513699500 120828 14:13:03 InnoDB: Mutexes and rw_locks use GCC atomic builtins
2012-08-28 14:13:03.513741500 120828 14:13:03 InnoDB: Compressed tables use zlib 1.2.5
2012-08-28 14:13:03.513783500 120828 14:13:03 InnoDB: Using Linux native AIO
2012-08-28 14:13:03.562324500 120828 14:13:03 InnoDB: Initializing buffer pool, size = 3.0G
2012-08-28 14:13:03.862050500 120828 14:13:03 InnoDB: Completed initialization of buffer pool
2012-08-28 14:13:03.879582500 120828 14:13:03 InnoDB: highest supported file format is Barracuda.
2012-08-28 14:13:04.203033500 120828 14:13:04 InnoDB: Waiting for the background threads to start
2012-08-28 14:13:05.203989500 120828 14:13:05 Percona XtraDB (http://www.percona.com) 1.1.8-27.0 started; log sequence number 833482097861
2012-08-28 14:13:05.225676500 120828 14:13:05 [Note] Event Scheduler: Loaded 0 events
2012-08-28 14:13:05.225695500 120828 14:13:05 [Note] /usr/sbin/mysqld: ready for connections.
2012-08-28 14:13:05.225696500 Version: '5.5.27-log' socket: '/tmp/mysql.sock' port: 0 MySQL Community Server (GPL)

This database has never really been used, only setup + replicated from a primary database server which shows no problems.

Nathan March (nmarch) wrote :
Download full text (8.8 KiB)

Some time later, mysql has now hung on a simple insert statement from replication. I'm assuming there's some sort of internal data file corruption here that's gone undetected, let me know please if it'd be useful to see it happening.

2012-08-28 16:29:25.005361500 InnoDB: Warning: a long semaphore wait:
2012-08-28 16:29:25.005459500 --Thread 47404182935312 has waited at buf0buf.c line 3667 for 241.00 seconds the semaphore:
2012-08-28 16:29:25.005485500 Mutex at 0x2577c78 '&buf_pool->mutex', lock var 1
2012-08-28 16:29:25.005486500 waiters flag 1
2012-08-28 16:29:25.005740500 InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
2012-08-28 16:29:25.005768500 InnoDB: Pending preads 0, pwrites 0
2012-08-28 16:29:33.537933500
2012-08-28 16:29:33.537935500 =====================================
2012-08-28 16:29:33.537936500 120828 16:29:33 INNODB MONITOR OUTPUT
2012-08-28 16:29:33.537937500 =====================================
2012-08-28 16:29:33.537938500 Per second averages calculated from the last 60 seconds
2012-08-28 16:29:33.537939500 -----------------
2012-08-28 16:29:33.537940500 BACKGROUND THREAD
2012-08-28 16:29:33.537941500 -----------------
2012-08-28 16:29:33.537942500 srv_master_thread loops: 7319 1_second, 7304 sleeps, 705 10_second, 378 background, 378 flush
2012-08-28 16:29:33.537976500 srv_master_thread log flush and writes: 8093
2012-08-28 16:29:33.537977500 ----------
2012-08-28 16:29:33.537978500 SEMAPHORES
2012-08-28 16:29:33.537979500 ----------
2012-08-28 16:29:33.537980500 OS WAIT ARRAY INFO: reservation count 38142, signal count 38077
2012-08-28 16:29:33.537981500 --Thread 47404182935312 has waited at buf0buf.c line 3667 for 249.00 seconds the semaphore:
2012-08-28 16:29:33.537982500 Mutex at 0x2577c78 '&buf_pool->mutex', lock var 1
2012-08-28 16:29:33.537988500 waiters flag 1
2012-08-28 16:29:33.537989500 --Thread 47404177839888 has waited at buf0flu.c line 1887 for 242.00 seconds the semaphore:
2012-08-28 16:29:33.537990500 Mutex at 0x2577c78 '&buf_pool->mutex', lock var 1
2012-08-28 16:29:33.537991500 waiters flag 1
2012-08-28 16:29:33.537992500 Mutex spin waits 41706, rounds 1214545, OS waits 36676
2012-08-28 16:29:33.537993500 RW-shared spins 1167, rounds 33805, OS waits 1028
2012-08-28 16:29:33.537994500 RW-excl spins 38, rounds 4457, OS waits 125
2012-08-28 16:29:33.537999500 Spin rounds per wait: 29.12 mutex, 28.97 RW-shared, 117.29 RW-excl
2012-08-28 16:29:33.538000500 --------
2012-08-28 16:29:33.538001500 FILE I/O
2012-08-28 16:29:33.538001500 --------
2012-08-28 16:29:33.538002500 I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
2012-08-28 16:29:33.538011500 I/O thread 1 state: waiting for completed aio requests (log thread)
2012-08-28 16:29:33.538013500 I/O thread 2 state: waiting for completed aio requests (read thread)
2012-08-28 16:29:33.538014500 I/O thread 3 state: waiting for completed aio requests (read thread)
2012-08-28 16:29:33.538015500 I/O thread 4 state: waiting for completed aio requests (read thread)
2012-08-28 16:29:33.538016500 I/O thread 5 state: waiting for completed aio requests (read thread)
2012-08-28 16:29:33.538021500 I/O thread 6 stat...

Read more...

The hang while replicating seems to match bug 1040735.

For the startup issue, could you take GDB stacktraces while that is happening? For that, you'd need to install the debug symbols and GDB, and do
gdb -ex "set pagination 0" -ex "thread apply all bt" --batch -p $(pidof mysqld)

Nathan March (nmarch) wrote :

Sure thing. I had trouble getting it to build with debug symbols (turns on Werror and doesn't make it through compilation) but grabbed your binary package (Percona-Server-5.5.27-rel28.0-291.Linux.x86_64) and reproduced on that.

Nathan March (nmarch) wrote :

Here's an additional one a minute or two later in case it's useful

@Nathan,

Were these stacktraces taken during period of high CPU or the lock contention mentioned in #1? There doesn't seem to be much going on based on your stacktraces.

While the comment#1 is about 1040735, the high CPU is not related to that. Can you provide oprofile output? (you can use pt-stalk from percona-toolkit for this).

Also, can you confirm you are not hitting the ntp bug -- http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ and https://access.redhat.com/knowledge/articles/15145 (this guess innocuous high mysqld cpu usage).

In both these stacktraces the server is almost idle. The CPU is iterating over InnoDB internal sync structures for monitoring threads. This should not take a long time.

Does this happen during the startup before the server starts accepting connections or afterwards? If afterwards, what happens when you try to connect during this time?

Nathan March (nmarch) wrote :

Argh, it was indeed the leap second bug. Should have figured that out, thanks for the help.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers