Waiting for master thread to be suspended

Bug #1671437 reported by Josef Vogt
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Status tracked in 2.4
2.4
Fix Released
High
Sergei Glushchenko

Bug Description

Hi

From time to time xtrabackup fails and does not finish. In the logs I can see the following message:
...
InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Creating shared tablespace for temporary tables
InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
InnoDB: Starting in background the rollback of uncommitted transactions
InnoDB: Rolling back trx with id 548932575, 1 rows to undo
InnoDB: Rollback of trx with id 548932575 completed
InnoDB: Rollback of non-prepared transactions completed
InnoDB: File './ibtmp1' size is now 12 MB.
InnoDB: 96 redo rollback segment(s) found. 1 redo rollback segment(s) are active.
InnoDB: 32 non-redo rollback segment(s) are active.
InnoDB: page_cleaner: 1000ms intended loop took 34245m23
InnoDB: page_cleaner: 1000ms intended loop took 34245m23

xtrabackup: Recovered WSREP position: 618d728d-3c47-11e6-94c0-527265a4c0cd:147540320
xtrabackup: starting shutdown with innodb_fast_shutdown = 1
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
InnoDB: Waiting for master thread to be suspended
InnoDB: Waiting for master thread to be suspended
InnoDB: Waiting for master thread to be suspended
(this message is repeated over and over, for multiple days)
...

Seems as something is preventing the shutdown to succeed.
How can I find out what it is?

The backups are executed as following:
- Stage One
/usr/bin/innobackupex --no-timestamp --no-lock --password="pwd" /srv/data/backup/`date +%Y%m%d-%H%M%S`

- Stage Two
/usr/bin/innobackupex --apply-log /srv/data/backup/`date +%Y%m%d-%H%M%S`

Best regards
Joe

Revision history for this message
Jericho Rivera (jericho-rivera) wrote :

Can you share the xtrabackup --version used for stage one and stage two? Please share my.cnf from source of the backup.

Changed in percona-xtrabackup:
status: New → Incomplete
Revision history for this message
Josef Vogt (joe81) wrote :

# xtrabackup --version
xtrabackup version 2.4.5 based on MySQL server 5.7.13 Linux (x86_64) (revision id: e41c0be)

There is no special my.cnf for xtrabackup, only the one for the MariaDB Server. And in this file, there is no special [xtrabackup] section.

The mysld part of my.cnf:
[mysqld]
# General
datadir=/srv/data/live/data
socket=/var/lib/mysql/mysql.sock
symbolic-links=0

# network stuff
bind-address=0.0.0.0
wait_timeout=600
max_connections=5000
max_user_connections=1000
max_allowed_packet=16M

# engine stuff
default-storage-engine=InnoDB
innodb_flush_log_at_trx_commit=1
innodb_buffer_pool_size=2G
innodb_log_file_size=256M
innodb_log_buffer_size=4M
innodb_thread_concurrency=8
innodb_flush_method=O_DIRECT
innodb_file_per_table=1
innodb_autoinc_lock_mode=2

# Binlog
binlog_format=row

# Logging
general-log=on
log-output=file
general-log-file=/var/log/mysql/queries.log
log-error=/var/log/mysql/mysqld.log
slow_query_log=on
slow_query_log_file=/var/log/mysql/slow_queries.log

# PID File
pid-file=/var/run/mysqld/mysqld.pid

Thanks
Joe

Changed in percona-xtrabackup:
status: Incomplete → New
Revision history for this message
Andrew Mc (amcq) wrote :

I seem to be having this same issue as well:

$ innobackupex --version
innobackupex version 2.4.6 Linux (x86_64) (revision id: 8ec05b7)

$ mysql --version
mysql Ver 15.1 Distrib 10.2.4-MariaDB, for Linux (x86_64) using readline 5.1

# innobackupex --apply-log /root/my-backup/2017-03-24_12-12-54
170324 12:23:40 innobackupex: Starting the apply-log operation

IMPORTANT: Please check that the apply-log run completes successfully.
           At the end of a successful apply-log run innobackupex
           prints "completed OK!".

innobackupex version 2.4.6 based on MySQL server 5.7.13 Linux (x86_64) (revision id: 8ec05b7)
....
....
....
InnoDB: Progress in percents: 1InnoDB: File './ibtmp1' size is now 12 MB.
InnoDB: 96 redo rollback segment(s) found. 1 redo rollback segment(s) are active.
InnoDB: 32 non-redo rollback segment(s) are active.
InnoDB: page_cleaner: 1000ms intended loop took 492004ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
InnoDB: page_cleaner: 1000ms intended loop took 492004ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
InnoDB: xtrabackup: Last MySQL binlog file position 332937698, file name ./binlog.000001

xtrabackup: Recovered WSREP position: 00000000-0000-0000-0000-000000000000:-1
xtrabackup: starting shutdown with innodb_fast_shutdown = 1
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
 2 3 4 5 6 7InnoDB: Waiting for 1 active transactions to finish
 8InnoDB: Waiting for 1 active transactions to finish
 9 10InnoDB: Waiting for 1 active transactions to finish
 11 12InnoDB: Waiting for 1 active transactions to finish
 13 14InnoDB: Waiting for 1 active transactions to finish
 15InnoDB: Waiting for 1 active transactions to finish
 16 17InnoDB: Waiting for 1 active transactions to finish

It's now been sitting like this (up to #81) for about an hour.

Revision history for this message
Jericho Rivera (jericho-rivera) wrote :

Can any of the reports provide the full backup and prepare logs if you can reproduce the problem.

Thanks!

Changed in percona-xtrabackup:
status: New → Incomplete
Revision history for this message
Josef Vogt (joe81) wrote :

As this only happens from time to time, I have to wait until to problem occurs again.

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :

Indeed, this happens sometimes because of the race condition in InnoDB shutdown process. This simple test case reproduces the issue (need to run several times to hit the bug).

function start_uncomitted_transaction()
{
    run_cmd $MYSQL $MYSQL_ARGS sakila <<EOF
START TRANSACTION;
DELETE FROM payment;
SELECT SLEEP(10000);
EOF
}

start_server
load_sakila

start_uncomitted_transaction &
transaction=$!

sleep 5

xtrabackup --backup --target-dir=$topdir/backup

kill -SIGKILL $transaction
stop_server

xtrabackup --prepare --target-dir=$topdir/backup

Revision history for this message
Ovais Tariq (ovais-tariq) wrote :
Download full text (7.4 KiB)

I have hit the same issue. Xtrabackup has been in the state "InnoDB: Waiting for master thread to be suspended" for a very long time.

Below are the stack traces:

root@schemadock362-sjc1:~# nsenter -t 23995 -m -p quickstack -f -p 18928
2017-06-01 05:01:01 154382: Reading process symbols..
2017-06-01 05:01:01 155473: No debug symbols found on /usr/bin/xtrabackup
2017-06-01 05:01:01 201782: Gathering stack traces..
2017-06-01 05:01:01 202629: Printing stack traces..
2017-06-01 05:01:01 202644: Total Traced Time: 0.458 milliseconds
2017-06-01 05:01:01 202650: Average Traced Time Per LWP: 0.038 milliseconds
2017-06-01 05:01:01 202653: Longest Traced LWP: LWP 18981, 0.057 milliseconds

Thread 12 (LWP 18981):
#01 0x00007f73be1ac0b0 in pthread_cond_wait@@GLIBC_2.3.2 () from /tmp/buildd/glibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#02 0x000000000076a8d6 in os_event::wait_low () from /usr/bin/xtrabackup
#03 0x000000000077b9d6 in os_aio_handler () from /usr/bin/xtrabackup
#04 0x00000000008e2c62 in fil_aio_wait () from /usr/bin/xtrabackup
#05 0x000000000075f4e8 in io_handler_thread () from /usr/bin/xtrabackup
#06 0x00007f73be1a80ca in start_thread () from /tmp/buildd/glibc-2.19/nptl/pthread_create.c:309

Thread 11 (LWP 18982):
#01 0x00007f73be1ac0b0 in pthread_cond_wait@@GLIBC_2.3.2 () from /tmp/buildd/glibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#02 0x000000000076a8d6 in os_event::wait_low () from /usr/bin/xtrabackup
#03 0x000000000077b9d6 in os_aio_handler () from /usr/bin/xtrabackup
#04 0x00000000008e2c62 in fil_aio_wait () from /usr/bin/xtrabackup
#05 0x000000000075f4e8 in io_handler_thread () from /usr/bin/xtrabackup
#06 0x00007f73be1a80ca in start_thread () from /tmp/buildd/glibc-2.19/nptl/pthread_create.c:309

Thread 10 (LWP 18983):
#01 0x00007f73be1ac0b0 in pthread_cond_wait@@GLIBC_2.3.2 () from /tmp/buildd/glibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#02 0x000000000076a8d6 in os_event::wait_low () from /usr/bin/xtrabackup
#03 0x000000000077b9d6 in os_aio_handler () from /usr/bin/xtrabackup
#04 0x00000000008e2c62 in fil_aio_wait () from /usr/bin/xtrabackup
#05 0x000000000075f4e8 in io_handler_thread () from /usr/bin/xtrabackup
#06 0x00007f73be1a80ca in start_thread () from /tmp/buildd/glibc-2.19/nptl/pthread_create.c:309

Thread 9 (LWP 18984):
#01 0x00007f73be1ac0b0 in pthread_cond_wait@@GLIBC_2.3.2 () from /tmp/buildd/glibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#02 0x000000000076a8d6 in os_event::wait_low () from /usr/bin/xtrabackup
#03 0x000000000077b9d6 in os_aio_handler () from /usr/bin/xtrabackup
#04 0x00000000008e2c62 in fil_aio_wait () from /usr/bin/xtrabackup
#05 0x000000000075f4e8 in io_handler_thread () from /usr/bin/xtrabackup
#06 0x00007f73be1a80ca in start_thread () from /tmp/buildd/glibc-2.19/nptl/pthread_create.c:309

Thread 8 (LWP 18985):
#01 0x00007f73be1ac0b0 in pthread_cond_wait@@GLIBC_2.3.2 () from /tmp/buildd/glibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#02 0x000000000076a8d6 in os_event::wait_low () from /usr/bin/xtrabackup
#0...

Read more...

Revision history for this message
Ovais Tariq (ovais-tariq) wrote :

XtraBackup version that I used to take the backup and prepare the backup is 2.4.7.

The prepare command is as follows:
/usr/bin/xtrabackup --prepare --use-memory=20444M --target-dir=/var/lib/mysql

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-499

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.