main-loop: WARNING: I/O thread spun for 1000 iterations

Bug #1481272 reported by Sibiao Luo
60
This bug affects 11 people
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Unassigned
qemu (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Artful
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

I compile the latest qemu to launch a VM but the monitor output the "main-loop: WARNING: I/O thread spun for 1000 iterations".

# /usr/local/bin/qemu-system-x86_64 -name rhel6 -S -no-kvm -m 1024M -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c9dd2a5c-40f2-fd3d-3c54-9cd84f8b9174 -rtc base=utc -drive file=/home/samba-share/ubuntu.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=disk,serial=425618d4-871f-4021-bc5d-bcd7f1b5ca9c,bootindex=0 -vnc :1 -boot menu=on -monitor stdio
QEMU 2.3.93 monitor - type 'help' for more information
(qemu) c
(qemu) main-loop: WARNING: I/O thread spun for 1000 iterations <-----------------------

qemu]# git branch -v
* master e95edef Merge remote-tracking branch 'remotes/sstabellini/tags/xen-migration-2.4-tag' into staging

Revision history for this message
Sibiao Luo (sibiaoluo) wrote :

host kernel version: 2.6.32-504.23.4.el6.x86_64

Revision history for this message
Sibiao Luo (sibiaoluo) wrote :

(qemu) info version
2.3.93

Revision history for this message
Stefan Hajnoczi (stefanha) wrote : Re: [Qemu-devel] [Bug 1481272] [NEW] main-loop: WARNING: I/O thread spun for 1000 iterations

On Tue, Aug 4, 2015 at 10:56 AM, Sibiao Luo <email address hidden> wrote:
> I compile the latest qemu to launch a VM but the monitor output the
> "main-loop: WARNING: I/O thread spun for 1000 iterations".

This this warning message is new, please use git-bisect(1) to
determine which commit caused it to appear:

https://git-scm.com/book/en/v2/Git-Tools-Debugging-with-Git#Binary-Search
https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html

Once you've found the commit that caused the warning to appear, please
post the details.

Stefan

Revision history for this message
Sibiao Luo (sibiaoluo) wrote :
Download full text (6.0 KiB)

Good catch, thanks stefanha for your kindly reminds with such good tools (git-bisect) to determine which commit caused this problem. I'm very sorry that i did not try it as your instruction timely, mainly that i focus on the openstack(nova&cinder) currently in the new company and have to adapt to the new work flow & role as quickly as possible. According to me trying that 05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3 is the first bad commit pushed by pbonzini.

My trying results as following:
master 074a992 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging <---> latest version fail
qemu-2.3-stable dfa83a6 Update version for 2.3.1 release <---> qemu-2.3-stable good

[root@PEK1000012301 qemu]# git bisect start
[root@PEK1000012301 qemu]# git bisect bad 074a992
[root@PEK1000012301 qemu]# git bisect good dfa83a6
Bisecting: a merge base must be tested
[e5b3a24181ea0cebf1c5b20f44d016311b7048f0] Update version for v2.3.0 release
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect good
Bisecting: 1105 revisions left to test after this (roughly 10 steps)
[afa25c4bb5bd0732dca4aa0691fd4682d242925f] Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20150611-1' into staging
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect good
Bisecting: 552 revisions left to test after this (roughly 9 steps)
[922f893e57da24bc80db3e79bea56485d1c111fa] ahci: assert is_ncq for process_ncq
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect good
Bisecting: 274 revisions left to test after this (roughly 8 steps)
[711dc6f36b74fe65a6e5a1847f1152717d887f8a] Merge remote-tracking branch 'remotes/cody/tags/jtc-for-upstream-pull-request' into staging
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect good
Bisecting: 138 revisions left to test after this (roughly 7 steps)
[226d007dbd75ec8d0f12d0f9e1ce66caf55d49e4] gdbstub: Set current CPU on interruptions
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect bad
Bisecting: 67 revisions left to test after this (roughly 6 steps)
[b9c46307996856d03ddc1527468ff5401ac03a79] Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2015-07-21-tag' into staging
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect good
Bisecting: 32 revisions left to test after this (roughly 5 steps)
[f793d97e454a56d17e404004867985622ca1a63b] Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
<<<compile QEMU with this branch and try to launch VM to test as comment#0>>>

[root@PEK1000012301 qemu]# git bisect bad
Bisecting: 17 revisions left to test after this (roughly 4 steps)
[80adb8fcad4778376a11d394a9e01516819e2327] tcg/aarch64: use 32-bit offset for 32-bit softmmu emulation
<<<compile QEMU with this branch and try to launch VM to test...

Read more...

Revision history for this message
Sibiao Luo (sibiaoluo) wrote :

[root@PEK1000012301 qemu]# git bisect log
git bisect start
# bad: [074a9925e1cfd659d5376dcaccd1436d3840e611] Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
git bisect bad 074a9925e1cfd659d5376dcaccd1436d3840e611
# good: [dfa83a6bae960e3e3a3186264d75790cfd727bce] Update version for 2.3.1 release
git bisect good dfa83a6bae960e3e3a3186264d75790cfd727bce
# good: [e5b3a24181ea0cebf1c5b20f44d016311b7048f0] Update version for v2.3.0 release
git bisect good e5b3a24181ea0cebf1c5b20f44d016311b7048f0
# good: [afa25c4bb5bd0732dca4aa0691fd4682d242925f] Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20150611-1' into staging
git bisect good afa25c4bb5bd0732dca4aa0691fd4682d242925f
# good: [922f893e57da24bc80db3e79bea56485d1c111fa] ahci: assert is_ncq for process_ncq
git bisect good 922f893e57da24bc80db3e79bea56485d1c111fa
# good: [711dc6f36b74fe65a6e5a1847f1152717d887f8a] Merge remote-tracking branch 'remotes/cody/tags/jtc-for-upstream-pull-request' into staging
git bisect good 711dc6f36b74fe65a6e5a1847f1152717d887f8a
# bad: [226d007dbd75ec8d0f12d0f9e1ce66caf55d49e4] gdbstub: Set current CPU on interruptions
git bisect bad 226d007dbd75ec8d0f12d0f9e1ce66caf55d49e4
# good: [b9c46307996856d03ddc1527468ff5401ac03a79] Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2015-07-21-tag' into staging
git bisect good b9c46307996856d03ddc1527468ff5401ac03a79
# bad: [f793d97e454a56d17e404004867985622ca1a63b] Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
git bisect bad f793d97e454a56d17e404004867985622ca1a63b
# bad: [80adb8fcad4778376a11d394a9e01516819e2327] tcg/aarch64: use 32-bit offset for 32-bit softmmu emulation
git bisect bad 80adb8fcad4778376a11d394a9e01516819e2327
# bad: [dc94bd9166af5236a56bd5bb06845911915a925c] Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
git bisect bad dc94bd9166af5236a56bd5bb06845911915a925c
# good: [6493c975af75be5b8d9ade954239bdf5492b7911] aio-win32: reorganize polling loop
git bisect good 6493c975af75be5b8d9ade954239bdf5492b7911
# good: [21a03d17f2edb1e63f7137d97ba355cc6f19d79f] AioContext: fix broken placement of event_notifier_test_and_clear
git bisect good 21a03d17f2edb1e63f7137d97ba355cc6f19d79f
# bad: [05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3] AioContext: optimize clearing the EventNotifier
git bisect bad 05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3

Revision history for this message
Arghya Banerjee (arbanerj) wrote :

We are hitting this bug as well. With Ceph storage back end running in OpenStack environment.

Ubuntu 14.04
Kernel - 3.13.0-52-generic
Qemu: QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.11)

Please provide a workaround or solution or if a patch is being worked on.

Thank you,

Arghya

Revision history for this message
Paolo Bonzini (bonzini) wrote :

Please try removing the "notified = true;" line from main-loop.c's os_host_main_loop_wait, and see if the warning comes out forever or just a few times.

Revision history for this message
Bruce Hohl (brucehohl) wrote :

I have experienced this behavior (main-loop: WARNING: I/O thread spun for 1000 iterations) and the resulting degraded performance. The VM becomes very unresponsive but eventually recovers. My setup:

Ubuntu 15.10 | qemu-system-x86_64 --version QEMU emulator version 2.3.0 (Debian 1:2.3+dfsg-5ubuntu9)
HW = Dell Precision M6600 with Intel(R) Core(TM) i7-2760QM CPU

I agree with other comments that it seems related to high disk I/O as the problem seems to occur during VM install or other VM operations which read/write large amounts of data.

Revision history for this message
T-artem (t-artem) wrote :

Count me in: Windows 10 Pro 64 November 2015 update is totally unusable because of this bug.

Some people say their VM recovers over time but mine doesn't apparently - I've been waiting for 15 minutes and the QEmu window is just dead.

Revision history for this message
Martin von Gagern (gagern) wrote :

I see the same with a Windows 10 guest on a Gentoo host. I tried setting nonblocking = 0; in vl.c as https://rafalcieslak.wordpress.com/2015/11/20/qemu-main-loop-warning-io-thread-spun-for-1000-iterations/ suggests, but that didn't solve the problem for me: the message is still there.

Revision history for this message
Marcin Mielniczuk (marmistrz) wrote :

I experience the same while running Minix guest under an Arch Linux host. Sometimes during heavy I/O the whole machine freezes with this error message.

Revision history for this message
Marcin Mielniczuk (marmistrz) wrote :

The command used is: qemu-system-x86_64 -drive file=../minix-img/minix_work.img -net user,hostfwd=tcp::22222-:22 -net nic,model=virtio -m 1024M
qemu 1.8.0 in this case, I'll test against 1.8.1 when it gets into the repos.

Revision history for this message
Marcin Mielniczuk (marmistrz) wrote :

Apparently, the solution is: https://rafalcieslak.wordpress.com/2015/11/20/qemu-main-loop-warning-io-thread-spun-for-1000-iterations/

I haven't experienced the hang within a couple of days, it's fairly irregular for me to reproduce it.

Revision history for this message
Peter Maydell (pmaydell) wrote :

I think that's a workaround, not a fix for whatever the underlying bug is.

Revision history for this message
Andreas Gustafsson (gson) wrote :

http://gnats.netbsd.org/52184 looks like it may be related.

Revision history for this message
Andreas Gustafsson (gson) wrote :

I no longer see the "WARNING: I/O thread spun for 1000 iterations" message. A bisection showed that it disappeared with the following commit:

commit e330c118f2a5a5365409b123cd0dd2c7d575bf05
Author: Paolo Bonzini <email address hidden>
Date: Fri Mar 3 11:51:07 2017 +0100

    main-loop: remove now unnecessary optimization

    This optimization is not necessary anymore, because the vCPU now drops
    the I/O thread lock even with TCG. Drop it to simplify the code and
    avoid the "I/O thread spun for 1000 iterations" warning.

    Reviewed-by: Alex Bennée <email address hidden>
    Reviewed-by: Edgar E. Iglesias <email address hidden>
    Signed-off-by: Paolo Bonzini <email address hidden>

The amount of system time consumed by the qemu process, which had incrased 300-fold with commit 05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3, is also back to normal.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Nice, thanks for the update.
I checked a few recent logs and agree that the message is gone since 2.9

Some might still encounter slow guests (for other reasons) and trigger the message, but the most common TCG emulation case is solved by this since Artful and later.

I think we don't want to backport all the prereqs (I/O thread locking changes) to Xenial that allow to add this change here.

Changed in qemu (Ubuntu Xenial):
status: New → Won't Fix
Changed in qemu (Ubuntu Artful):
status: New → Fix Released
Changed in qemu (Ubuntu Bionic):
status: New → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Updated the Ubuntu tasks on this bug, leaving the qemu state to a triager of the project.

Revision history for this message
Thomas Huth (th-huth) wrote :

Ok, closing this for upstream, too, according to the previous comments.

Changed in qemu:
status: New → Fix Released
Revision history for this message
Afsah Anwar (afsah) wrote :

I see the same issue with qemu version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.31).

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers