qemu hangs on live-migration with storage(virtual machine: windows Server 2008 with only one disk )

Bug #1434779 reported by youyunyehe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

Hi,
We are using “Windows Server 2008” with qemu-kvm-2.0.1 (linux kernel:3.10.0) as a host of a VM.
Using drive_mirror to do live-migration on the same host for different disks
Local disk: /sf/data/local/
Shared disk(iscsi): /sf/other/local/ --- the disk is busy, the IO rate is about 30MB/s
qemu-system-x86_64 -k en-us -m 2048 -smp 2 -vnc :3100 -usbdevice tablet -boot c -enable-kvm -drive file=/sf/data/local/vm-disk-1.qcow2,if=none,id=drive-ide0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100

Step 1:
Start migration: Drive_mirror -f drive-ide0 /sf/other/local/test.qcow2

Step 2:
When detect the migration has completed, then send cmd: block_job_complete -f drive-ide0

Step 3:
send cmd: info status
What surprised me is that the qemu monitor reports can’t be connected.

Then find bellows:
The qemu process hangs on the mirror_run->bdrv_drain_all->aio_poll->qemu_poll_ns->ppoll (),
None events were received and poll forever. I don’t know why the aio can’t be responsed. This case is hardly to
be generated but it really happens sometimes . I’m looking forward to getting help from you .
The stack capture snapshot:

Thanks
MyAngle

Revision history for this message
youyunyehe (youyunyehe) wrote :
Revision history for this message
Stefan Hajnoczi (stefanha) wrote : Re: [Qemu-devel] [Bug 1434779] [NEW] qemu hangs on live-migration with storage(virtual machine: windows Server 2008 with only one disk )

On Sat, Mar 21, 2015 at 07:44:37AM -0000, youyunyehe wrote:
> We are using “Windows Server 2008” with qemu-kvm-2.0.1 (linux kernel:3.10.0) as a host of a VM.

Please give the exact qemu-kvm package version and Linux distro name.
The code in qemu.git/master is slightly different so maybe the issue no
longer happens, but it would be worth looking at the exact source code
your qemu-kvm is built from.

I have CCed Jeff Cody, who maintains drive-mirror.

> Using drive_mirror to do live-migration on the same host for different disks
> Local disk: /sf/data/local/
> Shared disk(iscsi): /sf/other/local/ --- the disk is busy, the IO rate is about 30MB/s
> qemu-system-x86_64 -k en-us -m 2048 -smp 2 -vnc :3100 -usbdevice tablet -boot c -enable-kvm -drive file=/sf/data/local/vm-disk-1.qcow2,if=none,id=drive-ide0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100
>
> Step 1:
> Start migration: Drive_mirror -f drive-ide0 /sf/other/local/test.qcow2
>
> Step 2:
> When detect the migration has completed, then send cmd: block_job_complete -f drive-ide0
>
> Step 3:
> send cmd: info status
> What surprised me is that the qemu monitor reports can’t be connected.
>
> Then find bellows:
> The qemu process hangs on the mirror_run->bdrv_drain_all->aio_poll->qemu_poll_ns->ppoll (),
> None events were received and poll forever. I don’t know why the aio can’t be responsed. This case is hardly to
> be generated but it really happens sometimes . I’m looking forward to getting help from you .
> The stack capture snapshot:

Revision history for this message
halsey (halsey-pian) wrote :

> -----Original Message-----
> From: <email address hidden> [mailto:<email address hidden>]
> On Behalf Of Stefan Hajnoczi
> Sent: 2015年3月23日 21:32
> To: Bug 1434779
> Cc: <email address hidden>
> Subject: Re: [Qemu-devel] [Bug 1434779] [NEW] qemu hangs on live-migration with storage(virtual machine: windows Server 2008
> with only one disk )
>
> On Sat, Mar 21, 2015 at 07:44:37AM -0000, youyunyehe wrote:
> > We are using “Windows Server 2008” with qemu-kvm-2.0.1 (linux kernel:3.10.0) as a host of a VM.
>
> Please give the exact qemu-kvm package version and Linux distro name.
> The code in qemu.git/master is slightly different so maybe the issue no longer happens, but it would be worth looking at the exact
> source code your qemu-kvm is built from.

[Halsey] I also recreated this issue today, qemu hangs at bdrv_rw-co() -> bdrv_prwv_co() -> aio_poll(0 -> qemu_poll_ns() -> ppoll(). I'm just call bdrv_write to write data to qcow2 file. My qemu version is qemu-2.2.0.

> I have CCed Jeff Cody, who maintains drive-mirror.
>
> > Using drive_mirror to do live-migration on the same host for
> > different disks Local disk: /sf/data/local/
> > Shared disk(iscsi): /sf/other/local/ --- the disk is busy, the IO rate is about 30MB/s
> > qemu-system-x86_64 -k en-us -m 2048 -smp 2 -vnc :3100 -usbdevice
> > tablet -boot c -enable-kvm -drive
> > file=/sf/data/local/vm-disk-1.qcow2,if=none,id=drive-ide0,cache=none,a
> > io=native -device
> > ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100
> >
> > Step 1:
> > Start migration: Drive_mirror -f drive-ide0
> > /sf/other/local/test.qcow2
> >
> > Step 2:
> > When detect the migration has completed, then send cmd:
> > block_job_complete -f drive-ide0
> >
> > Step 3:
> > send cmd: info status
> > What surprised me is that the qemu monitor reports can’t be connected.
> >
> > Then find bellows:
> > The qemu process hangs on the
> > mirror_run->bdrv_drain_all->aio_poll->qemu_poll_ns->ppoll (), None
> > events were received and poll forever. I don’t know why the aio can’t be responsed. This case is hardly to be generated but it really
> happens sometimes . I’m looking forward to getting help from you .
> > The stack capture snapshot:

Revision history for this message
Thomas Huth (th-huth) wrote :

Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.