On Thu, 04/21 08:34, Fam Zheng wrote:
> On Wed, 04/20 22:03, Max Reitz wrote:
> > On 20.04.2016 20:09, Max Reitz wrote:
> > > On 20.04.2016 02:03, Matthew Schumacher wrote:
> > >> Max,
> > >>
> > >> Qemu still crashes for me, but the debug is again very different. When
> > >> I attach to the qemu process from gdb, it is unable to provide a
> > >> backtrace when it crashes. The log file is different too. Any ideas?
> > >>
> > >> qemu-system-x86_64: block.c:2307: bdrv_replace_in_backing_chain:
> > >> Assertion `!bdrv_requests_pending(old)' failed.
> > >
> > > This message is exactly the same as you saw in 2.5.1, so I guess we've
> > > at least averted a regression in 2.6.0.
> >
> > I get the same message in 2.5.0, in 2.4.0 it's "Co-routine re-entered
> > recursively". 2.3.0 works fine.
> >
> > Bisecting the regression between 2.3.0 and 2.4.0 interestingly yields
> > 48ac0a4df84662f as the problematic commit, but I can't imagine that this
> > is the root issue. The effective change it brings is that for active
> > commits, the buf_size is no longer the same as the granularity, but the
> > default mirror buf_size instead.
> >
> > When forcing buf_size to the granularity, the issue first appears with
> > commit 3f09bfbc7bee812 (after 2.4.0, before 2.5.0), which is much less
> > surprising, because this is the one that introduced the assertion in the
> > first place.
> >
> > However, I still don't think the assertion is the problem but the fact
> > that the guest device can still send requests after bdrv_drained_begin().
>
> Thanks for debugging this.
>
> bdrv_drained_begin isn't effective because the guest notifier handler is not
> registered as "external":
>
> virtio_queue_set_host_notifier_fd_handler
> event_notifier_set_handler
> qemu_set_fd_handler
> aio_set_fd_handler(ctx, fd,
> is_external, /* false */
> ...)
>
>
> is_external SHOULD be true here.
>
This patch survives the reproducer I have on top of master (also submitted to
qemu-devel for 2.6):
---
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index f745c4a..002c2c6 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1829,10 +1829,11 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign, bool set_handler)
{
if (assign && set_handler) {
- event_notifier_set_handler(&vq->host_notifier,
- virtio_queue_host_notifier_read);
+ aio_set_event_notifier(qemu_get_aio_context(), &vq->host_notifier,
+ true, virtio_queue_host_notifier_read);
} else {
- event_notifier_set_handler(&vq->host_notifier, NULL);
+ aio_set_event_notifier(qemu_get_aio_context(), &vq->host_notifier,
+ true, NULL);
}
if (!assign) {
/* Test and clear notifier before after disabling event,
On Thu, 04/21 08:34, Fam Zheng wrote: in_backing_ chain: requests_ pending( old)' failed. begin() . queue_set_ host_notifier_ fd_handler set_handler fd_handler( ctx, fd,
> On Wed, 04/20 22:03, Max Reitz wrote:
> > On 20.04.2016 20:09, Max Reitz wrote:
> > > On 20.04.2016 02:03, Matthew Schumacher wrote:
> > >> Max,
> > >>
> > >> Qemu still crashes for me, but the debug is again very different. When
> > >> I attach to the qemu process from gdb, it is unable to provide a
> > >> backtrace when it crashes. The log file is different too. Any ideas?
> > >>
> > >> qemu-system-x86_64: block.c:2307: bdrv_replace_
> > >> Assertion `!bdrv_
> > >
> > > This message is exactly the same as you saw in 2.5.1, so I guess we've
> > > at least averted a regression in 2.6.0.
> >
> > I get the same message in 2.5.0, in 2.4.0 it's "Co-routine re-entered
> > recursively". 2.3.0 works fine.
> >
> > Bisecting the regression between 2.3.0 and 2.4.0 interestingly yields
> > 48ac0a4df84662f as the problematic commit, but I can't imagine that this
> > is the root issue. The effective change it brings is that for active
> > commits, the buf_size is no longer the same as the granularity, but the
> > default mirror buf_size instead.
> >
> > When forcing buf_size to the granularity, the issue first appears with
> > commit 3f09bfbc7bee812 (after 2.4.0, before 2.5.0), which is much less
> > surprising, because this is the one that introduced the assertion in the
> > first place.
> >
> > However, I still don't think the assertion is the problem but the fact
> > that the guest device can still send requests after bdrv_drained_
>
> Thanks for debugging this.
>
> bdrv_drained_begin isn't effective because the guest notifier handler is not
> registered as "external":
>
> virtio_
> event_notifier_
> qemu_set_fd_handler
> aio_set_
> is_external, /* false */
> ...)
>
>
> is_external SHOULD be true here.
>
This patch survives the reproducer I have on top of master (also submitted to
qemu-devel for 2.6):
---
diff --git a/hw/virtio/ virtio. c b/hw/virtio/ virtio. c virtio. c virtio. c queue_set_ host_notifier_ fd_handler( VirtQueue *vq, bool assign,
bool set_handler) set_handler( &vq->host_ notifier, queue_host_ notifier_ read); event_notifier( qemu_get_ aio_context( ), &vq->host_notifier, queue_host_ notifier_ read); set_handler( &vq->host_ notifier, NULL); event_notifier( qemu_get_ aio_context( ), &vq->host_notifier,
index f745c4a..002c2c6 100644
--- a/hw/virtio/
+++ b/hw/virtio/
@@ -1829,10 +1829,11 @@ void virtio_
{
if (assign && set_handler) {
- event_notifier_
- virtio_
+ aio_set_
+ true, virtio_
} else {
- event_notifier_
+ aio_set_
+ true, NULL);
}
if (!assign) {
/* Test and clear notifier before after disabling event,