virtio-serial loses writes when used over virtio-mmio

Bug #1224444 reported by Richard Jones
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

virtio-serial appears to lose writes, but only when used on top of virtio-mmio. The scenario is this:

/home/rjones/d/qemu/arm-softmmu/qemu-system-arm \
    -global virtio-blk-device.scsi=off \
    -nodefconfig \
    -nodefaults \
    -nographic \
    -M vexpress-a15 \
    -machine accel=kvm:tcg \
    -m 500 \
    -no-reboot \
    -kernel /home/rjones/d/libguestfs/tmp/.guestfs-1001/kernel.27944 \
    -dtb /home/rjones/d/libguestfs/tmp/.guestfs-1001/dtb.27944 \
    -initrd /home/rjones/d/libguestfs/tmp/.guestfs-1001/initrd.27944 \
    -device virtio-scsi-device,id=scsi \
    -drive file=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/scratch.1,cache=unsafe,format=raw,id=hd0,if=none \
    -device scsi-hd,drive=hd0 \
    -drive file=/home/rjones/d/libguestfs/tmp/.guestfs-1001/root.27944,snapshot=on,id=appliance,cache=unsafe,if=none \
    -device scsi-hd,drive=appliance \
    -device virtio-serial-device \
    -serial stdio \
    -chardev socket,path=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -append 'panic=1 mem=500M console=ttyAMA0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color'

After the guest starts up, a daemon writes 4 bytes to a virtio-serial socket. The host side reads these 4 bytes correctly and writes a 64 byte message. The guest never sees this message.

I enabled virtio-mmio debugging, and this is what is printed (## = my comment):

## guest opens the socket:
trying to open virtio-serial channel '/dev/virtio-ports/org.libguestfs.channel.0'
virtio_mmio: virtio_mmio_write offset 0x50 value 0x3
opened the socket, sock = 3
udevadm settle
## guest writes 4 bytes to the socket:
virtio_mmio: virtio_mmio_write offset 0x50 value 0x5
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
sent magic GUESTFS_LAUNCH_FLAG
## host reads 4 bytes successfully:
main_loop libguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
libguestfs: [14605ms] appliance is up
Guest launched OK.
## host writes 64 bytes to socket:
libguestfs: writing the data to the socket (size = 64)
waiting for next request
libguestfs: data written OK
## hangs here forever with guest in read() call never receiving any data

I am using qemu from git today (2d1fe1873a984).

Some notes:

- It's not 100% reproducible. Sometimes everything works fine, although it fails "often" (eg > 2/3rds of the time).
- KVM is being used.
- We've long used virtio-serial on x86 and have never seen anything like this.

This is what the output looks like when it doesn't fail:

trying to open virtio-serial channel '/dev/virtio-ports/org.libguestfs.channel.0
'
virtio_mmio: virtio_mmio_write offset 0x50 value 0x3
opened the socket, sock = 3
udevadm settle
virtio_mmio: virtio_mmio_write offset 0x50 value 0x5
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
sent magic GUESTlibguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
libguestfs: [14710ms] appliance is up
Guest launched OK.
libguestfs: writing the data to the socket (size = 64)
FS_LAUNCH_FLAG
main_loop waiting for next request
libguestfs: data written OK
virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x0
virtio_mmio: virtio_mmio setting IRQ 0
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
[... more virtio-mmio lines omitted ...]
virtio_mmio: virtio_mmio_write offset 0x64 value 0x0
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
guestfsd: main_loop: new request, len 0x3c
virtio_mmio: virtio_mmio_write offset 0x50 value 0x4
0000: 20 00 f5 f5 00 00 00 04 00 00 00 d2 00 00 00 00 | ...............|virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
virtio_mmio: virtio_mmio setting IRQ 1

virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
0010: 00 12 34 00 00 00 00 00 00 00 00 00 00 00 00 00 |..4.............|
0020: 00 00 00 00 00 00 00 00 00 00 00 08 2f 64 65 76 |............/dev|
0030: 2f 73 64 61 00 00 00 03 6d 62 72 00 |/sda....mbr. |
virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0

description: updated
description: updated
Revision history for this message
Richard Jones (rjones-redhat) wrote :

strace -f of qemu when it fails.

Notes:

 - fd = 6 is the Unix domain socket connected to virtio-serial
 - only one 4 byte write occurs to this socket (expected guest -> host communication)
 - the socket isn't read at all (even though the library on the other side has written)
 - the socket is never added to any poll/ppoll syscall, so it's no wonder that qemu never sees any data on the socket

Revision history for this message
Richard Jones (rjones-redhat) wrote :

Recall this bug only happens intermittently. This is an strace -f of qemu when it happens to work.

Notes:

 - fd = 6 is the Unix domain socket
 - there are an expected number of recvmsg & writes, all with the correct sizes
 - this time qemu adds the socket to ppoll

Revision history for this message
Richard Jones (rjones-redhat) wrote :

I can reproduce this bug on a second ARM machine which doesn't have KVM (ie. using TCG). Note it's still linked to virtio-mmio.

Revision history for this message
Laszlo Ersek (Red Hat) (lersek) wrote : Re: [Qemu-devel] [Bug 1224444] [NEW] virtio-serial loses writes when used over virtio-mmio

On 09/12/13 14:04, Richard Jones wrote:

> + -chardev socket,path=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/guestfsd.sock,id=channel0 \

Is this a socket that libguestfs pre-creates on the host-side?

> the socket is never added to any poll/ppoll syscall, so it's no
> wonder that qemu never sees any data on the socket

This should be happening:

qemu_chr_open_socket() [qemu-char.c]
  unix_connect_opts() [util/qemu-sockets.c]
    qemu_socket()
    connect()
  qemu_set_nonblock() [util/oslib-posix.c]
  qemu_chr_open_socket_fd()
    socket_set_nodelay() [util/osdep.c]
    io_channel_from_socket()
      g_io_channel_unix_new()
    tcp_chr_connect()
      io_add_watch_poll()
        g_source_new()
        g_source_attach()
        g_source_unref()
      qemu_chr_be_generic_open()

io_add_watch_poll() should make sure the fd is polled starting with the
next main loop iteration.

Interestingly, even in the "successful" case, there's a slew of ppoll()
calls between connect() returning 6, and the first ppoll() that actually
covers fd=6.

Laszlo

Revision history for this message
Richard Jones (rjones-redhat) wrote :

> Is this a socket that libguestfs pre-creates on the host-side?

Yes it is:
https://github.com/libguestfs/libguestfs/blob/master/src/launch-direct.c#L208

You mention a scenario that might cause this. But that appears to be when the socket is opened. Note that the guest did send 4 bytes successfully (received OK at the host). The lost write occurs when the host next tries to send a message back to the guest.

Revision history for this message
Laszlo Ersek (Red Hat) (lersek) wrote : Re: [Qemu-devel] [Bug 1224444] Re: virtio-serial loses writes when used over virtio-mmio
Download full text (3.7 KiB)

On 09/16/13 16:39, Richard Jones wrote:
>> Is this a socket that libguestfs pre-creates on the host-side?
>
> Yes it is:
> https://github.com/libguestfs/libguestfs/blob/master/src/launch-direct.c#L208
>
> You mention a scenario that might cause this. But that appears to be
> when the socket is opened. Note that the guest did send 4 bytes
> successfully (received OK at the host). The lost write occurs when the
> host next tries to send a message back to the guest.

Which is the first time ever that a GLib event loop context messed up
only for reading would be exposed.

In other words, if the action

  register fd 6 for reading in the GLib main loop context

fails, that wouldn't prevent qemu from *writing* to the UNIX domain socket.

In both traces, the IO-thread (thread-id 8488 in the successful case,
and thread-id 7586 in the failing case) is the one opening / registering
etc. fd 6. The IO-thread is also the one calling ppoll().

However, all write(6, ...) syscalls are issued by one of the VCPU
threads (thread-id 8490 in the successful case, and thread-id 7588 in
the failing case).

Hmmmm. Normally (as in, virtio-pci), when a VCPU thread (running KVM)
executes guest code that sends data to the host via virtio, KVM kicks
the "host notifier" eventfd.

Once this "host notifier" eventfd is kicked, the IO thread should do:

  virtio_queue_host_notifier_read()
    virtio_queue_notify_vq()
      vq->handle_output()
        handle_output() [hw/char/virtio-serial-bus.c]
          do_flush_queued_data()
            vsc->have_data()
              flush_buf() [hw/char/virtio-console.c]
                qemu_chr_fe_write()
                  ... goes to the unix domain socket ...

When virtio-mmio is used though, the same seems to happen in VCPU thread:

  virtio_mmio_write()
    virtio_queue_notify()
      virtio_queue_notify_vq()
        ...same as above...

A long shot:

(a) With virtio-pci:

(a1) guest writes to virtio-serial port,
(a2) KVM sets the host notifier eventfd "pending",
(a3) the IO thread sees that in the main loop / ppoll(), and copies the
data to the UNIX domain socket (the backend),
(a4) host-side libguestfs reads the data and responds,
(a5) the IO-thread reads the data from the UNIX domain socket,
(a6) the IO-thread pushes the data to the guest.

(b) with virtio-mmio:

(b1) guest writes to virtio-serial port,
(b2) the VCPU thread in qemu reads the data (virtio-mmio) and copies it
to the UNIX domain socket,
(b3) host-side libguestfs reads the data and responds,
(b4) the IO-thread is not (yet?) ready to read the data from the UNIX
domain socket.

I can't quite pin it down, but I think that in the virtio-pci case, the
fact that everything runs through the IO-thread automatically serializes
the connection to the UNIX domain socket (and its addition to the GLib
main loop context) with the message from the guest. Due to the KVM
eventfd (the "host notifier") everything goes through the same ppoll().
Maybe it doesn't enforce any theoretical serialization, it might just
add a sufficiently long delay that there's never a problem in practice.

Whereas in the virtio-mmio case, the initial write to the UNIX domain
socket, and the response from...

Read more...

Revision history for this message
Peter Maydell (pmaydell) wrote : Re: [Qemu-devel] [Bug 1224444] Re: virtio-serial loses writes when used over virtio-mmio

On 16 September 2013 17:13, Laszlo Ersek <email address hidden> wrote:
> Hmmmm. Normally (as in, virtio-pci), when a VCPU thread (running KVM)
> executes guest code that sends data to the host via virtio, KVM kicks
> the "host notifier" eventfd.

What happens in the virtio-pci without eventfd case?
(eg virtio-pci on a non-x86 host)

Also, IIRC Alex said they'd had an annoying "data gets lost"
issue with the s390 virtio transports too...

-- PMM

Revision history for this message
Richard Jones (rjones-redhat) wrote :

> What happens if you add a five second delay to libguestfs,
> before writing the response?

No change. Still hangs in the same place.

Revision history for this message
Laszlo Ersek (Red Hat) (lersek) wrote :

On 09/17/13 10:09, Peter Maydell wrote:
> On 16 September 2013 17:13, Laszlo Ersek <email address hidden> wrote:
>> Hmmmm. Normally (as in, virtio-pci), when a VCPU thread (running KVM)
>> executes guest code that sends data to the host via virtio, KVM kicks
>> the "host notifier" eventfd.
>
> What happens in the virtio-pci without eventfd case?
> (eg virtio-pci on a non-x86 host)

I'm confused. I think Anthony or Michael could answer better.

There's at least three cases here I guess (KVM + eventfd, KVM without
eventfd (enforceable eg. with the "ioeventfd" property for virtio
devices), and TCG). We're probably talking about the third case.

I think we end up in

  virtio_pci_config_ops.write == virtio_pci_config_write
    virtio_ioport_write()
      virtio_queue_notify()
        ... the "usual" stuff ...

As far as I know TCG supports exactly one VCPU thread but that's still
separate from the IO-thread. In that case the above could trigger the
problem similarly to virtio-mmio I guess...

I think we should debug into GLib, independently of virtio. What annoys
me mostly is the huge number of ppoll()s in Rich's trace between
connecting to the UNIX domain socket and actually checking it for
read-readiness. The fd in question should show up in the first ppoll()
after connect().

My email might not make any sense. Sorry.
Laszlo

Revision history for this message
Richard Jones (rjones-redhat) wrote :

> There's at least three cases here I guess (KVM + eventfd, KVM without
> eventfd (enforceable eg. with the "ioeventfd" property for virtio
> devices), and TCG). We're probably talking about the third case.

To clarify on this point: I have reproduced this bug on two different ARM
machines, one using KVM and one using TCG.

In both cases they are ./configure'd without any special ioeventfd-related
options, which appears to mean CONFIG_EVENTFD=y (in both cases).

In both cases I'm using a single vCPU.

Revision history for this message
Laszlo Ersek (Red Hat) (lersek) wrote : Re: [Qemu-devel] [Bug 1224444] Re: virtio-serial loses writes when used over virtio-mmio
Download full text (13.4 KiB)

On 09/17/13 11:51, Richard Jones wrote:
>> There's at least three cases here I guess (KVM + eventfd, KVM without
>> eventfd (enforceable eg. with the "ioeventfd" property for virtio
>> devices), and TCG). We're probably talking about the third case.
>
> To clarify on this point: I have reproduced this bug on two different ARM
> machines, one using KVM and one using TCG.
>
> In both cases they are ./configure'd without any special ioeventfd-related
> options, which appears to mean CONFIG_EVENTFD=y (in both cases).
>
> In both cases I'm using a single vCPU.
>

I think I have a theory now; it's quite convoluted.

The problem is a deadlock in ppoll() that is *masked* by unrelated file
descriptor traffic in all of the apparently working cases.

I wrote some ad-hoc debug patches, and this is the log leading up to the
hang:

  io_watch_poll_prepare: chardev:channel0 was_active:0 now_active:0
  qemu_poll_ns: timeout=4281013151888
  poll entry #0 fd 3
  poll entry #1 fd 5
  poll entry #2 fd 0
  poll entry #3 fd 11
  poll entry #4 fd 4
  trying to open virtio-serial channel '/dev/virtio-ports/org.libguestfs.channel.0'
  opened the socket, sock = 3
  udevadm settle
  libguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
  libguestfs: [21734ms] appliance is up
  Guest launched OK.
  libguestfs: writing the data to the socket (size = 64)
  sent magic GUESTFS_LAUNCH_FLAG
  main_loop waiting for next request
  libguestfs: data written OK
  <HANG>

Setup call tree for the backend (ie. the UNIX domain socket):

   1 qemu_chr_open_socket() [qemu-char.c]
   2 unix_connect_opts() [util/qemu-sockets.c]
   3 qemu_socket()
   4 connect()
   5 qemu_chr_open_socket_fd() [qemu-char.c]
   6 io_channel_from_socket()
   7 g_io_channel_unix_new()
   8 tcp_chr_connect()
   9 io_add_watch_poll()
  10 g_source_new()
  11 g_source_attach()

This part connects to libguestfs's UNIX domain socket (the new socket
file descriptor, returned on line 3, is fd 6), and it registers a few
callbacks. Notably, the above doesn't try to add fd 6 to the set of
polled file descriptors.

Then, the setup call tree for the frontend (the virtio-serial port) is
as follows:

  12 virtconsole_initfn() [hw/char/virtio-console.c]
  13 qemu_chr_add_handlers() [qemu-char.c]

This reaches into the chardev (ie. the backend referenced by the
frontend, label "channel0"), and sets further callbacks.

The following seems to lead up to the hang:

  14 os_host_main_loop_wait() [main-loop.c]
  15 glib_pollfds_fill()
  16 g_main_context_prepare()
  17 io_watch_poll_prepare() [qemu-char.c]
  18 chr_can_read() [hw/char/virtio-console.c]
  19 virtio_serial_guest_ready() [hw/char/virtio-serial-bus.c]
  20
  21 if (use_multiport(port->vser) && !port->guest_connected) {
  22 return 0;
  23 }
  24
  25 virtqueue_get_avail_bytes()
  26 g_io_create_watch() // conditionally
  27 qemu_poll_ns() [qemu-timer.c]
  28 ppoll()

Line 15: glib_pollfds_fill() prepares the array of file descriptors for
polling. As first step,

Line 16: it calls g_main_context_prepar...

Revision history for this message
Richard Jones (rjones-redhat) wrote :

FWIW I am able to reproduce this quite easily on aarch64 too.

My test program is:
https://github.com/libguestfs/libguestfs/blob/master/tests/qemu/qemu-speed-test.c

and you use it like this:
qemu-speed-test --virtio-serial-upload

(You can also test virtio-serial downloads and a few other things, but those don't appear to deadlock)

Slowing down the upload, even just by enabling debugging, is sufficient to make the problem go away most of the time.

I am testing with qemu from git (f45c56e0166e86d3b309ae72f4cb8e3d0949c7ef).

Revision history for this message
Richard Jones (rjones-redhat) wrote :

I don't know how to close bugs in launchpad, but this one can be closed
for a couple of reasons:

(1) I benchmarked virtio-mmio the other day using qemu-speed-test on aarch64
and I did not encounter the bug.

(2) aarch64 has supported virtio-pci for a while, for virtio-mmio is effectively
obsolete.

Revision history for this message
Richard Jones (rjones-redhat) wrote :

Fixed upstream, see previous comment.

Changed in qemu:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.