Comment 5 for bug 1894804

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
interesting bug report - thanks lee, Kashyap and Sean - as well as Danil for taking a look already.

If this would always fail no unplugs would work ever which I knew can't be right as I test that. So we need to find what is different...

@Openstack people - is that reliably triggering at the very first hot-detach for you or more like "happening in one of a million detaches triggered by the CI"?

First I simplified the case from "massive openstack CI" to just a few commands and small XMLs...
So I tried to cut ceph out of the equation here and compared hot remove between Bionic and Focal qemu/libvirt when removing a local image file.

$ qemu-img create -f qcow2 /var/lib/uvtool/libvirt/images/testdisk.qcow 20m
Disk:
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/uvtool/libvirt/images/testdisk.qcow'/>
      <target dev='vdc' bus='virtio'/>
    </disk>

The systems have just this guest each and the log debug config is like:
 /etc/libvirt/libvirtd.conf:
 > log_filters="3:qemu 3:libvirt 4:object 2:json 4:event 3:util"
 > log_outputs="2:file:/var/log/libvirtd.log"
 $ rm /var/log/libvirtd.log
 $ systemctl restart libvirtd

Detach via:
 $ virsh detach-disk f-on-b vdc --live

I compared the monitor streams in those two cases but they both deliver DEVICE_DELETED for the
/machine/peripheral/virtio-disk2/virtio-backend as well as /machine/peripheral/virtio-disk2.

@Jamespage could you get me a system with at least one auxiliary ceph disk that I can attach/detach to check if this might be in any way ceph specific?

Details in case we need them for comparison later on ...:

## Bionic ##

libvirt checks devices before removal:

2020-09-21 11:00:19.287+0000: 5755: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-12"}
2020-09-21 11:00:19.288+0000: 5686: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": [{"name": "virtio-disk2", "type": "child<virtio-blk-pci>"}, {"name": "virtio-disk1", "type": "child<virtio-blk-pci>"}, {"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "serial0", "type": "child<isa-serial>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "net0", "type": "child<virtio-net-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-12"}

Then send removal command

2020-09-21 11:00:48.672+0000: 5691: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 msg={"execute":"device_del","arguments":{"id":"virtio-disk2"},"id":"libvirt-13"}
2020-09-21 11:00:48.673+0000: 5686: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": {}, "id": "libvirt-13"}
020-09-21 11:00:48.721+0000: 5686: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7fdf94001430 event={"timestamp": {"seconds": 1600686048, "microseconds": 720908}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-disk2/virtio-backend"}}
2020-09-21 11:00:48.723+0000: 5686: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7fdf94001430 event={"timestamp": {"seconds": 1600686048, "microseconds": 723091}, "event": "DEVICE_DELETED", "data": {"device": "virtio-disk2", "path": "/machine/peripheral/virtio-disk2"}}

^^ both DEVICE_DELETED occur in response to the "device_del".

vv Now also remove the device, but that already happened implicitly

2020-09-21 11:00:48.723+0000: 5691: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 msg={"execute":"human-monitor-command","arguments":{"command-line":"drive_del drive-virtio-disk2"},"id":"libvirt-14"}
2020-09-21 11:00:48.724+0000: 5686: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": "Device 'drive-virtio-disk2' not found\r\n", "id": "libvirt-14"}

In the follow up query virtio-disk2 is removed

2020-09-21 11:00:48.875+0000: 5691: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-15"}
2020-09-21 11:00:48.876+0000: 5686: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": [{"name": "virtio-disk1", "type": "child<virtio-blk-pci>"}, {"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "serial0", "type": "child<isa-serial>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "net0", "type": "child<virtio-net-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-15"}

## Focal ##

libvirt checks devices before removal:

2020-09-21 11:00:20.856+0000: 6395: info : qemuMonitorSend:993 : QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-12"}
2020-09-21 11:00:20.857+0000: 6319: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": [{"name": "type", "type": "string"}, {"name": "pci.7", "type": "child<pcie-root-port>"}, {"name": "pci.4", "type": "child<pcie-root-port>"}, {"name": "serial0", "type": "child<isa-serial>"}, {"name": "pci.1", "type": "child<pcie-root-port>"}, {"name": "virtio-disk1", "type": "child<virtio-blk-pci>"}, {"name": "video0", "type": "child<qxl-vga>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "pci.6", "type": "child<pcie-root-port>"}, {"name": "usb", "type": "child<qemu-xhci>"}, {"name": "pci.3", "type": "child<pcie-root-port>"}, {"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "pci.5", "type": "child<pcie-root-port>"}, {"name": "channel0", "type": "child<virtserialport>"}, {"name": "pci.2", "type": "child<pcie-root-port>"}, {"name": "virtio-disk2", "type": "child<virtio-blk-pci>"}, {"name": "net0", "type": "child<virtio-net-pci>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}], "id": "libvirt-12"}
2020-09-21 11:00:46.509+0000: 6319: info : qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: mon=0x7f463800b3e0 event={"timestamp": {"seconds": 1600686046, "microseconds": 508981}, "event": "RTC_CHANGE", "data": {"offset": 1}}

# send device_del

2020-09-21 11:00:47.928+0000: 6323: info : qemuMonitorSend:993 : QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 msg={"execute":"device_del","arguments":{"id":"virtio-disk2"},"id":"libvirt-13"}
# list
2020-09-21 11:00:47.929+0000: 6319: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": {}, "id": "libvirt-13"}

# check devices again right after (disk is still there)

2020-09-21 11:00:52.929+0000: 6323: info : qemuMonitorSend:993 : QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-14"}
2020-09-21 11:00:52.930+0000: 6319: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": [{"name": "type", "type": "string"}, {"name": "pci.7", "type": "child<pcie-root-port>"}, {"name": "pci.4", "type": "child<pcie-root-port>"}, {"name": "serial0", "type": "child<isa-serial>"}, {"name": "pci.1", "type": "child<pcie-root-port>"}, {"name": "virtio-disk1", "type": "child<virtio-blk-pci>"}, {"name": "video0", "type": "child<qxl-vga>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "pci.6", "type": "child<pcie-root-port>"}, {"name": "usb", "type": "child<qemu-xhci>"}, {"name": "pci.3", "type": "child<pcie-root-port>"}, {"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "pci.5", "type": "child<pcie-root-port>"}, {"name": "channel0", "type": "child<virtserialport>"}, {"name": "pci.2", "type": "child<pcie-root-port>"}, {"name": "virtio-disk2", "type": "child<virtio-blk-pci>"}, {"name": "net0", "type": "child<virtio-net-pci>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}], "id": "libvirt-14"}

A bit later both DEVICE_DELETED show up:

2020-09-21 11:00:54.118+0000: 6319: info : qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: mon=0x7f463800b3e0 event={"timestamp": {"seconds": 1600686054, "microseconds": 117896}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-disk2/virtio-backend"}}
2020-09-21 11:00:54.120+0000: 6319: info : qemuMonitorJSONIOProcessLine:234 : QEMU_MONITOR_RECV_EVENT: mon=0x7f463800b3e0 event={"timestamp": {"seconds": 1600686054, "microseconds": 120016}, "event": "DEVICE_DELETED", "data": {"device": "virtio-disk2", "path": "/machine/peripheral/virtio-disk2"}}

Some follow on cleanup (what was the HMP command on bionic)

2020-09-21 11:00:54.122+0000: 6409: info : qemuMonitorSend:993 : QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 msg={"execute":"blockdev-del","arguments":{"node-name":"libvirt-1-format"},"id":"libvirt-15"}
2020-09-21 11:00:54.126+0000: 6319: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": {}, "id": "libvirt-15"}
2020-09-21 11:00:54.126+0000: 6409: info : qemuMonitorSend:993 : QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 msg={"execute":"blockdev-del","arguments":{"node-name":"libvirt-1-storage"},"id":"libvirt-16"}
2020-09-21 11:00:54.129+0000: 6319: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": {}, "id": "libvirt-16"}