ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g5: libvirt fails to check for duplicate address in hotplug xml and causes the guest to go to shutoff state

Bug #1840872 reported by bugproxy on 2019-08-21
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Canonical Server Team
libvirt (Ubuntu)
Undecided
Ubuntu on IBM Power Systems Bug Triage
Bionic
Undecided
Unassigned
Disco
Undecided
Unassigned

Bug Description

[Impact]

 * Attaching a device to an already consumed address never was and will
   work. But the current error handling might affect the device currently
   at that address.
   Think: "Attach A at 1, Attach B at 1 => Fail clean up slot 1"

 * Upstream has a fix for this which better detects the bad case which
   helps to avoid this becoming an issue.

[Test Case]

Create two images to test with:
qemu-img create -f qcow2 /var/lib/libvirt/images/test1.img 1G
qemu-img create -f qcow2 /var/lib/libvirt/images/test2.img 1G

Create a guest:
$ uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=amd64 label=daily release=bionic
$ uvt-kvm create --password ubuntu tests-scsi-hotplug arch=amd64 release=bionic label=daily

Then edit and restart the guest to use a scsi disk, add:
     <disk type='file' device='disk'>
       <driver name='qemu' type='qcow2'/>
       <source file='/var/lib/libvirt/images/test1.img'/>
       <target dev='sda' bus='scsi'/>
       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
     </disk>
     <controller type='scsi' index='0' model='virtio-scsi'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
     </controller>

Finally create a hotplug XML file with the second disk referring to the same address
cat > hpdisk.xml << EOF
     <disk type='file' device='disk'>
       <driver name='qemu' type='qcow2'/>
       <source file='/var/lib/libvirt/images/test2.img'/>
       <target dev='sdb' bus='scsi'/>
       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
     </disk>
EOF

Then try to attach this device to the guest
$ virsh attach-device tests-scsi-hotplug hpdisk.xml

Without the fix you will get (which has further negative implications, but is enough for the test):
root@b:~# virsh attach-device tests-scsi-hotplug hpdisk.xml
error: Failed to attach device from hpdisk.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-0' for device

With the fix you will get:
root@b:~# virsh attach-device tests-scsi-hotplug hpdisk.xml
error: Failed to attach device from hpdisk.xml
error: Requested operation is not valid: Domain already contains a disk with that address

[Regression Potential]

 * Attaching a device to an already consumed address already failed and
   will continue to fail. So no use case will change significantly due to
   the intended change.
   The regression one could think of is that the duplicate check would
   match for an allowed device attachment and block it, but from my review
   I think that will not happen.

[Other Info]

 * n/a

---

== Comment: #0 - INDIRA P. JOGA - 2018-05-24 09:32:53 ==
Problem Description:
===================
libvirt fails to check for duplicate address in hotplug xml & error out before even trying to hotplug anything.

Steps to re-create:

1. boslcp4 is up with BMC:1.20 & PNOR: 20180420 levels
2. Guest is up with kernel
root@boslcp3g5:~# uname -a
Linux boslcp3g5 4.4.0-122-generic #146-Ubuntu SMP Mon Apr 23 15:33:25 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
root@boslcp3g5:~# uname -r
4.4.0-122-generic

3. boslcp3g5 guest is running with LTP run (45 hours) & stress-ng aio class ( 15 minutes).

4. Hotplug xml

root@boslcp3:/home# cat hp-disk.xml
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/home/boslcp3g5_hpblk1'/>
      <target dev='sdd' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>

5.Tried hotplug with duplicate address

root@boslcp3:/home# virsh attach-device boslcp3g5 hp-disk.xml --live

error: Failed to attach device from hp-disk.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-2' for device

root@boslcp3:/home#
root@boslcp3:/home# virsh list --all
 Id Name State
----------------------------------------------------
 3 boslcp3g2 running
 4 boslcp3g3 running
 7 boslcp3g4 running
 - boslcp3g1 shut off
 - boslcp3g5 shut off

6. Libvirt fails to check the duplicate address in hotplug xml & error before even trying to hotplug anything

== Comment: #5 - Daniel Henrique Barboza - 2019-03-26 15:43:55 ==
The fix for this bug was pushed upstream:

commit f1d6585300001c7b23b8796a0faa4411c3531996
Author: Daniel Henrique Barboza <email address hidden>
Date: Fri Mar 15 18:06:45 2019 -0300

    domain_conf: check device address before attach

After this patch, Libvirt is checking for duplicated address before trying to do hotplug operations.

bugproxy (bugproxy) on 2019-08-21
tags: added: architecture-ppc64le bugnameltc-168186 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → libvirt (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Server Team (canonical-server)

Thanks for the bug,
the commit is in libvirt 5.2 and therefore in Eoan already.

I'm working on a few other libvirt changes that I'd like to group into one SRU.
So give me a few days to check if the others are ready in time (uploaded together) or not (then this will most likely be uploaded soon).

Changed in libvirt (Ubuntu Bionic):
status: New → Triaged
Changed in libvirt (Ubuntu Disco):
status: New → Triaged
Changed in libvirt (Ubuntu):
status: New → Fix Released
Changed in ubuntu-power-systems:
status: New → Triaged

This feels a bit like a trap due to:
commit 22dc3e94c24b4d9a6c28beda91b9b283eb9b0251
Author: Michal Privoznik <email address hidden>
Date: Thu Apr 11 15:40:51 2019 +0200

    Revert "domain_conf: check device address before attach"

    This reverts commit f1d6585300001c7b23b8796a0faa4411c3531996.

    Turns out, this caused a regression. There is this (perhaps less
    known) semantic of virDomainAttachDevice() where if the device
    the API is trying to attach is a CDROM/floppy that is already in
    the domain the attach request is handled as 'change the media in
    the drive'.

    We have a better fix anyways.

    Signed-off-by: Michal Privoznik <email address hidden>
    Tested-by: Daniel Henrique Barboza <email address hidden>
    Reviewed-by: Jim Fehlig <email address hidden>

Instead one should use:
commit 89237d534f0fe950d06a2081089154160c6c2224
Author: Michal Privoznik <email address hidden>
Date: Thu Apr 11 15:44:14 2019 +0200

    conf: Expose virDomainSCSIDriveAddressIsUsed

    This function checks if given drive address is already present in
    passed domain definition. Expose the function as it will be used
    shortly.

    Signed-off-by: Michal Privoznik <email address hidden>
    Tested-by: Daniel Henrique Barboza <email address hidden>
    Reviewed-by: Jim Fehlig <email address hidden>

commit ddc72f99027b063feaf34e5fda89916b6b2c8943
Author: Michal Privoznik <email address hidden>
Date: Thu Apr 11 15:45:27 2019 +0200

    qemu_hotplug: Check for duplicate drive addresses

    This tries to fix the same problem as f1d65853000 but it's doing
    so in a less invasive way.

------- Comment From <email address hidden> 2019-08-21 06:35 EDT-------
Christian,

You're right. My original fix was pushed upstream, then a couple of weeks later problems were found during tests using libvirt-tck.

In the end we decided to revert my patch for a more subtle fix from Michal in the 2 patches you mentioned:

conf: Expose virDomainSCSIDriveAddressIsUsed
qemu_hotplug: Check for duplicate drive addresses

I forgot to update the bug after the revert was done. Sorry for the confusion.

Daniel

description: updated

Tested on x86 and confirmed - added my tests steps to the SRU template which needed those anyway.

No offense, but for future bugs:
  "boslcp3g5 guest is running with LTP run (45 hours) & stress-ng aio class ( 15 minutes)."
is:
a) neither a good step by step list how to reproduce
b) nor needed at all for this bug
Testcases are better the simpler they are :-)

NP Daniel, that is exactly the set of changes that I have prepared here then.

Thanks for confirming thou!

Changes reviewed and tested in the regression test set over night (combined with the other changes I intend to group on the SRU). Uploaded to -unapproved for the SRU team to consider.

Changed in libvirt (Ubuntu Disco):
status: Triaged → In Progress
Changed in libvirt (Ubuntu Bionic):
status: Triaged → In Progress
Changed in ubuntu-power-systems:
status: Triaged → In Progress

Hello bugproxy, or anyone else affected,

Accepted libvirt into disco-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/5.0.0-1ubuntu2.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-disco to verification-done-disco. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-disco. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in libvirt (Ubuntu Disco):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-disco

All autopkgtests for the newly accepted libvirt (5.0.0-1ubuntu2.5) for disco have finished running.
The following regressions have been reported in tests triggered by the package:

nova/2:19.0.1-0ubuntu2.1 (armhf)
virt-top/unknown (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/disco/update_excuses.html#libvirt

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

FYI - the reported test issues are resolved by now.

Download full text (11.0 KiB)

Did the test and the upgrade to proposed confirming the expected result.

root@d:~# uvt-kvm create --password ubuntu tests-scsi-hotplug arch=amd64 release=bionic label=daily
Warning: using --password from the command line is not secure and should be used for debugging only.
root@d:~# virsh shutdown tests-scsi-hotplug
Domain tests-scsi-hotplug is being shutdown
root@d:~# virsh list
 Id Name State
--------------------

root@d:~# virsh edit tests-scsi-hotplug

Select an editor. To change later, run 'select-editor'.
  1. /bin/nano <---- easiest
  2. /usr/bin/vim.basic
  3. /usr/bin/vim.tiny
  4. /bin/ed

Choose 1-4 [1]: 2
Domain tests-scsi-hotplug XML configuration edited.

root@d:~# cat > hpdisk.xml << EOF
> <disk type='file' device='disk'>
> <driver name='qemu' type='qcow2'/>
> <source file='/var/lib/libvirt/images/test2.img'/>
> <target dev='sdb' bus='scsi'/>
> <address type='drive' controller='0' bus='0' target='0' unit='0'/>
> </disk>
> EOF
root@d:~# virsh start tests-scsi-hotplug
Domain tests-scsi-hotplug started

root@d:~# virsh list
 Id Name State
------------------------------------
 3 tests-scsi-hotplug running

root@d:~# virsh attach-device tests-scsi-hotplug hpdisk.xml
error: Failed to attach device from hpdisk.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-0' for device

root@d:~# vim /etc/apt/sources.list
root@d:~# apt update
Hit:1 http://archive.ubuntu.com/ubuntu disco InRelease
Get:2 http://security.ubuntu.com/ubuntu disco-security InRelease [97.5 kB]
Get:3 http://archive.ubuntu.com/ubuntu disco-updates InRelease [97.5 kB]
Ign:4 http://ddebs.ubuntu.com disco InRelease
Hit:5 http://ppa.launchpad.net/lucaskanashiro/disco-openldap-slapd-segfault-1838370/ubuntu disco InRelease
Ign:6 http://ddebs.ubuntu.com disco-updates InRelease
Ign:7 http://ddebs.ubuntu.com disco-proposed InRelease
Get:8 http://archive.ubuntu.com/ubuntu disco-backports InRelease [88.8 kB]
Hit:9 http://ddebs.ubuntu.com disco Release
Get:10 http://ddebs.ubuntu.com disco-updates Release [34.7 kB]
Get:11 http://archive.ubuntu.com/ubuntu disco-proposed InRelease [255 kB]
Hit:12 http://ppa.launchpad.net/paelzer/bug-1840956-binfmt-containers/ubuntu disco InRelease
Get:13 http://ddebs.ubuntu.com disco-proposed Release [34.7 kB]
Hit:14 http://ppa.launchpad.net/sbeattie/lp1842701/ubuntu disco InRelease
Get:15 http://ddebs.ubuntu.com disco-updates Release.gpg [819 B]
Get:16 http://ddebs.ubuntu.com disco-proposed Release.gpg [819 B]
Get:17 http://security.ubuntu.com/ubuntu disco-security/main amd64 Packages [202 kB]
Get:18 http://security.ubuntu.com/ubuntu disco-security/main amd64 c-n-f Metadata [4640 B]
Get:19 http://security.ubuntu....

tags: added: verification-done verification-done-disco
removed: verification-needed verification-needed-disco

Hello bugproxy, or anyone else affected,

Accepted libvirt into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/4.0.0-1ubuntu8.13 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in libvirt (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
bugproxy (bugproxy) on 2019-09-12
tags: added: targetmilestone-inin18045
removed: targetmilestone-inin---
Download full text (3.3 KiB)

Testing bionic (don't be confused by the guest named disco)

Pre:
ubuntu@riccioli:~$ virsh attach-device disco hpdisk.xml
error: Failed to attach device from hpdisk.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-0' for device

Upgrade:
$ sudo apt install libvirt-daemon-system
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libvirt-clients libvirt-daemon libvirt-daemon-driver-storage-rbd libvirt0
Suggested packages:
  libvirt-daemon-driver-storage-gluster libvirt-daemon-driver-storage-sheepdog libvirt-daemon-driver-storage-zfs numad radvd auditd systemtap nfs-common zfsutils pm-utils
The following packages will be upgraded:
  libvirt-clients libvirt-daemon libvirt-daemon-driver-storage-rbd libvirt-daemon-system libvirt0
5 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.
Need to get 4116 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 libvirt-daemon-driver-storage-rbd amd64 4.0.0-1ubuntu8.13 [15.4 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 libvirt-daemon-system amd64 4.0.0-1ubuntu8.13 [80.7 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 libvirt-daemon amd64 4.0.0-1ubuntu8.13 [2176 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 libvirt-clients amd64 4.0.0-1ubuntu8.13 [596 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 libvirt0 amd64 4.0.0-1ubuntu8.13 [1248 kB]
Fetched 4116 kB in 1s (4660 kB/s)
Preconfiguring packages ...
(Reading database ... 71127 files and directories currently installed.)
Preparing to unpack .../libvirt-daemon-driver-storage-rbd_4.0.0-1ubuntu8.13_amd64.deb ...
Unpacking libvirt-daemon-driver-storage-rbd (4.0.0-1ubuntu8.13) over (4.0.0-1ubuntu8.12) ...
Preparing to unpack .../libvirt-daemon-system_4.0.0-1ubuntu8.13_amd64.deb ...
Unpacking libvirt-daemon-system (4.0.0-1ubuntu8.13) over (4.0.0-1ubuntu8.12) ...
Preparing to unpack .../libvirt-daemon_4.0.0-1ubuntu8.13_amd64.deb ...
Unpacking libvirt-daemon (4.0.0-1ubuntu8.13) over (4.0.0-1ubuntu8.12) ...
Preparing to unpack .../libvirt-clients_4.0.0-1ubuntu8.13_amd64.deb ...
Unpacking libvirt-clients (4.0.0-1ubuntu8.13) over (4.0.0-1ubuntu8.12) ...
Preparing to unpack .../libvirt0_4.0.0-1ubuntu8.13_amd64.deb ...
Unpacking libvirt0:amd64 (4.0.0-1ubuntu8.13) over (4.0.0-1ubuntu8.12) ...
Setting up libvirt0:amd64 (4.0.0-1ubuntu8.13) ...
Setting up libvirt-daemon (4.0.0-1ubuntu8.13) ...
Setting up libvirt-clients (4.0.0-1ubuntu8.13) ...
Setting up libvirt-daemon-system (4.0.0-1ubuntu8.13) ...
virtlockd.service is a disabled or a static unit, not starting it.
Setting up libvirt-daemon dnsmasq configuration.
Setting up libvirt-daemon-driver-storage-rbd (4.0.0-1ubuntu8.13) ...
Processing triggers for systemd (237-3ubuntu10.29) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...

Post:
ubuntu@...

Read more...

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic

The verification of the Stable Release Update for libvirt has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 5.0.0-1ubuntu2.5

---------------
libvirt (5.0.0-1ubuntu2.5) disco; urgency=medium

  * d/p/ubuntu/lp-1840872-*: avoid hotplug issues with duplicate device
    addresses (LP: #1840872)

 -- Christian Ehrhardt <email address hidden> Wed, 21 Aug 2019 11:15:43 +0200

Changed in libvirt (Ubuntu Disco):
status: Fix Committed → Fix Released

Hi, as I was asked on IRC lets clarify things here about when to expect the Bionic portion to be released.
The Bionic part of this is in proposed only for 5 days. Both bugs affected by it are verified, that means per [1] that we'd need at least 2 more days in proposed and then one of the SRU Team members picking it up for the final release.

Since there is a sprint going on atm, I'd more expect early next week than late friday (which is when the 7 days are fulfilled)

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 4.0.0-1ubuntu8.13

---------------
libvirt (4.0.0-1ubuntu8.13) bionic; urgency=medium

  * d/p/ubuntu/lp-1840872-*: avoid hotplug issues with duplicate device
    addresses (LP: #1840872)
  * d/p/ubuntu/lp-1840745-*: add amd ssbd / no-ssbd features (LP: #1840745)

 -- Christian Ehrhardt <email address hidden> Wed, 21 Aug 2019 11:08:29 +0200

Changed in libvirt (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers