[SRU] Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new instance

Bug #1633120 reported by Chinmaya Dwibedy
46
This bug affects 7 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
sean mooney
Ocata
Fix Committed
Medium
sean mooney
Pike
Fix Committed
Medium
sean mooney
Queens
Fix Committed
Medium
sean mooney
Rocky
Fix Committed
Medium
sean mooney
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Mitaka
Won't Fix
High
Unassigned
Ocata
Fix Released
High
Unassigned
Queens
Fix Released
Undecided
Unassigned
Rocky
Fix Released
Undecided
Unassigned
Stein
Fix Released
Undecided
Unassigned
nova (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Won't Fix
High
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Cosmic
Fix Released
Undecided
Unassigned
Disco
Fix Released
Undecided
Unassigned
Eoan
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
This patch is required to prevent nova from accidentally marking pci_device allocations as deleted when it incorrectly reads the passthrough whitelist

[Test Case]
* deploy openstack (any version that supports sriov)
* single compute configured for sriov with at least once device in pci_passthrough_whitelist
* create a vm and attach sriov port
* remove device from pci_passthrough_whitelist and restart nova-compute
* check that pci_devices allocations have not been marked as deleted

[Regression Potential]
None anticipated
----------------------------------------------------------------------------
Upon trying to create VM instance (Say A) with one QAT VF, it fails with the following error i.e., “Requested operation is not valid: PCI device 0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) . We have installed openstack-mitaka release on CentO7 system. It has two Intel QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be available.
But the nova scheduler tries to assign an already-in-use SRIOV VF to a new instance and instance fails. It appears that the nova database is not tracking which VF's have already been taken. But if I shut down VM B instance, then other instance VM A boots up and vice-versa. Note that, both the VM instances cannot run simultaneously because of the aforesaid issue.

We should always be able to create as many instances with the requested PCI devices as there are available VFs.

Please feel free to let me know if additional information is needed. Can anyone please suggest why it tries to assign same PCI device which has been assigned already? Is there any way to resolve this issue? Thank you in advance for your support and help.

[root@localhost ~(keystone_admin)]# lspci -d:435
83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
[root@localhost ~(keystone_admin)]#

[root@localhost ~(keystone_admin)]# lspci -d:443 | grep "QAT Virtual Function" | wc -l
64
[root@localhost ~(keystone_admin)]#

[root@localhost ~(keystone_admin)]# mysql -u root nova -e "SELECT hypervisor_hostname, address, instance_uuid, status FROM pci_devices JOIN compute_nodes oncompute_nodes.id=compute_node_id" | grep 0000:88:04.7
localhost 0000:88:04.7 e10a76f3-e58e-4071-a4dd-7a545e8000de allocated
localhost 0000:88:04.7 c3dbac90-198d-4150-ba0f-a80b912d8021 allocated
localhost 0000:88:04.7 c7f6adad-83f0-4881-b68f-6d154d565ce3 allocated
localhost.nfv.benunets.com 0000:88:04.7 0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 allocated
[root@localhost ~(keystone_admin)]#

[root@localhost ~(keystone_admin)]# grep -r e10a76f3-e58e-4071-a4dd-7a545e8000de /etc/libvirt/qemu
/etc/libvirt/qemu/instance-00000081.xml: <uuid>e10a76f3-e58e-4071-a4dd-7a545e8000de</uuid>
/etc/libvirt/qemu/instance-00000081.xml: <entry name='uuid'>e10a76f3-e58e-4071-a4dd-7a545e8000de</entry>
/etc/libvirt/qemu/instance-00000081.xml: <source file='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/disk'/>
/etc/libvirt/qemu/instance-00000081.xml: <source path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
/etc/libvirt/qemu/instance-00000081.xml: <source path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# grep -r 0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 /etc/libvirt/qemu
/etc/libvirt/qemu/instance-000000ab.xml: <uuid>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</uuid>
/etc/libvirt/qemu/instance-000000ab.xml: <entry name='uuid'>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</entry>
/etc/libvirt/qemu/instance-000000ab.xml: <source file='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/disk'/>
/etc/libvirt/qemu/instance-000000ab.xml: <source path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
/etc/libvirt/qemu/instance-000000ab.xml: <source path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
[root@localhost ~(keystone_admin)]#

On the controller, , it appears there are duplicate PCI device entries in the Database:

MariaDB [nova]> select hypervisor_hostname,address,count(*) from pci_devices JOIN compute_nodes on compute_nodes.id=compute_node_id group by hypervisor_hostname,address having count(*) > 1;
+---------------------+--------------+----------+
| hypervisor_hostname | address | count(*) |
+---------------------+--------------+----------+
| localhost | 0000:05:00.0 | 3 |
| localhost | 0000:05:00.1 | 3 |
| localhost | 0000:83:01.0 | 3 |
| localhost | 0000:83:01.1 | 3 |
| localhost | 0000:83:01.2 | 3 |
| localhost | 0000:83:01.3 | 3 |
| localhost | 0000:83:01.4 | 3 |
| localhost | 0000:83:01.5 | 3 |
| localhost | 0000:83:01.6 | 3 |
| localhost | 0000:83:01.7 | 3 |
| localhost | 0000:83:02.0 | 3 |
| localhost | 0000:83:02.1 | 3 |
| localhost | 0000:83:02.2 | 3 |
| localhost | 0000:83:02.3 | 3 |
| localhost | 0000:83:02.4 | 3 |
| localhost | 0000:83:02.5 | 3 |
| localhost | 0000:83:02.6 | 3 |
| localhost | 0000:83:02.7 | 3 |
| localhost | 0000:83:03.0 | 3 |
| localhost | 0000:83:03.1 | 3 |
| localhost | 0000:83:03.2 | 3 |
| localhost | 0000:83:03.3 | 3 |
| localhost | 0000:83:03.4 | 3 |
| localhost | 0000:83:03.5 | 3 |
| localhost | 0000:83:03.6 | 3 |
| localhost | 0000:83:03.7 | 3 |
| localhost | 0000:83:04.0 | 3 |
| localhost | 0000:83:04.1 | 3 |
| localhost | 0000:83:04.2 | 3 |
| localhost | 0000:83:04.3 | 3 |
| localhost | 0000:83:04.4 | 3 |
| localhost | 0000:83:04.5 | 3 |
| localhost | 0000:83:04.6 | 3 |
| localhost | 0000:83:04.7 | 3 |
| localhost | 0000:88:01.0 | 3 |
| localhost | 0000:88:01.1 | 3 |
| localhost | 0000:88:01.2 | 3 |
| localhost | 0000:88:01.3 | 3 |
| localhost | 0000:88:01.4 | 3 |
| localhost | 0000:88:01.5 | 3 |
| localhost | 0000:88:01.6 | 3 |
| localhost | 0000:88:01.7 | 3 |
| localhost | 0000:88:02.0 | 3 |
| localhost | 0000:88:02.1 | 3 |
| localhost | 0000:88:02.2 | 3 |
| localhost | 0000:88:02.3 | 3 |
| localhost | 0000:88:02.4 | 3 |
| localhost | 0000:88:02.5 | 3 |
| localhost | 0000:88:02.6 | 3 |
| localhost | 0000:88:02.7 | 3 |
| localhost | 0000:88:03.0 | 3 |
| localhost | 0000:88:03.1 | 3 |
| localhost | 0000:88:03.2 | 3 |
| localhost | 0000:88:03.3 | 3 |
| localhost | 0000:88:03.4 | 3 |
| localhost | 0000:88:03.5 | 3 |
| localhost | 0000:88:03.6 | 3 |
| localhost | 0000:88:03.7 | 3 |
| localhost | 0000:88:04.0 | 3 |
| localhost | 0000:88:04.1 | 3 |
| localhost | 0000:88:04.2 | 3 |
| localhost | 0000:88:04.3 | 3 |
| localhost | 0000:88:04.4 | 3 |
| localhost | 0000:88:04.5 | 3 |
| localhost | 0000:88:04.6 | 3 |
| localhost | 0000:88:04.7 | 3 |
+---------------------+--------------+----------+
66 rows in set (0.00 sec)

MariaDB [nova]>

Revision history for this message
Jon Proulx (jproulx) wrote :

I ran into a very similar issue with GPU passthrough (satble/mitaka from ubuntu cloudarchive on 14.04).

In my case there was a config management bug on my end which removed the active devices from the nova DB and then when the config was fixed nova created new "available" records for all the devices including the ones currently in use.

I think nova should check if duplicate "deleted" records exist and undletete them checking if the assinged instance if there is one still exists, if it does leave it assigned if it doesn't mark the resource as available in addition to undeleting.

example DB state:
> SELECT created_at,deleted_at,deleted,id,compute_node_id,address,status,instance_uuid FROM pci_devices WHERE address='0000:09:00.0';
+---------------------+---------------------+---------+----+-----------------+--------------+-----------+--------------------------------------+
| created_at | deleted_at | deleted | id | compute_node_id | address | status | instance_uuid |
+---------------------+---------------------+---------+----+-----------------+--------------+-----------+--------------------------------------+
| 2016-07-06 00:12:30 | 2016-10-13 21:04:53 | 4 | 4 | 90 | 0000:09:00.0 | allocated | 9269391a-4ce4-4c8d-993d-5ad7a9c3879b |
| 2016-10-18 18:01:35 | NULL | 0 | 12 | 90 | 0000:09:00.0 | available | NULL |
+---------------------+---------------------+---------+----+-----------------+--------------+-----------+--------------------------------------+

In this case instance ID 9269391a-4ce4-4c8d-993d-5ad7a9c3879b did exist and was using PCI 09:00.0 but it was associated in the deleted row.

I only had three devices which were affected by this (and in use) so could relatively easily fix by hand. I wonder the SRIOV issue is the same.

Jon Proulx (jproulx)
Changed in nova:
status: New → Confirmed
tags: added: pci
Revision history for this message
Frode Nordahl (fnordahl) wrote :

I believe the current Nova PCI implementation is susceptible to becoming out of sync in multiple scenarios, we have seen this too and suspect rows ended up in 'deleted' state because of lost messages and/or other operational events occurring at the same time as instance life cycle events took place.

Tracking down all the places this disconnect might happen seems like a impossible task, and I believe we should focus on:

a) means for operator to force refresh of PCI devices from a compute node that could easily be back-ported to previous OpenStack versions

b) improve handling of instance PCI attachements in periodic refresh of compute nodes

Frode Nordahl (fnordahl)
summary: Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new
- instance (openstack-mitaka)
+ instance
Revision history for this message
Sean Dague (sdague) wrote : Re: Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new instance

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.mitaka
Revision history for this message
Matt Riedemann (mriedem) wrote :

Seems to me that a very brute force way to prevent deleting allocated pci device records would be to raise an exception here:

https://github.com/openstack/nova/blob/def4b17934a3b2cf783d0177d6a9632916dfd10f/nova/objects/pci_device.py#L244

If self.instance_uuid is not None. I don't know what is setting the PciDevice.status to REMOVED/DELETED in the stack but clearly it's wrong and we should guard against that.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This might be where the status is changed to REMOVED:

https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L177

Revision history for this message
Matt Riedemann (mriedem) wrote :

Yup the reporter of duplicate bug 1809040 reported seeing that warning:

https://paste.ubuntu.com/p/GVJQqMSTrM/

2018-12-18 20:32:45.051 4961 WARNING nova.pci.manager [req-88cfd6bc-a25e-498b-972b-b9fa539a8e82 - - - - -] Trying to remove device with allocated ownership 9a1800dd-ab4d-4075-a999-dbc67cfc41e4 because of PCI device 23:0000:85:00.1 is allocated instead of ['available']: PciDeviceInvalidStatus: PCI device 23:0000:85:00.1 is allocated instead of ['available']

which aligns with a deleted allocated pci device record:

https://paste.ubuntu.com/p/Pn76QVmwqr/

So we should probably just change https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L177 to a continue statement.

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → High
assignee: nobody → sean mooney (sean-k-mooney)
status: Confirmed → Triaged
importance: High → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/626381

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/626381
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=26c41eccade6412f61f9a8721d853b545061adcc
Submitter: Zuul
Branch: master

commit 26c41eccade6412f61f9a8721d853b545061adcc
Author: Sean Mooney <email address hidden>
Date: Wed Dec 19 19:40:05 2018 +0000

    PCI: do not force remove allocated devices

    In the ocata release the pci_passthrough_whitelist
    was moved from the [DEFAULT] section of the nova.conf
    to the [pci] section and renamed to passthrough_whitelist.

    On upgrading if the operator chooses to migrate the config
    value to the new section it is not uncommon
    to forget to rename the config value.
    Similarly if an operator is updateing the whitelist and
    mistypes the value it can also lead to the whitelist
    being ignored.

    As a result of either error the nova compute agent
    would delete all database entries for a host regardless of
    if the pci device was in use by an instance. If this occurs
    the only recorse for an operator is to delete and recreate
    the guest on that host after correcting the error or manually
    restore the database to backup or otherwise consistent state.

    This change alters the _set_hvdevs function to not force
    remove allocated or claimed devices if they are no longer
    present in the pci whitelist.

    Closes-Bug: #1633120
    Change-Id: I6e871311a0fa10beaf601ca6912b4a33ba4094e0

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/635071

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/635072

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/635074

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/635075

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/635071
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9f9f372f33310ba6d57067fb0200aaa3e8d02978
Submitter: Zuul
Branch: stable/rocky

commit 9f9f372f33310ba6d57067fb0200aaa3e8d02978
Author: Sean Mooney <email address hidden>
Date: Wed Dec 19 19:40:05 2018 +0000

    PCI: do not force remove allocated devices

    In the ocata release the pci_passthrough_whitelist
    was moved from the [DEFAULT] section of the nova.conf
    to the [pci] section and renamed to passthrough_whitelist.

    On upgrading if the operator chooses to migrate the config
    value to the new section it is not uncommon
    to forget to rename the config value.
    Similarly if an operator is updateing the whitelist and
    mistypes the value it can also lead to the whitelist
    being ignored.

    As a result of either error the nova compute agent
    would delete all database entries for a host regardless of
    if the pci device was in use by an instance. If this occurs
    the only recorse for an operator is to delete and recreate
    the guest on that host after correcting the error or manually
    restore the database to backup or otherwise consistent state.

    This change alters the _set_hvdevs function to not force
    remove allocated or claimed devices if they are no longer
    present in the pci whitelist.

    Closes-Bug: #1633120
    Change-Id: I6e871311a0fa10beaf601ca6912b4a33ba4094e0
    (cherry picked from commit 26c41eccade6412f61f9a8721d853b545061adcc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/635072
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=955ecf26c57d1eeb2463fb342ffb6b6114e6ce24
Submitter: Zuul
Branch: stable/queens

commit 955ecf26c57d1eeb2463fb342ffb6b6114e6ce24
Author: Sean Mooney <email address hidden>
Date: Wed Dec 19 19:40:05 2018 +0000

    PCI: do not force remove allocated devices

    In the ocata release the pci_passthrough_whitelist
    was moved from the [DEFAULT] section of the nova.conf
    to the [pci] section and renamed to passthrough_whitelist.

    On upgrading if the operator chooses to migrate the config
    value to the new section it is not uncommon
    to forget to rename the config value.
    Similarly if an operator is updateing the whitelist and
    mistypes the value it can also lead to the whitelist
    being ignored.

    As a result of either error the nova compute agent
    would delete all database entries for a host regardless of
    if the pci device was in use by an instance. If this occurs
    the only recorse for an operator is to delete and recreate
    the guest on that host after correcting the error or manually
    restore the database to backup or otherwise consistent state.

    This change alters the _set_hvdevs function to not force
    remove allocated or claimed devices if they are no longer
    present in the pci whitelist.

    Closes-Bug: #1633120
    Change-Id: I6e871311a0fa10beaf601ca6912b4a33ba4094e0
    (cherry picked from commit 26c41eccade6412f61f9a8721d853b545061adcc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/635074
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=239bdd0fd243a97f3cf36999f9dbdd99b6b4ab6a
Submitter: Zuul
Branch: stable/pike

commit 239bdd0fd243a97f3cf36999f9dbdd99b6b4ab6a
Author: Sean Mooney <email address hidden>
Date: Wed Dec 19 19:40:05 2018 +0000

    PCI: do not force remove allocated devices

    In the ocata release the pci_passthrough_whitelist
    was moved from the [DEFAULT] section of the nova.conf
    to the [pci] section and renamed to passthrough_whitelist.

    On upgrading if the operator chooses to migrate the config
    value to the new section it is not uncommon
    to forget to rename the config value.
    Similarly if an operator is updateing the whitelist and
    mistypes the value it can also lead to the whitelist
    being ignored.

    As a result of either error the nova compute agent
    would delete all database entries for a host regardless of
    if the pci device was in use by an instance. If this occurs
    the only recorse for an operator is to delete and recreate
    the guest on that host after correcting the error or manually
    restore the database to backup or otherwise consistent state.

    This change alters the _set_hvdevs function to not force
    remove allocated or claimed devices if they are no longer
    present in the pci whitelist.

    Conflicts:
      nova/pci/manager.py

    Closes-Bug: #1633120
    Change-Id: I6e871311a0fa10beaf601ca6912b4a33ba4094e0
    (cherry picked from commit 26c41eccade6412f61f9a8721d853b545061adcc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.10

This issue was fixed in the openstack/nova 17.0.10 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.2.0

This issue was fixed in the openstack/nova 18.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/635075
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5c5a6b93a07b0b58f513396254049c17e2883894
Submitter: Zuul
Branch: stable/ocata

commit 5c5a6b93a07b0b58f513396254049c17e2883894
Author: Sean Mooney <email address hidden>
Date: Wed Dec 19 19:40:05 2018 +0000

    PCI: do not force remove allocated devices

    In the ocata release the pci_passthrough_whitelist
    was moved from the [DEFAULT] section of the nova.conf
    to the [pci] section and renamed to passthrough_whitelist.

    On upgrading if the operator chooses to migrate the config
    value to the new section it is not uncommon
    to forget to rename the config value.
    Similarly if an operator is updateing the whitelist and
    mistypes the value it can also lead to the whitelist
    being ignored.

    As a result of either error the nova compute agent
    would delete all database entries for a host regardless of
    if the pci device was in use by an instance. If this occurs
    the only recorse for an operator is to delete and recreate
    the guest on that host after correcting the error or manually
    restore the database to backup or otherwise consistent state.

    This change alters the _set_hvdevs function to not force
    remove allocated or claimed devices if they are no longer
    present in the pci whitelist.

    Conflicts:
      nova/pci/manager.py

    Closes-Bug: #1633120
    Change-Id: I6e871311a0fa10beaf601ca6912b4a33ba4094e0
    (cherry picked from commit 26c41eccade6412f61f9a8721d853b545061adcc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.8

This issue was fixed in the openstack/nova 16.1.8 release.

summary: - Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new
- instance
+ [SRU] Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a
+ new instance
description: updated
tags: added: sts-sru-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Cosmic is EOL so let's just fix this in Rocky.

Changed in nova (Ubuntu Cosmic):
status: New → Won't Fix
Revision history for this message
Corey Bryant (corey.bryant) wrote :

This is fixed in Ubuntu in all packages > Ocata.

Changed in nova (Ubuntu Eoan):
status: New → Fix Released
Changed in nova (Ubuntu Disco):
status: New → Fix Released
Changed in nova (Ubuntu Cosmic):
status: Won't Fix → Fix Released
Changed in nova (Ubuntu Bionic):
status: New → Fix Released
Changed in nova (Ubuntu Xenial):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Corey Bryant (corey.bryant) wrote :

I'm not sure how much this is needed in Ubuntu Mitaka. It seems from the commit message this is triggered mostly by movement of pci_passthrough_whitelist from the [DEFAULT] section of nova.conf to the [pci] section in ocata, where it was renamed to passthrough_whitelist.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Ok chatted with Edward Hope-Morley offline and he confirmed this is being hit on mitaka as well.

Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Chinmaya, or anyone else affected,

Accepted nova into ocata-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ocata-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ocata-needed to verification-ocata-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ocata-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ocata-needed
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Xenial Ocata verified using [Test Case]

Test output: https://pastebin.ubuntu.com/p/5gnDJBz5J4/

tags: added: verification-ocata-done
removed: verification-ocata-needed
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Mitaka not backportable so abandoning:

$ git-deps -e mitaka-eol 5c5a6b93a07b0b58f513396254049c17e2883894^!
c2c3b97259258eec3c98feabde3b411b519eae6e

$ git-deps -e mitaka-eol c2c3b97259258eec3c98feabde3b411b519eae6e^!
a023c32c70b5ddbae122636c26ed32e5dcba66b2
74fbff88639891269f6a0752e70b78340cf87e9a
e83842b80b73c451f78a4bb9e7bd5dfcebdefcab
1f259e2a9423a4777f79ca561d5e6a74747a5019
b01187eede3881f72addd997c8fd763ddbc137fc
49d9433c62d74f6ebdcf0832e3a03e544b1d6c83

Changed in nova (Ubuntu Xenial):
status: Triaged → Won't Fix
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for nova has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package nova - 2:15.1.5-0ubuntu1~cloud4
---------------

 nova (2:15.1.5-0ubuntu1~cloud4) xenial-ocata; urgency=medium
 .
   * d/p/pci-do-not-force-remove-allocated-devices.patch: Cherry-picked
     from upstream to prevent forced removal of allocated PCI devices
     (LP: #1633120).

tags: added: sts-sru-done
removed: sts-sru-needed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova ocata-eol

This issue was fixed in the openstack/nova ocata-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.