32bit pci domain number is not supported

Bug #1897528 reported by Nobuhiro MIKI on 2020-09-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Balazs Gibizer
Train
Undecided
Unassigned
Ussuri
Undecided
Unassigned
Victoria
Undecided
Unassigned

Bug Description

A device with a PCI domain number greater than 16 bits exists.
On a compute node with the device, nova-compute service fails to start.
It happened in the rocky version, but it should also happen in the commit d01972b272 on the master branch [1].
Other projects, such as libvirt, have already addressed this issue [2].

$ less /var/log/nova/compute.log
[req-0a9e4f23-0576-456e-add5-5bd6e7d707d6 - - - - -] Error updating resources for node XXXX.co.jp.: PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF).
Traceback (most recent call last):
  File "/opt/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 7994, in _update_available_resource_for_node
    rt.update_available_resource(context, nodename)
  File "/opt/nova/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 721, in update_available_resource
    self._update_available_resource(context, resources)
  File "/opt/nova/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
    return f(*args, **kwargs)
  File "/opt/nova/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 744, in _update_available_resource
    self._init_compute_node(context, resources)
  File "/opt/nova/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 572, in _init_compute_node
    self._setup_pci_tracker(context, cn, resources)
  File "/opt/nova/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 599, in _setup_pci_tracker
    dev_json)
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/manager.py", line 120, in update_devices_from_hypervisor_resources
    if self.dev_filter.device_assignable(dev):
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/whitelist.py", line 86, in device_assignable
    if spec.match(dev):
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/devspec.py", line 281, in match
    dev_dict.get('parent_addr'))])
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/devspec.py", line 238, in match
    pci_addr_obj = PhysicalPciAddress(pci_addr)
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/devspec.py", line 87, in __init__
    self._set_pci_dev_info('domain', MAX_DOMAIN, '%04x')
  File "/opt/nova/lib/python2.7/site-packages/nova/pci/devspec.py", line 66, in _set_pci_dev_info
    {'property': prop, 'attr': a, 'max': maxval})
PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF).

$ lspci | tail
0000:d7:16.1 Performance counters: Intel Corporation Device 2088 (rev 07)
0000:d7:16.4 System peripheral: Intel Corporation Sky Lake-E M2PCI Registers (rev 07)
0000:d7:16.5 Performance counters: Intel Corporation Device 2088 (rev 07)
0000:d7:17.0 System peripheral: Intel Corporation Sky Lake-E M2PCI Registers (rev 07)
0000:d7:17.1 Performance counters: Intel Corporation Device 2088 (rev 07)
10000:00:00.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port A (rev 07)
10000:00:01.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port B (rev 07)
10000:00:02.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port C (rev 07)
10000:00:03.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port D (rev 07)
10000:01:00.0 Non-Volatile memory controller: Toshiba America Info Systems Device 0110 (rev 01)

[1] https://opendev.org/openstack/nova/src/commit/d01972b272c0beade35e76aa84965c5c21124db9/nova/pci/devspec.py#L27
[2] https://gitlab.com/libvirt/libvirt/-/commit/d19c21429fd1acfdade23c903804d6e279d1e00e

tags: added: compute libvirt pci
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

The libvirt fix[2] landed in libvirt 5.7.0. Do we need / do we have similar fix in qemu as well?

If both qemu and libvirt supports 32bit PCI domains then I don't see any reason why nova should reject 32bit PCI domains.

Could you please confirm that QEMU works with these 32bit PCI domains? I'm setting this bug to Incomplete until the QEMU support is confirmed.

[2] https://gitlab.com/libvirt/libvirt/-/commit/d19c21429fd1acfdade23c903804d6e279d1e00e

Changed in nova:
status: New → Incomplete
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Looking at the QEMU code it seems to me that QEMU would simply reject PCI devices with domain bigger than 0xFFFF [1]. I'm not sure if this limitation also valid for QEMU with accel=kvm.

[1] https://github.com/qemu/qemu/blob/f2a1cf9180f63e88bb38ff21c169da97c3f2bad5/hw/core/qdev-properties.c#L993

Revision history for this message
Nobuhiro MIKI (nmiki) wrote :

Thanks for the confirmation.

I tried it by actually running qemu with accel=kvm.
On my environment (linux 3.10.0-957.27.2.el7.x86_64), the vfio-pci driver
failed to bind the 32bit PCI domain device before booting the VM. So, it
seems to me that there are a few other things we need to check besides
QEMU to assign 32bit PCI domain device to the VM. Please let me know if
the procedure is wrong.

Anyway, if nova doesn't support the 32bit PCI domain number,
the nova-compute service itself won't be able to start,
regardless of whether we actually use 32bit PCI number device or not.
So, I'd like to send a patch if possible.

# 16bit PCI domain number, booted successfully
$ sudo qemu-system-x86_64 -machine pc-i440fx-2.11,accel=kvm -nographic \
  -device vfio-pci,host=0000:5e:02.5 cirros-0.5.1-x86_64-disk.img

# 32bit PCI domain number, failed to boot
$ dmesg
# vfio-pci: probe of 10000:01:00.0 failed with error -22
$ sudo qemu-system-x86_64 -machine pc-i440fx-2.11,accel=kvm -nographic \
  -device vfio-pci,host=10000:01:00.0 cirros-0.5.1-x86_64-disk.img

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Thanks for confirming that qemu also fails with such PCI device.

Does nova only fail to start because you added a 32bit PCI address to the whitelist?

I'm a bit reluctant to allow adding 32bit PCI addresses in the whitelist as it suggest that deployment support it, but right now qemu will fail when you want to boot a VM with such device. So I think first we have to make sure that there is end-to-end support for 32 bit addresses and then we can enable the use of them in nova.

Revision history for this message
Nobuhiro MIKI (nmiki) wrote :

This issue is reproduced when only 16bit PCI domain address is registered
in pci.passthrough_whitelist. Note that it is not reproduced if the
pci.passthrough_whitelist is empty.

So I think there may be a bug in the PCI domain address matching part.

2020-10-08 11:23:13.316 10735 DEBUG oslo_service.service [req-bef9c36c-53c2-45ef-9b8e-4e3f602172cb - - - - -] pci.alias = [] log_opt_values /opt/nova/lib/python2.7/site-packages/oslo_config/cfg.py:3027
2020-10-08 11:23:13.317 10735 DEBUG oslo_service.service [req-bef9c36c-53c2-45ef-9b8e-4e3f602172cb - - - - -] pci.passthrough_whitelist = ['{ "address": "0000:5e:01.5" }'] log_opt_values /opt/nova/lib/python2.7/site-packages/oslo_config/cfg.py:3027
2020-10-08 11:23:14.142 10735 ERROR nova.compute.manager [req-e22b43ef-0f19-4ba1-9711-87a7ea9ca8f1 - - - - -] Error updating resources for node XXXXXX.jp.: PciConfigInvalidWhitelist: Invalid PCI devices Whitelist config: property domain (10000) is greater than the maximum allowable value (FFFF).

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Now that is interesting. So in a system where there are devices with 32bit PCI domain but they are not listed in the passthrough_whitelist nova still fails. I'm setting this back to New. I think it would make sense to try to create a reproduction unit test. My feeling is that we iterate all the host devices when filtering with the whitelist and there we encounter the 32bit domain.

Changed in nova:
status: Incomplete → New
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I tested my above assumption and it seems it is the root of the problem.

The nova.pci.devspec.WhitelistPciAddress.match()[1] is called for every PCI device on the host at startup, including the ones with 32bit domain. And this reuses the nova.pci.devspec.PciAddressSpec._set_pci_dev_info() utility with maxval=0xFFFF for domains. So I can confirm that if the host has a PCI device with 32bit domain then nova compute will refuse to start. If the passthrough_whitelist config is empty then there is a shortcut in the startup sequence[3] skipping the iteration of the PCI devices on the host hence the problem disappear.

Marking this bug Confirmed. I will try to push a fix soon.

[1] https://github.com/openstack/nova/blob/2745e685376abbc4c32516837f6074a3de23aa24/nova/pci/devspec.py#L217
[2] https://github.com/openstack/nova/blob/2745e685376abbc4c32516837f6074a3de23aa24/nova/pci/devspec.py#L51
[3] https://github.com/openstack/nova/blob/2745e685376abbc4c32516837f6074a3de23aa24/nova/compute/manager.py#L1393

Changed in nova:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

To be precise the shortcut I mentioned in comment #7 is a shortcut because it creates 0 device specification and when the host devices are iterated it is matched against the existing specs. If no spec exists there is no attempt to match and therefore no PCI address parsing happen.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/756696

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/756697

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/791767

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/791768

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/791770

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/791771

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/792116

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/792117

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/nova/+/791767
Committed: https://opendev.org/openstack/nova/commit/0354d4d9f47354e2b4fc0b2343c27e734fe2e494
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 0354d4d9f47354e2b4fc0b2343c27e734fe2e494
Author: Balazs Gibizer <email address hidden>
Date: Thu Oct 8 14:13:38 2020 +0200

    Reproduce bug 1897528

    The nova-compute fails to start if the hypervisor has PCI addresses
    32bit domain.

    Change-Id: I48dcb7faa17fe9f8346445a1746cff5845baf358
    Related-Bug: #1897528
    (cherry picked from commit 976ac722d36439d16ea4ec1bf5037c482c89ef55)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/nova/+/791768
Committed: https://opendev.org/openstack/nova/commit/90ffc553d7f4152a6a4a8708787150d3c3c40b03
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 90ffc553d7f4152a6a4a8708787150d3c3c40b03
Author: Balazs Gibizer <email address hidden>
Date: Thu Oct 8 14:27:44 2020 +0200

    Ignore PCI devices with 32bit domain

    Nova and QEMU[1] supports PCI devices with a PCI address that has 16 bit
    domain. However there are hypervisors that reports PCI addresses with
    32 bit domain. While today we cannot assign these to guests this should
    not prevent the nova-compute service to start.

    This patch changes the PCI manager to ignore such PCI devices.

    Please note that this patch does not change fact that nova does not
    allow specifying PCI addresses with 32bit domain in the
    [pci]/passthrough_whitelist configuration. Such configuration is still
    rejected at nova-compute service startup.

    Closes-Bug: #1897528

    [1] https://github.com/qemu/qemu/blob/f2a1cf9180f63e88bb38ff21c169da97c3f2bad5/hw/core/qdev-properties.c#L993

    Change-Id: I59a0746b864610b6a314078cf5661d3d2b84b1d4
    (cherry picked from commit 8c9d6fc8f073cde78b79ae259c9915216f5d59b0)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/nova/+/791770
Committed: https://opendev.org/openstack/nova/commit/8e9859b95c537738d97ade41a8d09670de27ef8d
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 8e9859b95c537738d97ade41a8d09670de27ef8d
Author: Balazs Gibizer <email address hidden>
Date: Thu Oct 8 14:13:38 2020 +0200

    Reproduce bug 1897528

    The nova-compute fails to start if the hypervisor has PCI addresses
    32bit domain.

    Change-Id: I48dcb7faa17fe9f8346445a1746cff5845baf358
    Related-Bug: #1897528
    (cherry picked from commit 976ac722d36439d16ea4ec1bf5037c482c89ef55)
    (cherry picked from commit 0354d4d9f47354e2b4fc0b2343c27e734fe2e494)

tags: added: in-stable-ussuri
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers