SR-IOV shared PCI numa not working

Bug #1795920 reported by Satish Patel
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
sean mooney

Bug Description

Folks,

I'm building SR-IOV supported compute node on HP 360g8 hardware and i
have Qlogic interface card, my compute node has 32 core & 32GB memory.

Problem:

when i launch vm-1 (with flavor 16 vCPU core) on openstack it launch
successful on numa0 node and working great. but when i launch vm-2
same flavor it start and then shutdown itself in few second, in-short
i am not able launch instance on numa1 node because my PCIe attach to
numa0node which i can see in lstopo command.

so at present i am going to lose half compute capacity if this is real
problem because i can't use numa1 to launch SR-IOV supported instance.

after google i found this link
https://blueprints.launchpad.net/nova/+spec/share-pci-device-between-numa-nodes

and according this link if i can set
hw:pci_numa_affinity_policy=preferred in flavor it will allow me to
spin up instance across the numa node but somehow its not working and
still i am not able to spin up instance, (it start instance but then
shutdown itself)

Any idea what is wrong here?

--------------

If i remove aggregate_instance_extra_specs:pinned='true', hw:cpu_policy='dedicated', hw:pci_numa_affinity_policy='preferred' from my flavor then it allowing me to spin up machine across NUMA.

How do i make SR-IOV work with pinning with shared PCI NUAM bus?

Changed in nova:
status: New → Confirmed
Revision history for this message
sean mooney (sean-k-mooney) wrote :

looking at the code i belive this is because the feature was not fully implmetned
i will work with the core team to see if this can be reproposed for the stein release.

Revision history for this message
Satish Patel (satish-txt) wrote :

Can we backport this feature in queens or pike release once it get successfully implemented on stein release?

Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
status: Confirmed → In Progress
Revision history for this message
sean mooney (sean-k-mooney) wrote :

note that as this bug is actually a feature request we will tack its completion as part of this blueprint https://blueprints.launchpad.net/nova/+spec/sriov-numa-affinity-policy-via-flavor-and-image

Revision history for this message
Satish Patel (satish-txt) wrote :

I had exact same issue last year when i was building openstack cloud with SR-IOV and this is what i did for workaround:

set flavor --property hw:numa_nodes=2

It will let you spin vm across numa zone.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

i am currenlty proposing https://blueprints.launchpad.net/nova/+spec/vm-scoped-sriov-numa-affinity
which is adressed by this spec https://review.opendev.org/#/c/683174/ and this code https://review.opendev.org/#/c/674072/ to adress this issue for Ussuri

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/674072
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c7224172641c6194582ca4cf7ce11e907df50aa
Submitter: Zuul
Branch: master

commit 8c7224172641c6194582ca4cf7ce11e907df50aa
Author: Sean Mooney <email address hidden>
Date: Thu Aug 1 15:00:07 2019 +0000

    support pci numa affinity policies in flavor and image

    This addresses bug #1795920 by adding support for
    defining a pci numa affinity policy via the flavor
    extra specs or image metadata properties enabling
    the policies to be applied to neutron sriov port
    including hardware offloaded ovs.

    Closes-Bug: #1795920
    Related-Bug: #1805891
    Implements: blueprint vm-scoped-sriov-numa-affinity
    Change-Id: Ibd62b24c2bd2dd208d0f804378d4e4f2bbfdaed6

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.