SR-IOV shared PCI numa not working

Bug #1795920 reported by Satish Patel on 2018-10-03
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
sean mooney

Bug Description

Folks,

I'm building SR-IOV supported compute node on HP 360g8 hardware and i
have Qlogic interface card, my compute node has 32 core & 32GB memory.

Problem:

when i launch vm-1 (with flavor 16 vCPU core) on openstack it launch
successful on numa0 node and working great. but when i launch vm-2
same flavor it start and then shutdown itself in few second, in-short
i am not able launch instance on numa1 node because my PCIe attach to
numa0node which i can see in lstopo command.

so at present i am going to lose half compute capacity if this is real
problem because i can't use numa1 to launch SR-IOV supported instance.

after google i found this link
https://blueprints.launchpad.net/nova/+spec/share-pci-device-between-numa-nodes

and according this link if i can set
hw:pci_numa_affinity_policy=preferred in flavor it will allow me to
spin up instance across the numa node but somehow its not working and
still i am not able to spin up instance, (it start instance but then
shutdown itself)

Any idea what is wrong here?

--------------

If i remove aggregate_instance_extra_specs:pinned='true', hw:cpu_policy='dedicated', hw:pci_numa_affinity_policy='preferred' from my flavor then it allowing me to spin up machine across NUMA.

How do i make SR-IOV work with pinning with shared PCI NUAM bus?

Changed in nova:
status: New → Confirmed
sean mooney (sean-k-mooney) wrote :

looking at the code i belive this is because the feature was not fully implmetned
i will work with the core team to see if this can be reproposed for the stein release.

Satish Patel (satish-txt) wrote :

Can we backport this feature in queens or pike release once it get successfully implemented on stein release?

Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
status: Confirmed → In Progress
sean mooney (sean-k-mooney) wrote :

note that as this bug is actually a feature request we will tack its completion as part of this blueprint https://blueprints.launchpad.net/nova/+spec/sriov-numa-affinity-policy-via-flavor-and-image

Satish Patel (satish-txt) wrote :

I had exact same issue last year when i was building openstack cloud with SR-IOV and this is what i did for workaround:

set flavor --property hw:numa_nodes=2

It will let you spin vm across numa zone.

sean mooney (sean-k-mooney) wrote :

i am currenlty proposing https://blueprints.launchpad.net/nova/+spec/vm-scoped-sriov-numa-affinity
which is adressed by this spec https://review.opendev.org/#/c/683174/ and this code https://review.opendev.org/#/c/674072/ to adress this issue for Ussuri

Reviewed: https://review.opendev.org/674072
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c7224172641c6194582ca4cf7ce11e907df50aa
Submitter: Zuul
Branch: master

commit 8c7224172641c6194582ca4cf7ce11e907df50aa
Author: Sean Mooney <email address hidden>
Date: Thu Aug 1 15:00:07 2019 +0000

    support pci numa affinity policies in flavor and image

    This addresses bug #1795920 by adding support for
    defining a pci numa affinity policy via the flavor
    extra specs or image metadata properties enabling
    the policies to be applied to neutron sriov port
    including hardware offloaded ovs.

    Closes-Bug: #1795920
    Related-Bug: #1805891
    Implements: blueprint vm-scoped-sriov-numa-affinity
    Change-Id: Ibd62b24c2bd2dd208d0f804378d4e4f2bbfdaed6

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers