nova: SR-IOV for PF requirements

Bug #1585777 reported by Derek Ditch
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Incomplete
Wishlist
Bharat Kunwar

Bug Description

I needed the ability to perform PCI-passthrough in Mitaka nova-compute. As far as I can tell, in newer kernels and/or OpenStack releases, this equates to SR-IOV using Physical Functions (PF). The older KVM pci-passthrough does not appear to be supported under Mitika (and seems to have to be done somewhat manually anyway).

I had to overcome a lot of hurdles for this to work, ultimately Kolla (and also Nova!) needs better docs about how this is supposed to work in a modern release.

I put the following into `/etc/kolla/config/nova.conf`, which gets pushed to all nova containers when running `kolla-ansible reconfigure`:

```
[DEFAULT]
#debug = True
pci_alias = {"name": "capture_nic", "vendor_id": "8086", "product_id": "10fb", "device_type": "type-PF"}
pci_passthrough_whitelist = [{"vendor_id": "8086", "product_id": "10fb", "address": "07:00.*"}]

scheduler_available_filters=nova.scheduler.filters.all_filters
scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,CoreFilter,PciPassthroughFilter
```

I specified the address since I had more than one device on the system that matched that vendor_id and product_id. One could put only the address, but I was explicit for docs purposes. If you have different devices to pass on different hosts, you should put the relevant config in `/etc/kolla/config/nova/{{ inventory_hostname }}/nova.conf`.

This enabled the scheduler to actually find my device (key was device_type). Once it got to the compute host, however, IOMMU was not enabled (despite I followed along KVM VT-d guides, as specified in the PCI passthrough wiki page for OpenStack). I found a great walkthrough here: https://bugzilla.redhat.com/attachment.cgi?id=1020593. The key here was to enable the `intel_iommu=on` on the kernel command line.

Lastly, libvirt required access to the device via a device `/dev/vfio/`. I tried a couple of iterations, finally I added "/dev:/dev" in `ansible/roles/nova/tasks/start_compute.yml` for both the nova-libvirt and the nova-compute containers. This is probably overkill, but it worked for now. We can probably narrow this scope.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/322334

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/322335

Revision history for this message
Swapnil Kulkarni (coolsvap-deactivatedaccount) wrote :

Please provide appropriate justification for it to be part of stable releases. We can then cherrypick the change to mitaka.

Changed in kolla:
assignee: nobody → Derek Ditch (derek-ditch+launchpad)
importance: Undecided → Wishlist
milestone: none → newton-1
no longer affects: kolla/mitaka
Revision history for this message
Derek Ditch (dcode) wrote :

Justification: SR-IOV passthrough is a feature available in the upstream Mitaka release. Without this patch, this functionality is not possible to achieve using Kolla. Additionally, there is conflicting documentation on how to perform SR-IOV passthrough due to changes in the way Nova has handled it in the past. The documentation provided attempts to provide the user background on the concepts and how to accomplish it using Kolla.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla (master)

Change abandoned by Derek Ditch (<email address hidden>) on branch: master
Review: https://review.openstack.org/325440
Reason: Not needed for backport. Pending commit to master.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla (stable/mitaka)

Change abandoned by Derek Ditch (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/322334
Reason: Abandon the backport of docs.

Changed in kolla:
status: New → Triaged
Changed in kolla:
milestone: newton-1 → newton-2
Changed in kolla:
milestone: newton-2 → newton-3
Changed in kolla:
milestone: newton-3 → occata-1
Changed in kolla:
milestone: ocata-1 → ocata-2
Changed in kolla:
milestone: ocata-2 → ocata-3
Changed in kolla:
milestone: ocata-3 → ocata-rc1
Changed in kolla:
milestone: ocata-rc1 → pike-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Dave Walker (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/322335
Reason: Needs to hit Master first.. Not touched in 9 months

Changed in kolla:
milestone: pike-2 → pike-3
Changed in kolla:
milestone: pike-3 → pike-rc1
Changed in kolla:
milestone: pike-rc1 → queens-1
Changed in kolla:
milestone: queens-2 → queens-3
Revision history for this message
Charlie Kang (charlie-kang) wrote :

This review could resolve this as it's the documentation for a neutron sriov enablement

https://review.openstack.org/#/c/498112/

Changed in kolla:
milestone: queens-3 → queens-rc1
Changed in kolla:
milestone: queens-rc1 → queens-rc2
Changed in kolla:
milestone: queens-rc2 → rocky-1
Changed in kolla:
milestone: rocky-2 → rocky-3
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Hi Bharat, could you advise whether this is still some issue that needs fixing?

Changed in kolla:
status: Triaged → Incomplete
affects: kolla → kolla-ansible
Changed in kolla-ansible:
milestone: rocky-3 → none
assignee: Derek Ditch (derek-ditch+launchpad) → Bharat Kunwar (brtknr)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.