Attaching sriov nic VM fail with keyError pci_slot

Bug #1708433 reported by Helena on 2017-08-03
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Matt Riedemann
Pike
Medium
Unassigned
Queens
Medium
Matt Riedemann
Rocky
Medium
Matt Riedemann

Bug Description

Trace back:

2017-08-03 12:03:50.064 DEBUG nova.network.os_vif_util [req-f1414ec4-6df7-46d8-9c97-f678c0f94d77 demo admin] No conversion for VIF type hw_veb yet from (pid=134902) nova_to_osvif_vif /opt/stack/nova/nova/network/os_vif_util.py:435
2017-08-03 12:03:50.119 ERROR oslo_messaging.rpc.server [req-f1414ec4-6df7-46d8-9c97-f678c0f94d77 demo admin] Exception during message handling: KeyError: 'pci_slot'
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server Traceback (most recent call last):
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server result = func(ctxt, **new_args)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 76, in wrapped
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server function_name, call_dict, binary)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.force_reraise()
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 67, in wrapped
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return f(self, context, *args, **kw)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 211, in decorated_function
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.force_reraise()
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 199, in decorated_function
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 5166, in attach_interface
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server network_info[0])
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1443, in attach_interface
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.vif_driver.plug(instance, vif)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 794, in plug
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server func(instance, vif)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 650, in plug_hw_veb
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server vif['profile']['pci_slot'],
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server KeyError: 'pci_slot'
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server

Steps to recreate:
- Create a VM on the compute node of a multi-node deployment.
- Attach an direct/macvtap bound SRIOV port:
   openstack server add port VM1 port1

Results:
- The above traceback is found in the n-cpu service on the compute node.

Sean Dague (sdague) wrote :

is there a specific requirement on the kind of hardware used?

"No conversion for VIF type hw_veb yet" starts the stack trace, so I wonder if it's an unknown type.

tags: added: sriov
tags: added: pci
Changed in nova:
status: New → Incomplete
sean mooney (sean-k-mooney) wrote :

this was done on a fortvile (intel XL710)
this should be a supported nic.

a slight update on the above is that we were able to boot vms with the sriov port if they were passed
on the nova boot commandline but this error occours if we try to attach an sriov port to a running vm.

my guess is that we are not populating the pci_slot in the port binding_profile when we call attache but we are when we boot with an sriov port.

sean mooney (sean-k-mooney) wrote :

yes it is... i taught that was fixed litrally years ago.

Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Andreas Karis (akaris) wrote :
Download full text (4.9 KiB)

Just posting a "me too" ;-)

Running into the same issue:

2018-04-05 17:39:46.933 237964 DEBUG nova.network.os_vif_util [req-36b544f4-91a6-442e-a30d-6148220d1449 d7530d1d970f48b2b19cf1f2a5289a4a 5516f95420f14e1885fde8449654a412 - - -] No conversion for VIF type hw_veb yet nova_to_osvif_vif /usr/lib/python2.7/site-packages/nova/network/os_vif_util.py:416

2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server [req-36b544f4-91a6-442e-a30d-6148220d1449 d7530d1d970f48b2b19cf1f2a5289a4a 5516f95420f14e1885fde8449654a412 - - -] Exception during message handling
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server self.force_reraise()
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server self.force_reraise()
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-04-05 17:39:46.950 237964 ER...

Read more...

melanie witt (melwitt) wrote :

Someone in the nova channel just reported this same bug running queens version 17.0.4:

[20:09:16] <sapd> Hi everyone. I got this error when attach a SR-IOV port to instance http://paste.openstack.org/show/726723/

Re-opening this bug as it sounds like it's still unsolved, at least as of queens. We need to investigate it more.

Changed in nova:
status: Expired → New
Matt Riedemann (mriedem) wrote :

Nova doesn't support hot-plugging SR-IOV ports to existing instances, see the spec:

https://review.openstack.org/#/c/139910/

The compute API should likely fail-fast rather than trying to attach these types of ports and failing in obscure ways.

melanie witt (melwitt) wrote :

Setting to Triaged to acknowledge we could use a bug fix here to fail fast in the API since attaching SRIOV NIC to existing instances is not currently supported.

Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Matt Riedemann (mriedem) wrote :

I'll post a patch to make this a fast failure in the API when trying to attach sriov ports to existing instances. Support for this in the API would require a microversion to indicate when nova actually supports it (if we ever support it) and it would require RPC API version checks on the nova-compute service (and some compute drivers might not support it at all).

Fix proposed to branch: master
Review: https://review.openstack.org/591898

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
sapd (saphi070) wrote :

I would like to develop this feature, How can I start?

Reviewed: https://review.openstack.org/591898
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=68011c40ae2ab0900674408a88f62a60a802fef7
Submitter: Zuul
Branch: master

commit 68011c40ae2ab0900674408a88f62a60a802fef7
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 15 13:33:16 2018 +0800

    Explicitly fail if trying to attach SR-IOV port

    Attaching SR-IOV ports to existing instances is not supported
    since the compute service does not perform any kind of PCI
    device allocation, so we should fail fast with a clear error
    if attempted. Note that the compute RPC API "attach_interface"
    method is an RPC call from nova-api to nova-compute so the error
    raised here will result in a 400 response to the user.

    Blueprint sriov-interface-attach-detach would need to be
    implemented to support this use case, and could arguably involve
    a microversion to indicate when the feature was made available.

    A related neutron docs patch https://review.openstack.org/594325
    is posted for mentioning the limitation with SR-IOV port attach
    as well.

    Change-Id: Ibbf2bd3cdd45bcd61eebff883c30ded525b2495d
    Closes-Bug: #1708433

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/605118
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e1d55af4089fe6b76680285e36069ab0f57404ab
Submitter: Zuul
Branch: stable/rocky

commit e1d55af4089fe6b76680285e36069ab0f57404ab
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 15 13:33:16 2018 +0800

    Explicitly fail if trying to attach SR-IOV port

    Attaching SR-IOV ports to existing instances is not supported
    since the compute service does not perform any kind of PCI
    device allocation, so we should fail fast with a clear error
    if attempted. Note that the compute RPC API "attach_interface"
    method is an RPC call from nova-api to nova-compute so the error
    raised here will result in a 400 response to the user.

    Blueprint sriov-interface-attach-detach would need to be
    implemented to support this use case, and could arguably involve
    a microversion to indicate when the feature was made available.

    A related neutron docs patch https://review.openstack.org/594325
    is posted for mentioning the limitation with SR-IOV port attach
    as well.

    Change-Id: Ibbf2bd3cdd45bcd61eebff883c30ded525b2495d
    Closes-Bug: #1708433
    (cherry picked from commit 68011c40ae2ab0900674408a88f62a60a802fef7)

This issue was fixed in the openstack/nova 18.0.2 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers