Attaching sriov nic VM fail with keyError pci_slot

Bug #1708433 reported by Helena
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann
Pike
Fix Released
Medium
Elod Illes
Queens
In Progress
Medium
Matt Riedemann
Rocky
Fix Committed
Medium
Matt Riedemann

Bug Description

Trace back:

2017-08-03 12:03:50.064 DEBUG nova.network.os_vif_util [req-f1414ec4-6df7-46d8-9c97-f678c0f94d77 demo admin] No conversion for VIF type hw_veb yet from (pid=134902) nova_to_osvif_vif /opt/stack/nova/nova/network/os_vif_util.py:435
2017-08-03 12:03:50.119 ERROR oslo_messaging.rpc.server [req-f1414ec4-6df7-46d8-9c97-f678c0f94d77 demo admin] Exception during message handling: KeyError: 'pci_slot'
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server Traceback (most recent call last):
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server result = func(ctxt, **new_args)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 76, in wrapped
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server function_name, call_dict, binary)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.force_reraise()
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 67, in wrapped
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return f(self, context, *args, **kw)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 211, in decorated_function
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.force_reraise()
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 199, in decorated_function
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 5166, in attach_interface
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server network_info[0])
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1443, in attach_interface
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server self.vif_driver.plug(instance, vif)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 794, in plug
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server func(instance, vif)
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 650, in plug_hw_veb
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server vif['profile']['pci_slot'],
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server KeyError: 'pci_slot'
2017-08-03 12:03:50.119 TRACE oslo_messaging.rpc.server

Steps to recreate:
- Create a VM on the compute node of a multi-node deployment.
- Attach an direct/macvtap bound SRIOV port:
   openstack server add port VM1 port1

Results:
- The above traceback is found in the n-cpu service on the compute node.

Tags: pci sriov
Revision history for this message
Sean Dague (sdague) wrote :

is there a specific requirement on the kind of hardware used?

"No conversion for VIF type hw_veb yet" starts the stack trace, so I wonder if it's an unknown type.

tags: added: sriov
tags: added: pci
Changed in nova:
status: New → Incomplete
Revision history for this message
sean mooney (sean-k-mooney) wrote :

this was done on a fortvile (intel XL710)
this should be a supported nic.

a slight update on the above is that we were able to boot vms with the sriov port if they were passed
on the nova boot commandline but this error occours if we try to attach an sriov port to a running vm.

my guess is that we are not populating the pci_slot in the port binding_profile when we call attache but we are when we boot with an sriov port.

Revision history for this message
Jan Gutter (jangutter) wrote :
Revision history for this message
sean mooney (sean-k-mooney) wrote :

yes it is... i taught that was fixed litrally years ago.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Revision history for this message
Andreas Karis (akaris) wrote :
Download full text (4.9 KiB)

Just posting a "me too" ;-)

Running into the same issue:

2018-04-05 17:39:46.933 237964 DEBUG nova.network.os_vif_util [req-36b544f4-91a6-442e-a30d-6148220d1449 d7530d1d970f48b2b19cf1f2a5289a4a 5516f95420f14e1885fde8449654a412 - - -] No conversion for VIF type hw_veb yet nova_to_osvif_vif /usr/lib/python2.7/site-packages/nova/network/os_vif_util.py:416

2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server [req-36b544f4-91a6-442e-a30d-6148220d1449 d7530d1d970f48b2b19cf1f2a5289a4a 5516f95420f14e1885fde8449654a412 - - -] Exception during message handling
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server self.force_reraise()
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server self.force_reraise()
2018-04-05 17:39:46.950 237964 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-04-05 17:39:46.950 237964 ER...

Read more...

Revision history for this message
melanie witt (melwitt) wrote :

Someone in the nova channel just reported this same bug running queens version 17.0.4:

[20:09:16] <sapd> Hi everyone. I got this error when attach a SR-IOV port to instance http://paste.openstack.org/show/726723/

Re-opening this bug as it sounds like it's still unsolved, at least as of queens. We need to investigate it more.

Changed in nova:
status: Expired → New
Revision history for this message
Matt Riedemann (mriedem) wrote :

Nova doesn't support hot-plugging SR-IOV ports to existing instances, see the spec:

https://review.openstack.org/#/c/139910/

The compute API should likely fail-fast rather than trying to attach these types of ports and failing in obscure ways.

Revision history for this message
melanie witt (melwitt) wrote :

Setting to Triaged to acknowledge we could use a bug fix here to fail fast in the API since attaching SRIOV NIC to existing instances is not currently supported.

Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Matt Riedemann (mriedem) wrote :

I'll post a patch to make this a fast failure in the API when trying to attach sriov ports to existing instances. Support for this in the API would require a microversion to indicate when nova actually supports it (if we ever support it) and it would require RPC API version checks on the nova-compute service (and some compute drivers might not support it at all).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/591898

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
Revision history for this message
sapd (saphi070) wrote :

I would like to develop this feature, How can I start?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/591898
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=68011c40ae2ab0900674408a88f62a60a802fef7
Submitter: Zuul
Branch: master

commit 68011c40ae2ab0900674408a88f62a60a802fef7
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 15 13:33:16 2018 +0800

    Explicitly fail if trying to attach SR-IOV port

    Attaching SR-IOV ports to existing instances is not supported
    since the compute service does not perform any kind of PCI
    device allocation, so we should fail fast with a clear error
    if attempted. Note that the compute RPC API "attach_interface"
    method is an RPC call from nova-api to nova-compute so the error
    raised here will result in a 400 response to the user.

    Blueprint sriov-interface-attach-detach would need to be
    implemented to support this use case, and could arguably involve
    a microversion to indicate when the feature was made available.

    A related neutron docs patch https://review.openstack.org/594325
    is posted for mentioning the limitation with SR-IOV port attach
    as well.

    Change-Id: Ibbf2bd3cdd45bcd61eebff883c30ded525b2495d
    Closes-Bug: #1708433

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/605118

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/605118
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e1d55af4089fe6b76680285e36069ab0f57404ab
Submitter: Zuul
Branch: stable/rocky

commit e1d55af4089fe6b76680285e36069ab0f57404ab
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 15 13:33:16 2018 +0800

    Explicitly fail if trying to attach SR-IOV port

    Attaching SR-IOV ports to existing instances is not supported
    since the compute service does not perform any kind of PCI
    device allocation, so we should fail fast with a clear error
    if attempted. Note that the compute RPC API "attach_interface"
    method is an RPC call from nova-api to nova-compute so the error
    raised here will result in a 400 response to the user.

    Blueprint sriov-interface-attach-detach would need to be
    implemented to support this use case, and could arguably involve
    a microversion to indicate when the feature was made available.

    A related neutron docs patch https://review.openstack.org/594325
    is posted for mentioning the limitation with SR-IOV port attach
    as well.

    Change-Id: Ibbf2bd3cdd45bcd61eebff883c30ded525b2495d
    Closes-Bug: #1708433
    (cherry picked from commit 68011c40ae2ab0900674408a88f62a60a802fef7)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/607729

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.2

This issue was fixed in the openstack/nova 18.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.10

This issue was fixed in the openstack/nova 17.0.10 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/695408

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.opendev.org/695408
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=af5df70f617b03f86cd3dc898c889a1825630c9e
Submitter: Zuul
Branch: stable/pike

commit af5df70f617b03f86cd3dc898c889a1825630c9e
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 15 13:33:16 2018 +0800

    Explicitly fail if trying to attach SR-IOV port

    Attaching SR-IOV ports to existing instances is not supported
    since the compute service does not perform any kind of PCI
    device allocation, so we should fail fast with a clear error
    if attempted. Note that the compute RPC API "attach_interface"
    method is an RPC call from nova-api to nova-compute so the error
    raised here will result in a 400 response to the user.

    Blueprint sriov-interface-attach-detach would need to be
    implemented to support this use case, and could arguably involve
    a microversion to indicate when the feature was made available.

    A related neutron docs patch https://review.openstack.org/695409
    is posted for mentioning the limitation with SR-IOV port attach
    as well.

    Conflicts due to no having
    Ifcc327f9f97e57d3d6f0db7045b56ffe60203eb9 and
    I4440a19370da9807cc8c32b681542c7048c9977e in Pike.

    Change-Id: Ibbf2bd3cdd45bcd61eebff883c30ded525b2495d
    Closes-Bug: #1708433
    (cherry picked from commit 68011c40ae2ab0900674408a88f62a60a802fef7)
    (cherry picked from commit e1d55af4089fe6b76680285e36069ab0f57404ab)
    (cherry picked from commit 7827890421a61a4e2029939cacf4e3fecb95282a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova pike-eol

This issue was fixed in the openstack/nova pike-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.