[sriov] Unable to change the VF state for i350 interface

Bug #1934957 reported by Liu Xie
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Low
Unassigned

Bug Description

When sriov-nic-agent configures VF state, the exception is as follows:
2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: Exception during request[139820149013392]: Operation not supported on interface eno4, namespace None. _process_cmd /usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:490
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 263, in _run_iproute_link
    return ip.link(command, index=idx, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1360, in link
    msg_flags=msg_flags)
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in nlm_request
    return tuple(self._genlm_request(*argv, **kwarg))
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 869, in nlm_request
    callback=callback):
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 379, in get
    return tuple(self._genlm_get(*argv, **kwarg))
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 704, in get
    raise msg['header']['error']
pyroute2.netlink.exceptions.NetlinkError: (95, 'Operation not supported')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 485, in _process_cmd
    ret = func(*f_args, **f_kwargs)
  File "/usr/local/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 249, in _wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 403, in set_link_vf_feature
    return _run_iproute_link("set", device, namespace=namespace, vf=vf_config)
  File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 265, in _run_iproute_link
    _translate_ip_device_exception(e, device, namespace)
  File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 237, in _translate_ip_device_exception
    namespace=namespace)
neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation not supported on interface eno4, namespace None.
2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: reply[139820149013392]: (5, 'neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported', ('Operation not supported on interface eno4, namespace None.',)) _call_back /usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-07-08 06:15:47.774 24 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-661d08fb-983f-4632-9eb4-91585a557753 - - - - -] Device fa:16:3e:66:e4:91 does not support state change: neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation not supported on interface eno4, namespace None.

But the vm network traffic is no problem. We use i350 interface, and I found these discuss about i350[1][2]. This exception is not impact for vm traffic, maybe we can ignore it when interface is i350.

[1]https://sourceforge.net/p/e1000/bugs/653/
[2]https://community.intel.com/t5/Ethernet-Products/On-SRIOV-interface-I350-unable-to-change-the-VF-state-from-auto/td-p/704769

version:
neutron-sriov-nic-agent version 17.1.3

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Thank you for your bug report.

I believe this is a reasonable request, however since the real bug (or missing feature) is in the igb/igbvf drivers I would suggest a workaround here that is the least intrusive for users of other drivers, for example (just thinking out loud):

Maybe introduce a new config option like: suppress_igbvf_vf_state_enable_error. Which would default to False and then the agent would behave as today. But when set to True sriov-agent would ignore the exception and proceed.

Of course if the driver version is available in a distro-independent way, then using that is another option. But if it's not available, we could not tell when to start not ignoring the exception if and when the bug gets fixed in the driver.

What do you think? Do you maybe want to upload a fix?

Changed in neutron:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

We discussed that issue in our team meeting today https://meetings.opendev.org/meetings/networking/2021/networking.2021-07-20-14.00.log.html
Our conclusion is that this is Intel's driver bug and we shouldn't try to fix/workaround it in Neutron. It should be fixed in the driver's code. So I'm going to close this bug.

Changed in neutron:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.