[RFE] SR-IOV agent depends on mac addresses for getting bound ports

Bug #1841067 reported by Adrian Chiris on 2019-08-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Medium
Adrian Chiris

Bug Description

SR-IOV agent depends on the administrative MAC address of the VF to determine which ports are managed by it.
this dependency should be removed as it relies on nova virt drivers to set the administrative mac address.
For macvtap ports, setting the administrative mac address is not necessary but neutron requires it.

a recent cleanup in Nova[1] caused VMs with macvtap port to not spawn as SR-IOV agent did not recognize the port as being configured.

A revert was proposed[2], but the long term solution should be in Neutron.

It should be noted that this in not a new issuem, but rather a historic design/implementation decision of SR-IOV agent.

This can be regarded as a (very old bug) exposed by the recent cleanup in nova or as an enhancement to the existing code that removes the dependency of MAC address from the agent.

[1]https://review.opendev.org/#/c/666631/
[2]https://review.opendev.org/#/c/675776/

tags: added: sriov-pci-pt
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
Akihiro Motoki (amotoki) on 2019-08-27
tags: added: nova
Akihiro Motoki (amotoki) wrote :
tags: added: rfe
summary: - SR-IOV agent depends on mac addresses for getting bound ports
+ [RFE] SR-IOV agent depends on mac addresses for getting bound ports
Akihiro Motoki (amotoki) wrote :

Marked as RFE per discussion in the neutron meeting

Akihiro Motoki (amotoki) on 2019-08-27
tags: added: rfe-confirmed
removed: rfe

Hello:

First of all, let me set some limitations and conditions:
- We still support kernels < 3.13. That means "upper_macvtap" shouldn't be in the FS.
- As described in the bug, we need to remove the macvtap dependency to an administrative MAC in the VF under the macvtap NIC.

I think [1] goes in the good direction, with the exception of relaying on the FS to retrieve the macvtap, as done in the upper patch [2]. We still can't still check always the macvtap interface from the FS. Apart from this, I think the patch is correct:
- When retrieving the PCI device info ("get_pci_device"), there is a double check to try first to retrieve the VF information and then, if the first fails, the macvtap info.
- The code split between VF and macvtap ("is_assigned_vf_direct", "is_assigned_vf_macvtap") IMO is a good idea.
- In the macvtap leaf, we retrieve the device MAC in a different way by reading the macvtap device MAC instead of the VF mac (the main problem here!).

IMO, we should restore this patch, rebase it on top of master and continue working on it.

[1] https://review.opendev.org/#/c/676713/
[2] https://review.opendev.org/#/c/677095

Adrian Chiris (adrian.chiris) wrote :

So, the end goal in this RFE is to remove sriov agent dependency on mac addresses entirely.
This will require some changes in neutron server as today mac address is required when agent wants to get device details.

IMO, the agent should rely on PCI address (and request details from neutron server based on PCI address) instead of MAC address.

As an intermediate step, since macvtap is broken today, we can either:

1. revert the commit in nova that exposed the issue [1]
2. move forward with the change in SR-IOV agent [2]

IMO we should do 1. and pursue the long term solution as this RFE suggests.
however if the community prefers 2. i'm OK with reworking patch[2].

In regards to the kernel <3.13 limitation, the only issue today is with CentOS 7, since CentOS 8 is around the corner, for U release we will not have this limitation as according to TC OpenStack needs to support the latest major distribution[3].

On a side note, once this limitation is removed, we can significantly reduce the amount of calls to 'ip link show' as proposed by [4], but this change is not related to this RFE.

I will be in Shanghai for the summit, we can maybe discuss this briefly there.

[1] https://review.opendev.org/#/c/675776/
[2] https://review.opendev.org/#/c/676713
[3] https://governance.openstack.org/tc/reference/project-testing-interface.html#linux-distributions
[4] https://review.opendev.org/#/c/677095

Slawek Kaplonski (slaweq) wrote :

Thx Rodolfo and Adrian to look at this. Based on Your comments and on the fact that we are now in Ussuri cycle already I think that we are good to go with https://review.opendev.org/#/c/677095.
So lets mark this RFE as triaged and discuss it on one of the next drivers meetings.

tags: added: rfe-triaged
removed: rfe-confirmed

Hello:

Once https://review.opendev.org/#/c/677095 is merged, we need to add an upgrade check. This upgrade check, if the SR-IOV agent is present, should check the kernel version and send a warning in case kernel <3.13.

That was discussed on 15/11/2019, during the drivers meeting.

Regards.

Slawek Kaplonski (slaweq) wrote :

We discussed this RFE on today's drivers meeting (http://eavesdrop.openstack.org/meetings/neutron_drivers/2019/neutron_drivers.2019-11-15-14.01.log.html#l-19) and we approved it.
So You can go with implementation of this RFE now :)
Please just note that we were discussing that some release note and upgrade check to warn administrators about minumum required kernel version should be added to neutron too.

I think that upgrade check can check if there are any SR-IOV agents in deployment and if there are any, than print warning about minimum required version of kernel on nodes.

tags: added: rfe-approved
removed: rfe-triaged
Adrian Chiris (adrian.chiris) wrote :

Slawek & Rodolfo, thx for discussing this RFE in the driver's meeting.
Unfortunately i wasn't able to attend.

I went over the IRC discussions, So I will ping you guys about adding an upgrade check for [1]
Also i have restored [2], will try to get some bandwidth to work on it as most of the work was done already.

[1] https://review.opendev.org/#/c/677095
[2] https://review.opendev.org/#/c/676713/

Changed in neutron:
milestone: none → ussuri-1
Adrian Chiris (adrian.chiris) wrote :

Got some bandwidth, Will work on [1] during December :)

[1] https://review.opendev.org/#/c/676713/

Reviewed: https://review.opendev.org/694757
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a8b5afa41ec8ccda1ace6ebb85bf94c8a9ea829a
Submitter: Zuul
Branch: master

commit a8b5afa41ec8ccda1ace6ebb85bf94c8a9ea829a
Author: Adrian Chiris <email address hidden>
Date: Mon Nov 18 14:13:16 2019 +0200

    Add upgrade check for NIC Switch agent

    This commit adds an upgrade check for NIC switch agent
    alerting operators to ensure relevant hosts have kernel >= 3.13.

    This check is introduced due to recent cleanup in NIC switch
    agent code.

    As of U release, Supported CentOS major was updated to 8 so it
    is not expected that any deployment will have hosts running
    kernel < 3.13. This check is added as an extra precaution.

    This check should be removed in 1-2 cycles. A TODO was added to
    reflect that.

    Change-Id: I34003b3c2f0ac23185d036c9e47dc1c8662d6ce7
    Related-bug: #1841067

Changed in neutron:
assignee: nobody → Adrian Chiris (adrian.chiris)
milestone: ussuri-1 → ussuri-2
Adrian Chiris (adrian.chiris) wrote :

So,
the PS that where proposed [1][2][3] rids SR-IOV agent from depending on administrative mac address set by Nova for macvtap ports.

the final step would be to remove the mac address dependency altogether.

[1] https://review.opendev.org/#/c/677095
[2] https://review.opendev.org/694757
[3] https://review.opendev.org/#/c/676713/

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers