neutron-sriov-agent failing to start

Bug #1903638 reported by Michał Ajduk
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
charm-ovn-chassis
Fix Released
High
Unassigned

Bug Description

SR-IOV charm configures only OVN SR-IOV mechanism driver which requires capability=switchdev and /sys/class/net/<iface>/phys_switch_id to be readable. Intel X710 driver does not provide readable phys_switch_id.

 This driver requires switchdev mode for port types direct (SR-IOV VF PT) and direct-physical (PF PT):
/usr/lib/python3/dist-packages/neutron/common/ovn/constants.py:
EXTERNAL_PORT_TYPES = (portbindings.VNIC_DIRECT,
                       portbindings.VNIC_DIRECT_PHYSICAL,
                       portbindings.VNIC_MACVTAP)

/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py:761
        if (vnic_type in ovn_const.EXTERNAL_PORT_TYPES and
                ovn_const.PORT_CAP_SWITCHDEV not in capabilities):
            LOG.debug("Refusing to bind port due to unsupported vnic_type: %s "
                      "with no switchdev capability", vnic_type)
            return
However x710 driver in 4.15 and 5.4.0-52 kernel with X710 driver 2.1.14-k and 2.8.20-k (HWE) does not support this mode. Switchdev mode requires:
    pf_path = "/sys/class/net/%s" % pf_ifname
    pf_sw_id_file = os.path.join(pf_path, "phys_switch_id")

However this operation is not supported in the above mentioned configs:
cat: /sys/class/net/ens3f0/phys_switch_id: Operation not supported
cat: /sys/class/net/ens3f1/phys_switch_id: Operation not supported

The legacy SR-IOV mode is the only supported supported in sriovnicswitch:
/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/mech_sriov/mech_driver/mech_driver.py
        if (vnic_type == portbindings.VNIC_DIRECT and
                'switchdev' in capabilities):
            LOG.debug("Refusing to bind due to unsupported vnic_type: %s "
                      "with switchdev capability", portbindings.VNIC_DIRECT)
            return

Unfortunately the charm does not start start Neutron SR-IOV Agent, using only OVN mechanism driver hence not allowing to use SR-IOV.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Switchdev mode is used for hardware offload.

Regular SR-IOV is also supported by the charm by enabling a configuration option and adding an optional relation.

Please refer to the documentation [0] for more information.

0: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-ovn.html#sr-iov-for-networking-support

Changed in charm-ovn-chassis:
status: New → Incomplete
Michał Ajduk (majduk)
description: updated
Michał Ajduk (majduk)
summary: - SR-IOV native OVN driver does not work on Intel X710
+ neutron-sriov-agent failing to start
Revision history for this message
Michał Ajduk (majduk) wrote :

Thank you for pointing to the documentation. I have used this documentation in first place. Updated the wording though - the bottom point is that the neutron-sriov-agent is not started. It is installed though.

Consult below my charm configuration:
  pci-passthrough-whitelist: &pci-passthrough-whitelist '[{ "devname": "ens3f0", "physical_network": "sriovfabric"},{ "devname": "ens3f1", "physical_network": "sriovfabric"}]'
  sriov-numvfs: &sriov-numvfs "ens3f0:64 ens3f0:64"
  sriov-device-mappings: &sriov-device-mappings "sriovfabric:ens3f0 sriovfabric:ens3f1"
  sriov-bridge-mappings: &sriov-bridge-mappings "dcfabric:br-data sriovfabric:br-data"
  sriov-data-port: &sriov-data-port "br-data:bond1"

  ovn-chassis-sriov:
    charm: cs:ovn-chassis
    num_units: 0
    bindings:
      "": *oam-space
      data: *overlay-space
      certificates: *internal-space
    options:
      ovn-bridge-mappings: *sriov-bridge-mappings
      bridge-interface-mappings: *sriov-data-port
      enable-sriov: true
      sriov-device-mappings: *sriov-device-mappings
      sriov-numvfs: *sriov-numvfs

Nova PCI passthrough is set:
juju config nova-compute-kvm-sriov pci-passthrough-whitelist
[{ "devname": "ens3f0", "physical_network": "sriovfabric"},{ "devname": "ens3f1", "physical_network": "sriovfabric"}]

Relation is also in place:
juju add-relation ovn-chassis-sriov:amqp rabbitmq-server:amqp
cannot add relation "ovn-chassis-sriov:amqp rabbitmq-server:amqp": relation ovn-chassis-sriov:amqp

However the SR-IOV agent is not running:
I believe that taking all this into account the problem is different:
service neutron-sriov-agent status
● neutron-sriov-agent.service - OpenStack Neutron SRIOV Plugin Agent
   Loaded: loaded (/lib/systemd/system/neutron-sriov-agent.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2020-11-10 09:03:33 UTC; 1h 10min ago
 Main PID: 64855 (code=exited, status=0/SUCCESS)

Nov 10 09:03:31 cmp4az2cz20300kvc systemd[1]: Started OpenStack Neutron SRIOV Plugin Agent.
Nov 10 09:03:33 cmp4az2cz20300kvc neutron-sriov-agent[64855]: argument --debug: Invalid Boolean value:

That is because:
head /etc/neutron/neutron.conf
###############################################################################
# [ WARNING ]
# Configuration file maintained by Juju. Local changes may be overwritten.
# Config managed by ovn-chassis charm
###############################################################################
[DEFAULT]
debug = # This setting is empty.

This is rendered from: {{ options.debug }}

But the charm does not seem to set this option anywhere in the code.

Changed in charm-ovn-chassis:
status: Incomplete → New
Frode Nordahl (fnordahl)
Changed in charm-ovn-chassis:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Frode Nordahl (fnordahl)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ovn-chassis (master)

Fix proposed to branch: master
Review: https://review.opendev.org/762113

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ovn-chassis (master)

Reviewed: https://review.opendev.org/762113
Committed: https://git.openstack.org/cgit/x/charm-ovn-chassis/commit/?id=f02046456b7984c257a10d25c135258f6a692e7b
Submitter: Zuul
Branch: master

commit f02046456b7984c257a10d25c135258f6a692e7b
Author: Frode Nordahl <email address hidden>
Date: Tue Nov 10 11:35:40 2020 +0100

    Fix SR-IOV support, restore `debug` configuration option

    The `debug` configuration option was removed in
    commit 2924a7e6835c108683f2fff531a64d2ea4459b2a which broke the
    SR-IOV support.

    Change-Id: Ica62114dcc4d68549d18bb8242ae57e87aba87d1
    Closes-Bug: #1903638

Changed in charm-ovn-chassis:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ovn-chassis (stable/20.10)

Fix proposed to branch: stable/20.10
Review: https://review.opendev.org/762197

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ovn-chassis (stable/20.10)

Reviewed: https://review.opendev.org/762197
Committed: https://git.openstack.org/cgit/x/charm-ovn-chassis/commit/?id=6bc2935efff787b66ac2d095f05df085e4f964fb
Submitter: Zuul
Branch: stable/20.10

commit 6bc2935efff787b66ac2d095f05df085e4f964fb
Author: Frode Nordahl <email address hidden>
Date: Tue Nov 10 11:35:40 2020 +0100

    Fix SR-IOV support, restore `debug` configuration option

    The `debug` configuration option was removed in
    commit 2924a7e6835c108683f2fff531a64d2ea4459b2a which broke the
    SR-IOV support.

    Change-Id: Ica62114dcc4d68549d18bb8242ae57e87aba87d1
    Closes-Bug: #1903638
    (cherry picked from commit f02046456b7984c257a10d25c135258f6a692e7b)

Frode Nordahl (fnordahl)
Changed in charm-ovn-chassis:
status: Fix Committed → Fix Released
assignee: Frode Nordahl (fnordahl) → nobody
Revision history for this message
Michael Skalka (mskalka) wrote :

Removing the critical subscription as a fix has been found & released.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.