SR-IOV interface configuration is not preserved across reboots

Bug #1697572 reported by Frode Nordahl
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron Open vSwitch Charm
Fix Released
High
Frode Nordahl

Bug Description

The SR-IOV interface configuration is currently done in a one-shot mode. Either after installation of charm or after change of config option 'sriov-numvfs'.

After a reboot of the node the charm is running on, none of this will happen and the systems SR-IOV interfaces will be left unconfigured.

Other means of doing this:
- kernel or module parameters (ixgbe.max_vfs=N)
- separate init script
- rc.local
- use hook that most likely will be executed after reboot

Frode Nordahl (fnordahl)
summary: - SR-IOV interface configuration is not preserved accross reboots of unit
+ SR-IOV interface configuration is not preserved across reboots
Revision history for this message
Frode Nordahl (fnordahl) wrote :
Download full text (24.9 KiB)

After deployment and first install of charm, before reboot:
$ lspci -nn|grep Eth
02:00.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
02:00.1 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
02:00.2 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
02:00.3 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
04:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
04:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
04:10.0 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.1 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.2 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.3 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.4 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.5 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.6 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:10.7 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.0 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.1 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.2 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.3 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.4 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.5 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.6 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:11.7 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:12.0 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:12.1 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:12.2 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:12.3 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controller Virtual Function [8086:1515] (rev 01)
04:12.4 Ethernet controller [0200]: Intel Corporation X540 Ethernet Controlle...

Revision history for this message
Frode Nordahl (fnordahl) wrote :
Download full text (13.8 KiB)

Post-reboot unit-neutron-openvswitch-0.log:
2017-06-22 09:56:19 INFO juju.cmd supercommand.go:63 running jujud [2.2.0.1 gc go1.8]
2017-06-22 09:56:19 DEBUG juju.cmd supercommand.go:64 args: []string{"/var/lib/juju/tools/unit-neutron-openvswitch-0/jujud", "unit", "--data-dir", "/var/lib/juju", "--unit-name", "neutron-openvswitch/0", "--debug"}
2017-06-22 09:56:19 DEBUG juju.agent agent.go:533 read agent config, format "2.0"
2017-06-22 09:56:19 INFO juju.jujud unit.go:141 unit agent unit-neutron-openvswitch-0 start (2.2.0.1 [gc])
2017-06-22 09:56:19 DEBUG juju.worker runner.go:319 start "api"
2017-06-22 09:56:19 INFO juju.worker runner.go:477 start "api"
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "logging-config-updater" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:485 "agent" manifold worker started
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "hook-retry-strategy" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "proxy-config-updater" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.apicaller connect.go:102 connecting with current password
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "log-sender" manifold worker stopped: "api-caller" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "migration-inactive-flag" manifold worker stopped: "api-caller" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:485 "migration-fortress" manifold worker started
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "logging-config-updater" manifold worker stopped: <nil>
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "migration-minion" manifold worker stopped: "api-caller" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "uniter" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:485 "api-config-watcher" manifold worker started
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "leadership-tracker" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "metric-sender" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "metric-collect" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "metric-spool" manifold worker stopped: "migration-inactive-flag" not running: dependency not available
2017-06-22 09:56:19 DEBUG juju.worker.dependency engine.go:499 "api-address-updater" manifold worker stopped: "migration-inactive-flag" not running: dependency not availabl...

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Quote from upstream Juju docs: "Note that the charm's software should be configured so as to persist through reboots without further intervention on juju's part."

Given that I guess we should probably install and maintain a init script?

Kernel parameters are complex for us to change and will vary from vendor to vendor.

Making the juju hook do it after reboot will stop this from working for the end-user in the event that Juju is removed.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

A first step is to find cause of configuration currently not happening on config-changed hook after reboot. I will propose a fix to address that.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/#/c/476506/

Changed in charm-neutron-openvswitch:
status: New → In Progress
assignee: nobody → Frode Nordahl (fnordahl)
Changed in charm-neutron-openvswitch:
milestone: none → 17.08
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-openvswitch (master)

Reviewed: https://review.openstack.org/476506
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/commit/?id=4ffbc2fe25400abf55719a370f3a2cd37f90c99d
Submitter: Jenkins
Branch: master

commit 4ffbc2fe25400abf55719a370f3a2cd37f90c99d
Author: Frode Nordahl <email address hidden>
Date: Wed Aug 23 12:35:47 2017 +0200

    Fix handling of SR-IOV interface configuration

    SR-IOV interfaces are currently only configured on charm
    installation and not after seubsequent reboots.

    The VFs need to be configured before the Neutron SR-IOV
    agent is started. Charms should also really not be involved
    in boot time system configuration. Due to these factors
    this commit adds a init script and corrensponding systemd
    unit file and upstart job to handle the boot-time configuration.

    Keep configure_sriov function for runtime configuration. Add
    warning about runtime configuration disrupting network service.

    Add restart of Neutron SR-IOV agent after runtime configuration.

    Cap value of sriov-numvfs at each interfaces sriov_totalvfs value.

    Change-Id: I7bde7217bf027db09ded35a262c214ccb11d6d86
    Closes-Bug: #1697572

Changed in charm-neutron-openvswitch:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-neutron-openvswitch:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.