Deployment task 'netconfig' incorrectly configures SR-IOV nics on second run

Bug #1558427 reported by Artem Panchenko on 2016-03-17
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Vladimir Eremin

Bug Description

Fuel version info (9.0 liberty):

Re-deployment fails on 'sriov_iommu_check' task, because 'netconfig' doesn't configure NICs with enabled SR-IOV properly:

root@node-2:~# echo 63 > /sys/class/net/enp1s0f0/device/sriov_numvfs
root@node-2:~# echo 63 > /sys/class/net/enp1s0f1/device/sriov_numvfs
root@node-2:~# ruby /etc/puppet/modules/osnailyfacter/modular/netconfig/sriov_iommu_check.rb
OK: SR-IOV and IOMMU are properly configured for enp1s0f0 interface
OK: SR-IOV and IOMMU are properly configured for enp1s0f1 interface
root@node-2:~# puppet apply -d /etc/puppet/modules/osnailyfacter/modular/netconfig/netconfig.pp &> /tmp/puppet.log
root@node-2:~# echo $?
root@node-2:~# cat /sys/class/net/enp1s0f0/device/sriov_numvfs /sys/class/net/enp1s0f1/device/sriov_numvfs
root@node-2:~# ifdown enp1s0f1
root@node-2:~# ifup enp1s0f1
root@node-2:~# cat /sys/class/net/enp1s0f1/device/sriov_numvfs

I added some debug logs to '/etc/puppet/modules/l23network/lib/puppet/provider/l2_port/sriov.rb' and got this:

root@node-2:~# grep -E 'Setting numvfs for|Value of numvfs for' /tmp/puppet.log
Debug: L2_port[enp1s0f0](provider=sriov): Value of numvfs for 'enp1s0f0' is different '63' != '0'
Debug: L2_port[enp1s0f0](provider=sriov): Setting numvfs for 'enp1s0f0' to '0'
Debug: L2_port[enp1s0f0](provider=sriov): Setting numvfs for 'enp1s0f0' to '0'
Debug: L2_port[enp1s0f1](provider=sriov): Value of numvfs for 'enp1s0f1' is different '63' != '0'
Debug: L2_port[enp1s0f1](provider=sriov): Setting numvfs for 'enp1s0f1' to '0'
Debug: L2_port[enp1s0f1](provider=sriov): Setting numvfs for 'enp1s0f1' to '0'

If I set 'sriov_numvfs' to 0 for all SR-IOV NICs and run netconfig.pp again then VFs are configured properly, because the following code isn't executed:

Steps to reproduce:

1. Create cluster with VLAN segmentation
2. Add 1 controller node and 1 compute node with NICs which support SR-IOV
3. Enable SR-IOV on some compute's NICs, set VFs number to max value (sriov_numvfs == sriov_totalvfs)
4. Deploy environment
5. SSH to compute and run 'puppet apply /etc/puppet/modules/osnailyfacter/modular/netconfig/netconfig.pp'

Expected result: sriov_numvfs value for SR-IOV enabled NICs isn't changed

Actual result: sriov_numvfs is set to 0 for all SR-IOV enabled NICs

Diagnostic snapshot:

Artem Panchenko (apanchenko-8) wrote :

Also looks like this issue affect configure_default_route.pp task too:

2016-03-16 18:58:26 +0000 Scope(Class[main]) (notice): MODULAR: netconfig.pp
2016-03-16 18:58:27 +0000 /Stage[main]/Main/L23network::L2::Port[enp1s0f0]/L23_stored_config[enp1s0f0]/sriov_numvfs (notice): sriov_numvfs changed '63' to '63'
2016-03-16 18:58:29 +0000 /Stage[main]/Main/L23network::L2::Port[enp1s0f1]/L23_stored_config[enp1s0f1]/sriov_numvfs (notice): sriov_numvfs changed '63' to '63'
2016-03-16 18:59:08 +0000 Scope(Class[main]) (notice): MODULAR: connectivity_tests.pp
2016-03-16 18:59:15 +0000 Scope(Class[main]) (notice): MODULAR: sriov_iommu_check.pp
2016-03-16 18:59:21 +0000 Scope(Class[main]) (notice): MODULAR: firewall.pp
2016-03-16 18:59:32 +0000 Scope(Class[main]) (notice): MODULAR: hosts.pp
2016-03-16 19:11:13 +0000 Scope(Class[main]) (notice): MODULAR: compute.pp
2016-03-16 19:13:58 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/common-config.pp
2016-03-16 19:14:34 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/plugins/ml2.pp
2016-03-16 19:14:44 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/agents/l3.pp
2016-03-16 19:14:50 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/agents/sriov.pp
2016-03-16 19:15:06 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/agents/metadata.pp
2016-03-16 19:15:13 +0000 Scope(Class[main]) (notice): MODULAR: openstack-network/compute-nova.pp
2016-03-16 19:15:22 +0000 Scope(Class[main]) (notice): MODULAR: enable_compute.pp
2016-03-16 19:21:28 +0000 Scope(Class[main]) (notice): MODULAR: dns-client.pp
2016-03-16 19:21:36 +0000 Scope(Class[main]) (notice): MODULAR: cgroups.pp
2016-03-16 19:21:49 +0000 Scope(Class[main]) (notice): MODULAR: configure_default_route.pp
2016-03-16 19:21:51 +0000 /Stage[main]/Main/L23network::L2::Port[enp1s0f0]/L23_stored_config[enp1s0f0]/sriov_numvfs (notice): sriov_numvfs changed '63' to '63'
2016-03-16 19:21:54 +0000 /Stage[main]/Main/L23network::L2::Port[enp1s0f1]/L23_stored_config[enp1s0f1]/sriov_numvfs (notice): sriov_numvfs changed '63' to '63'
2016-03-16 19:22:00 +0000 Scope(Class[main]) (notice): MODULAR: hosts.pp
2016-03-16 19:22:02 +0000 Scope(Class[main]) (notice): MODULAR: ntp-client.pp

So currently after successful deployment I have 'sriov_numvfs' set to 0 on all compute nodes with SR-IOV enabled NICs.

Changed in fuel:
importance: Undecided → High
tags: added: l23network
Changed in fuel:
status: New → Confirmed
description: updated
Vladimir Eremin (yottatsa) wrote :

I'm not sure it's duplicate for

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Vladimir Eremin (yottatsa)
Vladimir Eremin (yottatsa) wrote :

This is failed because vendor_specific is empty after re-apply (no changes)

Debug: L2_port[enp1s0f0](provider=sriov): FLUSH properties: L2_port[enp1s0f0] {:vendor_specific=>{}}

And there it's converted to 0.

Fix proposed to branch: master

Changed in fuel:
status: Confirmed → In Progress
tags: added: area-library

Submitter: Jenkins
Branch: master

commit 7f9bebc073c01c8cefcd9483024905c33b535c0b
Author: Vladimir Eremin <email address hidden>
Date: Thu Mar 17 14:06:57 2016 +0300

    Fix SR-IOV re-apply

    * vendor_specific update fixed
    * check new sriov_numvfs for nil fixed
    * prefetch fixed
    * also, conventional "FLUSH properties" debug added

    Change-Id: I8ebbbff278e448ee20fad7826bf5b1f7c11ea610
    Closes-Bug: #1558427

Changed in fuel:
status: In Progress → Fix Committed
Artem Panchenko (apanchenko-8) wrote :


cat /etc/fuel_build_id:
cat /etc/fuel_build_number:
cat /etc/fuel_release:
cat /etc/fuel_openstack_version:
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers