Interface configuration issue with RHEL compute using bondig + lacp + vlan for control-data interface

Bug #1593200 reported by kalagesan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Ignatious Johnson Christopher
R2.21.x
Fix Committed
High
Ignatious Johnson Christopher
R2.22.x
Fix Committed
High
Ignatious Johnson Christopher
R3.0
Fix Committed
High
Ignatious Johnson Christopher
R3.0.2.x
Fix Committed
High
Ignatious Johnson Christopher
Trunk
Fix Committed
High
Ignatious Johnson Christopher

Bug Description

Customer uses bondig + lacp + vlan for management/control-data interface
on their RHEL compute. The interface name is bond1.61, and its configuration file
is /etc/sysconfig/network-scripts/ifcfg-bond1.61.

Contrail version 2.21.2 build 36 /RHEL 7.1

During the installation of a vrouter, a provisioning script recreate the interface configuration file mapped to vhost0, and it insert "HWADDR" entry to physical, and "SUBCHANNELS" entry to bonding interface configuration.

https://github.com/Juniper/contrail-provisioning/blob/R2.21.x/contrail_provisioning/compute/network.py#L154
-----------------------------------------------------------------
        if os.path.isdir ('/sys/class/net/%s/bonding' % dev):
            bond = True
(snip)
        if bond:
            new_f_lines.append('SUBCHANNELS=1,2,3\n')
        else:
            new_f_lines.append('HWADDR=%s\n' % mac)
-----------------------------------------------------------------

Since the script doesn't distinguish whether the parent of vlan interface is bond or physical,
bond1.61 is regarded as non-bond interface and HWADDR is inserted to ifcfg-bond1.61.

However, it's problematic that the child interface configuration of bonding interface has one of slave's MAC address as HWADDR. If the operating system recognize the other
slave first, mac address of bond interface can be different from that kept in configuration file of child interface and thus it fails to be up.

If this happens during the last reboot after the setup_only_vrouter_node, the reboot_node task is timed out and abort. So, the installation task should care whether the control-data interface is a child of bonding interface.

I have also set this in my lab, we can reproduce the issue following below steps:

Just install vrouter on RHEL compute that uses bondig + lacp + vlan for its control-data interface.

The issue is that HWADDR is inserted to the configuration file of vlan interface which is
a child of bonded interface. Provisioning script can detect parent bonded interface, but not
child of it.

Before the installation of vrouter, interface configuration is as below.

--------------------------------------------------
[root@lb3bp-ssdd0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens1f0
TYPE="Ethernet"
BOOTPROTO="none"
DEVICE="ens1f0"
MASTER="bond1"
SLAVE="yes"
ETHTOOL_OPTS="-G "ens1f0" rx 1000"
[root@lb3bp-vscp0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens2f0
TYPE="Ethernet"
BOOTPROTO="none"
DEVICE="ens2f0"
MASTER="bond1"
SLAVE="yes"
ETHTOOL_OPTS="-G "ens2f0" rx 1000"
[root@lb3bp-vscp0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
TYPE="Bond"
BOOTPROTO="none"
DEVICE="bond0"
ONBOOT="yes"
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1 xmit_hash_policy=layer2+3"
[root@lb3bp-vscp0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1.61
TYPE="Vlan"
VLAN="yes"
BOOTPROTO="static"
DEFROUTE="yes"
DEVICE="bond1.61"
ONBOOT="yes"
IPADDR="10.0.128.72"
NETMASK="255.255.255.192"
--------------------------------------------------

After the installation (running "fab reboot_node"), ifcfg-bond1.61 is changed to like this.

--------------------------------------------------
[root@lb3bp-ssdp0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1.61
#Contrail bond1.61
DEVICE=bond1.61
TYPE="Vlan"
VLAN="yes"
DEFROUTE="yes"
ONBOOT="yes"
NM_CONTROLLED=no
HWADDR=8c:dc:d4:b7:41:20
--------------------------------------------------

HWADDR statement is inserted, and "8c:dc:d4:b7:41:20" is mac address of ens1f0.
This behavior is not desirable, so customer comment it out now.

--------------------------------------------------
[root@lb3bp-vscp0001n ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1.61
#Contrail bond1.61
DEVICE=bond1.61
TYPE="Vlan"
VLAN="yes"
DEFROUTE="yes"
ONBOOT="yes"
NM_CONTROLLED=no
#HWADDR=8c:dc:d4:b7:41:20

[root@lb3bp-vscp0001n ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 3c:a8:2a:12:f8:70 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 3c:a8:2a:12:f8:71 brd ff:ff:ff:ff:ff:ff
4: ens2f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP qlen 1000
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
5: eno3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 3c:a8:2a:12:f8:72 brd ff:ff:ff:ff:ff:ff
6: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 3c:a8:2a:12:f8:73 brd ff:ff:ff:ff:ff:ff
7: ens2f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
8: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP qlen 1000
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
9: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
11: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
12: pkt1: <UP,LOWER_UP> mtu 65535 qdisc noqueue state UNKNOWN
link/void ba:a3:33:0c:7c:00 brd 00:00:00:00:00:00
13: pkt3: <UP,LOWER_UP> mtu 65535 qdisc noqueue state UNKNOWN
link/void 76:5a:0f:2b:1d:41 brd 00:00:00:00:00:00
14: pkt2: <UP,LOWER_UP> mtu 65535 qdisc noqueue state UNKNOWN
link/void 82:1c:2c:75:e7:bc brd 00:00:00:00:00:00
15: bond0.62@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
inet 10.0.129.72/26 brd 10.0.129.127 scope global bond0.62
valid_lft forever preferred_lft forever
16: bond0.63@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
inet 10.0.130.72/26 brd 10.0.130.127 scope global bond0.63
valid_lft forever preferred_lft forever
17: bond0.64@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:21 brd ff:ff:ff:ff:ff:ff
inet 10.0.131.72/26 brd 10.0.131.127 scope global bond0.64
valid_lft forever preferred_lft forever
18: bond1.61@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
19: bond1.65@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
inet 10.0.132.72/26 brd 10.0.132.127 scope global bond1.65
valid_lft forever preferred_lft forever
20: vhost0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 8c:dc:d4:b7:41:20 brd ff:ff:ff:ff:ff:ff
inet 10.0.128.72/26 brd 10.0.128.127 scope global vhost0
valid_lft forever preferred_lft forever
21: pkt0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500
link/ether ae:2f:64:31:9d:ab brd ff:ff:ff:ff:ff:ff
34: tapd750580c-e8: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc pfifo_fast state DOWN qlen 500
link/ether fe:e2:0b:02:ca:33 brd ff:ff:ff:ff:ff:ff
:
[root@lb3bp-vscp0001n ~]# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2+3 (2)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 33
Partner Key: 1
Partner Mac Address: 54:4b:8c:40:4d:00

Slave Interface: ens1f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 8c:dc:d4:b7:41:20
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: ens2f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 8c:dc:d4:b7:41:54
Aggregator ID: 1
Slave queue ID: 0
--------------------------------------------------

Tags: provisioning
kalagesan (kalagesan)
description: updated
Changed in juniperopenstack:
importance: Undecided → High
assignee: nobody → Ignatious Johnson Christopher (ijohnson-x)
tags: added: provisioning
kalagesan (kalagesan)
information type: Proprietary → Public
Revision history for this message
kalagesan (kalagesan) wrote :
Revision history for this message
Ignatious Johnson Christopher (ijohnson-x) wrote :

I tried to reproduce the issue with R2.21-43 build, I didn’t hit this issue because in the following line
<https://github.com/Juniper/contrail-provisioning/blob/R2.22.x/contrail_provisioning/compute/common.py#L286>

We move the ifcfg-<dev> script of the interface hijacked by vhost0 to ifcfg-<dev>.rpmsave

The only possible scenario that will cause this issue follows,
1. Configure bond/vlan interface configuration manually
2. Provision compute using FAB
3. Assume compute provisioning fails for some reason before reboot, but after moving the hijacked interface configuration file to .rpmsave
4. Reconfigure bond/vlan interface configuration manually(ifcfg-<dev> is recreated)
2. Provision compute using FAB again to hit the issue

I can handle this scenario in my fix, before that can you confirm whether above steps are executed to hit this issue?

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21560
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/21561
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/21563
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/21565
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/21566
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/21563
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/21565
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/21566
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/21561
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21560
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21560
Committed: http://github.org/Juniper/contrail-provisioning/commit/9d1a01ae2b06809947fdce961a43d5fef3fd773f
Submitter: Zuul
Branch: master

commit 9d1a01ae2b06809947fdce961a43d5fef3fd773f
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 15:53:02 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I40d24e116787cb92018307a3db62f213807b4c95
Closes-Bug: 1593200

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0.2.x

Review in progress for https://review.opencontrail.org/21684
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21561
Committed: http://github.org/Juniper/contrail-provisioning/commit/917ab23bca9673879fc3dbc6b16863127061a646
Submitter: Zuul
Branch: R3.0

commit 917ab23bca9673879fc3dbc6b16863127061a646
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 16:29:17 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I1c797c8b8e2340271cdec837c7aeb8ce8bcd027b
Closes-Bug: 1593200

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21565
Committed: http://github.org/Juniper/contrail-provisioning/commit/f47a4c1ba13ffc5c0919ff908d9c90a47d15c19a
Submitter: Zuul
Branch: R2.22.x

commit f47a4c1ba13ffc5c0919ff908d9c90a47d15c19a
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 15:53:02 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I40d24e116787cb92018307a3db62f213807b4c95
Closes-Bug: 1593200

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21566
Committed: http://github.org/Juniper/contrail-provisioning/commit/14c9c18534c2862d02dcd6f482b43e1896c63fa0
Submitter: Zuul
Branch: R2.20

commit 14c9c18534c2862d02dcd6f482b43e1896c63fa0
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 15:53:02 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I40d24e116787cb92018307a3db62f213807b4c95
Closes-Bug: 1593200

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21684
Committed: http://github.org/Juniper/contrail-provisioning/commit/2b658db4f50fbc4cb3f4975c2b3cce75eb76d227
Submitter: Zuul
Branch: R3.0.2.x

commit 2b658db4f50fbc4cb3f4975c2b3cce75eb76d227
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 16:29:17 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I1c797c8b8e2340271cdec837c7aeb8ce8bcd027b
Closes-Bug: 1593200

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21563
Committed: http://github.org/Juniper/contrail-provisioning/commit/a4f5041a66df48e6c30284a1a7688e09e862f707
Submitter: Zuul
Branch: R2.21.x

commit a4f5041a66df48e6c30284a1a7688e09e862f707
Author: Ignatious Johnson Christopher <email address hidden>
Date: Wed Jun 29 15:53:02 2016 -0700

No need to specify HWADDR in the vlan interface configuration file.
It dynamically gets it from the parent interface.

Change-Id: I40d24e116787cb92018307a3db62f213807b4c95
Closes-Bug: 1593200

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.