VMs with no mgmt IP assigned when data I/F is ae balanced

Bug #1797392 reported by Nimalini Rasa
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Medium
Steven Webster

Bug Description

Brief Description
-----------------
VMs did not get mgmt ip assigned when the data i/f is ae balanced.

Severity
--------
Major

Steps to Reproduce
------------------
1. boot up 10 VMs on a system with ae balanced data i/f
2. ping VMs

Expected Behavior
------------------
All VMs are reachable

Actual Behavior
----------------
Ping failed

Reproducibility
---------------
yes

System Configuration
--------------------
2+4+20 system
Compute node configured with AE balanced mode data port. The corresponding TOR switch ports are configured as a LAG as well.

Branch/Pull Time/Commit
-----------------------
stx.2018.10 release branch build as of 2018-10-01_20-18-00

Timestamp/Logs
--------------
Wed Oct 3 20:10:21 2018: (6/208) 192.168.136.8 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (6/208) 192.168.136.23 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (0/208) 192.168.136.6 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (9/208) 192.168.136.9 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (9/208) 192.168.136.21 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (9/208) 192.168.136.4 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (9/208) 192.168.136.27 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (20/208) 192.168.136.13 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (30/208) 192.168.136.3 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (62/208) 192.168.136.11 is not reachable on initial ping
Wed Oct 3 20:10:21 2018: (83/208) 192.168.136.20 is not reachable on initial ping

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Maybe an interaction with the ToR switch. Issue can be avoided by setting up the AE with LACP. Needs further investigation. Targeting stx.2019.03

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.2019.03 stx.networking
Changed in starlingx:
assignee: nobody → Steven Webster (swebster-wr)
Revision history for this message
Steven Webster (swebster-wr) wrote :

It looks like it is an OVS recommendation to not even put the hardware switch in a bond for balance-slb:

"On the upstream switch, do not configure the interfaces as a bond"

man ovs-vswitchd.conf.db

...
...
       The following types of bonding will work with any kind of upstream switch. On the upstream switch, do not configure the interfaces as a bond:

              balance-slb
                     Balances flows among slaves based on source MAC address and output VLAN, with periodic rebalancing as traffic patterns change.

              active-backup
                     Assigns all flows to one slave, failing over to a backup slave when the active slave is disabled. This is the only bonding mode in which interfaces may be plugged into different upstream switches.
...
...

Revision history for this message
Steven Webster (swebster-wr) wrote :

From the OVS docs:

"Open vSwitch avoids packet duplication by accepting multicast and broadcast packets on only the active slave, and dropping multicast and broadcast packets on all other slaves."

This is indeed what was seen in the lab. So it's likely the upstream switch was only sending the broadcast down one link of the bond, and that link happened to be the inactive slave.

Ghada Khalil (gkhalil)
Changed in starlingx:
status: Triaged → In Progress
Ghada Khalil (gkhalil)
description: updated
Revision history for this message
Ghada Khalil (gkhalil) wrote :

The next step is to configure the switch ports for systems w/ AE balanced mode based on the recommendations in the ovs documentation.

description: updated
Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Revision history for this message
Ghada Khalil (gkhalil) wrote :

It was confirmed that with the switch prots being un-LAG'ed, there does not seem to be any issues for OVS in balanced-slb mode. In summary, this issue was resolved bya network configuration lab. It's not a software issue.

Changed in starlingx:
status: In Progress → Invalid
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.