Traffic control sets 1G mgmt network to 10G

Bug #1799486 reported by Yang Liu
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Medium
Ran An

Bug Description

Brief Description
-----------------
Traffic control sets mgmt to 10G when it is 1G for the actual link capacity as well as config controller ini file

Severity
--------
Major

Steps to Reproduce
------------------
1. Configure a system with 1G mgmt interface
mgmt interface in config.ini:
[LOGICAL_INTERFACE_1]
LAG_INTERFACE = N
INTERFACE_MTU = 1500
INTERFACE_LINK_CAPACITY = 1000
INTERFACE_PORTS = eno2

[MGMT_NETWORK]
CIDR = 192.168.204.0/24
MULTICAST_CIDR=239.1.1.0/28
DYNAMIC_ALLOCATION = Y
LOGICAL_INTERFACE = LOGICAL_INTERFACE_1

[wrsroot@controller-0 ~(keystone_admin)]$ cat /sys/class/net/eno2/speed
1000

2. Check traffic control for mgmt network
[wrsroot@controller-0 ~(keystone_admin)]$ tc class show dev eno2
class htb 1:40 parent 1:1 leaf 40: prio 4 rate 1Gbit ceil 10Gbit burst 15125b cburst 0b
class htb 1:10 parent 1:1 leaf 10: prio 3 rate 1Gbit ceil 10Gbit burst 15125b cburst 0b
class htb 1:1 root rate 10Gbit ceil 10Gbit burst 13750b cburst 0b

Expected Behavior
------------------
root rate and ceiling rate should be 1G instead of 10G
e.g.,
class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 15125b cburst 1375b

Actual Behavior
----------------
- class htb 1:1 root rate 10Gbit ceil 10Gbit burst 13750b cburst 0b
- Also in puppet log, cgcs_tc_setup.sh was setting it to 10G:
2018-10-21T00:45:29.067 Notice: 2018-10-21 00:45:29 +0000 /Stage[main]/Platform::Interfaces/Network_config[eno2]/options: defined 'options' as 'LINKDELAY => 20, post_up => /usr/local/bin/cgcs_tc_setup.sh eno2 mgmt 10000 > /dev/null'

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any system with 1G mgmt interface and 2+ nodes.

Branch/Pull Time/Commit
-----------------------
stx.18.10 as of "2018-10-19_01-52-00", as well as stx master.

Timestamp/Logs
--------------
puppet.log:
2018-10-21T00:45:29.067 Notice: 2018-10-21 00:45:29 +0000 /Stage[main]/Platform::Interfaces/Network_config[eno2]/options: defined 'options' as 'LINKDELAY => 20, post_up => /usr/local/bin/cgcs_tc_setup.sh eno2 mgmt 10000 > /dev/null'

Ghada Khalil (gkhalil)
summary: - STX: Traffic control sets 1G mgmt network to 10G
+ Traffic control sets 1G mgmt network to 10G
Ghada Khalil (gkhalil)
tags: added: stx.networking
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Targeting stx.2019.03 - this will have no impact unless the system is in an overload condition. It is also specific to a particular configuration (1G mgmt network). Therefore, it is not serious enough to block stx.2018.10

tags: added: stx.2019.03
Changed in starlingx:
status: New → Triaged
assignee: nobody → Bruce Jones (brucej)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Please include <email address hidden> on the gerrit review

Revision history for this message
Yang Liu (yliu12) wrote :

controller-0 logs attached.

Revision history for this message
Bruce Jones (brucej) wrote :

Cindy please assign an engineer to work this bug, thanks! Consult with Forrest and his team if needed.

Changed in starlingx:
assignee: Bruce Jones (brucej) → Cindy Xie (xxie1)
Ran An (an.ran)
Changed in starlingx:
assignee: Cindy Xie (xxie1) → Ran An (an.ran)
Ran An (an.ran)
Changed in starlingx:
status: Triaged → Confirmed
Revision history for this message
Ran An (an.ran) wrote :

1. this has been reproduce by config the system with 1G mgmt interface manually.
2. deeply study on config controller and puppet is under going.

Revision history for this message
Ran An (an.ran) wrote :

in the latest code(master), it is hard-coded and set network speed to 10G.
I will provide fix patch later.

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
Ran An (an.ran) wrote :

by discussions with TL of stx-config, setting network speed to 10G is hardcoded by design as specified under the Story https://storyboard.openstack.org/#!/story/2003087.

further discussion is under going.

Ran An (an.ran)
Changed in starlingx:
status: In Progress → Invalid
status: Invalid → Opinion
Revision history for this message
Ran An (an.ran) wrote :

by discussions with TL of stx-config, we will leave this bug open to "adjust the behavior of the script to use the hard coded value as a fallback value if it can’t determine the actual link speed."

As we no longer to support the explicit configuration of the mgmt network speed, we will remove related codes
of configure, traced by https://bugs.launchpad.net/starlingx/+bug/1805320

Changed in starlingx:
assignee: Ran An (an.ran) → nobody
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Opinion → Confirmed
Revision history for this message
Ghada Khalil (gkhalil) wrote :

The TC script still needs to be updated to use the hard-coded value as a fall-back if it can't determine the actual link state. Assigning back to Ran to implement this solution.

Changed in starlingx:
assignee: nobody → Ran An (an.ran)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Just to clarify, the acronym "TC" above is referring to Traffic Controls, not Test Case.

Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Revision history for this message
Ran An (an.ran) wrote :

1. this issue is by design.
2. as for "The TC script still needs to be updated to use the hard-coded value as a fall-back if it can't determine the actual link state" mentioned in #9, current codes had already meet requirement.

Changed in starlingx:
status: Confirmed → Invalid
Revision history for this message
Yang Liu (yliu12) wrote :

Hi Ran, could you please let me know how to check whether it failed to determine the actual link state thus a fallback value is used? i.e., any cmd or any log indicate that? We can update logic for the test accordingly.

Revision history for this message
Ran An (an.ran) wrote :

Hi Yang.
please check linux file "/etc/sysconfig/network-scripts/<mgmt net>:1", whether the value of "post_up" is "/usr/local/bin/cgcs_tc_setup.sh <mgmt net> mgmt 10000 > /dev/null".
if so, it means if the actual link speed is less than 10000, the fallback value 10000 will be used to set traffic control of <mgmt net>.

while I didn't find any logs on my env about traffic control setting.

Revision history for this message
Yang Liu (yliu12) wrote :

Ok, so I will update the testcase to consider any <10G mgmt link speed as 10G.

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.