Traffic controls limit bandwidth when infra consolidated with mgmt

Bug #1807055 reported by Ghada Khalil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Steven Webster

Bug Description

Brief Description
-----------------
On systems configured with a 10G mgmt network and an intrastructure network configured as a vlan, traffic controls is limiting the bandwidth on the link. This can cause live migrationsfor large/busy VMs to fail intermittently due to reduced infrastructure network bandwidth

Severity
--------
Major

Steps to Reproduce
------------------
On system with a 10G mgmt network and the infra network defined as a vlan on top of the mgmt interface, run the iperf tool on the infra interface to determine the link bandwidth.

Expected Behavior
------------------
iperf should report around 1G/s bandwidth

Actual Behavior
----------------
iperf is reporting around 236MB/s bandwidth

Reproducibility
---------------
100% Reproducible in this configuration.
Note: The bandwidth is as expected when testing with a dedicated (non-vlan) infra interface

System Configuration
--------------------
Multi-node system with a 10G mgmt network and the infra network defined as a vlan on top of the mgmt interface

Branch/Pull Time/Commit
-----------------------
any

Timestamp/Logs
--------------
Not Required. Issue is reproducible.

Ghada Khalil (gkhalil)
summary: - Intermittent live migration failures due to reduced infrastructure
- network bandwidth
+ Traffic controls limit bandwidth when infra consolidated with mgmt
Changed in starlingx:
assignee: nobody → Steven Webster (swebster-wr)
status: New → Triaged
importance: Undecided → High
tags: added: stx.2019.03 stx.networking
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-integ (master)

Fix proposed to branch: master
Review: https://review.openstack.org/623571

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote :

The gerrit review was merged on Dec 11/2018
https://review.openstack.org/#/c/623571/

Marking this launchpad as "Fix Released" (not sure why the status was not updated automatically by gerrit)

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-integ (f/centos76)

Fix proposed to branch: f/centos76
Review: https://review.openstack.org/625068

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-integ (f/centos76)
Download full text (5.7 KiB)

Reviewed: https://review.openstack.org/625068
Committed: https://git.openstack.org/cgit/openstack/stx-integ/commit/?id=e25c1acc9b5c77f1bab00288d74ca6df0a8640c3
Submitter: Zuul
Branch: f/centos76

commit 920fcb818c3dd8b0945e6d7bd2371dfb71790f60
Author: zhipengl <email address hidden>
Date: Wed Dec 12 19:42:40 2018 +0800

    Remove last patch of iscsi-initiator-utils

    As we see in the patch, it changes %dir to %ghost to avoid RPM audit.
    If we move the config file mod change to config package and use RPM
    instead of SRPM, we have no audit issue anymore and can ignore related
    change.
    Deployment test pass and related file check pass!

    Story: 2003768
    Task: 28459
    Depends-on: https://review.openstack.org/#/c/624584/

    Change-Id: Ic23ccd740520e1942b3118a84cb03aef5f388332
    Signed-off-by: zhipengl <email address hidden>

commit 52bef031ac6e52c73a0a6a680b0ef31b99baac71
Author: Alex Kozyrev <email address hidden>
Date: Tue Dec 11 13:42:07 2018 -0500

    Provide a way to set mem_stats_period_seconds in puppet-nova.

    There is no support of mem_stats_period_seconds in puppet-nova now.
    We need to add a way to set it to 0 to disable QEMU memory balloon statistics.
    The intention is to help with cyclictest spikes due to stats collection.

    Depends-On: Iaea1962601755736688f2deb61730ab1d548b8b1
    Change-Id: I1fe3dfede1a5a07ddb5adaff1095206ffe5f6340
    Closes-bug: 1803615
    Signed-off-by: Alex Kozyrev <email address hidden>

commit 01f5fdd274ac0bc02528b4630dacaf3ca10eb27a
Author: Steven Webster <email address hidden>
Date: Wed Dec 5 15:29:33 2018 -0500

    Traffic control: fix TC filters for vlan sub-interface

    Sometime after kernel 3.10.0-514.16.1.X, tc filter commands no longer
    match 802.1q packets when the filter protocol is set to 'ip'.

    This poses a problem for a consolidated (eg. infra w/ vlan over
    management) interface configuration.

    The tc filter will operate properly on the vlan interface, but all
    traffic will go to the default qdisc (low priority) when it arrives
    with a vlan tag at the sub-interface.

    This commit sets the filter protocol to '802.1q' in the case of a
    subinterface with a vlan tagged interface ontop of it.

    Some bashate cleanup has also been done on this file.

    Closes-Bug: #1807055
    Change-Id: I457faa2b56bbd270c104cc0313ffe3cc1bfd4db3
    Signed-off-by: Steven Webster <email address hidden>

commit 2ec4482fc766bd583df422c2df5939a2707c7996
Author: zhipengl <email address hidden>
Date: Tue Dec 11 22:51:33 2018 +0800

    Refactor meta patch for facter package

    Merge 2 meta patches as the first meta patch is just overwritted by
    second one.
    Build pass!

    Story: 2003768
    Task: 28458

    Change-Id: I02ccadafa5381c82bcace340f6c399af38aeecc7
    Signed-off-by: zhipengl <email address hidden>

commit 11a4f7a6964bd96f22a02f3394fc2d62447480fa
Author: Eric MacDonald <email address hidden>
Date: Mon Dec 10 19:02:18 2018 -0500

    Package log_functions.sh into platform-util

    The log_functions.sh script file wa...

Read more...

tags: added: in-f-centos76
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Re-opening as a similar fix is required for remotelogging_tc_setup.sh

Changed in starlingx:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to stx-integ (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/625784

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to stx-integ (master)

Reviewed: https://review.openstack.org/625784
Committed: https://git.openstack.org/cgit/openstack/stx-integ/commit/?id=61b8055a14f61851b9f70c76849bbb4f8f28ed55
Submitter: Zuul
Branch: master

commit 61b8055a14f61851b9f70c76849bbb4f8f28ed55
Author: Steven Webster <email address hidden>
Date: Mon Dec 17 12:22:48 2018 -0500

    Fix remote logging traffic control filter priority

    Previous commit 01f5fdd made a required change to filter
    infrastructure traffic on the management interface with an 802.1q
    protocol in the case of a consolidated interface.

    However, this has caused the remote logging tc script to have a
    failure. The script tries to install 'ip' protocol filters at the
    same priority as the 802.1q filters, which is rejected by the
    kernel.

    This commit detects a consolidated interface situation and bumps
    the priority of the remote logging tc filter priority on the
    management interface, similarly to what is done in the main
    cgcs_tc_setup script.

    The file has also been cleaned up to pass bashate.

    Related-Bug: #1807055
    Change-Id: Id11625c0f9bcbf109f574563ff284d4a36bc6377
    Signed-off-by: Steven Webster <email address hidden>

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Additional commit merged to stx master on Dec 19. Setting the status to "Fix Released"

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to stx-integ (f/centos76)

Related fix proposed to branch: f/centos76
Review: https://review.openstack.org/626688

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to stx-integ (f/centos76)
Download full text (7.1 KiB)

Reviewed: https://review.openstack.org/626688
Committed: https://git.openstack.org/cgit/openstack/stx-integ/commit/?id=26cb76275997f060064d439e050384bead77b21b
Submitter: Zuul
Branch: f/centos76

commit acc1863b269fa974cd6c19b31c224dd88154e09d
Author: zhipengl <email address hidden>
Date: Tue Dec 11 00:30:56 2018 +0800

    Refactor source code patches for dhcp package

    3 source patches can be removed.
    2 patches adds support for wrs_install_uuid in the dhclient script.
    This added script part just copy the whole content of dhclient-enter-hooks.
    Following this script part, it will call this hook script if the hook
    exist under /etc/. However, our hook file existed in /etc/dhcp/ folder will
    be called by sbin/dhclient-script as well. I'd like to use dhcp config
    package to creat /etc/dhclient-enter-hooks soft linked to
    /etc/dhcp/dhclient-enter-hooks, so that it can call dhclient script and
    no need to add this 2 patches.

    Support-disable-nsupdate.patch can be removed as we already fixed port
    conflict issue in https://review.openstack.org/#/c/622711/

    Deployment test pass and related script file check pass!

    Story: 2004473
    Task: 28164

    Change-Id: If50ae697062a7d0c8a2831fbcc0f5641aaa41ec7
    Signed-off-by: zhipengl <email address hidden>

commit 61b8055a14f61851b9f70c76849bbb4f8f28ed55
Author: Steven Webster <email address hidden>
Date: Mon Dec 17 12:22:48 2018 -0500

    Fix remote logging traffic control filter priority

    Previous commit 01f5fdd made a required change to filter
    infrastructure traffic on the management interface with an 802.1q
    protocol in the case of a consolidated interface.

    However, this has caused the remote logging tc script to have a
    failure. The script tries to install 'ip' protocol filters at the
    same priority as the 802.1q filters, which is rejected by the
    kernel.

    This commit detects a consolidated interface situation and bumps
    the priority of the remote logging tc filter priority on the
    management interface, similarly to what is done in the main
    cgcs_tc_setup script.

    The file has also been cleaned up to pass bashate.

    Related-Bug: #1807055
    Change-Id: Id11625c0f9bcbf109f574563ff284d4a36bc6377
    Signed-off-by: Steven Webster <email address hidden>

commit 4dd1d96eddc84433ee3f6cf6f61db5b71a2d3b4c
Author: zhipengl <email address hidden>
Date: Sat Dec 15 01:34:18 2018 +0800

    Fix SFTP service is not working issue

    The root cause is that sftp path in sshd_config is not right.
    It should be changed from /usr/libexec/sftp-server
    to /usr/libexec/openssh/sftp-server

    Verified in my deployment environment
    sftp can connect to controller.

    Closes-Bug: 1808054

    Change-Id: Ia8d00abc1f18bc3b46faadd87f8ed153a446b7b0
    Signed-off-by: zhipengl <email address hidden>

commit 43514ea7fbd18d518511a165b59c82b7e20ebd8d
Author: Kwan, Louie <email address hidden>
Date: Wed Dec 12 15:54:30 2018 -0500

    [Enhancement] Add system active alarms in collect logs

    Currently the collect tool does not c...

Read more...

Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.