Low network performance in Shaker north-south scenario

Bug #1593752 reported by Ilya Shakhat
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Confirmed
High
MOS Neutron
9.x
Fix Released
High
Alexander Adamov

Bug Description

Shaker north-south scenario (openstack/perf_l3_north_south) measures throughput between 2 instances located in different networks plugged into different routers. The source instance has private address only, the destination is accessed via floating IP. Measured throughput is around 10s of kilobytes instead of Gbs.

Description of the environment: MOS 9, ISO 415 and ISO 481, Neutron ML2, VxLAN, DVR, 6-nodes cluster. However it was not present on ISO 401 deployed on KVM.

The same can be reproduced manually with iperf3. The throughput between instance with fip and another instance with fip is good.

Steps to reproduce: run Shaker openstack/perf_l3_north_south scenario

Expected results: the throughput should be at least Gb for 1Gb interface

Actual result:
ubuntu@shaker-aompes-master-1:~$ iperf3 -c 172.20.157.25
Connecting to host 172.20.157.25, port 5201
[ 4] local 10.1.0.6 port 57555 connected to 172.20.157.25 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 54.6 KBytes 447 Kbits/sec 2 1.37 KBytes
[ 4] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 8 9.56 KBytes
[ 4] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 9.56 KBytes

Reproducibility: always

Workaround: n/a

Impact: barely no network connection from instance

Ilya Shakhat (shakhat)
Changed in mos:
milestone: none → 9.0
assignee: nobody → MOS Neutron (mos-neutron)
importance: Undecided → Critical
Revision history for this message
Alexander Ignatov (aignatov) wrote :

Work is in progress.

Changed in mos:
status: New → Confirmed
Ilya Shakhat (shakhat)
description: updated
Dina Belova (dbelova)
tags: added: area-neutron
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

User impact:

The bandwidth in 10 Kb per second from VMs to Internet is not acceptable at all.

The priority is Critical because it is not acceptable for MOS release to have such low network performance.

Revision history for this message
Dina Belova (dbelova) wrote :

Reproduced on #481 ISO as well.

Ilya Shakhat (shakhat)
description: updated
Revision history for this message
Ilya Shakhat (shakhat) wrote :

On KVM-based deployment with ISO 495 the issue is not reproduced:

================= ======== ======== ========
Metric Min Avg Max
================= ======== ======== ========
bandwidth, Mbit/s 139.99 297.92 419.79
retransmits 9 332
================= ======== ======== ========

Revision history for this message
Oleg Bondarev (obondarev) wrote :

So on the Intel lab we see poor network performance when running iperf from a VM without floating IP against public iperf server (https://iperf.fr/iperf-servers.php)
  VMs with floating IP (DVR router) have no such issue (in this case traffic goes to the external directly from compute node)

  For non-DVR routers the problem is with both floating IP and non-floating IP VMs (in this case traffic goes through controller in both cases)
  So can say this is not a DVR issue.

  Also tried to run iperf directly from router namespaces on controllers against public iperf servers - works fine, bandwidth is ok.
  Also tried to run iperf from VMs against router gateway on controller - also works fine in all cases (floating/non-floating, DVR/non-DVR)

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Turning tx off on physical interface of the external network fixes the issue completely.

This could have been made at the poit of deploymeng using Fuel UI, which has specific checkboxes for NIC features.

Considering this, the bug is not critical because it has both post and pre deployment workarounds.

Revision history for this message
Alexander Ignatov (aignatov) wrote :

Moved to High as per comment #6.

Revision history for this message
Elena Ezhova (eezhova) wrote :

It turns out that it is enough to turn off TSO, the Tx on/TSO off combination works fine.

Revision history for this message
Ilya Shakhat (shakhat) wrote :

The issue affects egress traffic only and most probably caused by enforced checksum verification (packets with wrong checksum don't pass through).

TCP egress
==========
    bandwidth:
      avg: 0.049431371688842776
      max: 0.4261589050292969
      min: 0
      unit: Mbit/s

TCP ingress
===========
    bandwidth:
      avg: 4052.971162160238
      max: 7042.360305786133
      min: 2267.122268676758
      unit: Mbit/s

Revision history for this message
Alexander Duyck (alexander-duyck) wrote :

Could we record the kernel version and i40e driver version for this so I can determine if this is still an issue in the latest kernel and/or drivers?

Revision history for this message
Sergey Matov (smatov) wrote :
Download full text (8.1 KiB)

Alex, I'll duplicate all I've described in email here so it will be pinned here.

root@node-8:~# uname -r
3.13.0-88-generic

root@node-8:~# modinfo i40e
filename: /lib/modules/3.13.0-88-generic/updates/dkms/i40e.ko
version: 1.3.47

For controller node I've tried i40e 1.5.18.

From default scenario (TSO and GRO turned ON on all devices) we were trying to change TSO and GRO on different devices and measuring performance.

Changes against default / Test result Iperf TCP default test
Default deployment 22.8 Kbits/sec
GRO off on controller private NICs(bond slaves) 1.80 Gbits/sec
TSO off on controller public interface 5.19 Gbits/sec

The only place where we are having an MTU mismatch is FIP namespace of destination compute node.

root@node-10:~# ip netns e fip-54905d4f-9b66-4c2e-ab9d-4105ea459fed ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: fpr-5faaca01-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 8e:9b:95:a4:f8:1f brd ff:ff:ff:ff:ff:ff
30: fg-1336f4fb-cb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/ether fa:16:3e:ba:a7:cb brd ff:ff:ff:ff:ff:ff

I was dumping traffic for TSO on/off on both (destination compute and controller) public interfaces. Just to see the difference:

TSO ON (low speed)

Controller:

06:04:56.915792 IP (tos 0x0, ttl 62, id 29440, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0x369f (correct), seq 1398:2796, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734267], length 1398
06:04:56.916241 IP (tos 0x0, ttl 62, id 48079, offset 0, flags [DF], proto TCP (6), length 64)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x9aba (correct), seq 1, ack 2796, win 699, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 1 {12582:15378}], length 0
06:04:56.916660 IP (tos 0x0, ttl 62, id 29441, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf9ce (correct), seq 18174:19572, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916754 IP (tos 0x0, ttl 62, id 29442, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf256 (correct), seq 19572:20970, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916884 IP (tos 0x0, ttl 62, id 48080, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x617f (correct), seq 1, ack 2796, win 743, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {18174:19572}{12582:15378}], length 0
06:04:56.916907 IP (tos 0x0, ttl 62, id 48081, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x5bdd (correct), seq 1, ack 2796, win 787, options [nop,nop,TS val 30734601 ecr 19...

Read more...

Revision history for this message
Sergey Matov (smatov) wrote :
Revision history for this message
Ilya Shakhat (shakhat) wrote :

Workaround for the issue: In Fuel UI, on all controllers, on physical interface that has "Public" role network switch "tcp-segmentation-offload" mode into "disabled" state (tx-tcp-segmentation, tx-tcp-ecn-segmentation, tx-tcp6-segmentation should be disabled automatically too).

With the workaround north-south traffic is ok:

TCP egress
==========
    bandwidth:
      avg: 5413.628151702881
      max: 7189.483642578125
      min: 3609.7335815429688
      unit: Mbit/s

TCP ingress
===========
    bandwidth:
      avg: 5608.663052622477
      max: 5799.884796142578
      min: 5383.405685424805
      unit: Mbit/s

tags: added: release-notes
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Isn't it an Invalid bug per comment #13?

Revision history for this message
Elena Ezhova (eezhova) wrote :

It is worth to describe this limitation in docs/release notes. In brief, low network performance in case of north-south traffic reproduces only when XL710 NICs are used and, as an Intel engineer confirms, is most probably related to kernel/driver problems. The workaround is to turn off TSO on physical interface of the external network.

Revision history for this message
Alexander Adamov (aadamov) wrote :
tags: added: release-notes-done
removed: release-notes
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Fix Verified as "documented" - not a real fix for the product, but the important note in the user documentation.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.