Bug #1593752 “Low network performance in Shaker north-south scen...” : Series 10.0.x : Bugs : Mirantis OpenStack

Ilya Shakhat (shakhat) on 2016-06-17

Changed in mos:
milestone:	none → 9.0
assignee:	nobody → MOS Neutron (mos-neutron)
importance:	Undecided → Critical

Revision history for this message

Alexander Ignatov (aignatov) wrote on 2016-06-17:

#1

Work is in progress.

Changed in mos:
status:	New → Confirmed

Ilya Shakhat (shakhat) on 2016-06-17

description:

updated

Dina Belova (dbelova) on 2016-06-17

tags:

added: area-neutron

Revision history for this message

Timur Nurlygayanov (tnurlygayanov) wrote on 2016-06-17:

#2

User impact:

The bandwidth in 10 Kb per second from VMs to Internet is not acceptable at all.

The priority is Critical because it is not acceptable for MOS release to have such low network performance.

Revision history for this message

Dina Belova (dbelova) wrote on 2016-06-17:

#3

Reproduced on #481 ISO as well.

Ilya Shakhat (shakhat) on 2016-06-17

description:

updated

Revision history for this message

Ilya Shakhat (shakhat) wrote on 2016-06-17:

#4

On KVM-based deployment with ISO 495 the issue is not reproduced:

================= ======== ======== ========
Metric Min Avg Max
================= ======== ======== ========
bandwidth, Mbit/s 139.99 297.92 419.79
retransmits 9 332
================= ======== ======== ========

Revision history for this message

Oleg Bondarev (obondarev) wrote on 2016-06-17:

#5

So on the Intel lab we see poor network performance when running iperf from a VM without floating IP against public iperf server (https://iperf.fr/iperf-servers.php)
VMs with floating IP (DVR router) have no such issue (in this case traffic goes to the external directly from compute node)

For non-DVR routers the problem is with both floating IP and non-floating IP VMs (in this case traffic goes through controller in both cases)
So can say this is not a DVR issue.

Also tried to run iperf directly from router namespaces on controllers against public iperf servers - works fine, bandwidth is ok.
Also tried to run iperf from VMs against router gateway on controller - also works fine in all cases (floating/non-floating, DVR/non-DVR)

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2016-06-18:

#6

Turning tx off on physical interface of the external network fixes the issue completely.

This could have been made at the poit of deploymeng using Fuel UI, which has specific checkboxes for NIC features.

Considering this, the bug is not critical because it has both post and pre deployment workarounds.

Revision history for this message

Alexander Ignatov (aignatov) wrote on 2016-06-18:

#7

Moved to High as per comment #6.

Revision history for this message

Elena Ezhova (eezhova) wrote on 2016-06-20:

#8

It turns out that it is enough to turn off TSO, the Tx on/TSO off combination works fine.

Revision history for this message

Ilya Shakhat (shakhat) wrote on 2016-06-21:

#9

The issue affects egress traffic only and most probably caused by enforced checksum verification (packets with wrong checksum don't pass through).

TCP egress
==========
    bandwidth:
      avg: 0.049431371688842776
      max: 0.4261589050292969
      min: 0
      unit: Mbit/s

TCP ingress
===========
    bandwidth:
      avg: 4052.971162160238
      max: 7042.360305786133
      min: 2267.122268676758
      unit: Mbit/s

Revision history for this message

Alexander Duyck (alexander-duyck) wrote on 2016-06-21:

#10

Could we record the kernel version and i40e driver version for this so I can determine if this is still an issue in the latest kernel and/or drivers?

Revision history for this message

Sergey Matov (smatov) wrote on 2016-06-22:

#11

Download full text (8.1 KiB)

Alex, I'll duplicate all I've described in email here so it will be pinned here.

root@node-8:~# uname -r
3.13.0-88-generic

root@node-8:~# modinfo i40e
filename: /lib/modules/3.13.0-88-generic/updates/dkms/i40e.ko
version: 1.3.47

For controller node I've tried i40e 1.5.18.

From default scenario (TSO and GRO turned ON on all devices) we were trying to change TSO and GRO on different devices and measuring performance.

Changes against default / Test result Iperf TCP default test
Default deployment 22.8 Kbits/sec
GRO off on controller private NICs(bond slaves) 1.80 Gbits/sec
TSO off on controller public interface 5.19 Gbits/sec

The only place where we are having an MTU mismatch is FIP namespace of destination compute node.

root@node-10:~# ip netns e fip-54905d4f-9b66-4c2e-ab9d-4105ea459fed ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: fpr-5faaca01-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 8e:9b:95:a4:f8:1f brd ff:ff:ff:ff:ff:ff
30: fg-1336f4fb-cb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/ether fa:16:3e:ba:a7:cb brd ff:ff:ff:ff:ff:ff

I was dumping traffic for TSO on/off on both (destination compute and controller) public interfaces. Just to see the difference:

TSO ON (low speed)

Controller:

06:04:56.915792 IP (tos 0x0, ttl 62, id 29440, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0x369f (correct), seq 1398:2796, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734267], length 1398
06:04:56.916241 IP (tos 0x0, ttl 62, id 48079, offset 0, flags [DF], proto TCP (6), length 64)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x9aba (correct), seq 1, ack 2796, win 699, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 1 {12582:15378}], length 0
06:04:56.916660 IP (tos 0x0, ttl 62, id 29441, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf9ce (correct), seq 18174:19572, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916754 IP (tos 0x0, ttl 62, id 29442, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf256 (correct), seq 19572:20970, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916884 IP (tos 0x0, ttl 62, id 48080, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x617f (correct), seq 1, ack 2796, win 743, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {18174:19572}{12582:15378}], length 0
06:04:56.916907 IP (tos 0x0, ttl 62, id 48081, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x5bdd (correct), seq 1, ack 2796, win 787, options [nop,nop,TS val 30734601 ecr 19...

Alex, I'll duplicate all I've described in email here so it will be pinned here.

root@node-8:~# uname -r
3.13.0-88-generic

root@node-8:~# modinfo i40e
filename:       /lib/modules/3.13.0-88-generic/updates/dkms/i40e.ko
version:        1.3.47

For controller node I've tried i40e 1.5.18.

From default scenario (TSO and GRO turned ON on all devices) we were trying to change TSO and GRO on different devices and measuring performance.

Changes against default / Test result	        Iperf TCP default test
Default deployment	                           22.8 Kbits/sec
GRO off on controller private NICs(bond slaves)	   1.80 Gbits/sec
TSO off on controller public interface	           5.19 Gbits/sec

The only place where we are having an MTU mismatch is FIP namespace of destination compute node.

root@node-10:~# ip netns e fip-54905d4f-9b66-4c2e-ab9d-4105ea459fed ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: fpr-5faaca01-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 8e:9b:95:a4:f8:1f brd ff:ff:ff:ff:ff:ff
30: fg-1336f4fb-cb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/ether fa:16:3e:ba:a7:cb brd ff:ff:ff:ff:ff:ff

I was dumping traffic for TSO on/off on both (destination compute and controller) public interfaces. Just to see the difference:

TSO ON (low speed)

Controller:

06:04:56.915792 IP (tos 0x0, ttl 62, id 29440, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0x369f (correct), seq 1398:2796, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734267], length 1398
06:04:56.916241 IP (tos 0x0, ttl 62, id 48079, offset 0, flags [DF], proto TCP (6), length 64)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x9aba (correct), seq 1, ack 2796, win 699, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 1 {12582:15378}], length 0
06:04:56.916660 IP (tos 0x0, ttl 62, id 29441, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf9ce (correct), seq 18174:19572, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916754 IP (tos 0x0, ttl 62, id 29442, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf256 (correct), seq 19572:20970, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.916884 IP (tos 0x0, ttl 62, id 48080, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x617f (correct), seq 1, ack 2796, win 743, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {18174:19572}{12582:15378}], length 0
06:04:56.916907 IP (tos 0x0, ttl 62, id 48081, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x5bdd (correct), seq 1, ack 2796, win 787, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {18174:20970}{12582:15378}], length 0

Compute:
6:04:56.912659 IP (tos 0x0, ttl 62, id 29440, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0x369f (correct), seq 2796:4194, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734267], length 1398
06:04:56.913036 IP (tos 0x0, ttl 62, id 48079, offset 0, flags [DF], proto TCP (6), length 64)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x9172 (incorrect -> 0x9aba), seq 1, ack 4194, win 699, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 1 {13980:16776}], length 0
06:04:56.913505 IP (tos 0x0, ttl 62, id 29441, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf9ce (correct), seq 19572:20970, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.913612 IP (tos 0x0, ttl 62, id 29442, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53025 > 172.20.156.138.5001: Flags [.], cksum 0xf256 (correct), seq 20970:22368, ack 1, win 441, options [nop,nop,TS val 19235505 ecr 30734601], length 1398
06:04:56.913687 IP (tos 0x0, ttl 62, id 48080, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x917a (incorrect -> 0x617f), seq 1, ack 4194, win 743, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {19572:20970}{13980:16776}], length 0
06:04:56.913695 IP (tos 0x0, ttl 62, id 48081, offset 0, flags [DF], proto TCP (6), length 72)
    172.20.156.138.5001 > 172.20.156.140.53025: Flags [.], cksum 0x917a (incorrect -> 0x5bdd), seq 1, ack 4194, win 787, options [nop,nop,TS val 30734601 ecr 19235505,nop,nop,sack 2 {19572:22368}{13980:16776}], length 0

TSO OFF

Controller:

172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x96dc (incorrect -> 0x0a77), seq 241855:243253, ack 0, win 441, options [nop,nop,TS val 19198612 ecr 30697707], length 1398
06:02:29.347727 IP (tos 0x0, ttl 62, id 6084, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x96dc (incorrect -> 0x02ff), seq 243253:244651, ack 0, win 441, options [nop,nop,TS val 19198612 ecr 30697707], length 1398
06:02:29.347728 IP (tos 0x0, ttl 62, id 6085, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x96dc (incorrect -> 0x0591), seq 244651:246049, ack 0, win 441, options [nop,nop,TS val 19198612 ecr 30697707], length 1398
06:02:29.347725 IP (tos 0x0, ttl 62, id 11529, offset 0, flags [DF], proto TCP (6), length 52)
    172.20.156.138.5001 > 172.20.156.140.53024: Flags [.], cksum 0xde6a (correct), seq 0, ack 64309, win 30734, options [nop,nop,TS val 30697708 ecr 19198612], length 0
06:02:29.347728 IP (tos 0x0, ttl 62, id 6086, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x96dc (incorrect -> 0xfe18), seq 246049:247447, ack 0, win 441, options [nop,nop,TS val 19198612 ecr 30697707], length 1398
06:02:29.347730 IP (tos 0x0, ttl 62, id 6087, offset 0, flags [DF], proto TCP (6), length 1450)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x96dc (incorrect -> 0xf6a0), seq 247447:248845, ack 0, win 441, options [nop,nop,TS val 19198612 ecr 30697707], length 1398

Compute:

06:02:27.823811 IP (tos 0x0, ttl 62, id 47700, offset 0, flags [DF], proto TCP (6), length 14032)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0xc802 (incorrect -> 0x33f1), seq 651469:665449, ack 0, win 441, options [nop,nop,TS val 19198232 ecr 30697327], length 13980
06:02:27.823944 IP (tos 0x0, ttl 62, id 47710, offset 0, flags [DF], proto TCP (6), length 64360)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0x8c9b (incorrect -> 0xbbcd), seq 665449:729757, ack 0, win 441, options [nop,nop,TS val 19198232 ecr 30697327], length 64308
06:02:27.823949 IP (tos 0x0, ttl 62, id 47756, offset 0, flags [DF], proto TCP (6), length 25216)
    172.20.156.140.53024 > 172.20.156.138.5001: Flags [.], cksum 0xf3b2 (incorrect -> 0x6c43), seq 729757:754921, ack 0, win 441, options [nop,nop,TS val 19198232 ecr 30697327], length 25164
06:02:27.823995 IP (tos 0x0, ttl 62, id 46912, offset 0, flags [DF], proto TCP (6), length 52)
    172.20.156.138.5001 > 172.20.156.140.53024: Flags [.], cksum 0x9166 (incorrect -> 0x49c9), seq 0, ack 466933, win 30734, options [nop,nop,TS val 30697328 ecr 19198232], length 0
06:02:27.824002 IP (tos 0x0, ttl 62, id 46913, offset 0, flags [DF], proto TCP (6), length 52)
    172.20.156.138.5001 > 172.20.156.140.53024: Flags [.], cksum 0x9166 (incorrect -> 0xe77c), seq 0, ack 492097, win 30734, options [nop,nop,TS val 30697328 ecr 19198232], length 0

With dumping on slaves/bonds I've discovered that packets arrived on controller are 1500 size (vxlan encapsulated + 1398 payload).

Also for physical interfaces ethtool -k are attached(in default scenario with poor performance).

Revision history for this message

Sergey Matov (smatov) wrote on 2016-06-22:

#12

attachments.tar.gz Edit (6.3 KiB, application/x-tar)

Revision history for this message

Ilya Shakhat (shakhat) wrote on 2016-06-22:

#13

Workaround for the issue: In Fuel UI, on all controllers, on physical interface that has "Public" role network switch "tcp-segmentation-offload" mode into "disabled" state (tx-tcp-segmentation, tx-tcp-ecn-segmentation, tx-tcp6-segmentation should be disabled automatically too).

With the workaround north-south traffic is ok:

TCP egress
==========
    bandwidth:
      avg: 5413.628151702881
      max: 7189.483642578125
      min: 3609.7335815429688
      unit: Mbit/s

TCP ingress
===========
    bandwidth:
      avg: 5608.663052622477
      max: 5799.884796142578
      min: 5383.405685424805
      unit: Mbit/s

Alexander Ignatov (aignatov) on 2016-06-22

tags:

added: release-notes

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2016-06-22:

#14

Isn't it an Invalid bug per comment #13?

Revision history for this message

Elena Ezhova (eezhova) wrote on 2016-07-06:

#15

It is worth to describe this limitation in docs/release notes. In brief, low network performance in case of north-south traffic reproduces only when XL710 NICs are used and, as an Intel engineer confirms, is most probably related to kernel/driver problems. The workaround is to turn off TSO on physical interface of the external network.

Revision history for this message

Alexander Adamov (aadamov) wrote on 2016-08-25:

#16

https://review.fuel-infra.org/#/c/25084/

tags:

added: release-notes-done
removed: release-notes

Revision history for this message

Timur Nurlygayanov (tnurlygayanov) wrote on 2016-09-02:

#17

Fix Verified as "documented" - not a real fix for the product, but the important note in the user documentation.

	Status	Importance	Assigned to	Milestone
Mirantis OpenStack	Status tracked in 10.0.x
10.0.x	Confirmed	High	MOS Neutron	Mirantis OpenStack 10.0
9.x	Fix Released	High	Alexander Adamov	Mirantis OpenStack 9.1

Mirantis OpenStack

Low network performance in Shaker north-south scenario

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches