VM-to-VM 4 Gbps, expected 9-10

Bug #1445684 reported by Gregory Elkinbard
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Sergey Vasilenko
6.0.x
Invalid
High
Fuel Library (Deprecated)

Bug Description

Scale lab has done a series of performance tests on the network.

Here is the link to the data from these runs:
http://mos-scale.vm.mirantis.net/test_results/build_6.1-192/jenkins-11_env_run_shaker-8-MSK-2015-03-26-13:50:21-2015-03-26-14:46:38/l2.html

As users run VMs for such services as LBaaS / other things in VM which pass traffic to the rest of VMs, there is a requirement to ensure that in VLAN mode (no GRE) we can get up to 9-10Gpbs in one direction, and if run iperf in both, then in sum around 19 Gbps.

Mike Scherbakov (mihgen)
summary: - Mirantis OpenStack Network Performance is 50% bellow wire speed
+ Fuel Network Performance is 50% bellow wire speed
Mike Scherbakov (mihgen)
summary: - Fuel Network Performance is 50% bellow wire speed
+ VM-to-VM 4 Gpbs, expected 9-10
Changed in fuel:
milestone: none → 6.1
description: updated
Mike Scherbakov (mihgen)
summary: - VM-to-VM 4 Gpbs, expected 9-10
+ VM-to-VM 4 Gbps, expected 9-10
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

Sergey, could you provide your opinion about this issue or we have to consult with Alex Ignatov about it?

Changed in fuel:
importance: Undecided → High
assignee: nobody → Sergey Vasilenko (xenolog)
Changed in fuel:
status: New → Confirmed
Revision history for this message
Aleksandr Shaposhnikov (alashai8) wrote :

Affected versions:
6.0 Release; 6.1 build 309 also tested and have the same behavior.

Steps to reproduce:

1. Deploy openstack cluster using ubuntu+ha+vlan+neutron on cluster. Assumed that there will be at least one 10Gb NIC per compute for tenant networks.
2. Start two VMs on different computes.
3. Add required rules to the security group. Allow all tcp/(optionally udp if udp also needed to be tested) traffic in tenant or go with specific tcp/udp ports (5000,5001)
3. Run iperf -s on one of them and iperf -c <IP of first VM> on the second one.

Observed behavior:
Speed will be around 4-4.5Gbit per second.

Expected behavior:
Sales team would like to see 90-95% of wire speed here.

tags: added: l23network
tags: added: scale
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

All the related fixes have been merged:

https://review.openstack.org/#/c/181397/
https://review.openstack.org/#/c/181134/
https://review.openstack.org/#/c/182571/

Test results with these options applied show almost wire speed performance

I am marking this bug as Fix committed

Changed in fuel:
status: Confirmed → Fix Committed
Revision history for this message
Sergey Vasilenko (xenolog) wrote :

While relative testing network performance between 6.0 and 6.1 on LAB with 2 compute nodes and 10GE switch, we achieved following results for VM2VM cases:

6.0-centos, MTU:1500 -- 2.9 gbit/s
6.0-centos, MTU:9000 -- 7.7 gbit/s

6.0-ubuntu, MTU:1500 -- 7.6 gbit/s
6.0-ubuntu, MTU:9000 -- 9.8 gbit/s

6.1-ubuntu, MTU:1500 -- 9.2 gbit/s
6.1-ubuntu, MTU:9000 -- 9.8 gbit/s

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Folks, we should double check results. As related bug shows https://bugs.launchpad.net/fuel/+bug/1447619/comments/33 we have some places in networking configuration we should apply fixes. Alexander Nevenchaniy and Sergii Golovatiuk may provide more details as they had discovered these issues. Briefly:
* there are poor IRQ sterring setup exist,
* some interfaces/bridges have txqueuelen:0, which impacts CPU load very much!
* rx/tx queue length for HW interfaces configured in non optimal way

Revision history for this message
Alexander Nevenchannyy (anevenchannyy) wrote :

Folks,

1) We need to increase txqueuelen at some bridges at least to 1000.
For example:
p_br-prv-0 Link encap:Ethernet HWaddr ca:a0:28:19:8d:bd
          inet6 addr: fe80::c8a0:28ff:fe19:8dbd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:152526367 errors:0 dropped:7590012 overruns:0 frame:0
          TX packets:214336421 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:177639911221 (177.6 GB) TX bytes:256069399236 (256.0 GB)

For my configuration: p_br-prv-0, p_br-floating-0, eth2.140, br-storage, br-prv, br-int, br-fw-admin, br-floating, br-ex.

2) At high network load we have 1 CPU usage at 100% (ksoftiqrqd by net_rx function) , so we need to enable packet steering
for i in `seq 0 *CPU_NUM*`; do echo ff > /sys/class/net/eth2/queues/rx-$i/rps_cpus ; done
More info at: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/network-rps.html
3) As we spoke with Vova Kuklin need to increase net.core.netdev_max_backlog at least to 262144
4) At this moment 10Gb/s ethernet cards have txqueuelen 512, we must to setup this at maximum that support hardware.

Changed in fuel:
status: Fix Committed → Confirmed
Changed in fuel:
status: Confirmed → Fix Committed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

We decided to submit a separate bug

Revision history for this message
Leontii Istomin (listomin) wrote :

1. switches have been configured with jumbo frames.
2. set mtu=9000 using Fuel UI
3. use 10G dedicated interface for private network.
Test:
netperf-wrapper -H 10.2.0.9 -l 60 -s 1 -f csv tcp_download
Result:
Ping (ms) ICMP:
  max: 1.46470810209
  mean: 1.032960349805072
  unit: ms
  min: 0.39013074506
TCP download:
  max: 9864.03160586
  mean: 9464.499048547048
  unit: Mbps
  min: 8619.53629759

tested with 521 build

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
Leontii Istomin (listomin) wrote :
Revision history for this message
Gregory Elkinbard (gelkinbard) wrote :

please test 64 byte and 200 byte (voice) packets.
please provide the total packet counts in millions of packets per sec
which OVS can handle.

Revision history for this message
Jay Pipes (jaypipes) wrote :

Greg, what does this have to do with OVS?

Revision history for this message
Gregory Elkinbard (gelkinbard) wrote :

Jay, we have different types of workloads on openstack. so a test of just 1500 byte web packets and 9000 storage packets is not enough. NFV workloads will experience short frames, especially NFV workloads which serve voice traffic. I've asked to add several other packet frame sizes which are routinely tested by other vendors. So we see how openstack and specifically ovs performs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.