Fuel for OpenStack

MOS 9.2 ovs-dpdk performance test

Bug #1705435 reported by Xiwen Deng on 2017-07-20

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Invalid	High	Xiwen Deng	Fuel for OpenStack 9.x-updates

Bug Description

In MOS 9.2, when test RFC2544 zero frame lossing, dpdk performance result is low.

In env there are two dpdk interfaces and a VM. VM have two nics and each nic have two queues. And dpdk interface config two queues too.

Configures of Env below:
top -p `pidof ovs-vswitchd` -H -d1
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15924 root 10 -10 27.159g 296408 9956 R 99.9 0.1 34:34.80 pmd222
15925 root 10 -10 27.159g 296408 9956 R 99.9 0.1 34:34.80 pmd219
15922 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.79 pmd223
15923 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.79 pmd224
15928 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.80 pmd218
15929 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.79 pmd221
15930 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.79 pmd220
15931 root 10 -10 27.159g 296408 9956 R 99.8 0.1 34:34.80 pmd217

root@compute-3:~# ovs-vsctl get open_vswitch . other_config
{dpdk-extra="-n 2 --vhost-owner libvirt-qemu:kvm --vhost-perm 0664", dpdk-init="true", dpdk-lcore-mask="0x400", dpdk-socket-mem="8192,1", max-idle="50000", pmd-cpu-mask="0x1e0001e000"}

root@compute-3:~# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 35:
isolated : true
port: vhu82a81743-88 queue-id: 1
pmd thread numa_id 1 core_id 33:
isolated : true
port: dpdk0 queue-id: 1
pmd thread numa_id 1 core_id 14:
isolated : true
port: dpdk1 queue-id: 0
pmd thread numa_id 1 core_id 15:
isolated : true
port: vhu82a81743-88 queue-id: 0
pmd thread numa_id 1 core_id 16:
isolated : true
port: vhueee4b3fb-32 queue-id: 0
pmd thread numa_id 1 core_id 34:
isolated : true
port: dpdk1 queue-id: 1
pmd thread numa_id 1 core_id 13:
isolated : true
port: dpdk0 queue-id: 0
pmd thread numa_id 1 core_id 36:
isolated : true
port: vhueee4b3fb-32 queue-id: 1

root@compute-3:~# ovs-appctl dpif-netdev/pmd-stats-show
pmd thread numa_id 1 core_id 35:
emc hits:0
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:0
lost:0
polling cycles:183391851052 (100.00%)
processing cycles:0 (0.00%)
pmd thread numa_id 1 core_id 33:
emc hits:5169955
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:500
lost:0
polling cycles:123787408521 (80.77%)
processing cycles:29463516747 (19.23%)
avg cycles per packet: 29639.74 (153250925268/5170455)
avg processing cycles per packet: 5698.44 (29463516747/5170455)
main thread:
emc hits:3
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:0
lost:0
polling cycles:21534522 (99.88%)
processing cycles:25472 (0.12%)
avg cycles per packet: 7186664.67 (21559994/3)
avg processing cycles per packet: 8490.67 (25472/3)
pmd thread numa_id 1 core_id 14:
emc hits:5160183
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:502
lost:2
polling cycles:125545461034 (80.41%)
processing cycles:30583715341 (19.59%)
avg cycles per packet: 30253.58 (156129176375/5160685)
avg processing cycles per packet: 5926.29 (30583715341/5160685)
pmd thread numa_id 1 core_id 15:
emc hits:0
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:0
lost:0
polling cycles:182558896290 (100.00%)
processing cycles:0 (0.00%)
pmd thread numa_id 1 core_id 16:
emc hits:0
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:0
lost:0
polling cycles:182211680516 (100.00%)
processing cycles:0 (0.00%)
pmd thread numa_id 1 core_id 34:
emc hits:5162238
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:501
lost:1
polling cycles:123743090602 (80.82%)
processing cycles:29366438623 (19.18%)
avg cycles per packet: 29656.65 (153109529225/5162739)
avg processing cycles per packet: 5688.15 (29366438623/5162739)
pmd thread numa_id 1 core_id 13:
emc hits:5167918
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:500
lost:0
polling cycles:125398694242 (80.39%)
processing cycles:30587837876 (19.61%)
avg cycles per packet: 30180.71 (155986532118/5168418)
avg processing cycles per packet: 5918.22 (30587837876/5168418)
pmd thread numa_id 1 core_id 36:
emc hits:0
megaflow hits:0
avg. subtable lookups per hit:0.00
miss:0
lost:0
polling cycles:182655421216 (100.00%)
processing cycles:0 (0.00%)

From the pmd-stats-show we can find some pmd pin cores(15,16,35,36) are not polling. Only four PMD pinning core process packets.

Why only four pmd pin cores process packets?

See original description

Xiwen Deng (deng-xiwen) on 2017-07-20

description:

updated

Xiwen Deng (deng-xiwen) on 2017-07-24

description:

updated

Revision history for this message

Denis Meltsaykin (dmeltsaykin) wrote on 2017-07-24:

Xiwen, could you please provide an example of expected behavior? How low the dpdk performance? Do you experience packet loss, heavy jitter or maybe increased latency? What are the throughput numbers? Are you measuring the forwarding inside a VM? Or is it bridging? Are there any iptables/ebtables rules inside this VM? It would also be great to show some kind of a deployment scheme. Please answer all the questions above.

Changed in fuel:
assignee:	nobody → Xiwen Deng (deng-xiwen)
status:	New → Incomplete
importance:	Undecided → High
milestone:	none → 9.x-updates

Revision history for this message

Xiwen Deng (deng-xiwen) wrote on 2017-07-24:

Hello Denis,

My dpdk performance test have a physical traffic generator and a compute node. Compute node have two 10G nics. A vm with 5 cores and 8G ram in the compute node. In the vm there is a test-pmd running.

We test RFC2544 and packet sizes are 64,128,256,512,1024.

Test results is below:
https://drive.google.com/file/d/0By7pI-rd3Q4hNjVwV21HaW1vRkU/view?usp=sharing

From test result we can find 64 bytes zero frame loss is about 4.867%. The main packet loss at dpdk interfaces.

VM don't have any iptables rules.

I don't why there are 8 pmd cores but only 4 pmd cores processing packets.

Revision history for this message

Dmitry Teselkin (teselkin-d) wrote on 2017-07-25:

Hello,

According to documentation [1] rx / tx queues can't be shared among multiple logical cores, one queue might be processes by one core. So if you have 4 queues total then only 4 core will process packets.

[1] http://dpdk.org/doc/guides-16.04/prog_guide/poll_mode_drv.html#generalities

Revision history for this message

Dmitry Teselkin (teselkin-d) wrote on 2017-07-25:

Avoiding lock contention is a key issue in a multi-core environment. To address this issue, PMDs are designed to work with per-core private resources as much as possible. For example, a PMD maintains a separate transmit queue per-core, per-port, if the PMD is not DEV_TX_OFFLOAD_MT_LOCKFREE capable. In the same way, every receive queue of a port is assigned to and polled by a single logical core (lcore).

[1] http://dpdk.org/doc/guides/prog_guide/poll_mode_drv.html

Denis Meltsaykin (dmeltsaykin) on 2017-11-01

Changed in fuel:
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.