DPDK with OVS cannot allocate more than 6 CPU

Bug #1719387 reported by Richard
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dpdk (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

OS: Ubuntu 17.10 , update to latest version
Hardware : ThunderX 2S System (Gigabyte R150)

Reproduce procedure
====
DPDKDEV1=0006:01:00.1
DPDKDEV2=0006:01:00.2

dpdk-devbind -b vfio-pci $DPDKDEV1
dpdk-devbind -b vfio-pci $DPDKDEV2

sysctl -w vm.nr_hugepages=24
umount /dev/hugepages
mount -t hugetlbfs none /dev/hugepages
grep HugePages_ /proc/meminfo

pkill ovs
sleep 5

rm -rf /etc/openvswitch/*
rm -rf /var/run/openvswitch/*
rm -rf /var/log/openvswitch/*

ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach --log-file

ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"

ovs-vsctl --no-wait init
TXQ=2 ovs-vswitchd --pidfile --detach --log-file

ovs-vsctl del-br br0
ovs-vsctl --log-file=/var/log/openvswitch/ovs-ctl.log add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=${DPDKDEV1}
ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=${DPDKDEV2}

ovs-ofctl add-flow br0 in_port=1,action=output:2
ovs-ofctl add-flow br0 in_port=2,action=output:1

ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6
ovs-vsctl set Interface dpdk0 options:n_rxq=2
ovs-vsctl set Interface dpdk1 options:n_rxq=2

====
if replace pmd-cpu-mask=0x6 with more than 6 CPU like pmd-cpu-mask=0xFF0,
this command will stop work.
but it less than 6 or equal than 6, it works fine .

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Richard,
what happens if you do so?

The openvswitch log should have have all the OVS and DPDK messages.

Could you generate which error it creates when you use "0x7"?

Changed in dpdk (Ubuntu):
status: New → Incomplete
Revision history for this message
Richard (richliu) wrote :

0x7 is fine, nothing happen.

if use 0xFF0, more than 6 CPUs, command will not finish and stop.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (5.0 KiB)

Hi Richard,
I also got to cross check this bug now.
I can not reproduce your issue on my current setup.
Here some logs, and at the end the recommended next steps.

# prep
ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048"
ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--vhost-owner libvirt-qemu:kvm --vhost-perm 0666"
# restart
update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
systemctl restart openvswitch-switch
# add bridge and dev
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:04:00.1

# check defaults
$ sudo pidstat -p 3670 -u -t 5
Average: UID TGID TID %usr %system %guest %wait %CPU CPU Command
Average: 0 3670 - 100,00 0,13 0,00 0,00 100,00 - ovs-vswitchd
Average: 0 - 3670 0,07 0,07 0,00 0,00 0,13 - |__ovs-vswitchd
[...]
Average: 0 - 3868 0,07 0,07 0,00 0,00 0,13 - |__revalidator27
Average: 0 - 3916 100,00 0,00 0,00 0,00 100,00 - |__pmd28

1 PMD thread, nothing unexpected yet (I have one numa node, so default pmd mask is 1).

# Note n-dpdk-rxqs is no more since OVS>2.5
# ovs-vsctl set Open_vSwitch . "other_config:n-dpdk-rxqs=10"
# Instead the config is per device like
# ovs-vsctl set Interface dpdk0 options:n_rxq=8

# iterate (re)-configure queues and mask
# MASK = CPU Mask, NUM = expected threadss, so set queues accordingly
MASK="1011"; HEX=$(printf '0x%x\n' $((2#${MASK}))); NUM=0; val=$((2#${MASK})); while [ ${val} -ne 0 ]; do if [ $((${val} & 1)) -eq 1 ]; then ((NUM++)); fi; val=$((val>>1)); done; echo "Mask: $MASK => Hex: $HEX => Num: $NUM";
ovs-vsctl set Interface dpdk0 options:n_rxq=${NUM}
ovs-vsctl set Open_vSwitch . "other_config:pmd-cpu-mask=${HEX}"

I see it correctly e.g.:
- adding a queue when I go from 11 -> 1011
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 3 created.
- removeing a queue when going from 1011 -> 1001
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 1 destroyed.
- I can fill up all 12 Cores I have atm
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 9 created.
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 11 created.
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 8 created.
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 7 created.
  dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 10 created.
  dpif_netdev|INFO|There are 12 pmd threads on numa node 0

You can have less queues than your mask opens for PMDs.
But this will likely be in efficient. So e.g. the following works but isn't recommended:
$ ovs-vsctl set Interface dpdk0 options:n_rxq=6
$ ovs-vsctl set Open_vSwitch . "other_config:pmd-cpu-mask=0xdef"

If I set your example of 1 -> 111111110000 (more than 6) things still work fine:
Also changing the mask is rather robust to mistakes, if I add more CPUs than I actually have those are just ignored.
So on a 12 CPU system 111111110000 (12) -> 11111111000000 (14) just igno...

Read more...

Changed in dpdk (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Richard (richliu) wrote :

I will use your log to check again

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.