OVS+DPDK segfault at the host, after running "ovs-vsctl set interface dpdk0 options:n_rxq=2 " within a KVM Guest
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
dpdk (Ubuntu) |
Expired
|
High
|
Unassigned | ||
openvswitch (Ubuntu) |
Expired
|
High
|
Unassigned |
Bug Description
Guys,
It is possible to crash OVS + DPDK running at the host, from inside of a KVM Guest!
All you need to do, is to enable multi-queue, then, from a KVM Guest, you can kill OVS running at the host...
* Hardware requirements (might be exaggerated but this is what I have):
1 Dell Server with dedicated 2 x 10G NIC cards, plus another 1 or 2 1G NIC, for management, apt-get, ssh, etc;
1 IXIA Traffic Generator - 10G in both directions.
* Steps to reproduce, at a glance:
1- Deploy Ubuntu at the host;
a. Grub options /etc/default/grub:
-
GRUB_CMDLINE_
-
2- Install OVS with DPDK;
3- Configure DPDK, 1G Hugepages, PCI IDs and create the OVS bridges for a VM:
a. /etc/default/
-
DPDK_OPTS='--dpdk -c 0x1 -n 4 -m 2048,0 --vhost-owner libvirt-qemu:kvm --vhost-perm 0664'
-
b. /etc/dpdk/
-
pci 0000:06:00.0 uio_pci_generic
pci 0000:06:00.1 uio_pci_generic
-
NOTE: those PCI devices are located at NUMA Node 0.
c. DPDK Hugepages /etc/dpdk/
-
NR_1G_PAGES=32
-
d. OVS Bridges:
ovs-vsctl add-br br0 -- set bridge br0 datapath_
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port br0 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser
ovs-vsctl add-br br1 -- set bridge br1 datapath_
ovs-vsctl add-port br1 dpdk1 -- set Interface dpdk1 type=dpdk
ovs-vsctl add-port br1 vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuser
ip link set dev br0 up
ip link set dev br1 up
4- At the host, enable multi-queue and add more CPU Cores to OVS+DPDK PMD threads:
ovs-vsctl set Open_vSwitch . other_config:
ovs-vsctl set Open_vSwitch . other_config:
5- Deploy Ubuntu at the VM, full Libvirt XML:
a. ubuntu-16.01-1 XML:
https:/
b. /etc/default/grub:
-
GRUB_CMDLINE_
-
6- Install OVS with DPDK;
7- Configure DPDK, 1G Hugepages, PCI IDs and create the OVS bridges within the VM:
NOTE: Do NOT enable multi-queue inside of the VM yet, you'll see that, so far, it will work!
a. /etc/default/
-
DPDK_OPTS='--dpdk -c 0x1 -n 4 -m 1024 --pci-blacklist 0000:00:03.0 --pci-blacklist 0000:00:04.0'
-
b. /etc/dpdk/
-
pci 0000:00:05.0 uio_pci_generic
pci 0000:00:06.0 uio_pci_generic
-
c. DPDK Hugepages /etc/dpdk/
-
NR_1G_PAGES=1
-
d. OVS Bridge:
ovs-vsctl add-br ovsbr -- set bridge ovsbr datapath_
ovs-vsctl add-port ovsbr dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port ovsbr dpdk1 -- set Interface dpdk1 type=dpdk
ip link set dev ovsbr up
NOTE 1: So far, so good! But no multi-queue yet!
NOTE 2: Sometimes, you can crash ovs-vswitchd at the host, right here!!!
8- At the VM, add more CPU Cores to OVS+DPDK PMD threads:
2 Cores):
ovs-vsctl set Open_vSwitch . other_config:
or:
4 Cores):
ovs-vsctl set Open_vSwitch . other_config:
9- Enable multi-queue before starting up DPDK and OVS, run this inside of the VM:
systemctl disable dpdk
systemctl disable openvswitch-switch
reboot
ethtool -L ens5 combined 4
ethtool -L ens6 combined 4
service dpdk start
service openvswitch-switch start
BOOM!!!
10- Error log at the host (ovs-vswitchd + DPDK crashed):
https:/
IMPORTANT NOTES:
* Sometimes, even without enabling multi-queue at the VM, ovs-vswitchd at the host, crashes!
** Also, more weird, is that I have a proprietary DPDK App (L2 Bridge for DPI), that uses multi-queue automatically and it does NOT crash the ovs-vswitchd running at the host! I can use my DPDK App with multi-queue, but I can't do the same with OVS+DPDK.
So, if I replace "ubuntu16.
Cheers!
Thiago
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: openvswitch-
ProcVersionSign
Uname: Linux 4.4.0-22-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
Date: Sat Apr 30 18:04:16 2016
SourcePackage: openvswitch
UpgradeStatus: Upgraded to xenial on 2016-04-07 (23 days ago)
summary: |
OVS+DPDK crashes at the host, right after starting another OVS+DPDK - inside of a KVM Guest, if multi-queue is enabled + inside of a KVM Guest, easier to reproduce if multi-queue is enabled. |
Changed in openvswitch (Ubuntu): | |
status: | Triaged → Confirmed |
Changed in dpdk (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → High |
Very interesting Thiago,
thanks for reporting - also the steps to reproduce are detailed and I should be able to work on that.
As I said in the mail thread it would be great if you could report that to upstream DPDK&OVS Dev and keep me on CC.
Quite often such things turned out to be known issues.
I'll try to reproduce in the meantime ...