OpenvSwitch with DPDK brings all VirtIO NICs down, software reboot also doesn't work.

Bug #1568627 reported by Thiago Martins
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openvswitch (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Guys,

 Soon as we start OpenvSwitch with DPDK inside of a KVM guest, it brings down all the VirtIO NICs! Including non-DPDK NIC VirtIO devices!

 The problem is easy to reproduce, I'm seeing it on two different hosts, a Dell server and a Macbook Pro, both running Xenial on host and on VM.

 Also, if DPDK Openvswitch options are enabled on /etc/default/openvswitch, soft reboot stops working! Very weird...

 Steps to reproduce:

 1- At the host, install Ubuntu KVM by running:

 sudo apt install ubuntu-virt

 2- Create two extra Libvirt networks, called "subscriber" and "internet", list this:

tmartins@blade:~$ virsh net-list
 Name State Autostart Persistent
----------------------------------------------------------
 default active yes yes
 internet active yes yes
 subscriber active yes yes

 3- Create a Xenial VM with 3 vNIC (Virtio), wire it against those 3 above netorks.

 4- Inside of the Xenial VM, you might see the PCI IDs like this:

00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
00:09.0 Ethernet controller: Red Hat, Inc Virtio network device
00:0a.0 Ethernet controller: Red Hat, Inc Virtio network device

 5- Install OpenvSwitch with DPDK:

 sudo apt install openvswitch-switch-dpdk

 6- Configure DPDK:

 sudo vi /etc/dpdk/dpdk.conf # Enable NR_2M_PAGES=64

 7- Select the PCI devices that you want to bind to DPDK drivers:

 sudo vi /etc/dpdk/interfaces

pci 0000:00:09.0 uio_pci_generic
pci 0000:00:0a.0 uio_pci_generic

 8- Reboot it

 NOTE: After rebooting it, you'll see by running "dpdk_nic_bind --status", that the latest two VirtIO NIC cards are using DPDK-compatible drivers. So far, so good...

 9- Configure OpenvSwitch to use DPDK

 run:

 update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk

 sudo vi /etc/default/openvswitch-switch

 Uncomment the latest line, like this:

DPDK_OPTS='--dpdk -c 0x1 -n 4'

 10- Hit the bug!

 Run:

 service openvswitch-switch stop

 service openvswitch-switch start # BOOM!!!

 From this point, there are two problems:

1- All VirtIO NIC cards loses connectivity, not only the DPDK ones! But also, vNICs with the regular Linux VirtIO drivers!

2- Soft reboot does not work.

 That's it! What am I missing here?

 Workaround:

 1- Disable OpenvSwitch DPDK options inside of /etc/default/openvswitch-switch, just comment it, and force-reboot the VM. After this, VM connectivity is restored (on its first vNIC) and soft reboot also works again.

 I made a video about this! It is on Youtube:

 https://youtu.be/sSBqYUVsO8U

 I really appreciate any help!

Cheers!
Thiago

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: openvswitch-switch-dpdk 2.5.0-0ubuntu1
ProcVersionSignature: Ubuntu 4.4.0-18.34-generic 4.4.6
Uname: Linux 4.4.0-18-generic x86_64
ApportVersion: 2.20.1-0ubuntu1
Architecture: amd64
Date: Sun Apr 10 18:25:34 2016
InstallationDate: Installed on 2016-02-01 (69 days ago)
InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Alpha amd64 (20160111)
SourcePackage: openvswitch
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Thiago Martins (martinx) wrote :
Revision history for this message
Thiago Martins (martinx) wrote :

Guys,

 I can confirm that this problem does exists on top of Trusty KVM Guest with Linux 4.4 (same Xenial kernel), but with DPDK 2.0 and OpenvSwitch 2.4 from Liberty Cloud Archive, it works!!

 At least, the Trusty guest doesn't lose connectivity and soft reboot works too! Precisely where Xenial fails...

 So, something is wrong with OpenvSwitch 2.5 when with DPDK 2.2 (Xenial Combo).

 Happy hacking!

Thanks!
Thiago

Revision history for this message
Thiago Martins (martinx) wrote :

WAIT! I was too fast on reply about Trusty...

On Trusty with Cloud Archive Liberty enabled, and even after this:

update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk

And this:

# grep DPDK_OPTS /etc/default/openvswitch-switch
DPDK_OPTS='--dpdk -c 0x1 -n 4'

The command:

service openvswitch-switch stop
service openvswitch-switch start

 Does NOT bring "ovs-vswitchd" with "dpdk" options!

 So, this test with Trusty is invalid.

 I don't know how to enable OVS with DPDK on Trusty when with Cloud Archive Liberty. The procedure might be different.

 I'll keep researching about this topic...

Cheers!
Thiago

Revision history for this message
James Page (james-page) wrote :

Hi Thiago

DPDK can use the virtio-pci devices directly without them being bound to the userspace driver - as a result, the default configuration will try to consume all virtio-pci devices configured on the system; make sure the blacklist the primary network adapter so that DPDK does not try to use it.

Revision history for this message
Thiago Martins (martinx) wrote : Re: [Bug 1568627] Re: OpenvSwitch with DPDK brings all VirtIO NICs down, software reboot also doesn't work.

Awesome! I'm gonna try it... Thanks James! :-D

On 11 April 2016 at 04:34, James Page <email address hidden> wrote:

> Hi Thiago
>
> DPDK can use the virtio-pci devices directly without them being bound to
> the userspace driver - as a result, the default configuration will try
> to consume all virtio-pci devices configured on the system; make sure
> the blacklist the primary network adapter so that DPDK does not try to
> use it.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1568627
>
> Title:
> OpenvSwitch with DPDK brings all VirtIO NICs down, software reboot
> also doesn't work.
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1568627/+subscriptions
>

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Thiago,
as James said this is a known dpdk behaviour - not special to OpenVswitch.

Any other DPDK tool like l2fwd or such would have killed you just the same.
We documented it in the serverguide.
No release for 16.04 yet but you can take a look at http://bazaar.launchpad.net/~ubuntu-core-doc/serverguide/trunk/view/head:/serverguide/C/network-config.xml

Maybe you can bzr branch lp:serverguide and render it for yourself until it is released.
It contains all the pitfalls I found so far.

Most of the time we also added notes to the .conf files, but not yet in this case.
James - what do you think should we add a warning in /etc/default/openvswitch-switch ?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Setting to Triaged but only low as it is documented, just not in that conf file.

Changed in openvswitch (Ubuntu):
status: New → Triaged
status: Triaged → Invalid
importance: Undecided → Low
status: Invalid → Triaged
Revision history for this message
James Page (james-page) wrote :

Marking this bug as Fix Released - newer versions of OVS don't slurp all capable devices by default any longer.

Changed in openvswitch (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.