Ownership/Permissions of vhost_user sockets for openvswitch-dpdk make them unusable by libvirt/qemu/kvm

Bug #1546565 reported by Christian Ehrhardt  on 2016-02-17
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
dpdk (Ubuntu)
High
Unassigned
Xenial
High
Unassigned
neutron-openvswitch (Juju Charms Collection)
High
James Page
openvswitch (Ubuntu)
High
Unassigned
Xenial
High
Unassigned

Bug Description

As of today the vhost_user sockets created by openvswitch have root:root file ownership.
In fact creation is actually done by code the DPDK lib, but the path is passed to it from openvswitch.

The API called to DPDK has no notion of ownership/groups.
It just "inherits" what the current running process has.
But due to LP:1546556 the process ownership/group can't be changed the usual way openvsiwtch would when using dpdk.

KVM as invoked by libvirt will run under libvirt-qemu:kvm and will thereby be unable to access these sockets.

The current workaround is:
   1. wait after start of openvswitch (only then the sockets exist)
   2. chown all created vhost_iuser sockets that are to be used
      e.g. sudo chown libvirt-qemu /var/run/openvswitch/vhost-user-1
   3. if one wants to separate vhost_user sockets from the "rest" of openvswitch /var/run files use e.g.:
      DPDK_OPTS='[...] -vhost_sock_dir /var/run/openvswitch-vhost [...]
   X. this has to be redone every start/restart of oepnvswitch
   Y. if permissions are changed in a way that openvswitch can no more remove them on shutdown they won't re-initialize properly on the next start

That is a severe shortcoming and not really applicable to a supported production environment.
There are discussions ongoing about providing an option to specify owner/group/permissions of vhost_user sockets which would solve the issue.
Unfortunately the patch series is blocked by a wider discussion about moving the dpdk configuration to the ovsdb (which makes sense, but stalls the acceptance of the patches providing the interface to modify permissions.

Link to the last thread about moving dpdk config to ovsdb: http://comments.gmane.org/gmane.network.openvswitch.devel/59186
Link to the last thread about making vhost_user socket user/group configurable - patch 4&5 of this: http://openvswitch.org/pipermail/dev/2015-December/063568.html
But as mentioned it was decided to get the db config discussion done first.

It is unsure if the patches once final will make it into openvswitch 2.5 - it would be great if they would.
But even if not they shouldn't appear too much after and we might be able to cherry pick them?

Changed in dpdk (Ubuntu):
status: New → Triaged
Changed in openvswitch-dpdk (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in dpdk (Ubuntu):
importance: Undecided → High

Discussion slowed down and patch series grew - still not committed.
=> https://<email address hidden>/msg57821.html

The way the discussion went this included more and more of the change from commandline to ovsdb config for dpdk.
That makes it uncertain if it will get into branch 2.5

Had a discussion with James Page today.
He will look into it (backporting, hack, other solution) in his work regarding charming that stuff.
Big thanks for that!

We agreed that will discuss (or be happy about the fix) again in approx 10 days from now.

Changed in openvswitch-dpdk (Ubuntu):
milestone: none → ubuntu-16.04
assignee: nobody → James Page (james-page)
Thiago Martins (martinx) wrote :

Looking forward for a proper fix to this problem! Very annoying... :-P

OVS+DPDK on Ubuntu 16.04 looks impressive!

James Page (james-page) wrote :

Adding libvirt task as well as one solution to this is for libvirt to fix the permissions on the socket at the point of vm creation, as it does for disk permissions...

Changed in openvswitch-dpdk (Ubuntu):
assignee: James Page (james-page) → nobody

FYI - Solution for DPDK has passed tests now.
I sent it to DPDK for review and to get their general opinion on it, but we might carry it to be able to work with openvswitch-2.5 as is anyway.

To be toally fixed this will need an DPDK upload (soon in the queue) to work.

We will also update openvswitch e.g. the examples in /etc/defaults/openvswitch that I'll start to prepare now - this is to fix the other portion of the bug in openvswitch.
But this portion will likely be uploaded together with the 2.5.1 point release which fixes some serious openvswitch-dpdk issues anyway.

no longer affects: libvirt (Ubuntu)

This change is supposed to be uploaded together with the merge of the next point release as requested in bug 1573091.
This will fix several critical issues, and without those in place this update of the "how to config" isn't too important yet.

So no need to sponsor this now (in case the bot auto assigns sponsors later).

The attachment "Debdiff for the recommended changes to the examples in /etc/defaults/openvswitch" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch

--vhost-perm 0664 would make even more sense.

Xenial is released, so we are back in SRU mode.
Therefore I add the matching SRU Template for the upload of 2.2.0ubuntu8 which is in the unapproved queue atm.

[Impact]

 * Openvswitch-DPDK not usable without unacceptable workarounds
 * This provides a stable way to get it set up which will be picked up by our packaged openvswitch
 * We also drive a long term solution upstream at ovs/dpdk to drop the delta somewhen in the future

[Test Case]

 * Creating vhost_user sockets with ovs-dpdk always ends up as root:root

[Regression Potential]

 * no change when the dpdk EAL parameters are not set, so low regression potential
 * passed ADT tests on i368/amd64/amd64-lowmem and our full CI (https://code.launchpad.net/~ubuntu-server/ubuntu/+source/dpdk-testing/+git/dpdk-testing)

Martin Pitt (pitti) wrote :

Christian, please upload the packge to yakkety too, otherwise this cannot progress to xenial-updates.

Changed in dpdk (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed

Hello ChristianEhrhardt, or anyone else affected,

Accepted dpdk into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dpdk/2.2.0-0ubuntu8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in dpdk (Ubuntu Xenial):
importance: Undecided → High
Changed in openvswitch-dpdk (Ubuntu Xenial):
importance: Undecided → High
milestone: none → ubuntu-16.04.1
Changed in openvswitch-dpdk (Ubuntu):
milestone: ubuntu-16.04 → none

FYI - Verified in Proposed.

Next I need to prep some Y tests to reasonably request an upload to Yakkety to allow migration as Martin indicated.

tags: added: verification-done-xenial

Thanks Martin for making me aware of the Y upload needed,
I did ensure that it passed sbuild and adt in Y environment, these passed, so now the debdiff for Y in this attachment.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dpdk - 2.2.0-0ubuntu9

---------------
dpdk (2.2.0-0ubuntu9) yakkety; urgency=medium

  * d/p/ubuntu-backport-[36-37] fix virtio issues (LP: #1570195):
    - don't let DPDK initialize virtio devices still in use by the kernel
    - this avoids conflicts between kernel and dpdk usage of those devices
    - an admin now has to unbind/bind devices as on physical hardware
    - this is in the dpdk 16.04 release and delta can then be dropped
    - d/dpdk-doc.README.Debian update for changes in virtio-pci handling
    - d/dpdk.interfaces update for changes in virtio-pci handling
  * d/p/ubuntu-backport-38... fix for memory leak (LP: #1570466):
    - call vhost_destroy_device on removing vhost user ports to fix memory leak
    - this likely is in the dpdk 16.07 release and delta can then be dropped
  * d/p/ubuntu-fix-vhost-user-socket-permission.patch fox (LP: #1546565):
    - when vhost_user sockets are created they are owner:group of the process
    - the DPDK api to create those has no way to specify owner:group
    - to fix that without breaking the API and potential workaround code in
      consumers of the library like openvswitch 2.6 for example. This patch
      adds an EAL commandline option to specify user:group created vhost_user
      sockets should have.

 -- Christian Ehrhardt <email address hidden> Wed, 27 Apr 2016 07:52:48 -0500

Changed in dpdk (Ubuntu):
status: Triaged → Fix Released
Thiago Martins (martinx) wrote :

Hey guys,

Sorry to ask this here but, how to use the fix ?

I have the proposed repository enabled, the latest DPDK that contains the fix is installed (apt full-upgrade on top of Proposed) but, look:

---
virsh start ubuntu16.04-1
error: Failed to start domain ubuntu16.04-1
error: internal error: process exited while connecting to monitor: 2016-04-29T23:22:57.647708Z qemu-system-x86_64: -chardev socket,id=charnet2,path=/var/run/openvswitch/vhost-user1: Failed to connect socket: Permission denied

---

ll /var/run/openvswitch/vhost-user*
srwxr-xr-x 1 root root 0 Apr 29 23:20 /var/run/openvswitch/vhost-user1=
srwxr-xr-x 1 root root 0 Apr 29 23:20 /var/run/openvswitch/vhost-user2=

---

Thanks!
Thiago

Thiago Martins (martinx) wrote :

Never mind... I'm reading the "ubuntu-fix-vhost-user-socket-permission.patch" file:

---
+* ``--vhost-owner``
+
+ When creating vhost_user sockets change owner and group to the specified value.
+ This can be given as ``user:group``, but also only ``user`` or ``:group`` are supported.
+
+ Examples::
+
+ --vhost-owner 'libvirt-qemu:kvm'
+ --vhost-owner 'libvirt-qemu'
+ --vhost-owner ':kvm'
+
+* ``--vhost-perm``
+
+ When creating vhost_user sockets set them up with these permissions.
+
+ For example::
+
+ --vhost-perm '0664'
---

I'll try it now...

Cheers!
Thiago

Chris J Arges (arges) on 2016-05-04
Changed in openvswitch-dpdk (Ubuntu):
status: Triaged → Invalid
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dpdk - 2.2.0-0ubuntu8

---------------
dpdk (2.2.0-0ubuntu8) xenial; urgency=medium

  * d/p/ubuntu-backport-[36-37] fix virtio issues (LP: #1570195):
    - don't let DPDK initialize virtio devices still in use by the kernel
    - this avoids conflicts between kernel and dpdk usage of those devices
    - an admin now has to unbind/bind devices as on physical hardware
    - this is in the dpdk 16.04 release and delta can then be dropped
    - d/dpdk-doc.README.Debian update for changes in virtio-pci handling
    - d/dpdk.interfaces update for changes in virtio-pci handling
  * d/p/ubuntu-backport-38... fix for memory leak (LP: #1570466):
    - call vhost_destroy_device on removing vhost user ports to fix memory leak
    - this likely is in the dpdk 16.07 release and delta can then be dropped
  * d/p/ubuntu-fix-vhost-user-socket-permission.patch fox (LP: #1546565):
    - when vhost_user sockets are created they are owner:group of the process
    - the DPDK api to create those has no way to specify owner:group
    - to fix that without breaking the API and potential workaround code in
      consumers of the library like openvswitch 2.6 for example. This patch
      adds an EAL commandline option to specify user:group created vhost_user
      sockets should have.

 -- Christian Ehrhardt <email address hidden> Mon, 25 Apr 2016 11:42:40 +0200

Changed in dpdk (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for dpdk has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Chris J Arges (arges) on 2016-05-04
no longer affects: openvswitch-dpdk (Ubuntu)
no longer affects: openvswitch-dpdk (Ubuntu Xenial)
James Page (james-page) on 2016-05-10
Changed in neutron-openvswitch (Juju Charms Collection):
milestone: none → 16.07
status: New → In Progress
importance: Undecided → High
assignee: nobody → James Page (james-page)

Reviewed: https://review.openstack.org/314472
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/commit/?id=4f6e2ca2512e298faf17b1db532625132623a628
Submitter: Jenkins
Branch: master

commit 4f6e2ca2512e298faf17b1db532625132623a628
Author: James Page <email address hidden>
Date: Tue May 10 10:16:06 2016 +0100

    Set correct permissions for vhostuser sockets

    The latest updates to DPDK in 16.04 and above introduce two new
    parameters for DPDK initialization which avoid the need to run
    qemu processes with vhostuser sockets as root.

    Use these options to ensure that sockets are created with the
    correct ownership and permissions for OpenStack/KVM.

    Change-Id: I04bbd514d1bdb9b3249ed69e8d64eb66d9839944
    Closes-Bug: 1546565

Changed in neutron-openvswitch (Juju Charms Collection):
status: In Progress → Fix Committed
James Page (james-page) on 2016-06-15
Changed in openvswitch (Ubuntu Xenial):
status: New → Triaged
Changed in openvswitch (Ubuntu):
status: New → Triaged
Changed in neutron-openvswitch (Juju Charms Collection):
status: Fix Committed → Fix Released
Changed in openvswitch (Ubuntu Xenial):
importance: Undecided → High
Changed in openvswitch (Ubuntu):
importance: Undecided → High
Changed in neutron-openvswitch (Juju Charms Collection):
status: Fix Released → Fix Committed
Martin Pitt (pitti) wrote :

Please send this patch to Debian as well, diverging config files in /etc sound bad in the long run.

Hi,
trying to clarify the complex situation regarding submitting this to Debian.

1. DPDK is not yet available in Debian. The to be packaged version is based on Ubuntu packaging and will have this fix.
2. Openvswitch as in Debian atm is not yet DPDK compatible (at 2.3 and would need to be >=2.5). Thereby the patch makes no sense for Debian (yet)

Once these are available in Debian it makes sense to submit to reduce delta.
Well actually we don't derive openvswitch from debian so it is no debian delta - still it would be nice to stay in sync.

I'm patch pilot today and a little confused. Can you check my summary of the situation?

The reason this is on the sponsorship report is to get the patch in comment #6 uploaded to yakkety? Martin asked for it to be sent to Debian, but as there is no dpdk in debian yet, there is no support for it in openvswitch in debian either, so there is nowhere to send the patch. As such the patch seems reasonable to me. Hopefully when Debian does get support from dpdk they won't do anything gratuitously different to what Ubuntu did...

Changed in openvswitch (Ubuntu):
status: Triaged → Incomplete

Hi Michael,
this seems to confuse people regularly, so instead of just answering questions as last time I went over the whole bug again to summarize. Thanks for being persistent pitti and mwhudson!

TL;DR
please unsubscribe sponsors and SRU Verification

Summary:

Upstream DPDK & Openvswitch (and libvirt):
This is a back and forth discussion who should fix it and who not. It is essentially looping around with no project accepting a fix yet and leaving the overall solution (OVS+DPDK on KVM Guests) unusable for end users.
That is why we carry the delta for now - DPDK to provide the option, Openvswitch to show an example how to exploit the option.

DPDK (fix to make option available):
- fixed in Yakkety 2.2.0-0ubuntu9
- fixed in Xenial 2.2.0-0ubuntu8
- Debian (I'm co-leading the Debian effort, the fix is in there already)

neutron-openvswitch Juju Charms Collection (exploit this from charms):
- already fixed in 4f6e2ca2 by James Page

Openvswitch (just add example to default config):
- the debdiff of #6 should have gone to Xenial
- since we now have yakkety I'll add another debdiff for yakkety
- Debian needs to get a recent Openvswitch first and the DPDK with the patch before we can push
- given that:
  - upstream hasn't decided on a solution yet
  - it is only the example that we wanted to be extended
  - it is covered in https://help.ubuntu.com/16.04/serverguide/DPDK.html#dpdk-openvswitch
  => I think we can drop this for now.

Changed in openvswitch (Ubuntu):
status: Incomplete → Won't Fix
Changed in openvswitch (Ubuntu Xenial):
status: Triaged → Won't Fix
Liam Young (gnuoy) on 2016-07-29
Changed in neutron-openvswitch (Juju Charms Collection):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers