os_vif.plug() at cni fails with privsep issue when using ovs-firewall

Bug #1673832 reported by Luis Tomas Bolivar on 2017-03-17
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
High
Antoni Segura Puimedon

Bug Description

When deploying kuryr-kubernetes with ovs-firewall enabled I got some errors at the os_vif.plug() command triggered by the cni, which prevent the container to go to running state due to some privsep issues.

This is the error at the kubelet.log:

E0317 13:24:36.315148 29430 cni.go:209] Error while adding to cni network: Failed to plug VIF VIFOpenVSwitch(active=True,address=fa:16:3e:c4:e7:e3,bridge_name='br-int',has_traffic_filtering=True,id=f2f07b6c-7513-47a4-952b-ea8cdb412de3,network=Network(9f60eacf-564a-4b57-9205-410ffbe7ec1f),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tapf2f07b6c-75'). Got error: privsep helper command exited non-zero (1); Traceback (most recent call last):
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/api.py", line 84, in run
    vif = self._plugin.add(params)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/main.py", line 40, in add
    self._watcher.start()
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/watcher.py", line 111, in start
    self._start_watch(path)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/watcher.py", line 129, in _start_watch
    self._watch(path)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/watcher.py", line 141, in _watch
    self._handler(event)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/handlers/dispatch.py", line 110, in __call__
    self._handler(event)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/handlers/dispatch.py", line 64, in __call__
    handler(event)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/handlers/k8s_base.py", line 63, in __call__
    self.on_present(obj)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/handlers.py", line 44, in on_present
    self.on_vif(pod, vif)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/handlers.py", line 78, in on_vif
    self._cni.CNI_IFNAME, self._cni.CNI_NETNS)
  File "/opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/binding/base.py", line 61, in connect
    os_vif.plug(vif, instance_info)
  File "/usr/lib/python2.7/site-packages/os_vif/__init__.py", line 85, in plug
    raise os_vif.exception.PlugException(vif=vif, err=err)
PlugException: Failed to plug VIF VIFOpenVSwitch(active=True,address=fa:16:3e:c4:e7:e3,bridge_name='br-int',has_traffic_filtering=True,id=f2f07b6c-7513-47a4-952b-ea8cdb412de3,network=Network(9f60eacf-564a-4b57-9205-410ffbe7ec1f),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tapf2f07b6c-75'). Got error: privsep helper command exited non-zero (1)

The current workaround is:
- As os-vif.plug() is not really needed for ovs-firewall, just commenting out that os_vif.plug fix the problem:

#os_vif.plug(vif, instance_info) in /opt/stack/kuryr-kubernetes/kuryr_kubernetes/cni/binding/base.py line 61

Changed in kuryr-kubernetes:
importance: Undecided → High
status: New → Triaged
Irena Berezovsky (irenab) wrote :
Download full text (3.2 KiB)

The following was logged while running on Xenial:
2017-03-20 09:03:39.063 31121 DEBUG os_vif [-] Plugging vif VIFBridge(active=True,address=fa:16:3e:09:e0:2d,bridge_name='qbr3e519fd3-c6',has_traffic_filtering=True,id=3e519fd3-c671-4627-ba39-f9fcebf3d2f3,network=Network(483f2685-846c-4a46-83fc-9e5cc4243451),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap3e519fd3-c6') plug /usr/local/lib/python2.7/dist-packages/os_vif/__init__.py:79
2017-03-20 09:03:39.065 31121 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'privsep-helper', '--config-file', u'/etc/kuryr/kuryr.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug', '--privsep_sock_path', '/tmp/tmpw_hQ_9/privsep.sock']
2017-03-20 09:03:39.480 31121 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap
2017-03-20 09:03:39.481 31121 DEBUG oslo.privsep.daemon [-] Accepted privsep connection to /tmp/tmpw_hQ_9/privsep.sock __init__ /usr/local/lib/python2.7/dist-packages/oslo_privsep/daemon.py:332
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    msg = self.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/handlers.py", line 100, in format
    return logging.StreamHandler.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/formatters.py", line 273, in format
    for arg in record.args:
TypeError: 'NoneType' object is not iterable
Logged from file daemon.py, line 506
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    msg = self.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/handlers.py", line 100, in format
    return logging.StreamHandler.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/formatters.py", line 273, in format
    for arg in record.args:
TypeError: 'NoneType' object is not iterable
Logged from file daemon.py, line 395
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    msg = self.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/handlers.py", line 100, in format
    return logging.StreamHandler.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/formatters.py", line 273, in format
    for arg in record.args:
TypeError: 'NoneType' object is not iterable
Logged from file daemon.py, line 414
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    msg = self.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/handlers.py", line 100, in format
    return logging.StreamHandler.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/format...

Read more...

Kirill Zaitsev (kzaitsev) wrote :

Here's another bit of logs, that look interesting:

2017-03-24 12:52:14.386 23097 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'privsep-helper', '--config-file', u'/etc/kuryr/kuryr.conf', '--privsep_context', 'vif_plug_ovs.privsep.vif_plug
', '--privsep_sock_path', '/tmp/tmpRuY61M/privsep.sock']
2017-03-24 12:52:14.392 23097 WARNING oslo.privsep.daemon [-] privsep log: root is not in the sudoers file. This incident will be reported.
2017-03-24 12:52:14.393 23097 CRITICAL oslo.privsep.daemon [-] privsep helper command exited non-zero (1)
2017-03-24 12:52:14.393 23097 ERROR os_vif [-] Failed to plug vif VIFOpenVSwitch(active=True,address=fa:16:3e:81:81:a5,bridge_name='br-int',has_traffic_filtering=True,id=718f89b7-3882-4ff5-91b8-d280c8fe28
aa,network=Network(b6cf0aff-0f35-4cb3-9d19-66a8960fa471),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap718f89b7-38')
2017-03-24 12:52:14.393 23097 ERROR os_vif Traceback (most recent call last):
2017-03-24 12:52:14.393 23097 ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/os_vif/__init__.py", line 80, in plug
2017-03-24 12:52:14.393 23097 ERROR os_vif plugin.plug(vif, instance_info)
2017-03-24 12:52:14.393 23097 ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/vif_plug_ovs/ovs.py", line 165, in plug
2017-03-24 12:52:14.393 23097 ERROR os_vif constants.OVS_DATAPATH_SYSTEM)
2017-03-24 12:52:14.393 23097 ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/oslo_privsep/priv_context.py", line 204, in _wrap
2017-03-24 12:52:14.393 23097 ERROR os_vif self.start()
2017-03-24 12:52:14.393 23097 ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/oslo_privsep/priv_context.py", line 215, in start
2017-03-24 12:52:14.393 23097 ERROR os_vif channel = daemon.RootwrapClientChannel(context=self)
2017-03-24 12:52:14.393 23097 ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/oslo_privsep/daemon.py", line 328, in __init__
2017-03-24 12:52:14.393 23097 ERROR os_vif raise FailedToDropPrivileges(msg)
2017-03-24 12:52:14.393 23097 ERROR os_vif FailedToDropPrivileges: privsep helper command exited non-zero (1)
2017-03-24 12:52:14.393 23097 ERROR os_vif

the most interesting bit here is "root is not in the sudoers file. This incident will be reported". After adding root to sudoers file the issue is half-gone for me. (I still get stacktraces Irena mentions).

The issue is that privsep, as it is used from os-vif, tries to use sudo to run things as root. However, CNI is already run as root. So this makes little sense. The privsep mode that should be used is Method.FORK.

The following snippet forces it:

diff --git a/kuryr_kubernetes/cni/main.py b/kuryr_kubernetes/cni/main.py
index 3188600..2bb121e 100644
--- a/kuryr_kubernetes/cni/main.py
+++ b/kuryr_kubernetes/cni/main.py
@@ -61,6 +61,10 @@ class K8sCNIPlugin(cni_api.CNIPlugin):
         config.init(args)
         config.setup_logging()
         os_vif.initialize()
+ ovs = os_vif._EXT_MANAGER['ovs'].obj
+ ovs_mod = sys.modules[ovs.__module__]
+ ovs_mod.linux_net.privsep.vif_plug.start(
+ ovs_mod.linux_net.privsep.priv_context.Method.FORK)
         clients.setup_kubernetes_client()
         self._pipeline = h_cni.CNIPipeline()
         self._watcher = k_watcher.Watcher(self._pipeline)

However, the right solution would be to make os-vif configurable to choose the privsep method.

Changed in kuryr-kubernetes:
milestone: none → pike-3
assignee: nobody → Antoni Segura Puimedon (celebdor)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers