/run/netns/* gets umounted on the host when a container starts

Bug #1307829 reported by Kaloyan Ganchev
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
iproute (Ubuntu)
Invalid
Medium
Unassigned
lxc (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Hello,
I am using latest stables lxc build from http://ppa.launchpad.net/ubuntu-lxc/stable/ubuntu on a OpenStack controller to run containers. The OpenStack is also using network namespaces . When I boot the server without auto start the containers everything looks fine with the OpenStack network namespaces. If I boot lxc container with lxc-start the container starts and its networking is functioning, but network namespaces created before that , by OpenStack installation for example , become unusable with he following error:

 root@osctrl3dc02:~# ip netns exec vips ip a
 seting the network namespace failed: Invalid argument

Here is the strace:

 open("/var/run/netns/vips", O_RDONLY) = 4
 syscall_308(0x4, 0x40000000, 0x7fffc4d54e83, 0x7fffc4d54bf0, 0x430af0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0) = -1 (errno 22)

 As far as I know sys call_308 should set the namespace , but it seems it fails on accessing the /var/run/netns/vips

It is strange that the permissions are altered:
root@osctrl3dc02:~# ls -alh /var/run/netns/vips
 ---------- 1 root root 0 Apr 14 08:48 /var/run/netns/vips

 This file have the following permission before I start the container
 -r--r--r-- 1 root root 0 Apr 12 14:01 /var/run/netns/vips

 If I destroy the vips namespace and create it again , keeping the lxc containers running, everything is back to normal, both containers and Openstack networking are working.

Best regards,

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: iproute 20111117-1ubuntu2.1
ProcVersionSignature: Ubuntu 3.11.0-19.33~precise1-generic 3.11.10.5
Uname: Linux 3.11.0-19-generic x86_64
ApportVersion: 2.0.1-0ubuntu17.6
Architecture: amd64
Date: Tue Apr 15 00:30:18 2014
InstallationMedia: Ubuntu-Server 12.04.4 LTS "Precise Pangolin" - Release amd64 (20140204)
MarkForUpload: True
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: iproute
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Kaloyan Ganchev (kaloqn-ganchev) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for reporting this bug. You say this is only with namespaces pre-created by openstack. I'm confused on that - why is openstack creating new network namespaces inside the container?

I've just tested under precise, and 'ip netns add' does the right thing there, so contrary to what I said before it looks like there is no bug in iproute. I'm going to mark this against nova, but really have no idea what part of nova is involved here.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

When you say

> root@osctrl3dc02:~# ip netns exec vips ip a

Is osctrl3dc02 the host or a container? Are you saying that you start a container on the host, and then /var/run/netns/ contents change on the host?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

D'oh, never mind, I see it now.

no longer affects: nova (Ubuntu)
Changed in lxc (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

So the particular files /var/run/netns/whatzit are bind-mounted /proc/self/ns/net files from a task which no longer exists, which are pinning the netns.

Interestingly, if I reproduce this by hand by doing

term 1: lxc-unshare -s NETWORK -- /bin/bash
term 2: mkdir /var/run/netns/z; mount --bind /proc/$pid/net/ns /var/run/netns/z
lxc-start -n t1 -d; sleep 3; lxc-stop -n t1 -k

then /var/run/netns/z permissions are not changed.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1307829

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: network namespace error

Ok I see the problem but am not sure what to do about it.

iproute makes /var/run/netns MS_SHARED. When a container starts up, it umounts everything. So the netns bind mounts are being umounted on the host.

Ideally it woudl be as simple as marking /var/run/netns MS_SLAVE before spawnign the container. However, 'mount --make-rslave /var/run/netns' fails because /var/run/netns doesn't appear to be in my mounts table. Rather /netns is.

no longer affects: linux (Ubuntu)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Please disregard the notice about required logs)

Changed in iproute (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The reason iproute is doing this is:

        /* Make it possible for network namespace mounts to propogate between
         * mount namespaces. This makes it likely that a unmounting a network
         * namespace file in one namespace will unmount the network namespace
         * file in all namespaces allowing the network namespace to be freed
         * sooner.
         */

The command 'ip netns delete x1' simply unmounts /run/netns/x1. If you have 300 'ip netns exec x$i' commands running, then having /run/netns MS_SHARED will propagate the unmounte to all 300 namespaces causing the network namespace to be freed earlier.

Unfortunately that makes it so that any task which unmounts /run/netns/x1, which all can do, unmounts it everywhere.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

One way iproute could be helpful here by creating a /run/netns/mnt, onto which one 'iproute' mounts namespace was bind-mounted. Then 'ip netns exec' could setns into that mount namespace, *then* unshare mntns. The /run/netns could be a slave to the host but peer with all its child namepace. (I guess it would have to be /run/netns_mnt for that to be sane)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Ah the problem was that /etc/mtab was a file, and /run/netns did not show up in it so mount refused to act on it. Changing /etc/mtab to a symlink to /proc/mounts allows me to make those rslave.

So it should suffice for lxc to always turn all of / into MS_SLAVE. It currently does so only when / is MS_SHARED.

Changed in iproute (Ubuntu):
status: Confirmed → Invalid
summary: - network namespace error
+ /run/netns/* gets umounted on the host when a container starts
Revision history for this message
Kaloyan Ganchev (kaloqn-ganchev) wrote :

Do you think there is a better way to work around this issue other than recreating the non lxc network namespaces after all lxc containers start? Until a fix is release of course .

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1307829] Re: /run/netns/* gets umounted on the host when a container starts

Quoting Kaloyan Ganchev (<email address hidden>):
> Do you think there is a better way to work around this issue other than
> recreating the non lxc network namespaces after all lxc containers
> start? Until a fix is release of course .

Yes, if you're starting the containers by hand, then you should be able
to do so from a private mounts namespace, like so:

sudo lxc-unshare -s MOUNT -- /bin/bash
mount --bind /proc/mounts /etc/mtab
mount --make-rslave /
lxc-start -n mycontainer

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've sent a patch upstream to fix this. It should be pulled into trusty's lxc soon-ish as part of the next stable release.

Changed in lxc (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Panagiotis Moustafellos (pmoust) wrote :

What was the patch exactly, Serge? Is there a link to an upstream sha1?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.