lxc-start symlink vulnerabilities may allow guest to read host filesystem, interfere with apparmor

Bug #1476662 reported by Roman Fiedler
264
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
High
Unassigned

Bug Description

lxc-start shuffles around mounts using helper directory /usr/lib/x86_64-linux-gnu/lxc (guest root fs is mounted here)

It then modifies mounts operating in guest root directory before invoking init. As it does not check if all mount points are directories, a malicious guest may modify its internal structure before shutdown (or was created using manipulated image) and then when started again, guest may

* Access the whole host root filesystem

* Block switching from lxc-start apparmor profile to lxc-container-default

# Real putold before pivot-root (root fs will end here)
mkdir -p /x/lxc_putold

# Faked putold
ln -s /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold lxc_putold
mkdir -p /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc
touch /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc/mounts

# proc fake
mkdir -p /x/proc
umount /proc
rmdir /proc
ln -s /usr/lib/x86_64-linux-gnu/lxc/x/proc proc

mkdir -p /usr/lib/x86_64-linux-gnu/lxc/x/proc/1/attr /usr/lib/x86_64-linux-gnu/lxc/x/proc/self
touch /usr/lib/x86_64-linux-gnu/lxc/x/proc/1/attr/current
touch /usr/lib/x86_64-linux-gnu/lxc/x/proc/self/status

The issue was also found during https://service.ait.ac.at/security/2015/LxcSecurityAnalysis.html

CVE References

Revision history for this message
Stéphane Graber (stgraber) wrote :

I'm assuming this was using LXC 1.0, any chance you can check LXC 1.1?

We've deprecated pivotdir in LXC 1.1 for other reasons but that may very well prevent this entirely, if that's the case, then we can backport that change to LXC 1.0 minus the deprecation warning.

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote :

No, only tested on Ubuntu Trusty. I do not have other test machine at hand at the moment. Could you run

strace -s256 -f /bin/sh -c "/usr/bin/lxc-start --name [guestname]" 2>&1 | tee /tmp/test.log

with a container where init is replaced with /bin/true (or just empty file) and send me the log?

Revision history for this message
Stéphane Graber (stgraber) wrote :

This is running:
strace -o out -s256 -f lxc-start -n test -F -- /bin/true

The -F flag is needed in 1.1 and higher as -d is now the default.

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote :

The pivot_root part looks more solid - the "." -> "." should make any symlink trickery with host / impossible here, the MS_SLAVE should do the rest. But perhaps someone else may take a look ...

The apparmor modification may still work, I do not see major differences preventing it. Do you know of critical features allowed in lxc-start that would be disabled by standard container profile with that?

25741 open("/proc/1/attr/current", O_WRONLY) = 11
25741 write(11, "changeprofile lxc-container-default", 35) = 35

Revision history for this message
Stéphane Graber (stgraber) wrote :

Indeed, this is still an issue. Sounds like the best way to fix this is:
 - Backport the pivot_root change to the 1.0 branch (but without the deprecation warning, just silently ignore it)
 - Refuse to mount anything to a destination path which happens to be a symlink

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(working on a patch for (2) today)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hi Serge - this patch looks good for the most part but I'm curious if it is possible for a container admin to modify the target path during or after the ensure_not_symlink() checks and before the mount? It feels like there's a TOCTOU issue in there but maybe the admin can't possibly make changes while the check is happening?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Yup, that race is there in theory. This appears to be yet another reason to push for a 'fdmount/mountat' function. But lacking that I'm not sure how we can prevent this.

Do you have any suggestions?

If we have the separate fix in apparmor for writing to /proc/self/attr/current, and the pivot_root update backported, what other attacks remain meaningful here?

Revision history for this message
Tyler Hicks (tyhicks) wrote : Re: [Bug 1476662] Re: lxc-start symlink vulnerabilities may allow guest to read host filesystem, interfere with apparmor

On 2015-07-30 18:46:50, Serge Hallyn wrote:
> Yup, that race is there in theory. This appears to be yet another
> reason to push for a 'fdmount/mountat' function. But lacking that I'm
> not sure how we can prevent this.
>
> Do you have any suggestions?

You could split all of the path components up into an array (or however
you want to iterate over them) and then walk each component of the path
like so:

for (component = components[0]; component; component++) {
  fd = open(dirfd, component,
            O_CLOEXEC | O_NOFOLLOW | O_DIRECTORY | O_RDONLY);
  fchdir(fd);
  close(dirfd);
  dirfd = fd;
}
close(fd);
mount(source, ".", ...);

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Tyler Hicks (<email address hidden>):
> On 2015-07-30 18:46:50, Serge Hallyn wrote:
> > Yup, that race is there in theory. This appears to be yet another
> > reason to push for a 'fdmount/mountat' function. But lacking that I'm
> > not sure how we can prevent this.
> >
> > Do you have any suggestions?
>
> You could split all of the path components up into an array (or however
> you want to iterate over them) and then walk each component of the path
> like so:
>
> for (component = components[0]; component; component++) {
> fd = open(dirfd, component,
> O_CLOEXEC | O_NOFOLLOW | O_DIRECTORY | O_RDONLY);
> fchdir(fd);
> close(dirfd);
> dirfd = fd;
> }
> close(fd);
> mount(source, ".", ...);

Haha. I guess that works :) mount with '.' was escaping me - thanks!

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

This is CVE-2015-1335

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Ah, right, the mount(".") doesn't work in the (very common) case of file mounts. We could special-case the file mount case to there chdir to the parent and then mount onto the filename, but that still allows for a TOCTTOU.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This is the v2 of the patch - which breaks lxc by not handling mounting onto files.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Ugh, I didn't think about bind mounting files. I can't think of any solutions for that scenario that would not require kernel changes.

Thanks for trying out the idea.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Ah, but there is something we can do: we can verify that the target path in /proc/self/mountinfo is what we expect.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

An alternative proposed patch which verifies that mounts were not on symlinks after the fact using /proc/self/mountinfo.

Since mounts are made in a private namespace and lxc has made every effort to first force all mounts to MS_SLAVE, this should be safe. Upon failure, lxc will report an error and fail/stop the container startup.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Patch combining two approaches.

While it has the allure of trying all the harder to prevent symlinks, it also does result in more complicated code. I'm not sure which we should go with.

(Note that while the two original approaches were independently tested, this patch has not been tested. In fact the support for file mounts has not been tested even individually)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hm, more problems with the chdir+mount approach - the susequent attempt to fix up readonly bind mounts results in

[pid 30540] mount(".", ".", 0x7fffca438669, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = -1 EINVAL (Invalid argument)

(with a fix to the posted patch to mount mountentry onto itself rather than fsname in the REMOUNT case)

So I'm really tempted to suggest going with the simpler patch in comment #17

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

New idea from Eric Biederman for using /proc/self/fd/N as mount target for the file bind mounts. I'll work on a patch, hopefully this week.

Changed in lxc (Ubuntu):
status: New → In Progress
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've ported the patches to 1.0 and 1.1. Also made a minor change (reflected in the changelog) to the patch against current git HEAD.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hi Serge - I've reviewed the patch against git HEAD and have a couple questions.

1) In mount_entry(), the call to mount(2) is changed to use target as both the source and the destination. I can't see why this would be a valid change.

2) In is_file_mount(), the usage of lstat(2) seems odd for a couple reasons:
   - If path is a symlink to a directory, the mount will be incorrectly treated as a file mount in safe_mount(). I think stat(2) should be used.
   - Error checking of lstat(2) is ignored. I think this should be treated as fatal for most (all?) cases.

3) In open_without_symlink(), why is curlen decremented after assigning it the return value of strlen(prefix_skip)? Is it because prefix_skip is somehow guaranteed to have a trailing '/' character?

4) open_without_symlink() is documented to return -1 upon error but it returns -EINVAL when target doesn't start with prefix_skip. (minor)

5) Is there a need to protect against ".." components in open_without_symlink()? I'm thinking of a target of "/foo/../" and a prefix_skip of "/foo".

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Tyler Hicks (<email address hidden>):
> Hi Serge - I've reviewed the patch against git HEAD and have a couple
> questions.

Thanks, Tyler.

> 1) In mount_entry(), the call to mount(2) is changed to use target as
> both the source and the destination. I can't see why this would be a
> valid change.

It appears to me to be more correct. Since we are doing a remount,
the old source isn't really valid any more. I'm thinking of it in
terms of

 mount --bind /ab /cd
 mount -o bind.remount,ro /cd

But I'll test with that part undone, and change it back, as I may just
not be thinking right.

> 2) In is_file_mount(), the usage of lstat(2) seems odd for a couple reasons:
> - If path is a symlink to a directory, the mount will be incorrectly treated as a file mount in safe_mount(). I think stat(2) should be used.

Hm, yeah. I'll change that.

> - Error checking of lstat(2) is ignored. I think this should be treated as fatal for most (all?) cases.

My tree has

 ret = lstat(path, &sb);
 if (ret)
  return false;

Oh, you mean that if the lstat (or stat) fails, the whole safe_mount()
should be aborted? That's probably a good thing to fix.

> 3) In open_without_symlink(), why is curlen decremented after assigning
> it the return value of strlen(prefix_skip)? Is it because prefix_skip is
> somehow guaranteed to have a trailing '/' character?

Hm, I don't think that's guaranteed.

> 4) open_without_symlink() is documented to return -1 upon error but it
> returns -EINVAL when target doesn't start with prefix_skip. (minor)
>
> 5) Is there a need to protect against ".." components in
> open_without_symlink()? I'm thinking of a target of "/foo/../" and a
> prefix_skip of "/foo".

The container admin cannot change the path being passed to
open_without_symlink() without having access to the container config
file, in which case he owns the host (or parent container) end of
story. But I think it's probably worth simply refusing any paths
with '..' in them, yes.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

On 2015-09-16 21:15:13, Serge Hallyn wrote:
> Quoting Tyler Hicks (<email address hidden>):
> > 1) In mount_entry(), the call to mount(2) is changed to use target as
> > both the source and the destination. I can't see why this would be a
> > valid change.
>
> It appears to me to be more correct. Since we are doing a remount,
> the old source isn't really valid any more. I'm thinking of it in
> terms of
>
> mount --bind /ab /cd
> mount -o bind.remount,ro /cd
>
> But I'll test with that part undone, and change it back, as I may just
> not be thinking right.

Aha! It didn't click with me that you are doing a remount there. If we
look at the mount(2) man page, under the MS_REMOUNT section:

"target should be the same value specified in the initial mount() call;
source and filesystemtype are ignored."

I guess you could even pass in NULL for source and filesystemtype.

> > 2) In is_file_mount(), the usage of lstat(2) seems odd for a couple reasons:
> > - If path is a symlink to a directory, the mount will be incorrectly treated as a file mount in safe_mount(). I think stat(2) should be used.
>
> Hm, yeah. I'll change that.
>
> > - Error checking of lstat(2) is ignored. I think this should be
> treated as fatal for most (all?) cases.
>
> My tree has
>
> ret = lstat(path, &sb);
> if (ret)
> return false;
>
> Oh, you mean that if the lstat (or stat) fails, the whole safe_mount()
> should be aborted? That's probably a good thing to fix.

Yes, that's what I meant. At the moment, a return value of false could
mean that it isn't a file mount or that lstat(2) failed. Those feel like
two different conditions to me.

>
> > 3) In open_without_symlink(), why is curlen decremented after assigning
> > it the return value of strlen(prefix_skip)? Is it because prefix_skip is
> > somehow guaranteed to have a trailing '/' character?
>
> Hm, I don't think that's guaranteed.
>
> > 4) open_without_symlink() is documented to return -1 upon error but it
> > returns -EINVAL when target doesn't start with prefix_skip. (minor)
> >
> > 5) Is there a need to protect against ".." components in
> > open_without_symlink()? I'm thinking of a target of "/foo/../" and a
> > prefix_skip of "/foo".
>
> The container admin cannot change the path being passed to
> open_without_symlink() without having access to the container config
> file, in which case he owns the host (or parent container) end of
> story. But I think it's probably worth simply refusing any paths
> with '..' in them, yes.

I understand that the right to modify the container config essentially
equals host root privs so I'm fine either way and will leave that
decision up to you.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The patch above still causes some regressions. The fix may be simple, but it's getting late.

The issue, I hope, is that we demand that we be able to inspect the mount target, to detect whether it is a symlink. This is more restrictive than 'mount' is. I can, from a non-init userns, mount a directory over /root even though i can't look at /root.

One of the failures I'm seeing is with $containerroot/proc/sysrq-trigger, which is
--w------- 1 root root 0 Sep 16 23:47 /proc/sysrq-trigger

Since a symbolic cannot have its permissions changed, I think we can assume that if we get EPERM on a final part of a link, we can allow the mount.

However, since we cannot then open the file/dir (to fchdir to or to mount onto /proc/self/fd), dealing with this complicates the whole thing.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

We can detect the case where the last path element is unreadable, but
we can't then handle it using this technique. If we just detect that
case and then fall back to mount(src, dest, ...), then we are open to a
race where

  a. first /home/$user was bind-mounted into the container at /foo
  b. then a mount is directed onto $containerroot/foo/a/b/c
  c. /home/$user/c is unreadable
  d. $user can redirect /home/$user/a/b to become an absolute symlink
     outside of the container (let's say /) after we've checked for
     absolute symlinks on /a/b but before we do the simple mount()

We'd need a way to use an fd to /a/b to prevent that from happening.
Perhaps we can chdir(/a/b) and then mount onto relative path 'c' ?
However c itself is still racy.

And so I'm leaning back toward doing the approach where we just blindly
do the mount, then check /proc/self/mountinfo or realpath() output to
verify that the mount was placed inside the container.

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote : AW: [Bug 1476662] Re: lxc-start symlink vulnerabilities may allow guest to read host filesystem, interfere with apparmor

I think that the implementation of fusermount (file system in user space) had
similar problem, some code in this domain uses scheme, were process opens the
dir, chdirs in the other one and uses the /proc/self/fd elements in the mount
command.

This is even resistent to moving/deleting and recreating the directory in
between, but I'm not sure, if it could be applied here also.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@roma-fiedler,

I think what you describe is what my last patch was doing. The problem is we need to be able to mount files onto paths which we can' topen (i.e. /proc/sysrq-trigger). In that case I can't see a way to make sure that the path is not changed between our last check of it and the mount onto it.

Failing any good solution to that, I intend to switch out all that code in favor of code which just does the mount, then checks realpth() results to make sure the target ended up in the container directory. Since we are in a MS_SLAVE remounted rootfs in a private mntns, there shouldn't be any downsides to that approach. Feedback very much appreciated.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Eric suggested using O_PATH in that case, I'll try that.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

A few minor comments on v2.

1) Typo in the commit message: "The host's mount path may container ..."

2) The function documentation for is_subdir() needs updating to reflect the new params.

3) The fstatat() in the while loop of open_without_symlink() is not needed. open_if_safe() passes the O_NOFOLLOW flag to openat(2) so symlinks are already being properly detected.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks - updated to address those, and also removed the '..' check as discussed since we verify that the target doesn't escape the container rootfs. Uploading new tarball.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Tyler Hicks (tyhicks) wrote :

Looks good to me!

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote :

Also to me. If I understand correctly, code inside e.g. validate_symlink is only secure when applied to a non-running container (where any process completely under guest control is launched). Otherwise there is TOCTOU between readlink and openat.

As the code is inside src/lxc/utils.c should we place comments before each of those functions, so that no one else will find them convenient, does not know about this implicit requirements and use it for something else.

"CAVEAT: This function must not be used for other purposes than container setup before executing the container's init?"

Apart from that, "validate_symlink" using destbuf would work around that but could make the code more prone to endless recursions. (Was the patch already checked against symlink loops?)

Could the description of open_without_symlink be out of sync with code?
"Open a path intending for mounting, ensuring there are no symbolic links in the path anywhere after 'prefix_skip' (if set)."
After checking the prefix, it calls open_if_safe which will validate_symlink and use it, when symlink encountered?

I haven't checked the whole patch yet, would take hours, but perhaps someone could do two straces of container startup on patched machine (one normal and one with a problematic symlink) and I will try to do the same analysis as for the initial report according to methods from https://service.ait.ac.at/security/2015/LxcSecurityAnalysis.html

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1476662] Re: lxc-start symlink vulnerabilities may allow guest to read host filesystem, interfere with apparmor

Quoting Roman Fiedler (<email address hidden>):
> Also to me. If I understand correctly, code inside e.g. validate_symlink
> is only secure when applied to a non-running container (where any

The code is only run against non-running containers, however the symlinks
can be changed if the container configuration (under the host admin's
control) has two mount entries, the first bind-mounting the attacker's
homedir into the container, and the second mounting to someplace under
the bind-mounted home.

So the TOCTTOU between readlink and openat is a problem. Sigh.

We could re-check the readlink after the openat, but the attacker could
presumably try to very quickly move the link back...

So using destbuf may be better. If the target is also a link, simply
returning EPERM at that point should be ok. We're not trying to support
every possible configuration. The real cases we need to support are
things like /proc/net.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I'm REALLY feeling like just doing mount followed by a check of /proc/self/mountinfo would've been more robust.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Updated patch to GIT HEAD

(updated comments)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

After irc discussion we realized the patch was insufficient.

This new patch makes some changes. It forbids all symlinks in the mount target. The case for which this broke lxc before has been taken care of by not mounting onto proc/net (which is a symlink to proc/self/net). It also forbids symlinks in relative bind mount sources. If needed we can later add a mount option saying to allow symlinks, but since we allow symlinks for absolute source paths this may not be needed.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 1.1.2-0ubuntu3.2

---------------
lxc (1.1.2-0ubuntu3.2) vivid-security; urgency=medium

  * SECURITY UPDATE: Arbitrary host file access and AppArmor
    confinement breakout via lxc-start following symlinks while
    setting up mounts within a malicious container (LP: #1476662).
    - debian/patches/0010-CVE-2015-1335.patch: block mounts to paths
      containing symlinks and block bind mounts from relative paths
      containing symlinks. Patch from upstream.
    - CVE-2015-1335

 -- Steve Beattie <email address hidden> Tue, 22 Sep 2015 16:04:18 -0700

Changed in lxc (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 1.0.7-0ubuntu0.5

---------------
lxc (1.0.7-0ubuntu0.5) trusty-security; urgency=medium

  * SECURITY UPDATE: Arbitrary host file access and AppArmor
    confinement breakout via lxc-start following symlinks while
    setting up mounts within a malicious container (LP: #1476662).
    - debian/patches/0003-CVE-2015-1335.patch: block mounts to paths
      containing symlinks and block bind mounts from relative paths
      containing symlinks. Patch from upstream.
    - CVE-2015-1335

 -- Steve Beattie <email address hidden> Tue, 22 Sep 2015 15:07:00 -0700

Changed in lxc (Ubuntu):
status: In Progress → Fix Released
Tyler Hicks (tyhicks)
information type: Private Security → Public Security
Revision history for this message
Andrey Kislyuk (weaver) wrote :

Is 12.04 Precise affected by this vulnerability?

Revision history for this message
Stéphane Graber (stgraber) wrote :

Chances are it is, but lxc in precise is in universe and on an unsupported upstream release, so we're not doing security updates there.
You can however use the upstream LXC PPA which will get you trusty's LXC on precise, including this security fix.

Revision history for this message
Daniel Kraft (daniel-kraft) wrote :

We're getting

lxc-start: utils.c: safe_mount: 1409 Mount of 'proc' onto '/usr/lib/x86_64-linux-gnu/lxc//proc' was onto a symlink!

on all containers since we upgraded to 1.0.7-0ubuntu0.5 and they don't start. No container has /proc as a symlink. Plese tell me what information is required from me.

We downgraded to 1.0.6-0ubuntu0.1 which works.

Revision history for this message
Daniel Kraft (daniel-kraft) wrote :

The problem lies in the ubuntu patch

http://archive.ubuntu.com/ubuntu/pool/main/l/lxc/lxc_1.0.7-0ubuntu0.5.debian.tar.gz

where this code

+ size_t start = croot ? strlen(croot) : 0;
+ if (strcmp(ws + start, target + start) != 0) {
+ ERROR("Mount onto %s resulted in %s\n", target, ws);
+ goto out;
+ }

in file 0003-CVE-2015-1335.patch checks if ws and start are the same.
According to the given error (which I forgot to paste above), ws and target ARE different:

lxc-start: utils.c: ensure_not_symlink: 1384 Mount onto /usr/lib/x86_64-linux-gnu/lxc//proc resulted in /usr/lib/x86_64-linux-gnu/lxc/proc

So target is

  /usr/lib/x86_64-linux-gnu/lxc//proc

and ws is

   /usr/lib/x86_64-linux-gnu/lxc/proc

Any hints how we could prevent the double slashing? Or would you just "clean up" the path somehow?

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote :
Revision history for this message
Stephen Gaito (3-stephen) wrote :

I have a similar problem (but not with /proc).

***Roman Fielder's link (above) suggests the correct work around.***

My lxc configuration file has the following line:

> lxc.mount.entry = /data/references /var/lib/lxc/noteServer/rootfs/data/references none ro,bind 0 0

(Note that the mount directory is an **absolute** path)

My resulting error message (in /var/log/lxc/noteServer.log) is:

> lxc-start 1443599663.225 ERROR lxc_utils - utils.c:ensure_not_symlink:1384 - Mount onto /usr/lib/x86_64-linux-gnu/lxc//data/references resulted in /usr/lib/x86_64-linux-gnu/lxc/data/references

Tracing through the apt-get source lxc code I think the offending code (in the mount_entry_on_absolute_rootfs function in the lxc-1.0.7/src/lxc/conf.c file) is:

> aux = strstr(mntent->mnt_dir, path);
> if (aux) {
> offset = strlen(path);
> goto skipabs;
> }
>
>skipvarlib:
> aux = strstr(mntent->mnt_dir, rootfs->path);
> if (!aux) {
> WARN("ignoring mount point '%s'", mntent->mnt_dir);
> goto out;
> }
> offset = strlen(rootfs->path);
>
>skipabs:
>
> r = snprintf(path, MAXPATHLEN, "%s/%s", rootfs->mount,
> aux + offset);

Note that the last line should (probably -- I have not compiled any code to test this) be:

> r = snprintf(path, MAXPATHLEN, "%s/%s", rootfs->mount,
> aux + offset + 1);

The "+1" then skips over the "/" in the mntent->mnt_dir so there will only be *one* "/" in the resulting path.

Note that the work around in Roman Fiedler's link ensures that the mount entry uses the mount_entry_on_relative_rootfs function (which works) rather than the (currently broken?) mount_entry_on_absolute_rootfs function.

I can confirm that the following configuration line:

> lxc.mount.entry = /data/references data/references none ro,bind 0 0

now in fact works, since it specifies a **relative** mount directory and so invokes the mount_entry_on_relative_rootfs function.

Many thanks for excellent **open source** tools!

Revision history for this message
Daniel Kraft (daniel-kraft) wrote :

@roman-fiedler
We're using absolute mount targets here, so that might help. Will try this out.

Revision history for this message
Stephen Gaito (3-stephen) wrote :

Looking through the top Google results on how to bind-mount a directory from the host-server into the lxc-server I notice that:

* Stéphane Graber's "LXC 1.0: Advanced container usage [3/10]" post ( https://www.stgraber.org/2013/12/21/lxc-1-0-advanced-container-usage/ ) makes use of the **relative** mount point (in the lxc-server's fstab config file on the host-server)

* Unfortunately the **official**(?) Debian LXC wiki page on "LXC" has the topic "Bind mounts inside the container" ( https://wiki.debian.org/LXC#Bind_mounts_inside_the_container ) which uses the lxc.mount.entry line in the config file **and** makes use of an **absolute** mount point.

So those following the official Debian LXC documentation will be caught by this security patch. ;-(

Just to be definite: changing all lxc.mount.entry mount points to **relative** paths is a current workaround.

Revision history for this message
Matthias Lüscher (m-luescher) wrote :

@Stephen Gaito and @Roman Fiedler:

Thanks for your hints! Using **relative** paths definitely helps also within the container fstab file.
However it would be very helpful if lxc would accept the absolute paths again:

fstab that fails with 1.0.7-0ubuntu0.5:

/home/MYUSER/somemountpoint /var/lib/lxc/CONTAINERNAME/rootfs/home/MYUSER none defaults,bind 0 0

fstab that works with 1.0.7-0ubuntu0.5:

/home/MYUSER/somemountpoint home/MYUSER none defaults,bind 0 0

Revision history for this message
Roman Fiedler (roman-fiedler-deactivatedaccount) wrote :

I'm not so deep in LXC to know how the design/specification is done for that. Discussion of "features" might therefore suite the lxc-users mailing list better.

The other thing is, if Ubuntu would treat that part of new behaviour of LXC affecting some users as "regression" and hence might act on that. I do not know, who would be up to decide that. Perhaps someone from Ubuntu could comment on that.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Relative paths are definately recommended over absolute paths, but the breaking of absolute paths will be fixed.

Can anyone who's having trouble who is not on trusty please comment? I'd like to make sure that is the only problem.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hello - Is anyone seeing this regression on a release other than 14.04 LTS (Trusty)?

Revision history for this message
Tyler Hicks (tyhicks) wrote :

The regression should be fixed with lxc 1.0.7-0ubuntu0.6. See http://www.ubuntu.com/usn/usn-2753-2/ for more details.

Revision history for this message
Daniel Kraft (daniel-kraft) wrote :

Regression fix fixes it on 14.04 LTS. Confirmed. Thanks!

Revision history for this message
Mike Gabriel (sunweaver) wrote :

Hi all,

today I worked on backporting available fixes for CVE-2015-1335 to LXC 0.7.x (as found in Debian squeeze-lts).

The patch is attached, I am still in the testing-for-regressions phase. Can any of the LXC devs take a look at the patch and maybe see if it is suitable for Ubuntu 12.04, as well?

Greets,
Mike (aka sunweaver at debian.org)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Mike Gabriel (<email address hidden>):
> Hi all,
>
> today I worked on backporting available fixes for CVE-2015-1335 to LXC
> 0.7.x (as found in Debian squeeze-lts).
>
> The patch is attached, I am still in the testing-for-regressions phase.
> Can any of the LXC devs take a look at the patch and maybe see if it is
> suitable for Ubuntu 12.04, as well?

Hi,

So the thing to look for is any unconverted "mount" calls. It
looks like the lxc_setup_fs() calls to mount_fs() are not being
protected. So the contianer admin could attack through a /proc
symlink.

> Greets,
> Mike (aka sunweaver at debian.org)
>
> ** Patch added: "Backport fix for CVE-2015-1335 to LXC 0.7.x (Ubuntu 12.04 / Debian squeeze-lts)"
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1476662/+attachment/4529631/+files/CVE-2015-1335.patch
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> Matching subscriptions: lxc
> https://bugs.launchpad.net/bugs/1476662
>
> Title:
> lxc-start symlink vulnerabilities may allow guest to read host
> filesystem, interfere with apparmor
>
> Status in lxc package in Ubuntu:
> Fix Released
>
> Bug description:
> lxc-start shuffles around mounts using helper directory
> /usr/lib/x86_64-linux-gnu/lxc (guest root fs is mounted here)
>
> It then modifies mounts operating in guest root directory before
> invoking init. As it does not check if all mount points are
> directories, a malicious guest may modify its internal structure
> before shutdown (or was created using manipulated image) and then when
> started again, guest may
>
> * Access the whole host root filesystem
>
> * Block switching from lxc-start apparmor profile to lxc-container-
> default
>
>
> # Real putold before pivot-root (root fs will end here)
> mkdir -p /x/lxc_putold
>
> # Faked putold
> ln -s /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold lxc_putold
> mkdir -p /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc
> touch /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc/mounts
>
>
> # proc fake
> mkdir -p /x/proc
> umount /proc
> rmdir /proc
> ln -s /usr/lib/x86_64-linux-gnu/lxc/x/proc proc
>
> mkdir -p /usr/lib/x86_64-linux-gnu/lxc/x/proc/1/attr /usr/lib/x86_64-linux-gnu/lxc/x/proc/self
> touch /usr/lib/x86_64-linux-gnu/lxc/x/proc/1/attr/current
> touch /usr/lib/x86_64-linux-gnu/lxc/x/proc/self/status
>
>
> The issue was also found during
> https://service.ait.ac.at/security/2015/LxcSecurityAnalysis.html
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1476662/+subscriptions

Revision history for this message
Mike Gabriel (sunweaver) wrote :

Hi Serge,

sorry for getting back to this so late.

On Di 08 Dez 2015 17:08:58 CET, Serge Hallyn wrote:

> Quoting Mike Gabriel (<email address hidden>):

>> today I worked on backporting available fixes for CVE-2015-1335 to LXC
>> 0.7.x (as found in Debian squeeze-lts).
>>
>> The patch is attached, I am still in the testing-for-regressions phase.
>> Can any of the LXC devs take a look at the patch and maybe see if it is
>> suitable for Ubuntu 12.04, as well?
>
> Hi,
>
> So the thing to look for is any unconverted "mount" calls. It
> looks like the lxc_setup_fs() calls to mount_fs() are not being
> protected. So the contianer admin could attack through a /proc
> symlink.

Hmmm... ok...

I just checked upstream Git and the location you refer to is not using
safe_mount either there [1]

Furthermore, it seems non-trivial to inform safe_mount about the root
path from within lxc_init.c.

Do you have any input on the following questions?:

   o Why mount_fs() in latest HEAD still using the mount() call
instead of safe_mount()?
   o How could one pipe the rootfs path into lxc_setup_fs() -> mount_fs()?

Thanks for any input.

Mike

[1] https://github.com/lxc/lxc/blob/master/src/lxc/initutils.c#L35
--

DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
fon: +49 (1520) 1976 148

GnuPG Key ID 0x25771B31
mail: <email address hidden>, http://das-netzwerkteam.de

freeBusy:
https://mail.das-netzwerkteam.de/mailxchange/kronolith/fb.php?u=m.gabriel%40das-netzwerkteam.de

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Download full text (3.6 KiB)

Quoting Mike Gabriel (<email address hidden>):
> Hi Serge,
>
> sorry for getting back to this so late.
>
> On Di 08 Dez 2015 17:08:58 CET, Serge Hallyn wrote:
>
> > Quoting Mike Gabriel (<email address hidden>):
>
> >> today I worked on backporting available fixes for CVE-2015-1335 to LXC
> >> 0.7.x (as found in Debian squeeze-lts).
> >>
> >> The patch is attached, I am still in the testing-for-regressions phase.
> >> Can any of the LXC devs take a look at the patch and maybe see if it is
> >> suitable for Ubuntu 12.04, as well?
> >
> > Hi,
> >
> > So the thing to look for is any unconverted "mount" calls. It
> > looks like the lxc_setup_fs() calls to mount_fs() are not being
> > protected. So the contianer admin could attack through a /proc
> > symlink.
>
> Hmmm... ok...
>
> I just checked upstream Git and the location you refer to is not using
> safe_mount either there [1]

Huh, that's odd. Yes those should be protected, since /proc etc in
the container could be symlinks. Do you mind sending a patch?

> Furthermore, it seems non-trivial to inform safe_mount about the root
> path from within lxc_init.c.
>
> Do you have any input on the following questions?:
>
> o Why mount_fs() in latest HEAD still using the mount() call
> instead of safe_mount()?
> o How could one pipe the rootfs path into lxc_setup_fs() -> mount_fs()?

You shouldn't need to - it's just '/' because you're already chrooted
there.

> Thanks for any input.
>
> Mike
>
> [1] https://github.com/lxc/lxc/blob/master/src/lxc/initutils.c#L35
> --
>
> DAS-NETZWERKTEAM
> mike gabriel, herweg 7, 24357 fleckeby
> fon: +49 (1520) 1976 148
>
> GnuPG Key ID 0x25771B31
> mail: <email address hidden>, http://das-netzwerkteam.de
>
> freeBusy:
> https://mail.das-netzwerkteam.de/mailxchange/kronolith/fb.php?u=m.gabriel%40das-netzwerkteam.de
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> Matching subscriptions: lxc
> https://bugs.launchpad.net/bugs/1476662
>
> Title:
> lxc-start symlink vulnerabilities may allow guest to read host
> filesystem, interfere with apparmor
>
> Status in lxc package in Ubuntu:
> Fix Released
>
> Bug description:
> lxc-start shuffles around mounts using helper directory
> /usr/lib/x86_64-linux-gnu/lxc (guest root fs is mounted here)
>
> It then modifies mounts operating in guest root directory before
> invoking init. As it does not check if all mount points are
> directories, a malicious guest may modify its internal structure
> before shutdown (or was created using manipulated image) and then when
> started again, guest may
>
> * Access the whole host root filesystem
>
> * Block switching from lxc-start apparmor profile to lxc-container-
> default
>
>
> # Real putold before pivot-root (root fs will end here)
> mkdir -p /x/lxc_putold
>
> # Faked putold
> ln -s /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold lxc_putold
> mkdir -p /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc
> touch /usr/lib/x86_64-linux-gnu/lxc/x/lxc_putold/proc/mounts
>
>
> # proc fake
> mkdir -p /x/proc
> umount /proc
> rmdir /proc
> ...

Read more...

Revision history for this message
Mike Gabriel (sunweaver) wrote :

Hi Serge,

On Mo 04 Jan 2016 21:26:05 CET, Serge Hallyn wrote:

> Quoting Mike Gabriel (<email address hidden>):
>> Hi Serge,
>>
>> sorry for getting back to this so late.
>>
>> On Di 08 Dez 2015 17:08:58 CET, Serge Hallyn wrote:
>>
>> > Quoting Mike Gabriel (<email address hidden>):
>>
>> >> today I worked on backporting available fixes for CVE-2015-1335 to LXC
>> >> 0.7.x (as found in Debian squeeze-lts).
>> >>
>> >> The patch is attached, I am still in the testing-for-regressions phase.
>> >> Can any of the LXC devs take a look at the patch and maybe see if it is
>> >> suitable for Ubuntu 12.04, as well?
>> >
>> > Hi,
>> >
>> > So the thing to look for is any unconverted "mount" calls. It
>> > looks like the lxc_setup_fs() calls to mount_fs() are not being
>> > protected. So the contianer admin could attack through a /proc
>> > symlink.
>>
>> Hmmm... ok...
>>
>> I just checked upstream Git and the location you refer to is not using
>> safe_mount either there [1]
>
> Huh, that's odd. Yes those should be protected, since /proc etc in
> the container could be symlinks. Do you mind sending a patch?

I will work on the squeeze-lts / precise patch first and test that. If
that works well, I will forward-port the change to current HEAD.

>> Furthermore, it seems non-trivial to inform safe_mount about the root
>> path from within lxc_init.c.
>>
>> Do you have any input on the following questions?:
>>
>> o Why mount_fs() in latest HEAD still using the mount() call
>> instead of safe_mount()?
>> o How could one pipe the rootfs path into lxc_setup_fs() -> mount_fs()?
>
> You shouldn't need to - it's just '/' because you're already chrooted
> there.
>

Ok. That will make it very easy.

I get back to you with results within the month.

Mike
--

DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
fon: +49 (1520) 1976 148

GnuPG Key ID 0x25771B31
mail: <email address hidden>, http://das-netzwerkteam.de

freeBusy:
https://mail.das-netzwerkteam.de/mailxchange/kronolith/fb.php?u=m.gabriel%40das-netzwerkteam.de

To post a comment you must log in.