Comment 23 for bug 1639345

Revision history for this message
Christian Brauner (cbrauner) wrote :

I was to optimistic about the userspace fix. The fix alone might minimize the attack surface but unfortunately we seem to need the kernel fix. The child we attach in lxc-attach wants to change its LSM label appropriately right before exec() and for that it needs an fd to /proc/self/attr/current. So we seem to always have such an fd around. But what we can do is instead of passing an fd to /proc itself around, is to open up an fd to /proc/self/attr/current in the parent and send it to the child. This might minimize the attack surface but we still need the kernel fix. I post an updated version of the patch I sent before here and I'll keep thinking a little more on how we can avoid having to pass any procfd around. But I doubt it. The more complex solution I outlined above, involving a second lxc_clone() which serves as a simple chrooting process to place is into a isolated set of namespaces is an additional attack surface minimizer. @Stéphane, do you think it be worth adding another process that chroots/minimally namspaces us before attaching to the childs namespaces?

Here's the outline of the current patch:
    So far, we opened a file descriptor refering to proc on the host inside the
    host namespace and handed that fd to the attached process in
    attach_child_main(). This was done to ensure that LSM labels were correctly
    setup. However, by exploiting a potential kernel bug, ptrace could be used to
    prevent the file descriptor from being closed which in turn could be used by an
    unprivileged container to gain access to the host namespace. Aside from this
    needing an upstream kernel fix, we should make sure that we don't pass the fd
    for proc itself to the attached process. However, we cannot completely prevent
    this, as the attached process needs to be able to change its apparmor profile
    by writing to /proc/self/attr/exec or /proc/self/attr/current. To minimize the
    attack surface, we only send the fd for /proc/self/attr/exec or
    /proc/self/attr/current to the attached process. To do this we introduce a
    little more IPC between the child and parent:

             * IPC mechanism: (X is receiver)
             * initial process intermediate attached
             * X <--- send pid of
             * attached proc,
             * then exit
             * send 0 ------------------------------------> X
             * [do initialization]
             * X <------------------------------------ send 1
             * [add to cgroup, ...]
             * send 2 ------------------------------------> X
             * [set LXC_ATTACH_NO_NEW_PRIVS]
             * X <------------------------------------ send 3
             * [open LSM label fd]
             * send 4 ------------------------------------> X
             * [set LSM label]
             * close socket close socket
             * run program

    The attached child tells the parent when it is ready to have its LSM labels set
    up. The parent then opens an approriate fd for the child PID to
    /proc/<pid>/attr/exec or /proc/<pid>/attr/current and sends it via SCM_RIGHTS
    to the child. The child can then set its LSM laben. Both sides then close the
    socket fds and the child execs the requested process.