make the "--close-all-fds" option in lxc-start on by default

Bug #1003583 reported by Andrey Kislyuk
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
Fix Released
Stéphane Graber

Bug Description

SRU Justification
Impact: callers of lxc-start can have confusing failures to start
Development fix: always close all fds
Stable fix: same as development fix
Test case:
  1. lxc-create -t ubuntu -n p1
  2. cat > execme.c << EOF
#include <stdio.h>
#include <unistd.h>

int main()
        FILE *f = fopen("/tmp/ab", "w");
        //int ret = execl("/usr/bin/lxc-start", "/usr/bin/lxc-start", "-d", "-n", "p1", NULL);
        int ret = execl("/usr/bin/lxc-start", "/usr/bin/lxc-start", "-n", "p1", NULL);
        printf("should not be here (ret %d)\n", ret);
   3. make execme
   4. sudo ./execme
   Without this patch, the container will fail to start with an error message
   saying an fd was inherited. With the patch, the container will start (and
   the fd will have been closed)
Regression potential: if anyone was counting on lxc-start to fail when started
with open fds, that will no longer happen.
Since the lxc-start command quits with an error if any inherited FDs are not disconnected, why is it necessary to specify --close-all-fds at all?

(I just spent an hour debugging a situation where some FDs were falling through from parent processes only in the remote deployment configuration...)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

In quantal the default is now to not fail on inherited fds.

In precise, the default is still to fail. Since it sends a warning debug
message, I'm not sure that changing this behavior in an SRU is warranted.
Not that I can think of any case where changing the behavior would cause
a regression.

Changed in lxc (Ubuntu):
importance: Undecided → Low
status: New → Fix Released
Changed in lxc (Ubuntu Precise):
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
Andrey Kislyuk (weaver) wrote :

My issue is that debugging is made harder - errors can be highly non-obvious because the default shell behavior is to pass the FDs through, and they could have been inserted many processes upstream depending on where you're calling the program from.

If you decide to release a fix for this in Precise, now would be the time while it's fresh and adoption is low.

Revision history for this message
Andrey Kislyuk (weaver) wrote :

Also, "inherited fd from parent" followed by a crash was a bit hard to follow for me. Maybe add on something like "Inheritance of FDs is not allowed. Use --close-all-fds to close them."

Revision history for this message
Andrey Kislyuk (weaver) wrote :

(By the way, thanks for your great work on the LXC infrastructure in Ubuntu! I'm especially impressed by the App Armor stuff.)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Good point, the error message needs to be clearer about the fact that it caused the container to fail to start.

And, so long as we're doing that, maybe we may just as well default to closing the fds.

description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :

Just thought it'd be worth mentioning that this doesn't affect Quantal as upstream changed behaviour in recent versions, the new upstream behaviour is still wrong but won't cause this bug (proposed improvement was sent to the mailing list).

Changed in lxc (Ubuntu Precise):
status: Confirmed → Fix Committed
assignee: nobody → Stéphane Graber (stgraber)
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Please test proposed package

Hello Andrey, or anyone else affected,

Accepted lxc into precise-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
Stéphane Graber (stgraber) wrote :

fix confirmed

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 0.7.5-3ubuntu58

lxc (0.7.5-3ubuntu58) precise-proposed; urgency=low

  * Fix broken logic in lxc-ubuntu template where lxc.devttydir would be
    set to 'lxc' only for releases that don't support it. (LP: #1007493)

lxc (0.7.5-3ubuntu57) precise-proposed; urgency=low

  [ Serge Hallyn ]
  * 0083-always-close-all-fds.patch: Have lxc-start always run with
    --close-all-fds. There is no advantage to having lxc-start fail with
    inherited fds. (LP: #1003583)
  * debian/lxc-net.upstart: don't put '()' after call to cleanup.
    (LP: #1000174)

  [ Stéphane Graber ]
  * Sync lxc-ubuntu with the one in Quantal:
    - Bugfixes:
      + Update list of extra packages for debootstrap to only include vim
        and ssh. The others were only relevant when we were still using the
        minbase variant. (LP: #996839)
      + Update default /etc/hosts to match that of a regular Ubuntu system.
        (adds missing ipv6 aliases) (LP: #1004108)
      + Make sure /etc/resolv.conf is valid before running any apt command.
        Fixes a potential race condition (no report of it at this time).

    - Improvements we get by pulling the whole patch from Quantal.
      These don't contain any user behaviour change but will make
      cherry-picking any further change much easier.
      + Drop any hardcoded Ubuntu version check and replace by feature
        checks instead. This removes the need for SRUs whenever we release
        a new Ubuntu.
      + Format lxc-ubuntu to consistently use 4-spaces indent instead
        of mixed spaces/tabs.
      + Update default /etc/network/interfaces to include the header.
      + Drop support for never supported releases (gutsy on sparc).
      + Update template help message for release and arch parameters.
        Old string was only listing i386 and amd64, which is no longer
        accurate (as of 12.04).
        (This string isn't translated)
      + Switch default Ubuntu version from lucid to precise for systems
        that don't have lsb_release (won't affect Ubuntu)

  * Sync lxc-start-ephemeral with the one in Quantal:
    - Switch lxc-start-ephemeral from unreliable parsing of DHCP lease files
      to using "ip netns" to retrieve the IP from the container's network
      namespace. (LP: #994752)
    - Fix a race in lxc-start-ephemeral where the container isn't yet
      running when trying to get its IPs.
    - Update a few calls so that lxc-start-ephemeral can be called as a
      user (ensure consistent usage of sudo across the script). (LP: #1004069)
 -- Stephane Graber <email address hidden> Fri, 01 Jun 2012 11:46:50 -0400

Changed in lxc (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.