Unable to start lxc container after update to 2.6.32-32

Bug #790863 reported by HeinMueck on 2011-05-31
174
This bug affects 28 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Lucid
High
Tim Gardner
Precise
High
Unassigned
vsftpd (Ubuntu)
High
Unassigned
Lucid
High
Jamie Strandboge
Precise
High
Unassigned

Bug Description

Starting a container with 2.6.32-32 leads to the errors below, container will not start:

lxc-start 1306854843.605 ERROR lxc_namespace - failed to clone(0x6c020000): Invalid argument
lxc-start 1306854843.605 ERROR lxc_start - Bad file descriptor - failed to fork into a new namespace

When booting 2.6.32-31 again the container works fine.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-31-generic-pae 2.6.32-31.61
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-31.61-generic-pae 2.6.32.32+drm33.14
Uname: Linux 2.6.32-31-generic-pae i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386

dmi.sys.vendor: LENOVO

HeinMueck (cperz) wrote :
Serge Hallyn (serge-hallyn) wrote :

Thanks very much for reporting this bug.

This has unfortunately been caused by the resolution to bug 788602. I'm not sure how feasible it is to do a more complete solution which fixes bug 788602 with CONFIG_NET_NS enabled, but it seems important to me to find it.

If noone else wants to tackle that, please assign this bug to me and I'll take a look.

Changed in linux (Ubuntu):
status: New → Confirmed
importance: Undecided → Critical
Ulli Horlacher (framstag) wrote :

I have the same problem. A workaround for me was:

apt-get install linux-image-server-lts-backport-maverick

Nevertheless the default linux kernel should not break other packages on a LTS system!

My server is a virtualization host for many LXC VMs and after a "aptitude upgrade" I was not able any more to boot them. This is a VERY bad situation in a production environment!

On LTS the maintainers should not change the kernel ABI! It's ok for a testing, unstable, experimental, etc version, but NOT FOR LTS!

And yes, I am angry!

Jeremy Yoder (jyoder) wrote :

Seriously guys... What the heck were you thinking? The whole reason people choose to run an LTS is to avoid this kind of thing.

I can downgrade for now, but please fix soon!

Rush Tonop Online (rush-online) wrote :

Holy shit! How can you (Canonical) do this with LTS?!?!? What's a crap?

@Ulli Horlacher thank you very much!

Jeremy Yoder (jyoder) wrote :

Reading through the discussion on the kernel list, I think the flawed logic in this decision was: "Let's avoid a POTENTIAL regression due to lots of lines of code changed by introducing a DEFINITE regression". I think we've all been there. Still, not a great call.

Brian Parma (bj0) wrote :

I still can't lxc-start in 2.6.32-33

Alex Bligh (ubuntu-alex-org) wrote :

Using 2.6.38 backport is NOT a workaround. See #843892.

The only way to fix this is reverting #720095 and fixing that in userspace (i.e. in vsftp)

Kai Blin (kai-blin-biotech) wrote :

Also, the 2.6.35 backport is broken at the moment, see bug #847828.

HeinMueck (cperz) on 2011-10-01
description: updated
HeinMueck (cperz) wrote :

Is there a way to find out if the existance of this bug report leads to anything? Has it been noticed, considered, discussed, rejected? Anything? If its not being done, someone should close this bug and explain it.

Alex Bligh (ubuntu-alex-org) wrote :

Whilst I find the substitution of a possible regression for a definite regression a bit of a bizarre choice on Canonical's part (especially as it's entirely possible to disable the possible regression with a compile flag to the package in question), a workaround that should work is to install the new Oneiric kernel now bug #843892 is fixed. Of course that doesn't give you LTS support for your kernel, and I don't know of a way that you can even get security updates automatically. However, it will let you start lxc.

I realise that this workaround is effectively "so don't use Ubuntu 10.04 LTS kernel then", but it might help.

Jeremy Yoder (jyoder) wrote :

The Maverick and Natty kernels also "work" for some people. However, depending on the rest of your configuration, installing backported kernels can break other things. Definitely not the ideal answer. The fact that Maverick backport was broken for a month is an example.

I'd complain about how long this fix has taken, but you can pay a lot for an operating system and get worse support, so I won't :) Instead I'll say thanks to everyone at Canonical for all the stuff that DOES work, and please try to fix this properly soon, it's been 4 months.

We were looking into using Ubuntu 10.04 + LXC for our SAAS services, but because of this bug are now looking at Debian Squeeze. Come on Canonical, some statement from a Ubuntu dev would go a long way here.

tags: added: kernel-key
Serge Hallyn (serge-hallyn) wrote :

@Joseph,

re-enabling CONFIG_NET_NS has been rejected. As I understand it, because backport kernels are supported on LTS releases, that is seen as the right path for those requiring network namespaces.

Serge Hallyn wrote:
> @Joseph,
>
> re-enabling CONFIG_NET_NS has been rejected. As I understand it,
> because backport kernels are supported on LTS releases, that is seen as
> the right path for those requiring network namespaces.

At least one other person on this ticket asserted that the backport
kernels are too unreliable for their taste. Bringing in a whole new
kernel to reenable one option sounds like lunacy to me.

I am not enthusiastic about this solution -- I would rather reroll the
LTS 2.6.32 kernels (as they appear on lucid-security) with a trivial
in-house patch that reenables cgroups, than to run a completely new
kernel from backports.

And indeed, as soon as I see a kernel USN that worries me
sufficiently, that is exactly what I will do. Until then I have
simply pinned the kernel at 2.6.32-31.

Jeremy Yoder (jyoder) wrote :

I've been pinned at 2.6.32-31 since this began. I experimented with the various backports but each caused more problems than it solved.

Bad answer Canonical. Sigh. At least 12.04 is only 6 months away, plus 3 months to stabilize.

By the way, where exactly was this decision documented? I'd like to read the reasoning.

Stefan Bader (smb) wrote :

The reasoning behind that can be found in bug #720095. Basically vsftp was found to be one case of using NET_NS and the way network namespaces work in 2.6.32 they can be quickly created but take an awful long time on teardown. So a quick sequence of cloning a process with a new network namespace and ending it quickly can be used to make a system run out of memory.

The behaviour is much better in 2.6.35, but the code was massively changed in between. So trying to bring that back to .32 would end up in porting most of the network changes. We cannot do this in a stable release without risking regressions. So it is either leaving a potential OOM vector open or to disable the support. The LTS-backports kernels are supposed to be close the gap between needing new kernel functionality but staying at the LTS release. What exactly were the "more problems" that were encountered and in which kernel?

Henrik Holmboe (holmboe) wrote :

Stefan,

I'd like to use your argumentation, but in reverse.

Disabling NET_NS is what is _causing_ a regression. Quick setup with slow tear-down is a _current_ behaviour of the LTS release. If the users of this LTS release wants to have another behaviour with vsftpd, then _they_ should seek to use the backported kernels for new kernel functionality.

Thanks,
Henrik

Kai Blin (kai-blin-biotech) wrote :

Earlier this year the LTS-backport kernels didn't have a valid header package, thus breaking dkms-based kernel modules, see bug #824080 . It took over a month to fix that, not giving much of an impression that LTS-backport kernels are really supported.

Jeremy Yoder (jyoder) wrote :

Stefan,

I have to agree with Henrik regarding the logic seeming backwards, but I suppose that's because the "solution" hurt me.

Did Serge Hallyn ever take a look at it? I got the impression someone was going to try to backport part of the fix, but it sounds like that path was rejected.

Regarding the Maverick and Natty backport kernels, I had the problem Kai had (the Maverick backport was broken and it took a MONTH to fix, which is not what I consider "fully supported") along with issues with drivers for my video card and LCD display. I haven't looked for or tried an Oneiric backport kernel yet, but at this point I'm definitely leery.

The main source of frustration comes from the sense that Canonical deliberately introduced a regression (disabling a kernel feature that had previously been enabled) to avoid a potential regression (backporting the fix) and the workaround of using a backport kernel would have been fine except that it wasn't.

Does vsftpd retry without CLONE_NEWNET if attempt with CLONE_NEWNET failed?
If yes, reenabling CONFIG_NET_NS with this (untested) patch might help.
Even if no, reenabling CONFIG_NET_NS with this patch with
adequate initial value (e.g. 64) assigned to max_netns_count might help.

tags: added: patch

I posted this topic to netdev ML.

http://www.spinics.net/lists/netdev/msg180263.html

According to Eric W. Biederman, Debian has tweaked
vsftpd to not use network namespaces on 2.6.32.

By combining tweaking vsftpd and this patch, I think
we might be able to reenable CONFIG_NET_NS.
What do you think?

Jamie Strandboge (jdstrand) wrote :

vsftpd needs this patch from Debian's 10-remote-dos.patch 2.3.4-1. vsftpd in precise already has this.

Changed in linux (Ubuntu Lucid):
status: New → Confirmed
importance: Undecided → Critical
Changed in linux (Ubuntu Precise):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Lucid):
importance: Critical → High
Changed in vsftpd (Ubuntu Lucid):
status: New → Triaged
importance: Undecided → High
Changed in linux (Ubuntu Precise):
importance: Critical → High
Changed in vsftpd (Ubuntu Precise):
status: New → Fix Released
importance: Undecided → High
Changed in vsftpd (Ubuntu Lucid):
assignee: nobody → Jamie Strandboge (jdstrand)
Changed in vsftpd (Ubuntu Lucid):
status: Triaged → Fix Committed
Tim Gardner (timg-tpi) wrote :

Tetsuo - I'll forward your patch for review on the Ubuntu kernel team mailling list. If accepted, then I'm OK with restoring CONFIG_NET_NS. The most common OOM cause (vsftpd) will have been ameliorated with the application of the patch for CVE-2011-2189 , and your patch should prohibit fatal OOM conditions.

Tim Gardner (timg-tpi) on 2011-12-01
Changed in linux (Ubuntu Lucid):
assignee: nobody → Tim Gardner (timg-tpi)
status: Confirmed → In Progress
Tim Gardner (timg-tpi) wrote :

Test kernels at http://kernel.ubuntu.com/~rtg/790863.1

Corresponding sources at git://kernel.ubuntu.com/rtg/ubuntu-lucid.git config-ns-790863, tag 790863.1

Tim Gardner (timg-tpi) wrote :

Don't forget to do something like:

echo 4096 | sudo tee /proc/sys/net/core/netns_max

before attempting to fork with CLONE_NEWNET.

Vladimir Osintsev (osintsev) wrote :

This parameter must be set before LXC containers starts? It can be setted via sysctl.conf?

Jamie Strandboge (jdstrand) wrote :

vsftpd (2.2.2-3ubuntu6.3) lucid-security; urgency=low

  * SECURITY UPDATE: remote DoS via network namespaces
    - debian/patches/12-CVE-2011-2189.patch: only use network namespaces
      on 2.6.36 and higher kernels
    - patch based on Debian's patch
    - CVE-2011-2189

Changed in vsftpd (Ubuntu Lucid):
status: Fix Committed → Fix Released
Tim Gardner (timg-tpi) wrote :

Where have all my testers gone? This patch isn't going in without some feedback.

Kai Blin (kai-blin-biotech) wrote :

Ok, my containers do start up again and seem to behave ok. I'm happy to test this for a more extended period of time.

SaveTheRbtz (savetherbtz) wrote :

>> Where have all my testers gone? This patch isn't going in without some feedback.
Hmm.. Half a year... To Debian I guess....

Jeremy Yoder (jyoder) wrote :

Tim,

Thanks for getting a test version out! I've been swamped the last few weeks but I can test it out this weekend and post my results. I also admit I thought this was a VSFTP patch (from Jamie's note) when I skimmed it the first time, not a new kernel. Oops :)

Jeremy Yoder (jyoder) wrote :

So far so good. My container is running fine along with everything else. I'll keep running this build and report if I see any issues, but I think it's good. Let me know if there's anything else I need to test.

Tim Gardner (timg-tpi) wrote :

The final patch has a non-zero initial value for max_netns_count, plus it prints a warning once when the number of allocated network name spaces exceeds max_netns_count.

Changed in linux (Ubuntu Lucid):
status: In Progress → Fix Committed
damiens (damiens-robert) on 2012-01-05
Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
damiens (damiens-robert) wrote :

I changed by error the fix from comitted to released ... Sorry !

Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem (2.6.32-38.83). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lucid' to 'verification-done-lucid'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Changed in linux (Ubuntu Lucid):
status: Fix Released → Fix Committed
tags: added: verification-needed-lucid
tags: removed: kernel-key needs-upstream-testing
Brad Figg (brad-figg) wrote :

We need to get this kernel tested so that it can be included on the next Lucid point release DVD. Please test if you can.

Jeremy Yoder (jyoder) wrote :

Verified. Tag updated to verification-done-lucid

tags: added: verification-done-lucid
removed: verification-needed-lucid
Launchpad Janitor (janitor) wrote :
Download full text (4.4 KiB)

This bug was fixed in the package linux - 2.6.32-38.83

---------------
linux (2.6.32-38.83) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #911405

  [ Upstream Kernel Changes ]

  * Revert "clockevents: Set noop handler in clockevents_exchange_device()"
    - LP: #911392
  * Linux 2.6.32.52
    - LP: #911392

linux (2.6.32-38.82) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #910906

  [ Tetsuo Handa ]

  * SAUCE: netns: Add quota for number of NET_NS instances.

  [ Tim Gardner ]

  * [Config] CONFIG_NET_NS=y
    - LP: #790863

  [ Upstream Kernel Changes ]

  * Revert "core: Fix memory leak/corruption on VLAN GRO_DROP,
    CVE-2011-1576"
  * hfs: fix hfs_find_init() sb->ext_tree NULL ptr oops, CVE-2011-2203
    - LP: #899466
    - CVE-2011-2203
  * net: ipv4: relax AF_INET check in bind()
    - LP: #900396
  * KEYS: Fix a NULL pointer deref in the user-defined key type,
    CVE-2011-4110
    - LP: #894369
    - CVE-2011-4110
  * i2c-algo-bit: Generate correct i2c address sequence for 10-bit target
    - LP: #902317
  * eCryptfs: Extend array bounds for all filename chars
    - LP: #902317
  * PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
    - LP: #902317
  * ARM: 7161/1: errata: no automatic store buffer drain
    - LP: #902317
  * ALSA: lx6464es - fix device communication via command bus
    - LP: #902317
  * SUNRPC: Ensure we return EAGAIN in xs_nospace if congestion is cleared
    - LP: #902317
  * timekeeping: add arch_offset hook to ktime_get functions
    - LP: #902317
  * p54spi: Add missing spin_lock_init
    - LP: #902317
  * p54spi: Fix workqueue deadlock
    - LP: #902317
  * nl80211: fix MAC address validation
    - LP: #902317
  * gro: reset vlan_tci on reuse
    - LP: #902317
  * staging: usbip: bugfix for deadlock
    - LP: #902317
  * staging: comedi: fix oops for USB DAQ devices.
    - LP: #902317
  * Staging: comedi: fix signal handling in read and write
    - LP: #902317
  * USB: whci-hcd: fix endian conversion in qset_clear()
    - LP: #902317
  * usb: ftdi_sio: add PID for Propox ISPcable III
    - LP: #902317
  * usb: option: add SIMCom SIM5218
    - LP: #902317
  * USB: usb-storage: unusual_devs entry for Kingston DT 101 G2
    - LP: #902317
  * SCSI: scsi_lib: fix potential NULL dereference
    - LP: #902317
  * SCSI: Silencing 'killing requests for dead queue'
    - LP: #902317
  * cifs: fix cifs stable patch cifs-fix-oplock-break-handling-try-2.patch
    - LP: #902317
  * sched, x86: Avoid unnecessary overflow in sched_clock
    - LP: #902317
  * x86/mpparse: Account for bus types other than ISA and PCI
    - LP: #902317
  * oprofile, x86: Fix crash when unloading module (nmi timer mode)
    - LP: #902317
  * genirq: Fix race condition when stopping the irq thread
    - LP: #902317
  * tick-broadcast: Stop active broadcast device when replacing it
    - LP: #902317
  * clockevents: Set noop handler in clockevents_exchange_device()
    - LP: #902317
  * Linux 2.6.32.50
    - LP: #902317
  * nfsd4: permit read opens of executable-only files
    - LP: #833300
  * ipv6: Allow inet6_dump_addr() to handle more t...

Read more...

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.