Regression in overlayfs

Bug #1490267 reported by Thomas Müller
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Andy Whitcroft

Bug Description

When installing the linux-lts-vivid kernel in Ubuntu 14.04, there is a regression with overlayfs and unprivileged containers: 'failed to whiteout'

Below the system information:

- Kernel error: http://pastebin.com/iajXHbHZ
- Kernel version: 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
- Executed LXC command: lxc-clone -s -B overlayfs orig test (both unprivileged containers)
- LXC version: 1.0.7

We discussed this on the LXC user mailing list, and this is likely a bug in one of the patches that add support for unprivileged use of overlayfs.
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Aug 30 13:56 seq
 crw-rw---- 1 root audio 116, 33 Aug 30 13:56 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.12
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-fceee9e1
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-central-1b
Ec2InstanceType: t2.micro
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: Xen HVM domU
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:
 0 cirrusdrmfb
 1 xen
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-26-generic root=UUID=dde0e30e-be54-43dc-9a44-5769353c2e27 ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 3.19.0-26.28~14.04.1-generic 3.19.8-ckt4
RelatedPackageVersions:
 linux-restricted-modules-3.19.0-26-generic N/A
 linux-backports-modules-3.19.0-26-generic N/A
 linux-firmware 1.127.15
RfKill: Error: [Errno 2] No such file or directory
StagingDrivers: visorutil
Tags: trusty ec2-images staging
Uname: Linux 3.19.0-26-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 05/06/2015
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd05/06/2015:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1490267

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Thomas Müller (muellerthomas977) wrote : BootDmesg.txt

apport information

tags: added: apport-collected ec2-images staging trusty
description: updated
Revision history for this message
Thomas Müller (muellerthomas977) wrote : CRDA.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : Lspci.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : ProcModules.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : UdevDb.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : UdevLog.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote : WifiSyslog.txt

apport information

Revision history for this message
Thomas Müller (muellerthomas977) wrote :

Added logs as requested.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Andy Whitcroft (apw) wrote :
Download full text (14.4 KiB)

PASTEBIN | #1 paste tool since 2002

    create new pasteshop new!toolsapiarchivefaq

PASTEBIN
create new paste trending pastes

    sign uploginmy alertsmy settingsmy profile

Want more features on Pastebin? Sign Up, it's FREE!
Public Pastes

    UntitledLua | 1 sec ago
    Untitled7 sec ago
    Untitled5 sec ago
    Untitled19 sec ago
    Untitled27 sec ago
    Untitled37 sec ago
    Untitled38 sec ago
    Untitled46 sec ago

0
0
Guest
Untitled
By: a guest on Aug 14th, 2015 | syntax: None | size: 4.45 KB | views: 210 | expires: Never
download | raw | embed | report abuse | print | QR code | clone

Aug 14 10:17:52 dev kernel: [16055.793800] overlayfs: ERROR - failed to whiteout '#ffff88008f3c9240'
Aug 14 10:17:52 dev kernel: [16055.796903] overlayfs: ERROR - failed to whiteout '#ffff88008f397480'
Aug 14 10:17:52 dev kernel: [16055.798370] BUG: unable to handle kernel NULL pointer dereference at 0000000000000057
Aug 14 10:17:52 dev kernel: [16055.800024] IP: [<ffffffff813b4d1c>] lockref_put_or_lock+0xc/0x90
Aug 14 10:17:52 dev kernel: [16055.800428] PGD 215092067 PUD 1b00bb067 PMD 0
Aug 14 10:17:52 dev kernel: [16055.800785] Oops: 0000 [#1] SMP
Aug 14 10:17:52 dev kernel: [16055.800973] Modules linked in: veth overlay vboxsf pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables dm_crypt nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc fscache joydev ppdev serio_raw parport_pc parport 8250_fintek mac_hid i2c_piix4 vboxvideo drm vboxguest nls_utf8 isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper psmouse cryptd ahci e1000 libahci video
Aug 14 10:17:52 dev kernel: [16055.806264] CPU: 5 PID: 11051 Comm: mv Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu
Aug 14 10:17:52 dev kernel: [16055.807276] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Aug 14 10:17:52 dev kernel: [16055.808139] task: ffff8802149f1d70 ti: ffff8801b0004000 task.ti: ffff8801b0004000
Aug 14 10:17:52 dev kernel: [16055.808949] RIP: 0010:[<ffffffff813b4d1c>] [<ffffffff813b4d1c>] lockref_put_or_lock+0xc/0x90
Aug 14 10:17:52 dev kernel: [16055.809790] RSP: 0018:ffff8801b0007c08 EFLAGS: 00010292
Aug 14 10:17:52 dev kernel: [16055.810246] RAX: 0000000000000001 RBX: 0000000000000057 RCX: 0000000100060006
Aug 14 10:17:52 dev kernel: [16055.810750] RDX: 000000000006bfa0 RSI: ffff88008f23d5c0 RDI: 0000000000000057
Aug 14 10:17:52 dev kernel: [16055.811178] RBP: ffff8801b0007c18 R08: 0000000000000000 R09: ffff8800000bee00
Aug 14 10:17:52 dev kernel: [16055.811810] R10: 0000000000000000 R11: ffff8801b00078be R12: 0000000000000057
Aug 14 10:17:52 dev kernel: [16055.812466] R13: ffff8800afbca600 R14: ffffffffffffffff R15: 0000000000000000
Aug 14 10:17:52 dev kernel: [16055.813886] FS: 00007f99165f0840(0000) GS:ffff88021fd40000(0000) knlGS:0000000000000000
Aug 14 10:17:52 dev kernel: [16055.814653] CS: 0010 DS: 0000 ES: 0000 CR0: 00000000800500...

Changed in linux (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Andy Whitcroft (apw)
milestone: none → ubuntu-15.09
Revision history for this message
Andy Whitcroft (apw) wrote :

@Thomas -- would you be able to give me a reproduce by from a "no containers at all" situation to where this triggers. Make it easier for me to try and find.

Revision history for this message
Thomas Müller (muellerthomas977) wrote :

Sure, assuming that you are a regular user (non-root), run the following commands:

- lxc-create -t download -n base -- --dist ubuntu --release trusty --arch amd64
- lxc-clone -s -B overlayfs base test
- lxc-start -n test -d
- lxc-attach -n test -- apt-get update

You'll see the "apt-get update" call starting to fetch packages, but it will be killed right away. When you then view the syslog on the host system, you'll see the stacktrace above. I hope this was what you were looking for. I also have an AWS image which has everything set-up that I can give you access to if that helps.

You seem to have tagged this for ubuntu-15.09, is there a chance this is also fixed in the 14.04 LTS release?

Revision history for this message
oleg (overlayfs) wrote :

Some related information:

A thread on the lxc mailing-list:
https://lists.linuxcontainers.org/pipermail/lxc-users/2015-August/009854.html

A related bug filed against lxc:
https://bugs.launchpad.net/lxc/+bug/1486073

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: ubuntu-15.09 → ubuntu-15.10
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: ubuntu-15.10 → ubuntu-15.11
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: ubuntu-15.11 → ubuntu-15.12
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: ubuntu-15.12 → ubuntu-16.01
Revision history for this message
Alex T. (m-u1tst-20-alext) wrote :

I observe a repeatable crash on Wily; when the system is rebooted, the callstack in the crash report is similar to the one in comment #13 ("BUG: unable to handle kernel NULL pointer dereference", dput - ovl_rename2 - vfs_rename - ...).

The crash is triggered by the usage of an unprivileged LXC container with overlayfs storage.

It rarely happens on kernel 4.2.0-22, much more frequently with kernel 4.2.0-23, very reliably with 4.2.0-25. Also it seems that replacing system RAM with a faster one increased the incidence of the crash.

Unlike Thomas's case in #15, my system locks up completely (only Ctrl-Alt-Shift-SysRq-B works).

I hope I'm not off topic here; my kernel is different from the one in the original report but the stack trace is similar so I think it might be the same root cause. I'll be glad to provide any additional info or perform experiments.

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: ubuntu-16.01 → ubuntu-16.02
Revision history for this message
oleg (overlayfs) wrote :

I can no longer reproduce this bug, following the release of fixes for bugs #1531747, #1534961 and #1535150 .

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.