installing or upgrading libc6 in Trusty removes all content from /tmp directory

Bug #1464442 reported by Larry Michel
32
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
upstart (Ubuntu)
Triaged
Low
Unassigned

Bug Description

We are seeing an issue with installation of dkms package during a curtin installation which ends up with /tmp directory being wiped clean. This is very bad for curtin as it saves critical installation files in /tmp.

It turns out that it's the of upgrading libc6, which is triggered as a result of installing dependencies, that removes content of /tmp. For example, installation of gcc results in the same result since it ends up with libc6 being upgraded. The only way that this won't be recreated is if the latest libc6 is already installed.

This problem does not exist in precise. It can also be recreated by installing the .deb file for any version in trusty including 2.17.

========================================================================
ubuntu@host:~$ ls /tmp
tmpHHbRkP
ubuntu@sirrush:~$ sudo apt-get install libc6
sudo: unable to resolve host sirrush
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  libc-dev-bin libc6-dev
Suggested packages:
  glibc-doc
Recommended packages:
  manpages-dev
The following packages will be upgraded:
  libc-dev-bin libc6 libc6-dev
3 upgraded, 0 newly installed, 0 to remove and 148 not upgraded.
Need to get 6,714 kB of archives.
After this operation, 6,144 B disk space will be freed.
Do you want to continue? [Y/n] y
Get:1 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc6-dev amd64 2.19-0ubuntu6.6 [1,910 kB]
Get:2 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc-dev-bin amd64 2.19-0ubuntu6.6 [68.9 kB]
Get:3 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc6 amd64 2.19-0ubuntu6.6 [4,735 kB]
Fetched 6,714 kB in 0s (18.5 MB/s)
Preconfiguring packages ...
(Reading database ... 57798 files and directories currently installed.)
Preparing to unpack .../libc6-dev_2.19-0ubuntu6.6_amd64.deb ...
Unpacking libc6-dev:amd64 (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
Preparing to unpack .../libc-dev-bin_2.19-0ubuntu6.6_amd64.deb ...
Unpacking libc-dev-bin (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
Preparing to unpack .../libc6_2.19-0ubuntu6.6_amd64.deb ...
Unpacking libc6:amd64 (2.19-0ubuntu6.6) over (2.19-0ubuntu6.3) ...
Processing triggers for man-db (2.6.7.1-1) ...
Setting up libc6:amd64 (2.19-0ubuntu6.6) ...
Setting up libc-dev-bin (2.19-0ubuntu6.6) ...
Setting up libc6-dev:amd64 (2.19-0ubuntu6.6) ...
Processing triggers for libc-bin (2.19-0ubuntu6.3) ...
ubuntu@host:~$ ls /tmp
ubuntu@host:~$
========================================================================

This is very recreatable.

Tags: oil trusty
Larry Michel (lmic)
summary: - installing or upgrading libc6 removes all content from /tmp directory
+ installing or upgrading libc6 in Trusty removes all content from /tmp
+ directory
description: updated
Revision history for this message
Larry Michel (lmic) wrote :

I was only able to recreate in curtin environment which has overlay root:

$ mount
overlayroot on / type overlayfs (rw,lowerdir=/media/root-ro/,upperdir=/media/root-rw/overlay)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
/dev/sdf on /media/root-ro type ext4 (ro)
tmpfs-root on /media/root-rw type tmpfs (rw,relatime)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,relatime,name=systemd)

Attached are process_dump files for strace -o process_dump -ff apt-get install libc6

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in eglibc (Ubuntu):
status: New → Confirmed
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

This bug is also effecting the MAAS 1.7.1 deployment of Ubuntu 14.04 onto a diskless server with access to iSCSI LUN, as described in https://bugs.launchpad.net/curtin/+bug/1425264/.

I've marked LP#1425264 as a dup of this bug.

Revision history for this message
Samantha Jian-Pielak (samantha-jian) wrote :

I'd assume this will affect any provision with trusty + hpvsa 3rd party driver as well.

Revision history for this message
Steve Langasek (vorlon) wrote :

This bug is not reproducible at all on an ordinary trusty chroot. There is nothing in any of the libc6 maintainer scripts which touches the /tmp directory directly, and nothing in the strace output shows inappropriate handling of /tmp. I think you're looking at a bug in your overlayfs implementation.

Changed in eglibc (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1464442

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: trusty
Revision history for this message
Seth Forshee (sforshee) wrote :

In addition to the requested logs could someone provide steps to reproduce, preferably in a VM without MAAS?

Revision history for this message
Larry Michel (lmic) wrote :

Brad, I need some clarification on the apport-collect script. The curtin environment is a very limited installation environment and when I ran apport-collect, it was attempting to open a socket but never returned. AFAICT, it needs to open a browser to allow user to authenticate through a browser, but I don't see a way for me to do it that way since there's no browser. Is it possible for the tool to collect this data offline then use the apport-collect to post the data from another system from which I can authenticate from? If so, can you please share the steps?

Seth, we haven't recreated this without MAAS but we think that this should re-creatable in any environment with an overlay root... You could try with a LiveCD for example. Once inside the environment, then you would verify content of /tmp then run "sudo apt-get update && sudo apt-get install libc6" assuming that the libc6 version in the archive is newer. If the content of /tmp is empty, then issue was successfully recreated.

Revision history for this message
Larry Michel (lmic) wrote :

Brad, one additional bit of info: it is through ssh that I am able to access the environment.

Revision history for this message
Stefan Bader (smb) wrote :

Ok, Seth, I found that I can see this in a VM when using the original Trusty release desktop 64bit ISO:

- boot into "Try Ubuntu"
- open a terminal (uxterm works best for me due to this stupid gfx bugs with Cirrus)
- ls /tmp (contains some files already)
- sudo apt-get update
- sudo apt-get install libc6
- ls /tmp (files are gone)

This does *not* happen using the 14.04.2 ISO with the 3.16 kernel. Interesting is that on the live-cd /tmp is a tmpfs mount, so that would rather not make overlayfs a suspect.

@Larry, can you post here which kernel version is used in the curtin environment you are looking at? Did not see that info, yet. Maybe I missed it. Right now its hard to say which other package versions might be interesting.

For libc it seems 2.19-0ubuntu6.3 -> 2.19-0ubuntu6.6 for curtin. On the 14.04.2 ISO its from 2.19-0ubuntu6.5 to the same (cannot say yet whether that is of importance).

Revision history for this message
Seth Forshee (sforshee) wrote :

I've also reproduced it in a VM using Stefan's method. Both Stefan and I have verified that stubbing out or removing /etc/init/mounted-tmp.conf eliminates the problem, so it seems that this job is getting triggered somehow.

Revision history for this message
Larry Michel (lmic) wrote :

Stefan, it's ubuntu@sirrush:~$ uname -a
Linux sirrush 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Attached is output of dpkg -l

Revision history for this message
Seth Forshee (sforshee) wrote :

http://paste.ubuntu.com/11768305/

That link shows upstart-monitor output during the libc6 upgrade. I see the mountall job starting, then later I see a mounted event for /tmp which triggers the mounted-tmp job, which is what ends up clearing out /tmp. In fact, generally it looks to me like many of the startup jobs are running, so I suspect upstart must be emitting a startup event.

Just glancing briefly at the upstart code, it seems to support a restart flag to be passed on reexec which would prevent the startup event from being emitted. The logic is a bit weird though - on re-exec upstart only seems to pass --restart if the current upstart isn't a re-exec ... so if you re-execed twice you don't want the flag?

Anyway, I think at this point I'll throw it back over to Steve, as I don't see any evidence at all that this is a kernel issue.

Revision history for this message
Seth Forshee (sforshee) wrote :

Also I forgot to mention that Chris Arges pointed out that there's a eglibc postinst script which does a re-exec of upstart, which is an important piece of the puzzle.

Revision history for this message
Chris J Arges (arges) wrote :

The problem is the following:

In debian/debhelper.in/libc.postinst, telinit u 2>/dev/null is called, which restarts all the services, and calls mounted-tmp eventually. This should be skipped for livecd or curtian environments.

Changed in eglibc (Ubuntu):
status: Invalid → Triaged
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Chris J Arges (arges) wrote :

To follow up, I've also tested this with a hacked libc6, with that line commented out /tmp does not get wiped, with it normally /tmp gets wiped. In addition it looks like there is already a clause for not rebooting init if we're in a chroot.

Revision history for this message
Steve Langasek (vorlon) wrote :

In fact, the re-exec should not be skipped for live environments. The problem is that on re-exec, the arguments that are supposed to be passed to the new process to tell it to not emit the startup event are not making it through. I know this worked at one point, so I'm not sure how this regressed, but we'll follow up on it.

affects: eglibc (Ubuntu) → upstart (Ubuntu)
Revision history for this message
Steve Langasek (vorlon) wrote :

'strace -ff -p 1 -e execve' and 'telinit u' shows that init is being execed with the expected args:

execve("/sbin/init", ["/sbin/init", "splash", "--restart", "--state-fd", "19", "--verbose"], [...])

But after re-exec, the process's args show as just '/sbin/init', and the log output shows that jobs are being triggered just as if the 'startup' event has been emitted.

Revision history for this message
Steve Langasek (vorlon) wrote :

It seems I misread the logs when I said I was seeing the jobs being triggered as if the 'startup' event has been emitted. I cannot reproduce this behavior on a stock trusty VM with a normal root. There is no evidence that 'startup' is being emitted again, and no mountall process being spawned.

> The logic is a bit weird though - on re-exec upstart only seems
> to pass --restart if the current upstart isn't a re-exec ... so if
> you re-execed twice you don't want the flag?

The point of this is that if restart is set, --restart is already present in args_copy so shouldn't be added (again).

So, since I couldn't reproduce this issue on a stock trusty VM, I'm trying now with a liveCD booted under VM since that's how it's reported to be reproducible - so I can see what the state of upstart is before re-exec.

Revision history for this message
Steve Langasek (vorlon) wrote :

Still don't know what's going on here, but it's consistently reproducible when running telinit u on a live system, and consistently *not* reproducible when running telinit u on a live system booted with --write-state-file. This should make debugging fun.

Revision history for this message
Stefan Bader (smb) wrote :

Not sure this helps or adds just more confusion, but I find it odd that with the broken environment there remain two /sbin/init processes running at least for a while... Ok, could be because the 2nd one which likely is the supposed restart is getting blocked because of the vanished files in /tmp.

Revision history for this message
Stefan Bader (smb) wrote :

Oh right and the trace command that Steve used will show a call to init early on which passes in a state fd which then, a bit later, seems to be used as command stream to a shell. A construct like that was the parent of mounted-tmp runs I observed. So I assume this is the method to execute upstart scripts... but anyway.

And oh #2 @Steve: to make things even more fun it seems that the "does not reproduce" is not depending on the use of --write-state-file but on the fact that you have to interrupt the boot and then directly boot into the live mode instead of making that selection graphically...Even more fun it seems... Oddly it seems that in those cases the execve for init seems to use a different fd number than later calls of sh use for input. But again might be just fluke...

Revision history for this message
Stefan Bader (smb) wrote :

Oh f..., I hate user-space. Now it does not even reproduce with the steps it was before...

Revision history for this message
Stefan Bader (smb) wrote :

Maybe its actually the strace attached to pid1...

Revision history for this message
Stefan Bader (smb) wrote :

So its confirmed: having a "strace -ff -p1 -e execve" running in a different uxterm while doing the install will cause the reproduction to fail. Damn Heisenbugs.

Revision history for this message
Stefan Bader (smb) wrote :

I took this log while installing libc with forkstat. /tmp was cleaned when doing this. Not sure it shows more about what is going on.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Steve's suggested work around:

> # dpkg-divert --rename --add /sbin/telinit
> # cat > /sbin/telinit
> #!/bin/sh
> exit 0
> ^D
> # apt-get install [...]
> # dpkg-divert --rename --remove /sbin/telinit

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1464442] Re: installing or upgrading libc6 in Trusty removes all content from /tmp directory

On Fri, Jun 26, 2015 at 10:47:54PM -0000, Jason Hobbs wrote:
> Steve's suggested work around:
>
> > # dpkg-divert --rename --add /sbin/telinit
> > # cat > /sbin/telinit
> > #!/bin/sh
> > exit 0
> > ^D

Just noticed I forgot to put in a 'chmod a+x /sbin/telinit' in here

> > # apt-get install [...]
> > # dpkg-divert --rename --remove /sbin/telinit

Daniel Manrique (roadmr)
Changed in upstart (Ubuntu):
importance: Undecided → Low
Revision history for this message
Andres Rodriguez (andreserl) wrote :

I think the priority of this should be raised as the importance of this being fixed is quite high!

Revision history for this message
Steve Langasek (vorlon) wrote :

Andres, I understood that a workaround was in place now for curtin. Why is it of high importance to fix this bug?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.