usage of /tmp during boot is not safe due to systemd-tmpfiles-clean

Bug #1707222 reported by Scott Moser on 2017-07-28
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
cloud-init
High
Unassigned
cloud-init (Ubuntu)
High
Unassigned
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

Earlier this week on Zesty on Azure I saw a cloud-init failure in its 'mount_cb' function.
That function esentially does:
 a.) make a tmp directory for a mount point
 b.) mount some filesystem to that mount point
 c.) call a function
 d.) unmount the directory

What I recall was that access to a file inside the mount point failed during 'c'.
This seems possible as systemd-tmpfiles-clean may be running at the same time as cloud-init (cloud-init.service in this example).

It seems that this service basically inhibits *any* other service from using tmp files.
It's ordering statements are only:

  After=local-fs.target time-sync.target
  Before=shutdown.target

So while in most cases only services that run early in the boot process like cloud-init will be affected, any service could have its tmp files removed. this service could take quite a long time to run if /tmp/ had been filled with lots of files in the previous boot.

Related branches

Dimitri John Ledkov (xnox) wrote :

systemd-tmpfiles-clean is racy, but only cleans things as per tmpfiles.d/ configs in /run /etc /usr/lib, for things that explicitely specify to clean themself older than some value.

For /tmp the affected paths are older than 10 days only:
d /tmp/.X11-unix 1777 root root 10d
d /tmp/.ICE-unix 1777 root root 10d
d /tmp/.XIM-unix 1777 root root 10d
d /tmp/.font-unix 1777 root root 10d
d /tmp/.Test-unix 1777 root root 10d

To figure out what actually happened, we need a reproducer or detailed logs, including journal, and contents of /run/tmpfiles.d /etc/tmpfiles.d /usr/lib/tmpfiles.d

I do not recommend using /tmp on security grounds, but I do recommend to set PrivateTmp=true in the systemd units to get a secure /tmp /var/tmp for your service.

Changed in systemd (Ubuntu):
status: New → Incomplete
Scott Moser (smoser) wrote :

So Something is definitely cleaning on boot.
Maybe I just misunderstood your statement, but above it seems like you were saying that only old files named like those would be removed.

Try this:
 $ lxc launch ubuntu-daily:artful a1
 $ lxc exec a1 -- touch /tmp/foo
 $ lxc restart a1
 $ lxc exec a1 -- ls -l /tmp/foo
 ls: cannot access '/tmp/foo': No such file or directory

Changed in systemd (Ubuntu):
status: Incomplete → Confirmed
Dimitri John Ledkov (xnox) wrote :

/tmp is only guaranteed to be in a usable state after sysinit.target currently. Or more generally, after system is fully up only.

During boot (initramfs, post-initramfs, early boot, post-boot) system services should use /run, private runtime dir, private tmp dir.

This is not a bug in Ubuntu boot sequence. Please fix cloud-init to use /run.

Changed in systemd (Ubuntu):
status: Confirmed → Won't Fix
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in cloud-init (Ubuntu):
status: New → Confirmed
Scott Moser (smoser) wrote :

From another bug with the same root cause, a failure might look like this:
Apr 12 07:56:33 ubuntu [CLOUDINIT] util.py[DEBUG]: Running command ['mount', '-o', 'ro,sync', '-t', 'auto', '/dev/sr0', '/tmp/tmpzq70nqyi'] with allowed return codes [0] (shell=False, capture=True)
Apr 12 07:56:33 ubuntu [CLOUDINIT] util.py[DEBUG]: Failed mount of '/dev/sr0' as 'auto': Unexpected error while running command.#012Command: ['mount', '-o', 'ro,sync', '-t', 'auto', '/dev/sr0', '/tmp/tmpzq70nqyi']#012Exit code: 32#012Reason: -#012Stdout: ''#012Stderr: 'mount: mount point /tmp/tmpzq70nqyi does not exist\n'
Apr 12 07:56:33 ubuntu [CLOUDINIT] util.py[DEBUG]: Recursively deleting /tmp/tmpzq70nqyi

Changed in cloud-init:
status: New → Confirmed
importance: Undecided → Medium
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Changed in cloud-init:
importance: Medium → High
Scott Moser (smoser) wrote :

Note there is some discussion of this bug in irc and other options at https://irclogs.ubuntu.com/2017/07/28/%23ubuntu-devel.html#t16:24

Scott Moser (smoser) wrote :

Some solutions:
 a.) Use systemd PrivateTmp which mounts a filesystem over /tmp the process. The issue here is that would only solve for systemd boot.
 b.) set up our own tmp and use it.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-267-g922c3c5c-0ubuntu1

---------------
cloud-init (0.7.9-267-g922c3c5c-0ubuntu1) artful; urgency=medium

  * New upstream snapshot.
    - Ec2: only attempt to operate at local mode on known platforms.
      (LP: #1715128)
    - Use /run/cloud-init for tempfile operations. (LP: #1707222)
    - ds-identify: Make OpenStack return maybe on arch other than intel.
      (LP: #1715241)
    - tests: mock missed openstack metadata uri network_data.json
      [Chad Smith] (LP: #1714376)
    - relocate tests/unittests/helpers.py to cloudinit/tests
      [Lars Kellogg-Stedman]
    - tox: add nose timer output [Joshua Powers]
    - upstart: do not package upstart jobs, drop ubuntu-init-switch module.
    - tests: Stop leaking calls through unmocked metadata addresses
      [Chad Smith] (LP: #1714117)

 -- Scott Moser <email address hidden> Thu, 07 Sep 2017 16:59:04 -0400

Changed in cloud-init (Ubuntu):
status: Confirmed → Fix Released
Scott Moser (smoser) on 2017-09-21
Changed in cloud-init:
status: Confirmed → Fix Committed

This bug is believed to be fixed in cloud-init in 17.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers