cloud-init will not run user-data scripts when /var filesystem is mounted with the noexec flag

Bug #1839899 reported by Nému Support
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Medium
Unassigned

Bug Description

Cloud Vendor: Amazon AWS
Platform: RHEL7.6
Cloud-Init: cloud-init-18.5-3.el7.x86_64
Kernel: 3.10.0-1062.el7.x86_64
SELinux: selinux-policy-targeted-3.13.1-252.el7.1.noarch

--

We have identified that having the "noexec" flag set on the /var filesystem causes cloud-init to fail running user-data scripts. This is a security requirement mandated by STIG policies that we're purposefully trying to meet for Federal systems.

The affected code is in:

/usr/lib/python2.7/site-packages/cloudinit/util.py

Under the function:

runparts()

The system checks for access to the executable using the following line:

        if os.path.isfile(exe_path) and os.access(exe_path, os.X_OK):
                                     ## ^^^^^^^^^

While the file is executable, the "noexec" flag on the filesystem causes os.access() to report False, which cancels the execution of the user-data script.

To reproduce the problem:

- Create new filesystem
- Move /var files to new filesystem
- Add /var to fstab with the "noexec" option
- Mount new /var filesystem
- Run cloud-init init
- Run cloud-init modules -m final
- Observe that the cloud-init scripts do not run

Note that the files in /var/lib/cloud/instances/*/scripts/ are executable (mode 0755 or 0700)

And that when trying to execute the file, you will get Error 13: Permission denied.

--

Possible fixes:

- Search for marker on the first line of the file (#!) and add the requested shell as exe_prefix (as stated above)
- Move /var/lib/cloud (or a portion thereof) to a different filesystem path and symlink it to original path

We have tested the second workaround and it seems to help:

# cloud-init clean
# rm -Rf /var/lib/cloud
# mkdir -p /etc/cloud/runtime
# ln -s /etc/cloud/runtime /var/lib/cloud
# restorecon -rv /var/lib/cloud

After this, user-data scripts appear to execute.

Tags: rhel aws selinux
tags: added: selinux
tags: added: aws rhel
Revision history for this message
Nému Support (nemusupport) wrote :

Update:

Further testing leads us to believe that this problem may actually occur when having the "noexec" flag set on the /var filesystem. This is a security requirement that we're purposefully trying to meet for Federal systems.

Possible fixes:

- Search for marker (#!) and add the requested shell as exe_prefix (as stated above)
- Move /var/lib/cloud to a different filesystem path and symlink it to original path

We have tested the second item and it seems to work:

# cloud-init clean
# rm -Rf /var/lib/cloud
# mkdir -p /etc/cloud-init-runtime
# ln -s /etc/cloud-init-runtime /var/lib/cloud
# restorecon -rv /var/lib/cloud

After this, user-data scripts appear to execute.

Revision history for this message
Nému Support (nemusupport) wrote :

Confirmed that moving the directory to /etc works.

Not sure if there's a clean way to fix this in cloud-init's code - Should the software detect that the /var/lib/cloud directory is on a noexec filesystem and change the storage path for executable scripts to /etc/cloud/runtime-scripts in such cases?

Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi Nému!

Thanks for using cloud-init, and for filing this detailed bug. It's great!

Regarding your first possible fix, my feeling is that we can't assume that the files that runparts is executing are scripts with shebangs. For example, I just did `ln /bin/ls /var/lib/cloud/scripts/per-boot` (in an Ubuntu lxd container) and cloud-init happily runs it, outputting the contents of / to /var/log/cloud-init-output.log. I don't think we should break this binary-in-scripts-directory usecase.

Given that I think you've discovered that this issue is slightly different to your initial report, could you update the description to reflect your latest understanding of it, and then move this report back to New? That will make it easier for me to take this bug to the rest of the development team for a conversation.

Thanks!

Dan

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Nému Support (nemusupport) wrote :

Thanks for the reply, Dan! To confirm, if you remount your /var filesystem as noexec under the lxc container, your binary no longer gets executed?

summary: - When running on an SELinux enforcing system, cloud-init will not run
- user-data scripts
+ cloud-init will not run user-data scripts when /var filesystem is
+ mounted with the noexec flag
Changed in cloud-init:
status: Incomplete → New
Revision history for this message
Dan Watkins (oddbloke) wrote :

Thanks for the title update! I'd appreciate it if we could also update the longer-form text to match the bug as we now understand it, so that people don't have to read through comments to work out where we're at.

> To confirm, if you remount your /var filesystem as noexec under the lxc container, your binary no longer gets executed?

/var is part of the root partition in the Ubuntu lxd images, so I don't really have an easy way to test that, unfortunately.

Thanks!

Dan

Changed in cloud-init:
status: New → Incomplete
description: updated
Changed in cloud-init:
status: Incomplete → New
Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi Nému, thanks for the update, it looks good! It looks like we understand the problem pretty well, so I've moved it to Triaged. Am I right in thinking that this is something that you're looking to work on?

Changed in cloud-init:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
C de-Avillez (hggdh2) wrote :

This is actually a problem whenever the system is installed with /var in its own filesystem, and set 'noexec'. Although the *default* deployment on Ubuntu is with /var under the root filesystem, this is may not be the case on STIG-hardened installs, across distributions.

In our case, we see Azure being hit by this as well. All that is needed is:
* have /var under its own filesystem (for the paranoids -- and I am there as well --, also /var/tmp)

Please not that this also affects running of /var/tmp/dhclient on startup.

Revision history for this message
Chad Smith (chad.smith) wrote :

Here is the corresponding dhclient bug related to this noexec issue in /var/tmp. https://bugs.launchpad.net/cloud-init/+bug/1962343

Revision history for this message
Chad Smith (chad.smith) wrote (last edit ):

Per the suggestion/request in comment #2

> Should the software detect that the /var/lib/cloud directory is on a noexec filesystem and change the storage path for executable scripts to /etc/cloud/runtime-scripts in such cases?

I suggest that cloud-init shouldn't attempt to automatically stuff executable binaries somewhere under /etc as the filesystem heirarchy standard tells us we shouldn't stuff binaries there https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s07.html#requirements3.

Instead I think images which need noexec/var filesystems out of the box should probably update the base cloud configuration in /etc/cloud/cloud.cfg by providing a config snippet in something like /etc/cloud/cloud.cfg.d/95-custom_cloud_dir.cfg:
system_info:
   paths:
      cloud_dir: /some/dir/on/a/filesystem/without/noexec.

Some thoughts that come to mind could be /usr/lib/cloud-init/cloud or /usr/libexec/cloud-init/cloud depending on your Linux distribution.

We may be taking this /usr/lib*/ approach for the issues affecting /var/tmp/cloud-init/dhclient runs for LP: #1962343

Revision history for this message
Chad Smith (chad.smith) wrote :

Discovered today LP: #1976564 that cloud_dir doesn't seem to be honored everywhere so there will be a couple of corner cases where setting cloud_dir won't work at the moment, but we can resolve that bug shortly.

Revision history for this message
Alberto Contreras (aciba) wrote :

Fix committed solving #1976564

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.