systemd=229-4ubuntu21.8 use of fchownat failes on some systems (openvz)

Bug #1804847 reported by Chris E on 2018-11-23
364
This bug affects 19 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Chris Coulson

Bug Description

The following description is taken from:

https://answers.launchpad.net/ubuntu/+source/systemd/+question/676237

Hello everyone,
I'm running 16.04 LTS on a virtual server which, I think, uses OpenVz. After a recent reboot I found most of my services to be in a failed state. The reason for that, I guess, are these log entries:

Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/elasticsearch failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/kopano failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/kopano failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/php failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/postgresql failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/redis failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/screen failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/utmp failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/systemd/netif failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/systemd/netif/links failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/systemd/netif/leases failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/log/journal failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/log/journal/bbad3a438f4b4fb49e5d0700bd5981e8 failed: Invalid argument
Nov 17 04:47:42 h2118376 systemd-tmpfiles[165]: fchownat() of /run/log/journal/bbad3a438f4b4fb49e5d0700bd5981e8/system.journal failed: Invalid argument

To verify I tried this:

/usr/lib/tmpfiles.d# SYSTEMD_LOG_LEVEL=debug systemd-tmpfiles --create elasticsearch.conf
Reading config file "elasticsearch.conf".
Running create action for entry d /var/run/elasticsearch
Found existing directory "/var/run/elasticsearch".
"/run/elasticsearch" has right mode 40755
chown "/run/elasticsearch" to 120.128
fchownat() of /run/elasticsearch failed: Invalid argument

I can manually chown the directories, e.g. "chown elasticsearch:elasticsearch /var/run/elasticsearch" and restart the service successfully. My suspicion is, this is related to an upgrade of systemd to 229-4ubuntu21.8.

At this point I don't know what to do.

I'm also confused about the version I have installed, which I thought is systemd-229. Howver, I looked at https://github.com/systemd/systemd/blob/v229/src/tmpfiles/tmpfiles.c and found that fchownat() is only used from version 238+:
Tag v237 (and earlier, including 229):
/.../
                        if (chown(fn,
                                  i->uid_set ? i->uid : UID_INVALID,
                                  i->gid_set ? i->gid : GID_INVALID) < 0)
                                return log_error_errno(errno, "chown(%s) failed: %m", path);
}
/.../

Tag v238

/.../
    if (fchownat(fd,
                             "",
                             i->uid_set ? i->uid : UID_INVALID,
                             i->gid_set ? i->gid : GID_INVALID,
                             AT_EMPTY_PATH) < 0)
return log_error_errno(errno, "fchownat() of %s failed: %m", path);
/.../

Any help fixing this problem would be highly appreciated.
Many thanks,
Rafael

=== Notes ===
fchownat() was added to Linux in kernel 2.6.16;
library support was added to glibc in version 2.4.
checkinf if it is blocked/filtered/sandboxed, rarther than unavailable.
glibc in bionic requires minimum linux 3.2.

CVE References

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
SV (smvl) wrote :

temporary solution is:

wget http://launchpadlibrarian.net/395488411/systemd_229-4ubuntu21.6_amd64.deb | wget http://launchpadlibrarian.net/395488401/libpam-systemd_229-4ubuntu21.6_amd64.deb | wget http://launchpadlibrarian.net/395488403/libsystemd0_229-4ubuntu21.6_amd64.deb

dpkg -i libpam-systemd_229-4ubuntu21.6_amd64.deb systemd_229-4ubuntu21.6_amd64.deb libsystemd0_229-4ubuntu21.6_amd64.deb
apt-mark hold systemd libsystemd0 libpam-systemd

Miha Ravšelj (mravselj) wrote :

I have the same issue.

My vps is 16.04.5 LTS(openvz)

This bug however has quite an impact. After reboot sshd server fails to start since /var/run/sshd folder is not created since systemd-tmpfiles-setup service failed to start.

I have applied the temporary solution above and can confirm it works ok.

Peter Enns (nelway) wrote :

I had the same problem with ssh, after reboot I was not able to SSH to the server.

My solution to the SSH problem was the following (through an emergency console session from the VPS provider's control panel):

sudo nano /etc/ssh/sshd_config

Find:
UsePrivilegeSeperation yes
and change to:
UsePrivilegeSeperation no

sudo service ssh restart

This worked for me, but I am not sure about the security implications of this change.

SV, thank you for the temporary solution, I tried it on my test server and it worked fine.

Stefan Andres (s-andres) wrote :

In it's current version (systemd-229-4ubuntu21.9) it seems to be made worse.

Files in /var/run/ will get `Too many levels of symbolic links` files in /run/ will still get `fchownat() of /run/... failed: Invalid argument`

root@sandres-test-systemd:/usr/lib/tmpfiles.d# SYSTEMD_LOG_LEVEL=debug systemd-tmpfiles --create zabbix-agent.conf
Reading config file "zabbix-agent.conf".
Running create action for entry d /var/run/zabbix
Failed to validate path /var/run/zabbix: Too many levels of symbolic links
root@sandres-test-systemd:/usr/lib/tmpfiles.d# cat zabbix-agent.conf
d /var/run/zabbix 0755 zabbix zabbix - -
root@sandres-test-systemd:/usr/lib/tmpfiles.d# SYSTEMD_LOG_LEVEL=debug systemd-tmpfiles --create zabbix-agent2.conf
Reading config file "zabbix-agent2.conf".
Running create action for entry d /run/zabbix
Found existing directory "/run/zabbix".
"/run/zabbix" has right mode 40755
chown "/run/zabbix" to 108.115
fchownat() of /run/zabbix failed: Invalid argument
root@sandres-test-systemd:/usr/lib/tmpfiles.d# cat zabbix-agent2.conf
d /run/zabbix 0755 zabbix zabbix - -
root@sandres-test-systemd:/usr/lib/tmpfiles.d#

Raimundas (iv123) wrote :

I have the same problem with my Ubuntu 16 VPS servers. I had systemd version 229-4ubuntu21.5 on my templates and when it is updated to version 229-4ubuntu21.9 this erroneous behavior occurs. On other client VPS's i have seen that they updated systemd from version 229-4ubuntu21.8 to 229-4ubuntu21.9. Thus it seems like a problem with the November 19 update for systemd - https://launchpad.net/ubuntu/+source/systemd/229-4ubuntu21.9

Here is an alternative "fix" you can use until this problem is properly resolved (just change the "21.5" version, which was working for you before the update):

cd /home/
wget launchpadlibrarian.net/392626140/libsystemd0_229-4ubuntu21.5_amd64.deb
wget launchpadlibrarian.net/392626138/libpam-systemd_229-4ubuntu21.5_amd64.deb
wget launchpadlibrarian.net/392626148/systemd_229-4ubuntu21.5_amd64.deb
dpkg -i libpam-systemd_229-4ubuntu21.5_amd64.deb systemd_229-4ubuntu21.5_amd64.deb libsystemd0_229-4ubuntu21.5_amd64.deb
apt-mark hold systemd libsystemd0 libpam-systemd
reboot

Franz Seidl (franz-s) wrote :

I've the problem with my VPS at Strato, they use Virtuozzo.

Dimitri John Ledkov (xnox) wrote :

Could you please post output of:

   uname -a

information type: Public → Public Security
tags: added: regression-update
description: updated
description: updated
Seth Arnold (seth-arnold) wrote :

I'm also curious what filesystems are showing this issue. If you're affected can you please run this command and include the results here?

mount | grep run

Thanks

Peter Enns (nelway) wrote :

Output of uname -a:

Linux newpnc 2.6.32-042stab132.1 #1 SMP Wed Jul 11 13:51:30 MSK 2018 x86_64 x86_64 x86_64 GNU/Linux

Output of mount | grep run:

tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
none on /run/shm type tmpfs (rw,relatime)

Steve Langasek (vorlon) wrote :

glibc in Ubuntu 16.04 does have MIN_KERNEL_SUPPORTED := 2.6.32 in debian/sysdeps/amd64.mk precisely to support running Ubuntu userspace on RHEL6 kernels. This is not a supported configuration for a full booted Ubuntu 16.04 install, because the minimum kernel version for udev itself is higher than this (iirc, Linux 3.2) - but since this is running as a container, udev isn't a concern.

So it seems reasonablbe to expect this not to regress as part of a security update of systemd on Ubuntu 16.04.

Peter Enns (nelway) wrote :

Ah ok, thank you for the information.

Since it is an OpenVZ container, I assume my only option is to apply the temporary solution above, then start to look for a VPS provider that supports a more current kernel version (or one that would let me upgrade the kernel).

Thanks again.

Peter

Jens Zahner (jens-it-zahner) wrote :

I also ran into the problem but solved it by downgrading systemd:

> apt install systemd=229-4ubuntu4 libsystemd0=229-4ubuntu4
> apt-mark hold systemd

Dimitri John Ledkov (xnox) wrote :

@vorlon

fchownat() was added to Linux in kernel 2.6.16, which is prior to 2.6.32 and is a kernel version one of the affected users is reporting. Thus it seems as if, the fchownat() is somehow otherwise broken.

Changed in systemd (Ubuntu):
assignee: nobody → Chris Coulson (chrisccoulson)
Chris Coulson (chrisccoulson) wrote :

We're just going to publish a revert of the CVE-2018-6954 fixes for 16.04 before investigating this further. As far as I can tell, this shouldn't be an issue in bionic where MIN_KERNEL_SUPPORTED is 3.2 in glibc.

Chris Coulson (chrisccoulson) wrote :

The issue is that O_PATH doesn't work from these containers:

2025 11:00:08 openat(4, "run", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = -1 ELOOP (Too many levels of symbolic links)

Apparently, O_PATH was added in 2.6.39, so this makes sense now.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 229-4ubuntu21.10

---------------
systemd (229-4ubuntu21.10) xenial-security; urgency=medium

  [ Chris Coulson ]
  * Revert the fixes for CVE-2018-6954 for causing a regression when running
    in a container on old kernels (LP: #1804847)
    - update debian/patches/series

  [ Balint Reczey ]
  * Fix LP: #1803391 - Don't always trigger systemctl stop of udev service
    and sockets
    - update debian/udev.postinst

 -- Chris Coulson <email address hidden> Tue, 27 Nov 2018 11:10:48 +0000

Changed in systemd (Ubuntu):
status: Confirmed → Fix Released

Dear All,
my name is Vasily Averin, I'm maintainer of RHEL6-based OpenVz kernel.
We have backported required patches and going to release updated kernel with support for openat(O_PATH|O_NOFOLLOW) for symlinks (internal bug id PSBM-90060) and fchownat for empty path (PSBM-89993).
Fixed kernel 2.6.32-042stab134.7 is under testing now and we're going to publish it in few days.

Thank you,
   Vasily Averin

Dimitri John Ledkov (xnox) wrote :

@vvs Spasibo!

Updated OpenVz6 kernel was released:
https://wiki.openvz.org/Download/kernel/rhel6/042stab134.7

We are very grateful for Ubuntu team for reverting of patches specially for OpenVz.

For affected hosters: OpenVz6 is great but it is really old,
and similar incidents can happen again and again.
Please think about switch to RHEL7-based OpenVz7.

Thank you,
   Vasily Averin

Rene Meier (meier.rene) wrote :

Could someone please have a look at https://launchpad.net/bugs/1804603 because the changes made to fix CVE-2018-6954 break systemd-tmpfiles also in a different way.

Carlos Garcia (lwrcase) wrote :

Just updated 229-4ubuntu21.15 today and this issue is back.

Seth Arnold (seth-arnold) wrote :

Hello Carlos, the OpenVZ team was kind enough to backport the necessary kernel feature in November last year. After a month and a half we decided that enough time had elapsed for OpenVZ-based service providers to install new kernels. If your provider has not yet rebooted into a new OpenVZ kernel I suggest you ask them to do so.

We can't reasonably hold off providing this security update to our 16.04 LTS users any longer.

Thanks

Andreas Kar (thexmanxyz) wrote :
Download full text (3.4 KiB)

Hello Seth as I'm also facing a similar issue which is also related to this bug I'm now posting to this bug report as it has the highest heat concerning issues with systemd and systemd-tmpfiles (service do not start correctly). I don't think this is exclusively related to OpenVZ because I'm also affected I don't use OpenVZ.

I have collected a few reports which also relate to the systemd changes which result in services not starting on boot up:

https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1811580
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1804603
https://forum.armbian.com/topic/8852-ssh-doesnt-work-on-orange-pi-zero/

I'm on Armbian on an OrangePI One and I'm also affected by services that won't start anymore. The same was the case with systemd-229-4ubuntu21.9 and now with 229-4ubuntu21.15. All other systemd version did work without any issues. For the sake of completeness I will now also attach my system information and terminal output which relates to the issue.

Distribution / Kernel
Linux xxx 3.4.113-sun8i #2 SMP PREEMPT Sat Jan 12 15:54:26 CET 2019 armv7l armv7l armv7l GNU/Linux
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

Output of journalctl -b 0 -u systemd-tmpfiles-setup.service

Jän 14 11:01:51 xxx systemd[1]: Starting Create Volatile Files and Directories...
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var/log: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var/lib: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /run/sendsigs.omit.d: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /home: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /srv: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /run/lock/subsys: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var/run/lighttpd: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var/cache: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /var/cache/man: Bad file descriptor
Jän 14 11:01:51 xxx systemd-tmpfiles[581]: Failed to validate path /run/openvpn: Bad file descriptor
Jän 14 11:01:51 xxx systemd[1]: systemd-tmpfiles-setup.service: Main process exited, code=exited, status=1/FAILURE
Jän 14 11:01:51 xxx systemd[1]: Failed to start Create Volatile Files and Directories.
Jän 14 11:01:51 xxx systemd[1]: systemd-tmpfiles-setup.service: Unit entered failed state.
Jän 14 11:01:51 xxx systemd[1]: systemd-tmpfiles-setup.service: Failed with result 'exit-code'.

Affected services:

# dnsmasq.service loaded failed failed dnsmasq - A lightweight DHCP and caching DNS server
# lighttpd.service loaded failed failed Lighttpd Daemon
# <email address hidden> loaded failed failed OpenVPN connection to...

Read more...

Andreas Kar (thexmanxyz) wrote :

Please forgive me if I don't get overall picture correctly because I'm not a professional in kernel development. Moreover I have no insights what is going on with the systemd changes and who is actually affected and why I'm affected. In the Armbian forums is stated that the problem originates in Ubuntu so I decided to share my information here.

I just see right now that I'm limited to either upgrade the Ubuntu revision (reinstall my machine) or hold systemd updates on a revision not facing the issue. I would really appreciate more information if this will ever get fixed again or what options we have in the long run to surpass the issue because I doesn't sound that the problem will ever be fixed again. Sorry again for my stupid question but there might be more people affected with the same problem who probably want to know a resolution.

Andreas Kar (thexmanxyz) wrote :

Upgrading to Linux pan 4.19.13-sunxi #5.70 SMP Sat Jan 12 15:43:21 CET 2019 armv7l armv7l armv7l GNU/Linux solved the issue for me.

To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Duplicates of this bug

Other bug subscribers