2021-01-06 20:54:40 |
Seth Forshee |
description |
We use unprivileged user namespaces with overlay mounts for containers. After recently upgrading our Focal kernels to 5.4.0-51.56 this breaks, one cannot access files through the overlay mount in the container anymore. This is very likely caused by some of the patches that were added in relation to CVE-2020-16120.
The following commands allow to reproduce the problem when executed as an arbitrary non-root user:
mkdir /tmp/test /tmp/test/upper /tmp/test/work /tmp/test/usr
unshare -m -U -r /bin/sh -c "mount -t overlay none /tmp/test/usr -o lowerdir=/usr,upperdir=/tmp/test/upper,workdir=/tmp/test/work; ls -l /tmp/test/usr/bin/id; file /tmp/test/usr/bin/id; /tmp/test/usr/bin/id"
The output when broken is this:
-rwxr-xr-x 1 nobody nogroup 47480 Sep 5 2019 /tmp/test/usr/bin/id
/tmp/test/usr/bin/id: executable, regular file, no read permission
/bin/sh: 1: /tmp/test/usr/bin/id: Operation not permitted
The expected output is this:
-rwxr-xr-x 1 nobody nogroup 43224 Jan 18 2018 /tmp/test/usr/bin/id
/tmp/test/usr/bin/id: ELF 64-bit LSB shared object, ...
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
These commands create a user namespace and within it mount an overlay of /usr to /tmp/test/usr and then try to access something in it.
This works on Ubuntu Bionic with kernel 4.15.0-121.123 (note that this already includes a fix for CVE-2020-16120) and on kernel 5.4.0-48.52 but is broken on kernel 5.4.0-51.56, no matter whether on Bionic or Focal.
So I strongly suspect that not the actual security fixes for CVE-2020-16120 are the cause, but one of the following two patches that according to the changelogs were applied in the same revision but only to 5.4, not to 4.15:
ovl: call secutiry hook in ovl_real_ioctl()
ovl: check permission to open real file
The mail with the announcement (https://www.openwall.com/lists/oss-security/2020/10/13/6) lists these two commits as separate from the actual security fixes ("may be desired or necessary").
Is it possible to revert these two changes or fix them such that our unprivileged containers work again on Ubuntu kernel 5.4? Or is there a workaround that I can add to my container solution such that this use case works again?
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-51-generic 5.4.0-51.56
ProcVersionSignature: User Name 5.4.0-51.56-generic 5.4.65
Uname: Linux 5.4.0-51-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Oct 14 04:48 seq
crw-rw---- 1 root audio 116, 33 Oct 14 04:48 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.9
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: [Errno 2] No such file or directory: 'fuser'
CasperMD5CheckResult: skip
CurrentDmesg: Error: command ['dmesg'] failed with exit code 1: dmesg: read kernel buffer failed: Operation not permitted
Date: Fri Oct 16 13:02:32 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Lsusb-t:
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
|__ Port 1: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
PciMultimedia:
ProcEnviron:
TERM=screen-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcFB: 0 bochs-drmdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-51-generic root=PARTUUID=59ea2f51-599c-49f2-b9b3-77197e333865 ro console=tty1 console=ttyS0
RelatedPackageVersions:
linux-restricted-modules-5.4.0-51-generic N/A
linux-backports-modules-5.4.0-51-generic N/A
linux-firmware 1.187.3
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-5.0
dmi.modalias: dmi:bvnSeaBIOS:bvrrel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-5.0:cvnQEMU:ct1:cvrpc-i440fx-5.0:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-5.0
dmi.sys.vendor: QEMU |
SRU Justification
[Impact]
The backports to fix CVE-2020-16120 introduced a regression for overlay mounts within user namespaces. Files with ownership outside of the user namespace can no longer be accessed, even if allowed by both DAC and MAC.
This issue is fixed by the following upstream commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6650dab404c701d7fe08a108b746542a934da84
This commit relaxes the check to remove O_NOATIME from the open flags for the file in the lower filesystem when the overlay filesystem mounter is not privileged with respect to the underlying inode, rather than failing the open as happens now.
[Test Case]
The attached lp1900141.sh script reproduces the issue.
[Where problems could occur]
For the most part this patch restores previous behavior of allowing access to these files while keeping the enhanced permission checks towards the lower filesystem to help prevent unauthorized access to file data in the lower filesystem. The one difference in behavior is that files in the lower filesystem may no longer be opened with the O_NOATIME flag, potentially causing atime updates for these files which were not happening before. If any software expects O_NOATIME behavior in this situation then it could cause problems for that software. However, the correct behavior is that only the inode owner or a process with CAP_FOWNER towards the inode owner is allowed to open with O_NOATIME (as documented in open(2)).
---
We use unprivileged user namespaces with overlay mounts for containers. After recently upgrading our Focal kernels to 5.4.0-51.56 this breaks, one cannot access files through the overlay mount in the container anymore. This is very likely caused by some of the patches that were added in relation to CVE-2020-16120.
The following commands allow to reproduce the problem when executed as an arbitrary non-root user:
mkdir /tmp/test /tmp/test/upper /tmp/test/work /tmp/test/usr
unshare -m -U -r /bin/sh -c "mount -t overlay none /tmp/test/usr -o lowerdir=/usr,upperdir=/tmp/test/upper,workdir=/tmp/test/work; ls -l /tmp/test/usr/bin/id; file /tmp/test/usr/bin/id; /tmp/test/usr/bin/id"
The output when broken is this:
-rwxr-xr-x 1 nobody nogroup 47480 Sep 5 2019 /tmp/test/usr/bin/id
/tmp/test/usr/bin/id: executable, regular file, no read permission
/bin/sh: 1: /tmp/test/usr/bin/id: Operation not permitted
The expected output is this:
-rwxr-xr-x 1 nobody nogroup 43224 Jan 18 2018 /tmp/test/usr/bin/id
/tmp/test/usr/bin/id: ELF 64-bit LSB shared object, ...
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
These commands create a user namespace and within it mount an overlay of /usr to /tmp/test/usr and then try to access something in it.
This works on Ubuntu Bionic with kernel 4.15.0-121.123 (note that this already includes a fix for CVE-2020-16120) and on kernel 5.4.0-48.52 but is broken on kernel 5.4.0-51.56, no matter whether on Bionic or Focal.
So I strongly suspect that not the actual security fixes for CVE-2020-16120 are the cause, but one of the following two patches that according to the changelogs were applied in the same revision but only to 5.4, not to 4.15:
ovl: call secutiry hook in ovl_real_ioctl()
ovl: check permission to open real file
The mail with the announcement (https://www.openwall.com/lists/oss-security/2020/10/13/6) lists these two commits as separate from the actual security fixes ("may be desired or necessary").
Is it possible to revert these two changes or fix them such that our unprivileged containers work again on Ubuntu kernel 5.4? Or is there a workaround that I can add to my container solution such that this use case works again?
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-51-generic 5.4.0-51.56
ProcVersionSignature: User Name 5.4.0-51.56-generic 5.4.65
Uname: Linux 5.4.0-51-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Oct 14 04:48 seq
crw-rw---- 1 root audio 116, 33 Oct 14 04:48 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.9
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: [Errno 2] No such file or directory: 'fuser'
CasperMD5CheckResult: skip
CurrentDmesg: Error: command ['dmesg'] failed with exit code 1: dmesg: read kernel buffer failed: Operation not permitted
Date: Fri Oct 16 13:02:32 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Lsusb-t:
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
|__ Port 1: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
PciMultimedia:
ProcEnviron:
TERM=screen-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcFB: 0 bochs-drmdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-51-generic root=PARTUUID=59ea2f51-599c-49f2-b9b3-77197e333865 ro console=tty1 console=ttyS0
RelatedPackageVersions:
linux-restricted-modules-5.4.0-51-generic N/A
linux-backports-modules-5.4.0-51-generic N/A
linux-firmware 1.187.3
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-5.0
dmi.modalias: dmi:bvnSeaBIOS:bvrrel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-5.0:cvnQEMU:ct1:cvrpc-i440fx-5.0:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-5.0
dmi.sys.vendor: QEMU |
|