Please backport support for "close_range" syscall

Bug #1944436 reported by Steve Dodd
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libseccomp (Ubuntu)
New
Wishlist
Unassigned

Bug Description

Please backport support for the "close_range" syscall .. may be as simple as cherrypicking

https://github.com/seccomp/libseccomp/commit/01e5750e7c84bb14e5a5410c924bed519209db06

from upstream. I've hit problems running buildah in a systemd-nspawn container, but this will probably affect people trying to run modern code in other container systems as well, e.g. docker.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: libseccomp2 2.5.1-1ubuntu1~20.04.1
ProcVersionSignature: Ubuntu 5.4.0-84.94-generic 5.4.133
Uname: Linux 5.4.0-84-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: Xpra
Date: Tue Sep 21 15:10:54 2021
InstallationDate: Installed on 2017-01-08 (1717 days ago)
InstallationMedia: Xubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
SourcePackage: libseccomp
UpgradeStatus: Upgraded to focal on 2021-09-02 (19 days ago)

Revision history for this message
Steve Dodd (anarchetic) wrote :
Revision history for this message
Steve Dodd (anarchetic) wrote :

https://github.com/seccomp/libseccomp/pull/322/ (or at least parts of it) probably required too.

Revision history for this message
Steve Dodd (anarchetic) wrote (last edit ):

Can confirm rebuilding seccomp in focal with the relevant bits of the above two commits allows me to whitelist close_range in systemd-nspawn, solving my problem.

Revision history for this message
Alex Murray (alexmurray) wrote :

Can you please post a simple reproducer?

Revision history for this message
Steve Dodd (anarchetic) wrote (last edit ):

It's not going to be simple I'm afraid, at least for the original problem! "scmp_sys_resolver close_range" will quickly test whether current seccomp has support for close_range (prints "-1" if not supported, "436" otherwise - at least on x86_64.) Ubuntu seccomp maintainers have been pretty happy SRUing this sort of thing before - it's a running problem, and the changes are trivial.

Outline of a reproducer for my original problem would be something like:

1. download and unpack https://cloud-images.ubuntu.com/releases/focal/release/ubuntu-18.04-server-cloudimg-amd64-root.tar.xz
2. cd to the rootfs directory and start a container:
rm etc/resolv.conf && cat /run/systemd/resolve/resolv.conf >etc/resolv.conf
systemd-nspawn --system-call-filter=@keyring\ close_range
3. Add podman/buildah PPA:
echo "deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_18.04/ /" | sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list
curl -L "https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_18.04/Release.key" | sudo apt-key add -
sudo apt-get update
4. apt-get -y install buildah
5. add a new user and switch
adduser test --gecos "" --disabled-password
sudo -u test -Hs
cd ~
6. create scratch container and copy in busybox
ctr=$(buildah from scratch)
buildah copy $ctr /bin/busybox
7. check EOF handling
echo foo | buildah run $ctr /busybox cat

Without the patch, this should fail to return to the prompt, as the missing syscall seems to interfere with buildah's ability to to process EOF; with the patch it should return to the prompt.
In the event of failure there should also be messages logged about "close_range" being unsupported.

Revision history for this message
Steve Dodd (anarchetic) wrote :

Still working out kinks in the above, but here's a simpler one. Needs running in an nspawn container again (steps 1-2 above); should either succeed (no output) or print "function not implemented", but without seccomp support nspawn will block it and it will print "not permitted"

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main()
{
        if(syscall(436, 0, 0, 0)) {
                perror("close_range");
                exit(1);
        }

        exit(0);
}

Revision history for this message
Steve Dodd (anarchetic) wrote :

I think the long test case in #5 now works. Note that later versions of crun have worked around the problem: https://github.com/containers/crun/pull/672

Still worth fixing, though, I think, as it is likely to cause further problems as more code starts to use close_range.

Revision history for this message
Paride Legovini (paride) wrote :

Reminds me of LP: #1943049. I mentioned this bug there, as we should make sure that close_range doesn't bring us back to that same issue.

Mathew Hodson (mhodson)
Changed in libseccomp (Ubuntu):
importance: Undecided → Wishlist
Revision history for this message
Dan Nicholson (danbnicholson) wrote :

This causes an issue when using glib's gspawn APIs under libseccomp on impish. It uses close_range to set CLOEXEC on some open file descriptors and rightfully checks for ENOSYS. However, since seccomp doesn't know about the syscall that becomes EPERM and it skips setting CLOEXEC assuming there was a legit error in close_range. Eventually this means that the process run by gspawn hangs because nothing is closing the file descriptor as expected.

Debian has been shipping this backported to bullseye for a while - https://salsa.debian.org/debian/libseccomp/-/blob/debian/bullseye/debian/patches/syscalls_add_close_range_syscall.patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.