REGRESSION: shiftfs lets sendfile fail with EINVAL
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | |||
Focal |
Invalid
|
Undecided
|
Unassigned | |||
Hirsute |
Fix Released
|
High
|
Unassigned | |||
Impish |
Fix Released
|
Undecided
|
Unassigned | |||
linux-hwe-5.11 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | |||
Focal |
Fix Released
|
Undecided
|
Unassigned | |||
Hirsute |
Invalid
|
Undecided
|
Unassigned | |||
Impish |
Invalid
|
Undecided
|
Unassigned | |||
linux-meta-hwe-5.11 (Ubuntu) | ||||||
Focal |
Invalid
|
Undecided
|
Unassigned | |||
Hirsute |
Invalid
|
Undecided
|
Unassigned | |||
Impish |
Invalid
|
Undecided
|
Unassigned |
Bug Description
With the 5.11 HWE kernel landing for Ubuntu 20.04 we noticed that LXC tools we're using in bionic containers as part of Anbox Cloud start to fail when executed on the 5.11 kernel.
A simple reproducer looks like this:
1. Run Ubuntu 20.04 with HWE kernel (linux-
2. Install LXD and enable shiftfs
$ snap install lxd
$ snap set lxd shiftfs.enable true
$ snap restart --reload lxd
3. Launch bionic container and run `lxc-info`
$ lxc launch ubuntu:b c0
$ lxc shell c0
c0$ apt update
c0$ apt install -y lxc-utils
root@c1:~# apt show lxc-utils | grep Version
Version: 3.0.3-0ubuntu1~
c0$ mkdir -p containers/test
c0$ touch containers/
c0$ lxc-info -P containers -n test
Failed to load config for test
Failure to retrieve information on containers:test
Looking into the failing `lxc-info` call with strace reveals:
...
memfd_create(
openat(AT_FDCWD, "containers/
sendfile(4, 5, NULL, 2147479552) = -1 EINVAL (Invalid argument
...
LXC >= 4.0.0 doesn't use sendfile anymore and with that isn't affected. Any other tool using sendfile however is affected and will fail. Bionic is affected as the 3.0.3 version of LXC it includes still uses sendfile.
Disabling shiftfs makes things work again and can be considered as a workaround to a certain degree, but not be applicable in all cases.
Further analysis with Christian (cbrauner) from the LXD team this morning showed that shiftfs is missing an implementation for the now required slice_read handler in the file_operations structure. So whenever shiftfs is being used, all calls to sendfile will fail because of the missing implementation. The generic handler for this got removed in the following upstream change: https://<email address hidden>/
Christian implemented a quick fix: https:/
As of today I don't know of any customer of Anbox Cloud who is affected by this as most of them run with one of our cloud kernels. However as soon as 5.11 rolls out to the cloud kernels, we will hit production systems and cause them to fail.
Changed in linux-meta-hwe-5.11 (Ubuntu): | |
status: | New → Confirmed |
Changed in linux-meta-hwe-5.11 (Ubuntu Focal): | |
status: | New → Confirmed |
Changed in linux (Ubuntu Hirsute): | |
status: | New → Confirmed |
Changed in linux-meta-hwe-5.11 (Ubuntu Hirsute): | |
status: | New → Invalid |
Changed in linux-meta-hwe-5.11 (Ubuntu Impish): | |
status: | Confirmed → Invalid |
Changed in linux (Ubuntu Focal): | |
status: | New → Invalid |
Changed in linux (Ubuntu Hirsute): | |
importance: | Undecided → High |
status: | Confirmed → In Progress |
Changed in linux (Ubuntu Impish): | |
status: | New → In Progress |
Changed in linux (Ubuntu Hirsute): | |
status: | In Progress → Fix Committed |
Changed in linux-hwe-5.11 (Ubuntu Hirsute): | |
status: | New → Invalid |
Changed in linux-hwe-5.11 (Ubuntu Impish): | |
status: | New → Invalid |
Changed in linux-hwe-5.11 (Ubuntu Focal): | |
status: | New → In Progress |
no longer affects: | linux-meta-hwe-5.11 (Ubuntu) |
Changed in linux-meta-hwe-5.11 (Ubuntu Focal): | |
status: | Confirmed → Invalid |
Changed in linux-hwe-5.11 (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-hirsute removed: verification-needed-hirsute |
tags: |
added: verification-done-focal removed: verification-needed-focal |
I've verified on my end that with the patch from https:/ /paste. ubuntu. com/p/TPsjfCpnD 5/ the failing sendfile syscall on top of shiftfs is gone.