LXD containers using shiftfs on ZFS or TMPFS broken on 5.15.0-48.54

Bug #1990849 reported by Thomas Parrott
52
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned
Jammy
Confirmed
Undecided
Unassigned
Kinetic
Confirmed
Undecided
Unassigned
zfs-linux (Ubuntu)
Confirmed
Undecided
Unassigned
Jammy
Confirmed
Undecided
Unassigned
Kinetic
Confirmed
Undecided
Unassigned

Bug Description

Since 5.15.0-48.54 LXD containers using shiftfs ontop of ZFS or TMPFS are broken.

Reproducer steps:

```
sudo snap install lxd
sudo snap set lxd shiftfs.enable=true
sudo lxd init --auto
lxc storage create zfs zfs
lxc launch images:ubuntu/jammy c1 -s zfs
lxc exec c1 -- touch /root/foo
touch: cannot touch '/root/foo': Value too large for defined data type
```

Expected result can be achieved by disabling shiftfs:

```
sudo snap set lxd shiftfs.enable=false
sudo systemctl reload snap.lxd.daemon
lxc launch images:ubuntu/jammy c2 -s zfs
lxc exec c2 -- touch /root/foo
lxc exec c2 -- ls -la /root/foo
-rw-r--r-- 1 root root 0 Sep 26 14:00 /root/foo
```

Kernel 5.15.0-47-generic does not exhibit this issue.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-48-generic 5.15.0-48.54
ProcVersionSignature: Ubuntu 5.15.0-48.54-generic 5.15.53
Uname: Linux 5.15.0-48-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: user 2240 F.... pulseaudio
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Mon Sep 26 14:55:52 2022
InstallationDate: Installed on 2022-03-04 (205 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220228)
MachineType: LENOVO 20R1000RUS
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.15.0-48-generic root=/dev/mapper/vgubuntu-root ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-48-generic N/A
 linux-backports-modules-5.15.0-48-generic N/A
 linux-firmware 20220329.git681281e4-0ubuntu3.5
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/15/2021
dmi.bios.release: 1.34
dmi.bios.vendor: LENOVO
dmi.bios.version: N2QET40W(1.34 )
dmi.board.asset.tag: Not Available
dmi.board.name: 20R1000RUS
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40697 WIN
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.ec.firmware.release: 1.15
dmi.modalias: dmi:bvnLENOVO:bvrN2QET40W(1.34):bd04/15/2021:br1.34:efr1.15:svnLENOVO:pn20R1000RUS:pvrThinkPadX1Carbon7th:rvnLENOVO:rn20R1000RUS:rvrSDK0J40697WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_20R1_BU_Think_FM_ThinkPadX1Carbon7th:
dmi.product.family: ThinkPad X1 Carbon 7th
dmi.product.name: 20R1000RUS
dmi.product.sku: LENOVO_MT_20R1_BU_Think_FM_ThinkPad X1 Carbon 7th
dmi.product.version: ThinkPad X1 Carbon 7th
dmi.sys.vendor: LENOVO

Revision history for this message
Thomas Parrott (tomparrott) wrote :
Revision history for this message
Thomas Parrott (tomparrott) wrote :

Added sudo systemctl reload snap.lxd.daemon after disabling shiftfs.

description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Suspect is:

  * refactoring of overlayfs fix to properly support shiftfs (LP: #1983640)
    - SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency

tags: added: regression-update
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I like the reproducer, i wonder if i should add it to zfs-linux as a test case. Or maybe to the kernel.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

(zfs-linux portion is to autopkgtest the reproducer, because this has been pain multiple times now)

Revision history for this message
Thomas Parrott (tomparrott) wrote :

This bug also breaks `lxc file push` functionality.

Revision history for this message
Simon Fels (morphis) wrote :

Anbox Cloud is affected by this too. We used shiftfs by default until our 1.14 release in May 2022. According to the metrics we have there are still a few users around with the 1.13 release which enables shiftfs on ZFS.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zfs-linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Tamas Papp (tomposmiko) wrote :

Is there any plan or estimation on the fix?

Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote (last edit ):

Dear friends,

Right now I'm on the way to understand what's happening with this.

# strace touch b
execve("/usr/bin/touch", ["touch", "b"], 0x7ffd29f848a8 /* 7 vars */) = 0
brk(NULL) = 0x56007dba6000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff436afcb0) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6bb5d2d000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
<...>
openat(AT_FDCWD, "b", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 EOVERFLOW (Value too large for defined data type)

From ftrace/perf trace:
  877.582556 | 0) | /* do_sys_openat2__return: (__x64_sys_openat+0x55/0x90 <- do_sys_openat2) arg1=0xffffffffffffffb5 */

0xffffffffffffffb5 it's 2's complement of -EOVERFLOW error

I've finally traced it to:
static inline int may_create(struct user_namespace *mnt_userns,
        struct inode *dir, struct dentry *child)
{
 audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
 if (child->d_inode)
  return -EEXIST;
 if (IS_DEADDIR(dir))
  return -ENOENT;
 if (!fsuidgid_has_mapping(dir->i_sb, mnt_userns))
  return -EOVERFLOW; // <<< looks like error comes from here

My suspicion is that the problem is caused by these two changes, which are potentially incompatible with shiftfs:
+ - fs: tweak fsuidgid_has_mapping()
+ - fs: support mapped mounts of mapped filesystems
(changelog from https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?h=Ubuntu-5.15.0-48.54&id=941bdeb5ab2258758fce5f4d06296da98bfa7e82)

Will continue investigation.

Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :

This degradation is not related to
SAUCE: overlayfs: remove CONFIG_AUFS_FS dependency
change

Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :
Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "0001-UBUNTU-SAUCE-shiftfs-fix-permanent-EOVERFLOW-inside-.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Stefan Bader (smb)
no longer affects: linux (Ubuntu Kinetic)
no longer affects: zfs-linux (Ubuntu Kinetic)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Jammy):
status: New → Confirmed
Changed in linux (Ubuntu Kinetic):
status: New → Confirmed
Changed in zfs-linux (Ubuntu Jammy):
status: New → Confirmed
Changed in zfs-linux (Ubuntu Kinetic):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers