'shifted' (shiftfs) FS mount became inconsistent with host FS; resolved by dropping caches

Bug #1879196 reported by James Troup
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Christian Brauner
Eoan
Won't Fix
Medium
Unassigned
Focal
Triaged
Medium
Unassigned

Bug Description

On Ubuntu 20.04 with linux-image-5.4.0-30-generic from the Canonical
Kernel Team's proposed PPA, I ran into the following problem with
using a shiftfs 'shifted' ext4 FS mount inside a LXD container.

On the host, I created a file (in emacs) that was in no way special
(single line text file):

🙂 james@malefic:~/projects/ethq/deb$ stat ethq-0.6.1~git2020517/debian/ethq.install
  File: ethq-0.6.1~git2020517/debian/ethq.install
  Size: 15 Blocks: 8 IO Block: 4096 regular file
Device: fd01h/64769d Inode: 7085913 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1000/ james) Gid: ( 1000/ james)
Access: 2020-05-17 22:48:36.274528130 +0100
Modify: 2020-05-17 22:37:17.676232019 +0100
Change: 2020-05-17 22:37:17.676232019 +0100
 Birth: -
🙂 james@malefic:~/projects/ethq/deb$ stat ethq-0.6.1~git2020517/debian/rules

But in the container, I saw this:

ubuntu@ethq-build:~/ethq/deb/ethq-0.6.1~git2020517$ ls -l debian/
ls: cannot access 'debian/ethq.install': No such file or directory
total 20
-rw-rw-r-- 1 ubuntu ubuntu 150 May 17 21:25 changelog
-rw-r--r-- 1 ubuntu ubuntu 2 May 17 21:36 compat
-rw-rw-r-- 1 ubuntu ubuntu 514 May 17 21:14 control
-rw-rw-r-- 1 ubuntu ubuntu 720 May 17 21:20 copyright
-????????? ? ? ? ? ? ethq.install
-rwxr-xr-x 1 ubuntu ubuntu 30 May 17 21:35 rules
ubuntu@ethq-build:~/ethq/deb/ethq-0.6.1~git2020517$ stat debian/ethq.install
stat: cannot stat 'debian/ethq.install': No such file or directory
ubuntu@ethq-build:~/ethq/deb/ethq-0.6.1~git2020517$

On a suggestion from Stephane Graber, I tried running:

  echo 3 > /proc/sys/vm/drop_caches

Which seemed to resolve the problem:

ubuntu@ethq-build:~/ethq/deb/ethq-0.6.1~git2020517$ stat debian/ethq.install
  File: 'debian/ethq.install'
  Size: 15 Blocks: 8 IO Block: 4096 regular file
Device: fd01h/64769d Inode: 7085913 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1000/ ubuntu) Gid: ( 1000/ ubuntu)
Access: 2020-05-17 21:48:36.274528130 +0000
Modify: 2020-05-17 21:37:17.676232019 +0000
Change: 2020-05-17 21:37:17.676232019 +0000
 Birth: -
ubuntu@ethq-build:~/ethq/deb/ethq-0.6.1~git2020517$

Tags: focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1879196

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
James Troup (elmo) wrote :

The good news (?) is that this seems easy to reproduce; editing any file in emacs seems to do the trick:

  https://pastebin.ubuntu.com/p/XydrSqfkfX/

Changed in linux (Ubuntu):
status: Incomplete → In Progress
assignee: nobody → Christian Brauner (cbrauner)
Revision history for this message
Christian Brauner (cbrauner) wrote :

I have a fix for this note, that this is a regression we introduced by another fix. I also want to put this cautious note here so people better understand why shiftfs has such bugs and why they are not simple idiot regressions but rather intricate to fix:

    Note, in general it's not advisable to directly modify the underlay
    while a shiftfs mount is on top. In some way this means we need to keep
    two caches in sync and it's hard enough to keep a single cache happy.
    But shiftfs' use-case is inherently prone to be used for exactly that.
    So this is something we have to navigate carefully and honestly we have
    no full model upstream that does the same. Overlayfs has the copy-up
    behavior which let's it get around most of the issues but we don't have
    it and ecryptfs is broken in such scenarios which we verified quite a
    while back.
    In any case, I built a kernel with this patch and re-ran all regressions
    that are related to this that we have so far (cf. [1], [2], and [3]).
    None of them were reproducible with this patch here. So we still fix the
    ESTALE issue but also keep underlay and overlay in sync.

Stefan Bader (smb)
Changed in linux (Ubuntu Eoan):
status: New → Triaged
importance: Undecided → Medium
Changed in linux (Ubuntu Focal):
importance: Undecided → Medium
status: New → Triaged
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Christian Brauner (cbrauner) wrote :
Revision history for this message
Brian Murray (brian-murray) wrote :

The Eoan Ermine has reached end of life, so this bug will not be fixed for that release

Changed in linux (Ubuntu Eoan):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.