Comment 3 for bug 1873074

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Root Cause:
----------

Note: this is completely independent of Kubernetes.

The aufs filesystem calls i_readcount_inc() when opening a
file in read-only mode, not paired with an i_readcount_dec().

@ fs/aufs/vfsub.c

struct file *vfsub_dentry_open(struct path *path, int flags)
{
    struct file *file;

    file = dentry_open(path, flags /* | __FMODE_NONOTIFY */,
                       current_cred());
    if (!IS_ERR_OR_NULL(file)
        && (file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
            i_readcount_inc(d_inode(path->dentry));

    return file;
}

That is _incorrect_ as only the VFS layer should maintain
the 'struct inode.i_readcount' value.

Neither of i_readcount_inc() or i_readcount_dec() should
happen there. They don't exist out of VFS on Linux tree.

So,

If the same file is opened in read-only mode so many times,
its backing inode.i_readcount value overflows back to zero.

Once that happens, when the file is closed, __fput() calls
i_readcount_dec(), and that will trigger the BUG_ON().

That causes a kernel panic/crash if panic_on_oops is set;
otherwise, just kernel messages.

By default it's not, but usually the 'enterprise'/larger
users set it so to save kernel crashdumps on such errors.

See the 'Problem Demonstration / Instrumentation' section
to watch the number to overflow and hit the BUG_ON/panic.