[Impact]
Oops during heavy NFS + FSCache + Cachefiles use:
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
[Cause]
1)Two threads are trying to do operate on a cookie and two objects.
2a)One thread tries to unmount the filesystem and in process goes over
a huge list of objects marking them dead and deleting the objects.
cookie->usage is also decremented in nfs_fscache_release_super_cookie -> __fscache_relinquish_cookie ->__fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0);
2b)second thread tries to lookup an object for reading data in fscache_alloc_object
1) cachefiles_alloc_object-> fscache_object_init -> assign cookie, but usage not bumped.
2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object or cookie's->parent object are going away
3)fscache_put_object
-> cachefiles_put_object ->fscache_object_destroy ->fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0);
[Fix]
Bump up the cookie usage in fscache_object_init,
when it is first being assigned a cookie atomically such that the cookie
is added and bumped up if its refcount is not zero.
remove the assignment in the attach_object.
[Testcase]
A user has run ~100 hours of NFS stress tests and not seen this bug recur.
[Regression Potential]
- Limited to fscache/cachefiles.
== SRU Justification ==
[Impact]
Oops during heavy NFS + FSCache + Cachefiles use:
kernel BUG at /build/ linux-Y09MKI/ linux-4. 4.0/fs/ fscache/ internal. h:321! linux-Y09MKI/ linux-4. 4.0/fs/ fscache/ cookie. c:639!
kernel BUG at /build/
[Cause]
nfs_fscache_ release_ super_cookie
-> __fscache_ relinquish_ cookie
->__ fscache_ cookie_ put
- >BUG_ON( atomic_ read(&cookie- >usage) <= 0);
1)Two threads are trying to do operate on a cookie and two objects.
2a)One thread tries to unmount the filesystem and in process goes over
a huge list of objects marking them dead and deleting the objects.
cookie->usage is also decremented in
2b)second thread tries to lookup an object for reading data in fscache_ alloc_object alloc_object- > fscache_object_init -> assign cookie, but usage not bumped. attach_ object -> fails in cant_attach_object because the cookie's backing object
or cookie's->parent object are going away fscache_ put_object put_object
->fscache_ object_ destroy
->fscache_ cookie_ put
->BUG_ ON(atomic_ read(&cookie- >usage) <= 0); object_ init,
1) cachefiles_
2) fscache_
3)
-> cachefiles_
[Fix]
Bump up the cookie usage in fscache_
when it is first being assigned a cookie atomically such that the cookie
is added and bumped up if its refcount is not zero.
remove the assignment in the attach_object.
[Testcase]
A user has run ~100 hours of NFS stress tests and not seen this bug recur.
[Regression Potential]
- Limited to fscache/cachefiles.