Comment 0 for bug 1776277

Revision history for this message
Kiran Kumar Modukuri (kmodukuri) wrote :

== SRU Justification ==

[Impact]
Oops during heavy NFS + FSCache + Cachefiles use:

 kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
 kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

[Cause]
 1)Two threads are trying to do operate on a cookie and two objects.
 2a)One thread tries to unmount the filesystem and in process goes over
   a huge list of objects marking them dead and deleting the objects.
   cookie->usage is also decremented in
                              nfs_fscache_release_super_cookie
                                   -> __fscache_relinquish_cookie
                                      ->__fscache_cookie_put
                                         ->BUG_ON(atomic_read(&cookie->usage) <= 0);

 2b)second thread tries to lookup an object for reading data in fscache_alloc_object
     1) cachefiles_alloc_object-> fscache_object_init -> assign cookie, but usage not bumped.
     2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object
                                 or cookie's->parent object are going away
     3)fscache_put_object
           -> cachefiles_put_object
           ->fscache_object_destroy
           ->fscache_cookie_put
           ->BUG_ON(atomic_read(&cookie->usage) <= 0);
[Fix]
 Bump up the cookie usage in fscache_object_init,
 when it is first being assigned a cookie atomically such that the cookie
 is added and bumped up if its refcount is not zero.
 remove the assignment in the attach_object.

[Testcase]
A user has run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.