Activity log for bug #1983992

Date Who What changed Old value New value Message
2022-08-08 16:58:04 Ponnuvel Palaniyappan bug added bug
2022-08-08 16:58:14 Ponnuvel Palaniyappan tags sts
2022-08-08 17:02:16 Ponnuvel Palaniyappan description nfs-ganesha server crashes regularly. It doesn't happen all the time or easily reproducible. But when it does crash, the backtrace looks like: (gdb) bt #0 atomic_postclear_uint16_t_bits (bits=<optimized out>, var=<optimized out>) at ./ntirpc/misc/abstract_atomic.h:1812 #1 svc_rqst_epoll_event (sr_rec=sr_rec@entry=0x564dc852efd8, ev=0x7f6134002900) at ./src/svc_rqst.c:1416 #2 0x00007f620fa76565 in svc_rqst_epoll_events (n_events=2, sr_rec=0x564dc852efd8) at ./src/svc_rqst.c:1466 #3 svc_rqst_epoll_loop (wpe=0x564dc852efd8) at ./src/svc_rqst.c:1566 #4 0x00007f620fa816d6 in work_pool_thread (arg=0x7f6064002280) at ./src/work_pool.c:184 #5 0x00007f621029a6db in start_thread (arg=0x7f6097cfa700) at pthread_create.c:463 #6 0x00007f620fdbb61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 This was using 3.0.3 on Bionic (available via Ubuntu Cloud Archive packages). Upstream nfs-ganesha developers suggested that a "number of fixes" related to libntirpc fixed what looks like a race condition. libntirpc is a submodule used in nfs-ganesha and it's where the problem comes form: https://github.com/nfs-ganesha/ntirpc There were a number of commits that went in since 3.0 [0]. Given the crash isn't reproducible easily, it's not straightforward to find the commits that fixed the issue between 3.0.3 and 3.5 for a potential SRU. [0] https://github.com/nfs-ganesha/ntirpc [1] https://github.com/nfs-ganesha/ntirpc/commit/1da6533431a23af7406b5961d4b16ef61045b6af nfs-ganesha server crashes regularly. It doesn't happen all the time or easily reproducible. But when it does crash, the backtrace looks like: (gdb) bt #0 atomic_postclear_uint16_t_bits (bits=<optimized out>, var=<optimized out>) at ./ntirpc/misc/abstract_atomic.h:1812 #1 svc_rqst_epoll_event (sr_rec=sr_rec@entry=0x564dc852efd8, ev=0x7f6134002900) at ./src/svc_rqst.c:1416 #2 0x00007f620fa76565 in svc_rqst_epoll_events (n_events=2, sr_rec=0x564dc852efd8) at ./src/svc_rqst.c:1466 #3 svc_rqst_epoll_loop (wpe=0x564dc852efd8) at ./src/svc_rqst.c:1566 #4 0x00007f620fa816d6 in work_pool_thread (arg=0x7f6064002280) at ./src/work_pool.c:184 #5 0x00007f621029a6db in start_thread (arg=0x7f6097cfa700) at pthread_create.c:463 #6 0x00007f620fdbb61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 This was using 3.0.3 on Bionic (available via Ubuntu Cloud Archive packages). Upstream nfs-ganesha developers suggested that a "number of fixes" related to libntirpc fixed what looks like a race condition. libntirpc is a submodule used in nfs-ganesha and it's where the problem comes form: https://github.com/nfs-ganesha/ntirpc There were a number of commits that went in since 3.0 [0]. Given the crash isn't reproducible easily, it's not straightforward to find the commits that fixed the issue between 3.0.3 and 3.5 for a potential SRU. In a user environment where the problem occurred, they were able to test nfs-ganesha 3.5 and confirmed that it didn't crash over several days load test whereas 3.0.3 crashed at least once a day under a similar load/test environment. [0] https://github.com/nfs-ganesha/ntirpc [1] https://github.com/nfs-ganesha/ntirpc/commit/1da6533431a23af7406b5961d4b16ef61045b6af
2022-10-31 10:16:47 Ponnuvel Palaniyappan nominated for series Ubuntu Focal
2022-10-31 10:16:47 Ponnuvel Palaniyappan bug task added nfs-ganesha (Ubuntu Focal)
2022-10-31 12:51:29 Chris MacNaughton bug task added ntirpc (Ubuntu)
2023-09-14 09:59:37 Jorge Rodríguez bug added subscriber Jorge Rodríguez