It doesn't happen all the time or easily reproducible. But when it does crash, the backtrace looks like:
(gdb) bt
#0 atomic_postclear_uint16_t_bits (bits=<optimized out>, var=<optimized out>) at ./ntirpc/misc/abstract_atomic.h:1812
#1 svc_rqst_epoll_event (sr_rec=sr_rec@entry=0x564dc852efd8, ev=0x7f6134002900) at ./src/svc_rqst.c:1416
#2 0x00007f620fa76565 in svc_rqst_epoll_events (n_events=2, sr_rec=0x564dc852efd8) at ./src/svc_rqst.c:1466
#3 svc_rqst_epoll_loop (wpe=0x564dc852efd8) at ./src/svc_rqst.c:1566
#4 0x00007f620fa816d6 in work_pool_thread (arg=0x7f6064002280) at ./src/work_pool.c:184
#5 0x00007f621029a6db in start_thread (arg=0x7f6097cfa700) at pthread_create.c:463
#6 0x00007f620fdbb61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
This was using 3.0.3 on Bionic (available via Ubuntu Cloud Archive packages).
Upstream nfs-ganesha developers suggested that a "number of fixes" related to libntirpc fixed what looks like a race condition.
There were a number of commits that went in since 3.0 [0]. Given the crash isn't reproducible easily, it's not straightforward to find the commits that fixed the issue between 3.0.3 and 3.5 for a potential SRU.
nfs-ganesha server crashes regularly.
It doesn't happen all the time or easily reproducible. But when it does crash, the backtrace looks like:
(gdb) bt postclear_ uint16_ t_bits (bits=<optimized out>, var=<optimized out>) at ./ntirpc/ misc/abstract_ atomic. h:1812 epoll_event (sr_rec= sr_rec@ entry=0x564dc85 2efd8, ev=0x7f6134002900) at ./src/svc_ rqst.c: 1416 epoll_events (n_events=2, sr_rec= 0x564dc852efd8) at ./src/svc_ rqst.c: 1466 efd8) at ./src/svc_ rqst.c: 1566 2280) at ./src/work_ pool.c: 184 a700) at pthread_ create. c:463 unix/sysv/ linux/x86_ 64/clone. S:95
#0 atomic_
#1 svc_rqst_
#2 0x00007f620fa76565 in svc_rqst_
#3 svc_rqst_epoll_loop (wpe=0x564dc852
#4 0x00007f620fa816d6 in work_pool_thread (arg=0x7f606400
#5 0x00007f621029a6db in start_thread (arg=0x7f6097cf
#6 0x00007f620fdbb61f in clone () at ../sysdeps/
This was using 3.0.3 on Bionic (available via Ubuntu Cloud Archive packages).
Upstream nfs-ganesha developers suggested that a "number of fixes" related to libntirpc fixed what looks like a race condition.
libntirpc is a submodule used in nfs-ganesha and it's where the problem comes form: /github. com/nfs- ganesha/ ntirpc
https:/
There were a number of commits that went in since 3.0 [0]. Given the crash isn't reproducible easily, it's not straightforward to find the commits that fixed the issue between 3.0.3 and 3.5 for a potential SRU.
[0] https:/ /github. com/nfs- ganesha/ ntirpc /github. com/nfs- ganesha/ ntirpc/ commit/ 1da6533431a23af 7406b5961d4b16e f61045b6af
[1] https:/