This kind of issue appeared with Ubuntu 23.10 for me on the server mostly using an HDD for bulk storage with a not exactly powerful CPU also being occupied with using WireGuard to secure the NFS connection.
Mentioning the performance details because I have a feeling they matter. An also not exactly high performance client connecting over 1 Gb/s only very occasionally caused this problem, however given a 10 Gb/s connection, the issue appeared significantly more commonly. A higher performance setup utilizing a 2.5 Gb/s connection triggered this bug in a couple of days after setup.
The lockup always seem to occur with heavy NFS usage, suspiciously mostly when there's both reading and writing going on, at least I don't recall it happening with reading only, but I'm not confident in stating it didn't happen with a writing only load.
Found this bug report by the client error message, server side differs due to the different version:
```
[300146.046666] INFO: task nfsd:1426 blocked for more than 241 seconds.
[300146.046732] Not tainted 6.5.0-27-generic #28~22.04.1-Ubuntu
[300146.046770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[300146.046813] task:nfsd state:D stack:0 pid:1426 ppid:2 flags:0x00004000
[300146.046827] Call Trace:
[300146.046832] <TASK>
[300146.046839] __schedule+0x2cb/0x750
[300146.046860] schedule+0x63/0x110
[300146.046870] schedule_timeout+0x157/0x170
[300146.046881] wait_for_completion+0x88/0x150
[300146.046894] __flush_workqueue+0x140/0x3e0
[300146.046908] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[300146.047074] nfsd4_destroy_session+0x193/0x260 [nfsd]
[300146.047219] nfsd4_proc_compound+0x3b7/0x770 [nfsd]
[300146.047365] nfsd_dispatch+0xbf/0x1d0 [nfsd]
[300146.047497] svc_process_common+0x420/0x6e0 [sunrpc]
[300146.047695] ? __pfx_read_tsc+0x10/0x10
[300146.047706] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[300146.047848] ? __pfx_nfsd+0x10/0x10 [nfsd]
[300146.047977] svc_process+0x132/0x1b0 [sunrpc]
[300146.048157] nfsd+0xdc/0x1c0 [nfsd]
[300146.048287] kthread+0xf2/0x120
[300146.048299] ? __pfx_kthread+0x10/0x10
[300146.048310] ret_from_fork+0x47/0x70
[300146.048321] ? __pfx_kthread+0x10/0x10
[300146.048331] ret_from_fork_asm+0x1b/0x30
[300146.048341] </TASK>
```
This seems to be matching, but the previous lockups experienced may have been somewhat different.
I mostly remember the client whining about the server not responding instead of the message presented here, and the server call trace used to have btrfs in it which made me suspect it may be exclusive to that, although the issue was always with NFS, nothing else locked up despite having some other sources of heavy I/O.
This kind of issue appeared with Ubuntu 23.10 for me on the server mostly using an HDD for bulk storage with a not exactly powerful CPU also being occupied with using WireGuard to secure the NFS connection.
Mentioning the performance details because I have a feeling they matter. An also not exactly high performance client connecting over 1 Gb/s only very occasionally caused this problem, however given a 10 Gb/s connection, the issue appeared significantly more commonly. A higher performance setup utilizing a 2.5 Gb/s connection triggered this bug in a couple of days after setup.
The lockup always seem to occur with heavy NFS usage, suspiciously mostly when there's both reading and writing going on, at least I don't recall it happening with reading only, but I'm not confident in stating it didn't happen with a writing only load.
Found this bug report by the client error message, server side differs due to the different version: kernel/ hung_task_ timeout_ secs" disables this message. 0x2cb/0x750 timeout+ 0x157/0x170 completion+ 0x88/0x150 workqueue+ 0x140/0x3e0 callback_ sync+0x1a/ 0x30 [nfsd] session+ 0x193/0x260 [nfsd] compound+ 0x3b7/0x770 [nfsd] 0xbf/0x1d0 [nfsd] common+ 0x420/0x6e0 [sunrpc] tsc+0x10/ 0x10 dispatch+ 0x10/0x10 [nfsd] 0x10/0x10 [nfsd] 0x132/0x1b0 [sunrpc] 0x10/0x10 fork+0x47/ 0x70 0x10/0x10 fork_asm+ 0x1b/0x30
```
[300146.046666] INFO: task nfsd:1426 blocked for more than 241 seconds.
[300146.046732] Not tainted 6.5.0-27-generic #28~22.04.1-Ubuntu
[300146.046770] "echo 0 > /proc/sys/
[300146.046813] task:nfsd state:D stack:0 pid:1426 ppid:2 flags:0x00004000
[300146.046827] Call Trace:
[300146.046832] <TASK>
[300146.046839] __schedule+
[300146.046860] schedule+0x63/0x110
[300146.046870] schedule_
[300146.046881] wait_for_
[300146.046894] __flush_
[300146.046908] nfsd4_probe_
[300146.047074] nfsd4_destroy_
[300146.047219] nfsd4_proc_
[300146.047365] nfsd_dispatch+
[300146.047497] svc_process_
[300146.047695] ? __pfx_read_
[300146.047706] ? __pfx_nfsd_
[300146.047848] ? __pfx_nfsd+
[300146.047977] svc_process+
[300146.048157] nfsd+0xdc/0x1c0 [nfsd]
[300146.048287] kthread+0xf2/0x120
[300146.048299] ? __pfx_kthread+
[300146.048310] ret_from_
[300146.048321] ? __pfx_kthread+
[300146.048331] ret_from_
[300146.048341] </TASK>
```
This seems to be matching, but the previous lockups experienced may have been somewhat different.
I mostly remember the client whining about the server not responding instead of the message presented here, and the server call trace used to have btrfs in it which made me suspect it may be exclusive to that, although the issue was always with NFS, nothing else locked up despite having some other sources of heavy I/O.