Activity log for bug #1957986

Date Who What changed Old value New value Message
2022-01-14 20:22:17 dann frazier bug added bug
2022-01-14 20:22:26 dann frazier nominated for series Ubuntu Jammy
2022-01-14 20:22:26 dann frazier bug task added linux (Ubuntu Jammy)
2022-01-14 20:22:26 dann frazier nominated for series Ubuntu Hirsute
2022-01-14 20:22:26 dann frazier bug task added linux (Ubuntu Hirsute)
2022-01-14 20:22:26 dann frazier nominated for series Ubuntu Impish
2022-01-14 20:22:26 dann frazier bug task added linux (Ubuntu Impish)
2022-01-14 20:22:26 dann frazier nominated for series Ubuntu Focal
2022-01-14 20:22:26 dann frazier bug task added linux (Ubuntu Focal)
2022-01-14 20:22:34 dann frazier linux (Ubuntu Jammy): status New Fix Released
2022-01-14 20:22:36 dann frazier linux (Ubuntu Impish): status New Fix Released
2022-01-14 20:22:38 dann frazier linux (Ubuntu Hirsute): status New Fix Released
2022-01-14 20:22:41 dann frazier linux (Ubuntu Focal): status New In Progress
2022-01-14 20:22:44 dann frazier linux (Ubuntu Focal): assignee dann frazier (dannf)
2022-01-18 23:51:06 dann frazier description [Impact] An NFSv4 client that does a lot of opens/closes can overwhelm and NFSv4 server, causing a significant drop in performance. In my testing, I've seen performance drop from ~700MiB/s down to < 10MiB/s. The same workload using NFSv3 does not have this problem. [Test Case] This can be demonstrated using the elbencho benchmark from https://github.com/breuner/elbencho: $ elbencho -t 40 -r -n 10 -N 5000 -s 128k -b 128k /mnt/nfs/ubuntu You'll notice the nfsd threads (I stuck w/ the default of 4) start to consume 100% CPU, and the performance of the elbencho benchmark will begin to trickle. [Fix] The following fix solves the problem, but there are a number of patches dependencies required before it will apply to focal: commit 10717f45639f6c1bc27b56405252c3a027406d92 (refs/bisect/bad) Author: Trond Myklebust <trondmy@gmail.com> Date: Mon Jan 27 09:58:19 2020 -0500 NFSv4: Limit the total number of cached delegations Delegations can be expensive to return, and can cause scalability issues for the server. Let's therefore try to limit the number of inactive delegations we hold. Once the number of delegations is above a certain threshold, start to return them on close. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [What could go wrong] The fixes are restricted to NFS code, so problems should be limited to NFS users. They could include performance issues, crashes, etc. [Impact] An NFSv4 client that does a lot of opens/closes can overwhelm and NFSv4 server, causing a significant drop in performance. In my testing, I've seen performance drop from ~700MiB/s down to < 10MiB/s. The same workload using NFSv3 does not have this problem. [Test Case] This can be demonstrated using the elbencho benchmark from https://github.com/breuner/elbencho:  $ elbencho -t 40 -r -n 10 -N 5000 -s 128k -b 128k /mnt/nfs/ubuntu You'll notice the nfsd threads (I stuck w/ the default of 4) start to consume 100% CPU, and the performance of the elbencho benchmark will begin to trickle. [Fix] The following fix solves the problem, but there are a number of patches dependencies required before it will apply to focal: commit 10717f45639f6c1bc27b56405252c3a027406d92 (refs/bisect/bad) Author: Trond Myklebust <trondmy@gmail.com> Date: Mon Jan 27 09:58:19 2020 -0500     NFSv4: Limit the total number of cached delegations     Delegations can be expensive to return, and can cause scalability issues     for the server. Let's therefore try to limit the number of inactive     delegations we hold.     Once the number of delegations is above a certain threshold, start     to return them on close.     Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>     Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [What could go wrong] The fixes are restricted to NFS code, so problems should be limited to NFS users. They could include performance issues, crashes, etc. Because these changes are mostly related to NFS delegations, I use the `nfstest_delegation` test suite from nfstest[*] to try and identify any regressions: ./nfstest_delegation --client 192.168.42.1 --server 192.168.42.2 -e /srv/nfstest --trcdelay 4 Both before and after applying the fixes, I see the same 146 tests pass and 23 failures. The 23 failures are expected because I was using a Linux-based NFSv4 server which does not support all of the delegation modes that the test checks for. [*] git://git.linux-nfs.org/projects/mora/nfstest.git
2022-01-18 23:53:13 dann frazier attachment added nfstest_delegation-5.4.0-94-generic.log https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957986/+attachment/5555253/+files/nfstest_delegation-5.4.0-94-generic.log
2022-01-18 23:53:26 dann frazier attachment added nfstest_delegation-5.4.0-95-generic+fix.log https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957986/+attachment/5555254/+files/nfstest_delegation-5.4.0-95-generic+fix.log
2022-02-11 15:24:05 Stefan Bader linux (Ubuntu Focal): importance Undecided Medium
2022-02-11 15:24:39 Stefan Bader linux (Ubuntu Focal): status In Progress Fix Committed
2022-02-24 15:47:25 Ubuntu Kernel Bot tags verification-needed-focal
2022-02-25 22:36:16 dann frazier tags verification-needed-focal verification-done-focal
2022-03-21 15:50:22 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2022-03-21 15:50:22 Launchpad Janitor cve linked 2022-0435
2022-03-21 15:50:22 Launchpad Janitor cve linked 2022-0492
2022-03-21 15:50:22 Launchpad Janitor cve linked 2022-0516
2022-03-21 15:50:22 Launchpad Janitor cve linked 2022-0847