After most recent upgrade to 3.2.0-87-generic, nfs server process has extremely high I/O to /var/lib/nfs/v4recovery

Bug #1473948 reported by IanBall
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nfs-utils (Ubuntu)
New
Undecided
Unassigned

Bug Description

We upgraded our 12.04 LTS on 9 July at around 18:30. Immediately after the reboot, the I/O on the / partition (sda) was extremely high. This was causing sluggish responsiveness on the NFS server who's exported directory is on a different file system (sdb) and LUN (disk).

I investigated the problem and found using iotop that the process "jbd2/sda2-8" was responsible for an extremely high number of I/O operations
Total DISK READ: 0.00 B/s | Total DISK WRITE: 13.33 M/
  TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
  293 be/3 root 0.00 B/s 0.00 B/s 0.00 % 39.16 % [jbd2/sda2-8]

I turned on file system debugging using
   echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
   echo 1 >/sys/kernel/debug/tracing/events/jbd2/jbd2_run_stats/enable

and found that one inode was being massively addressed.

     jbd2/sda2-8-293 [000] 260587.952474: jbd2_run_stats: dev 8,2 tid 42050430 wait 0 running 0 locked 0 flushing 0 logging 0 handle_count 2 blocks 7 blocks_logged 8
            nfsd-1332 [000] 260587.953142: ext4_sync_file_enter: dev 8,2 ino 150 parent 16987 datasync 0

This inode (150) belongs to /var/lib/nfs/v4recovery

Further investigation showed in dmesg that shortly after booting the directory /var/lib/nfs/v4recovery couldn't be written to:
[ 99.020861] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable
[ 99.089156] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable
[ 99.189010] NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writeable

I have tried deleting and recreating the directory /var/lib/nfs/v4recovery with 777 permissions however this did not solve the problem. These messages were still produced even after a reboot.

The I/O operations per second are sometimes in excess of 1000 and typically around 500-750. This is completely different behaviour to the previous kernel where the I/O operations were in the 5-20 I/O operations per second with peaks around 30. I will attach a graph from our central EMC storage system of the LUN for a graphical view of before and after the update.

There were no changes in the parameters or shares to the NFS server and I have not been able to find any documentation about parameters that we should change to address such a problem so I can only conclude with this behaviour that this is a bug of some decsription.

Additional information about our system:
$ uname -a
Linux wsps428 3.2.0-87-generic #125-Ubuntu SMP Fri Jun 19 08:25:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/version_signature
Ubuntu 3.2.0-87.125-generic 3.2.69

$ lsb_release -rd
Description: Ubuntu 12.04.5 LTS
Release: 12.04

$ apt-cache policy nfs-server
nfs-server:
  Installed: (none)
  Candidate: (none)
  Version table:

I will attach a dmesg output.

Tags: bot-comment
Revision history for this message
IanBall (ian-onlineloop) wrote :
Revision history for this message
IanBall (ian-onlineloop) wrote :
Revision history for this message
IanBall (ian-onlineloop) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1473948/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
IanBall (ian-onlineloop)
affects: ubuntu → nfs-utils (Ubuntu)
Revision history for this message
IanBall (ian-onlineloop) wrote :
Revision history for this message
IanBall (ian-onlineloop) wrote :

As we had continuing severe problems with our applications, I downgraded the entire nfs-kernel-server and nfs-utils packages to version 1:1.2.5-3ubuntu3 using "apt-get install nfs-common=1:1.2.5-3ubuntu3" and "apt-get install nfs-kernel-server=1:1.2.5-3ubuntu3".
The I/O problems are gone with this older version.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.