Ubuntu
linux package

NFSv4 CLOSE/LOCK-operation needs timing improvement for AIX compat

Bug #1167420 reported by Bryan Quigley on 2013-04-10

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Invalid	Medium	Dave Chiluk

Bug Description

We're mounting our users' home directories via NFSv4 from a server
running AIX 7.1. When a process on an Ubuntu 12.04 client locks a file
accessed via NFSv4 using flock() and afterwards closes it, the close()
sometimes blocks for 15 seconds.

The cause of the problem turned out to be a race condition between the
NFSv4 operations CLOSE and RELEASE_LOCKOWNER. The client sends CLOSE
immediately after RELEASE_LOCKOWNER, without waiting for the reply
for RELEASE_LOCKOWNER (which is completely fine to do per NFS RFC, but not what AIX was expecting). Sometimes it happens that the server tries to process the CLOSE before RELEASE_LOCKOWNER has finished. In that case
it replies to the CLOSE with NFS4ERR_DELAY, which causes the client to
retry the CLOSE after 15 seconds.

Ubuntu should instead of freezing for 15 seconds retry sooner and then exponentially increase the timeout up to 15 sec.
This should make the client more responsive in other cases, while increasing compatibility with AIX 7.1's implementation issues.

OS: Ubuntu 12.04 / AIX 7.1. Not fixed in upstream Linux releases.

Steps to Reproduce:
  gcc -o open-close open-close.c
  touch testfile # create file to be used by open-close
  for i in `seq 100`; do ./open-close; done

Happens 10% of the time.

Relevant NFS mailing list discussion: https://www.ietf.org/mail-archive/web/nfsv4/current/msg11720.html

Tags:

Revision history for this message

Bryan Quigley (bryanquigley) wrote on 2013-04-10:

open-close.c Edit (410 bytes, text/plain)

Revision history for this message

Brad Figg (brad-figg) wrote on 2013-04-10: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1167420

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete

Joseph Salisbury (jsalisbury) on 2013-04-10

Changed in linux (Ubuntu):
importance:	Undecided → Medium

Dave Chiluk (chiluk) on 2013-04-22

Changed in linux (Ubuntu):
assignee:	nobody → Dave Chiluk (chiluk)

Revision history for this message

Dave Chiluk (chiluk) wrote on 2013-04-22:

Anyone affected by this issue please attempt to recreate using the kernel available below. And report back here.
http://people.canonical.com/~chiluk/lp1167420/

The patch included with this kernel should drop the first retry from 15 seconds down to .1 second.

tags:

added: precise raring

Revision history for this message

Bryan Quigley (bryanquigley) wrote on 2014-07-29:

A tunable can be set in AIX 7.1 that forces better syncronization which avoids this issue.

Changed in linux (Ubuntu):
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

open-close.c Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

NFSv4 CLOSE/LOCK-operation needs timing improvement for AIX compat

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package