NFS Client Ignores TCP Resets

Bug #1542826 reported by Richard Laager
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Steps to reproduce:
1) Mount NFS share from HA cluster with TCP.
2) Failover the HA cluster. (The NFS server's IP address moves from one
machine to the other.)
3) Access the mounted NFS share from the client (an `ls` is sufficient).

Expected results:
Accessing the NFS mount works fine immediately.

Actual results:
Accessing the NFS mount hangs for 5 minutes. Then the TCP connection times out, a new connection is established, and it works fine again.

After the IP moves, the new server responds to the client with TCP RST packets, just as I would expect. I would expect the client to tear down its TCP connection immediately and re-establish a new one. But it doesn't. Am I confused, or is this a bug?

For the duration of this test, all iptables firewalling was disabled on the client machine. I have a packet capture of a minimized test (just a simple ls).

Note that this is a "single failover" scenario. It is NOT a case of failing over and then failing back before the TCP connection times out on the first NFS server.

We've been using UDP instead of TCP as a work-around.

The system is running Ubuntu 14.04 with Linux 3.13.0-76-generic.
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Feb 3 18:41 seq
 crw-rw---- 1 root audio 116, 33 Feb 3 18:41 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
HibernationDevice: RESUME=/dev/mapper/data-swap
InstallationDate: Installed on 2015-07-11 (210 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: Silicon Mechanics Storform iServ R515.v2.1
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-76-generic root=/dev/mapper/data-os ro elevator=noop console=ttyS1,115200n8 console=tty1 transparent_hugepage=always nomdmonddf nomdmonisw
ProcVersionSignature: Ubuntu 3.13.0-76.120-generic 3.13.11-ckt30
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-76-generic N/A
 linux-backports-modules-3.13.0-76-generic N/A
 linux-firmware 1.127.20
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty
Uname: Linux 3.13.0-76-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm dialout libvirtd lpadmin sudo users
WifiSyslog:

_MarkForUpload: True
dmi.bios.date: 05/15/2012
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2.0c
dmi.board.asset.tag: 1234567890
dmi.board.name: X8DT6
dmi.board.vendor: Supermicro
dmi.board.version: 1234567890
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 17
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 1234567890
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2.0c:bd05/15/2012:svnSiliconMechanics:pnStorformiServR515.v2.1:pvr1234567890:rvnSupermicro:rnX8DT6:rvr1234567890:cvnSupermicro:ct17:cvr1234567890:
dmi.product.name: Storform iServ R515.v2.1
dmi.product.version: 1234567890
dmi.sys.vendor: Silicon Mechanics

Richard Laager (rlaager)
description: updated
description: updated
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1542826

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Richard Laager (rlaager) wrote : BootDmesg.txt

apport information

tags: added: apport-collected trusty
description: updated
Revision history for this message
Richard Laager (rlaager) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : IwConfig.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : Lspci.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : Lsusb.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : ProcModules.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : UdevDb.txt

apport information

Revision history for this message
Richard Laager (rlaager) wrote : UdevLog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.5 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-rc3-wily/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
penalvch (penalvch)
tags: added: latest-bios-2.0c
Richard Laager (rlaager)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Revision history for this message
Richard Laager (rlaager) wrote :

The problem exists with the upstream kernel as well. It times out right at 5 minutes instead of right at 15 minutes. So clearly some timer is different. But the problem is still there.

Revision history for this message
penalvch (penalvch) wrote :

Richard Laager, to clarify, did this problem not occur with a kernel prior to 3.13.0-76?

tags: added: kernel-bug-exists-upstream-4.5-rc3
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Richard Laager (rlaager) wrote :

I have never seen this work, but I can't say the earliest kernel we tried with.

Revision history for this message
penalvch (penalvch) wrote :

Richard Laager, the issue you are reporting is an upstream one. Could you please report this problem following the instructions verbatim at https://wiki.ubuntu.com/Bugs/Upstream/kernel to the appropriate mailing list (TO Trond Myklebust, and Anna Schumaker CC linux-nfs)?

Please provide a direct URL to your post to the mailing list when it becomes available so that it may be tracked.

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Richard Laager (rlaager)
description: updated
Revision history for this message
Richard Laager (rlaager) wrote :
Revision history for this message
Richard Laager (rlaager) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.