NFS client fails to mount with timeout on kernel 5.4.0-1057-aws

Bug #1946032 reported by José M.G. Moreira
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-aws-5.4 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

On an Ubuntu 18.04.1 server, I mount AWS EFS filesystems with nfsv4.1 successfully on ec2 instances.

Mounts work on kernel version 5.4.0-1056-aws but if i upgrade the kernel to 5.4.0-1057-aws the mounts stop working with a network timeout, due to port 111 being blocked as its not needed for nfsv4

Issue started to happen when apt daily automatically upgraded the kernel version. I reproduced by rolling back kernel version on the same instance and doing mounts successfully, then reverting to newer kernel (plus reboot) and getting the network timeout issue.

Mount command is identical in all cases and follows AWS Documentation

```
sudo mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport mount-target-DNS:/ ~/efs-mount-point
```

Could it be a kernel bug or misconfiguration, what can i do ?

Revision history for this message
José M.G. Moreira (josemoreira) wrote :

The crux of the issue seems to be the use of privileged ports, which are (and should be) blocked by network ACL in our environment:

The functional server instance is using a non-privileged port 56242:
tcp 0 0 10.99.19.46:56242 10.99.16.151:2049 ESTABLISHED

The non-functional server is trying to use a privileged port 978:
116 27.617188 10.99.19.43 10.99.16.151 TCP 76 978 → 2049 [SYN] Seq=0 Win=62727 Len=0 MSS=8961 SACK_PERM=1 TSval=906311050 TSecr=0 WS=128

As a result, the non-working server is unable to establish a connection with EFS to mount.
This behaviour happens with this kernel version (https://launchpad.net/ubuntu/+source/linux-aws/5.4.0-1057.60 only so far):

It appears that this kernel ignores the "noresvport" mount option.

tags: added: kernel-bug
tags: added: bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-aws-5.4 (Ubuntu):
status: New → Confirmed
Revision history for this message
José M.G. Moreira (josemoreira) wrote :

same behavior has been reproduced in Ubuntu 20, kernel 5.11.0-10-19-aws

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.