Comment 0 for bug 1887607

Revision history for this message
Matthew Ruffell (mruffell) wrote : Cutting and Pasting files from NFS sec=sys to NFS sec=krb5p causes NFS to hang

BugLink: https://bugs.launchpad.net/bugs/

[Impact]

If you have a desktop system, with two NFS mounts:
- One that uses the baseline IP based security, aka sec=sys,
- and the other that uses Kerberos sec=krb5p based security,

If you try and cut a file from the normal NFS mount, and paste it to a directory on the kerberos krb5p mount (using Nautilus), the NFS subsystem will lock up, Nautilus will hang, and the file won't be moved.

The problem only reproduces if you cut and paste. Copying and pasting does not trigger any problems. Using mv in terminal doesn't reproduce either, you need to use Nautilus.

The issue was introduced into 4.15.0-60-generic, by the commit:

commit 594d1644cd59447f4fceb592448d5cd09eb09b5e
Author: Chris Perl <email address hidden>
Date: Mon Dec 17 10:56:38 2018 -0500
Subject: NFS: nfs_compare_mount_options always compare auth flavors.
Link: https://github.com/torvalds/linux/commit/594d1644cd59447f4fceb592448d5cd09eb09b5e

It was backported to 4.15.0-60-generic from upstream -stable, and landed in 4.4.175, 4.14.99 and 4.19.21. The commit itself does not actually cause the problem, it simply removes a check for NFS server security settings, which simply reveals a broken codepath which the testcase triggers.

Xenial 4.4.0-185-generic is not affected, only Bionic 4.15.0-60-generic onward.

[Fix]

The fix landed in 5.1-rc1, in the following commit:

commit 02ef04e432babf8fc703104212314e54112ecd2d
Author: Chuck Lever <email address hidden>
Date: Mon Feb 11 11:25:25 2019 -0500
Subject: NFS: Account for XDR pad of buf->pages
Link: https://github.com/torvalds/linux/commit/02ef04e432babf8fc703104212314e54112ecd2d

The above commit more or less relies on the below commit as a dependency, and is included in the SRU:

commit cf500bac8fd48b57f38ece890235923d4ed5ee91
Author: Chuck Lever <email address hidden>
Date: Mon Feb 11 11:25:20 2019 -0500
Subject: SUNRPC: Introduce rpc_prepare_reply_pages()
Link: https://github.com/torvalds/linux/commit/cf500bac8fd48b57f38ece890235923d4ed5ee91

It appears that some NFS calls return a NFS payload which is not a multiple of 4 bytes, but any payload sent over the network needs to be padded to an exact multiple of 4 bytes. It seems cutting and pasting from Nautilus triggers one such payload which is missing a byte, and it causes the NFS subsystem to hang during packet transmission. The fix ensures that all payloads use correct padding.

[Testcase]

You will need four machines. The first, is a kerberos KDC. Set up Kerberos correctly and create new service principals for the NFS server and for the client.

The second machine will be a NFS server with the krb5p share. Add the nfs server kerberos keys to the system's keytab, and set up a NFS server that exports a directory with sec=krb5p.

The third machine is a regular NFS server. Export a directory with normal sec=sys security.

The fourth is a desktop machine. Add the client kerberos keys to the system's keytab. Mount both NFS shares, and generate some files full of random data. I found 20MB from /dev/random works great.

Open each NFS share up in tabs in Nautilus. Copy the random data files to the sec=sys NFS share. When they are done, one at a time cut and then paste the file into the sec=krb5p NFS share. The bug will trigger either on the first, or subsequent tries, but less than 10 tries are needed usually.

There is a test kernel available in the following PPA:
https://launchpad.net/~mruffell/+archive/ubuntu/sf285439-test

If you install the test kernel, files will cut and paste correctly, and NFS will work as expected.

[Regression Potential]

If a regression were to occur, it would impact users of the NFS subsystem, since the changes modify how padding is applied to all NFS packets, and a regression would affect all versions of NFS.

If a regression were to occur, users would need to downgrade their kernel while awaiting a fix.