[regression] NFS client: access problems after updating to kernel 4.4.0-31-generic

Bug #1603719 reported by Jurgen Schellaert on 2016-07-17
60
This bug affects 12 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Seth Forshee
Xenial
High
Seth Forshee

Bug Description

SRU Justification

Impact: A regression in xenial causes automatic mounting of exported submounts of an nfs mount to fail.

Fix: Change nfs to use sget_userns instead of sget, bypassing a capability check that is not necessary for nfs.

Regression Potential: The behavior after the fix is functionally equivalent to upstream, so this is unlikely to cause regressions.

---

I am denied access to the subfilesystems exported by my nfs server (the top level filesystem itself is unaffected).

The client is reporting that I do not have the necessary permissions. However, all was fine until the day before yesterday.

When I revert my client to 4.4.0-28, everything is in working order again. I assume the permission problem is really a bug in kernel 4.4.0-31.
---
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jurgen 1743 F.... pulseaudio
CurrentDesktop: Unity
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=569da237-3a37-4fe3-b885-83213aae8b52
InstallationDate: Installed on 2016-04-07 (101 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Beta amd64 (20160405)
IwConfig:
 lo no wireless extensions.

 enp4s0 no wireless extensions.
MachineType: MICRO-STAR INTERNATIONAL CO.,LTD MS-7514
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-31-generic root=UUID=f6f6bd51-4d4f-4547-b615-90319625d909 ro quiet splash
ProcVersionSignature: Ubuntu 4.4.0-31.50-generic 4.4.13
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-31-generic N/A
 linux-backports-modules-4.4.0-31-generic N/A
 linux-firmware 1.157.2
RfKill:

Tags: xenial
Uname: Linux 4.4.0-31-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 05/28/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V1.1
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MS-7514
dmi.board.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.board.version: 1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV1.1:bd05/28/2008:svnMICRO-STARINTERNATIONALCO.,LTD:pnMS-7514:pvr1.0:rvnMICRO-STARINTERNATIONALCO.,LTD:rnMS-7514:rvr1.0:cvnMICRO-STARINTERNATIONALCO.,LTD:ct3:cvr1.0:
dmi.product.name: MS-7514
dmi.product.version: 1.0
dmi.sys.vendor: MICRO-STAR INTERNATIONAL CO.,LTD

description: updated

Jurgen Schellaert, thank you for reporting this and helping to make Ubuntu better. Please execute the following command only once, as it will automatically gather debugging information, in a terminal:
apport-collect 1603719

When reporting bugs in the future please use apport by using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

affects: libreoffice (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Andreas Roth (aroth) wrote :

I experience the same issue:

Kernel on the nfs client:
#> uname -a
Linux ar 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Here's the fstab line for the nfs4 mount:
server:/ /network/srv nfs4 noauto,user,x-systemd.automount,x-systemd.device-timeout=10,x-systemd.idle-timeout=0,rw,sec=sys,relatime,vers=4.2,hard,intr,proto=tcp,timeo=30,retrans=2,port=2049,x-systemd.requires=nfs-client.target 0 0

#> ls /network/srv
music videos

#> ls /network/srv/music
/bin/ls: cannot open directory '/network/srv/music': Operation not permitted

Kernel on the NFS server:
#> uname -a
Linux ossrv 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Here's the /etc/exports file on the NFS server:
/export 192.168.0.0/16(fsid=0,rw,crossmnt,insecure,no_subtree_check,async,anonuid=65534,anongid=65534)
/export/music 192.168.0.0/16(rw,crossmnt,insecure,no_subtree_check,async)
/export/videos 192.168.0.0/16(rw,crossmnt,insecure,no_subtree_check,async)

When i boot the kernel version 4.4.0-28-generic (or before) on the NFS client is working fine.

apport information

tags: added: apport-collected
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Same problem reverting to 4.4.0-28 fixed the problem

Sean (seanshivak) wrote :

Same issue on this kernel. Cant browse subdirectories that are mounted hdds after kernel update. Downgrading fixes the issue.

Ex can browse /mnt but cant browse /mnt/hdd1 /mnt/hdd2 etc

If im root i can browse the subdirectories though just not as a standard user. Verified permission and works fine on a kernel downgrade. After root user has browsed a nfs subdirectory a standard user can browse it.

Also able to browse the subdirectories if I mount each hdd individually. Ex if I mount on client /mnt/hdd1 instead of mounting /mnt

Jurgen Schellaert, in order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following:
1) The one to test is at the very top line at the top of the page (not the daily folder).
2) The release names are irrelevant.
3) The folder time stamps aren't indicative of when the kernel actually was released upstream.
4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds .

If testing on your main install would be inconvenient, one may:
1) Install Ubuntu to a different partition and then test this there.
2) Backup, or clone the primary install.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Also, you don't need to apport-collect further unless specifically requested to do so.

Once testing of the latest upstream kernel is complete, please mark this report Status Confirmed. Please let us know your results.

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Incomplete → Opinion
status: Opinion → Incomplete

Thanks for your helpful instructions.

The most recent mainline kernel (= 4.7.0-040700rc7-generic) appears to be unaffected.

I can now enter the exported subfilesystems again with the expected permissions.

tags: added: kernel-fixed-upstream kernel-fixed-upstream-4.7rc7
tags: added: kernel-fixed-upstream-4.7-rc7
removed: kernel-fixed-upstream-4.7rc7
Changed in linux (Ubuntu):
status: Incomplete → Confirmed

Same problem. Reverted to 4.4.0-28. Didn't try mainline as -28 was sufficient.

Allan (aileanr) wrote :

I am having the same problem here as well since 4.4.0-31. Have not tried mainline.

Bryan Quigley (bryanquigley) wrote :

I also tried Upstream v4.4.15 (35467dc7630af60abacc330f64029d081f160530) which does not have the issue. Going to bisect shortly.

summary: - NFS client: access problems after updating to kernel 4.4.0-31-generic
+ [regression] NFS client: access problems after updating to kernel
+ 4.4.0-31-generic
Andy Whitcroft (apw) on 2016-07-21
tags: added: regression-updates
Changed in linux (Ubuntu):
importance: Low → High
tags: added: kernel-key
Seth Forshee (sforshee) wrote :

Can you please test the kernel at the link below and see if it fixes your issue? Thanks!

http://people.canonical.com/~sforshee/lp1603719/

Changed in linux (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
status: Confirmed → Incomplete
Bryan Quigley (bryanquigley) wrote :

Thanks Seth. That kernel fixes it for me.

Seth Forshee (sforshee) on 2016-07-22
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in linux (Ubuntu Xenial):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → High
status: New → In Progress
Seth Forshee (sforshee) wrote :

Marking invalid for the development task as that kernel is not affected.

Changed in linux (Ubuntu):
status: In Progress → Invalid
Seth Forshee (sforshee) on 2016-07-22
description: updated
Seth Forshee (sforshee) on 2016-07-22
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (14.6 KiB)

This bug was fixed in the package linux - 4.4.0-33.52

---------------
linux (4.4.0-33.52) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1605709

  * [regression] NFS client: access problems after updating to kernel
    4.4.0-31-generic (LP: #1603719)
    - SAUCE: (namespace) Bypass sget() capability check for nfs

linux (4.4.0-32.51) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1604443

  * thinkpad yoga 260 wacom touchscreen not working (LP: #1603975)
    - HID: wacom: break out parsing of device and registering of input
    - HID: wacom: Initialize hid_data.inputmode to -1
    - HID: wacom: Support switching from vendor-defined device mode on G9 and G11

  * changelog: add CVEs as first class citizens (LP: #1604344)
    - use CVE numbers in changelog

  * [Xenial] Include Huawei PCIe SSD hio kernel driver (LP: #1603483)
    - SAUCE: import Huawei ES3000_V2 (2.1.0.23)
    - SAUCE: hio: bio_endio() no longer takes errors arg
    - SAUCE: hio: blk_queue make_request_fn now returns a blk_qc_t
    - SAUCE: hio: use alloc_cpumask_var to avoid -Wframe-larger-than
    - SAUCE: hio: fix mask maybe-uninitialized warning
    - [config] enable CONFIG_HIO (Huawei ES3000_V2 PCIe SSD driver)
    - SAUCE: hio: Makefile and Kconfig

  * CVE-2016-5243 (LP: #1589036)
    - tipc: fix an infoleak in tipc_nl_compat_link_dump
    - tipc: fix nl compat regression for link statistics

  * CVE-2016-4470
    - KEYS: potential uninitialized variable

  * integer overflow in xt_alloc_table_info (LP: #1555353)
    - netfilter: x_tables: check for size overflow

  * CVE-2016-3135:
    - Revert "UBUNTU: SAUCE: (noup) netfilter: x_tables: check for size overflow"

  * CVE-2016-4440 (LP: #1584192)
    - kvm:vmx: more complete state update on APICv on/off

  * the system hangs in the dma driver when reboot or shutdown on a baytrail-m
    laptop (LP: #1602579)
    - dmaengine: dw: platform: power on device on shutdown
    - ACPI / LPSS: override power state for LPSS DMA device

  * Add proper palm detection support for MS Precision Touchpad (LP: #1593124)
    - Revert "HID: multitouch: enable palm rejection if device implements
      confidence usage"
    - HID: multitouch: enable palm rejection for Windows Precision Touchpad

  * Add support for Intel 8265 Bluetooth ([8087:0A2B]) (LP: #1599068)
    - Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]

  * CVE-2016-4794 (LP: #1581871)
    - percpu: fix synchronization between chunk->map_extend_work and chunk
      destruction
    - percpu: fix synchronization between synchronous map extension and chunk
      destruction

  * Xenial update to v4.4.15 stable release (LP: #1601952)
    - net_sched: fix pfifo_head_drop behavior vs backlog
    - net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG
    - sit: correct IP protocol used in ipip6_err
    - esp: Fix ESN generation under UDP encapsulation
    - netem: fix a use after free
    - ipmr/ip6mr: Initialize the last assert time of mfc entries.
    - Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address
    - sock_diag: do not broadcast raw socket destruction
    - bpf, perf...

Changed in linux (Ubuntu):
status: Invalid → Fix Released
Seth Forshee (sforshee) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Bryan Quigley (bryanquigley) wrote :

Verified working with both -33 and -34 kernels

tags: added: verification-done-xenial
removed: verification-needed-xenial
Allan (aileanr) wrote :

This fixed it for me (although I have no idea how to run verification nor change tags).

I confirm that the issue is solved by the kernel in proposed.

Also confirm -34 solves issue.

Launchpad Janitor (janitor) wrote :
Download full text (15.0 KiB)

This bug was fixed in the package linux - 4.4.0-34.53

---------------
linux (4.4.0-34.53) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1606960

  * [APL][SAUCE] Slow system response time due to a monitor bug (LP: #1606147)
    - x86/cpu/intel: Introduce macros for Intel family numbers
    - SAUCE: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont
      based CPUs

linux (4.4.0-33.52) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1605709

  * [regression] NFS client: access problems after updating to kernel
    4.4.0-31-generic (LP: #1603719)
    - SAUCE: (namespace) Bypass sget() capability check for nfs

linux (4.4.0-32.51) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1604443

  * thinkpad yoga 260 wacom touchscreen not working (LP: #1603975)
    - HID: wacom: break out parsing of device and registering of input
    - HID: wacom: Initialize hid_data.inputmode to -1
    - HID: wacom: Support switching from vendor-defined device mode on G9 and G11

  * changelog: add CVEs as first class citizens (LP: #1604344)
    - use CVE numbers in changelog

  * [Xenial] Include Huawei PCIe SSD hio kernel driver (LP: #1603483)
    - SAUCE: import Huawei ES3000_V2 (2.1.0.23)
    - SAUCE: hio: bio_endio() no longer takes errors arg
    - SAUCE: hio: blk_queue make_request_fn now returns a blk_qc_t
    - SAUCE: hio: use alloc_cpumask_var to avoid -Wframe-larger-than
    - SAUCE: hio: fix mask maybe-uninitialized warning
    - [config] enable CONFIG_HIO (Huawei ES3000_V2 PCIe SSD driver)
    - SAUCE: hio: Makefile and Kconfig

  * CVE-2016-5243 (LP: #1589036)
    - tipc: fix an infoleak in tipc_nl_compat_link_dump
    - tipc: fix nl compat regression for link statistics

  * CVE-2016-4470
    - KEYS: potential uninitialized variable

  * integer overflow in xt_alloc_table_info (LP: #1555353)
    - netfilter: x_tables: check for size overflow

  * CVE-2016-3135:
    - Revert "UBUNTU: SAUCE: (noup) netfilter: x_tables: check for size overflow"

  * CVE-2016-4440 (LP: #1584192)
    - kvm:vmx: more complete state update on APICv on/off

  * the system hangs in the dma driver when reboot or shutdown on a baytrail-m
    laptop (LP: #1602579)
    - dmaengine: dw: platform: power on device on shutdown
    - ACPI / LPSS: override power state for LPSS DMA device

  * Add proper palm detection support for MS Precision Touchpad (LP: #1593124)
    - Revert "HID: multitouch: enable palm rejection if device implements
      confidence usage"
    - HID: multitouch: enable palm rejection for Windows Precision Touchpad

  * Add support for Intel 8265 Bluetooth ([8087:0A2B]) (LP: #1599068)
    - Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]

  * CVE-2016-4794 (LP: #1581871)
    - percpu: fix synchronization between chunk->map_extend_work and chunk
      destruction
    - percpu: fix synchronization between synchronous map extension and chunk
      destruction

  * Xenial update to v4.4.15 stable release (LP: #1601952)
    - net_sched: fix pfifo_head_drop behavior vs backlog
    - net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Thomas Fili (tfili69) wrote :

Very strange ...
for us the problem is not solved ... all Kernel Versions for Trusty and Xenial since 4.4.0-31-generic are affected ... 4.4.0-54 also.
Through the report here and Bug #1604396 i tested some mainline kernel ...
By todays post the latest 4.4 mainline kernel ( 4.4.37-040437-generic )work for us without problems
But 4.8 and 4.9 kernel have still the access denied error for NFS subshares

Seth Forshee (sforshee) wrote :

@Thomas - Please file a new bug containing as much detail as possible about your NFS setup. It's best if you do this by running 'ubuntu-bug linux' after getting the access denied error so that any relevant information from the logs will be included. You can assign the new bug to me. Thanks!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers