automount fails on resume after suspend if file open for writing

Bug #1544462 reported by Seved Torstendahl on 2016-02-11
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

For a couple of years I have usually not shut down my pc after using it, I just leave it and let it suspend.
(Of course I have also shut down the system completely every now and then.)
This has worked well but recently with new kernels 4.2.0-* I get problems.

Further investigations show that it is automount that fails when :
1. a file is opened for writing on a nfs-mounted volume that autofs mounts
2. autofs unmounts the nfs directory after five minutes of inactivity, AND
3. the system is automatically suspended after ten minutes of inactivity

After resuming activity it is impossible to access the volume on the server

To reproduce:
    cat > /path/to/file/on/automounted/nfs-dir/file.txt
    write some text
    wait for the system to suspend

    Activate the system again and try to access the automounted volume

ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: linux-image-4.2.0-27-generic 4.2.0-27.32
ProcVersionSignature: Ubuntu 4.2.0-27.32-generic 4.2.8-ckt1
Uname: Linux 4.2.0-27-generic i686
ApportVersion: 2.19.1-0ubuntu5
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: seved 2472 F.... pulseaudio
CurrentDesktop: Unity
Date: Thu Feb 11 09:35:17 2016
HibernationDevice: RESUME=UUID=38682e03-657e-46a8-b728-71bb6b19ffd3
InstallationDate: Installed on 2010-11-12 (1916 days ago)
InstallationMedia: Ubuntu 10.04.1 LTS "Lucid Lynx" - Release i386 (20100816.1)
MachineType: Gigabyte Technology Co., Ltd. G41M-ES2L
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-27-generic root=UUID=67935e67-5ba7-4442-8401-b0134d6ca21f ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-27-generic N/A
 linux-backports-modules-4.2.0-27-generic N/A
 linux-firmware 1.149.3
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/04/2009
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: F6
dmi.board.name: G41M-ES2L
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrF6:bd11/04/2009:svnGigabyteTechnologyCo.,Ltd.:pnG41M-ES2L:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnG41M-ES2L:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: G41M-ES2L
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Seved Torstendahl (sevedt) wrote :
description: updated

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Seved Torstendahl (sevedt) wrote :

Actually the system works ok until the program that is writing to the file reaches a point where data must be written. E.g. in the example above, continue writing after resume. When a limit is reached the buffer shall be written to the file. Bang!! The terminal hangs and when I try to examine the nfs volume with the file manager it hangs.

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.5 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-rc3-wily/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Seved Torstendahl (sevedt) wrote :

I had used the information from the wiki page DebuggingKernelSuspend
"resume-trace" debugging procedure for finding buggy drivers
and made the system suspend by
    sudo sh -c "sleep 720 && sync && echo 1 > /sys/power/pm_trace && pm-suspend"
i.e. wait 12 minutes before suspend. So dmesg.log4 may contain trace info from suspend/resume.

My earlier assumption that autofs unmounts the volume seems to be incorrect, point 2. in prerequisites.

There are four conditions:
1. a file is opened for writing on a nfs-mounted volume that autofs mounts AND
2. the system is suspended
3. the system is resumed
4. a write operation to the "open" file is performed
==> it is impossible to access the volume on the server

I have just tested the new kernel and it works ok so far!

I'll update the bug reports now

Den 2016-02-11 kl. 18:34, skrev Joseph Salisbury:
> Would it be possible for you to test the latest upstream kernel? Refer
> to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
> v4.5 kernel[0].
>
> If this bug is fixed in the mainline kernel, please add the following
> tag 'kernel-fixed-upstream'.
>
> If the mainline kernel does not fix this bug, please add the tag:
> 'kernel-bug-exists-upstream'.
>
> Once testing of the upstream kernel is complete, please mark this bug as
> "Confirmed".
>
>
> Thanks in advance.
>
> [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-rc3-wily/
>
>
> ** Changed in: linux (Ubuntu)
> Importance: Undecided => Medium
>
> ** Changed in: linux (Ubuntu)
> Status: Confirmed => Incomplete
>

Seved Torstendahl (sevedt) wrote :

I have just tested the new kernel and it works ok so far!

But now I don't understand how to add a tag. On this page displaying the bug activity I can't find tags or how to add one.

I want to add the tag 'kernel-fixed-upstream' and mark the bug as "Confirmed"

Seved Torstendahl (sevedt) wrote :

I finally found the place for tags so the bug is now marked as it should be

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-fixed-upstream-4.5-rc3 needs-reverse-bisect

Seved Torstendahl, the next step is to fully reverse commit bisect from kernel 4.2 to 4.5-rc3 in order to identify the last bad commit, followed immediately by the first good one. Once this good commit has been identified, it may be reviewed for backporting. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F ?

Please note, finding adjacent kernel versions is not fully commit bisecting.

After the fix commit (not kernel version) has been identified, then please mark this report Status Confirmed.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

tags: added: bios-outdated-f9
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Seved Torstendahl (sevedt) wrote :

Happy to be able to help! It will take me a couple of days I guess but I'll come back when
I have found the breakpoint.

Den 2016-02-12 kl. 22:26, skrev Christopher M. Penalver:
> Seved Torstendahl, the next step is to fully reverse commit bisect from
> kernel 4.2 to 4.5-rc3 in order to identify the last bad commit, followed
> immediately by the first good one. Once this good commit has been
> identified, it may be reviewed for backporting. Could you please do this
> following
> https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F
> ?
>
> Please note, finding adjacent kernel versions is not fully commit
> bisecting.
>
> After the fix commit (not kernel version) has been identified, then
> please mark this report Status Confirmed.
>
> Thank you for your understanding.
>
> Helpful bug reporting tips:
> https://wiki.ubuntu.com/ReportingBugs
>
> ** Tags added: bios-outdated-f9
>
> ** Changed in: linux (Ubuntu)
> Status: Confirmed => Incomplete
>

Seved Torstendahl (sevedt) wrote :

Now I think I have found the commit that fixes the issue by reverse bisecting first kernel versions and the commits.
Since it is a reverse bisection it says first bad commit instead of good commit:

# first bad commit: [e92c1e0d40c50472f80820bd829645ce9fefd6c1] NFSv4: Fix a nograce recovery hang

See also attached containing output from 'git bisect good' and 'git bisect log'

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: cherry-pick reverse-bisect-done
removed: needs-reverse-bisect
Changed in linux (Ubuntu):
status: Confirmed → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers