ubuntu 20.04 nfs page fault since Kernel 5.3 (?)

Bug #1876567 reported by Michael
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-oem-5.6 (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Kind of a follow up/extension to https://bugs.launchpad.net/ubuntu/disco/+source/linux/+bug/1858832

== Original Bug Report (for the most part, still holds true) ==

RELEASE=19.3
CODENAME=tricia
EDITION="Cinnamon"
DESCRIPTION="Linux Mint 19.3 Tricia"
DESKTOP=Gnome
TOOLKIT=GTK
NEW_FEATURES_URL=https://www.linuxmint.com/rel_tricia_cinnamon_whatsnew.php
RELEASE_NOTES_URL=https://www.linuxmint.com/rel_tricia_cinnamon.php
USER_GUIDE_URL=https://www.linuxmint.com/documentation.php
GRUB_TITLE=Linux Mint 19.3 Cinnamon

My home dir is mounted through nfs on a local server via nfs4 and krb5i.
When stressing the mounted directory or its sub-directories (sometimes starting firefox, sometimes starting thunderbird, nearly guaranteed when compiling, sometimes the login itself), it will eventually lead to the following stack-trace. The corresponding process is then stuck and
accessing the mounted directory (like calling ls) easily yields further and similar stack trace and causing the process to also stuck.

Currently I am running an AMD 3950x on a ASUS Crosshair VII Hero Wifi (chipset x470).

I installed Ubuntu 20.04 LTS Desktop to check if the newer kernels (5.4 + 5.6) work without issues, but sadly, they **do not**.

With a little compile-stress-test, I have tested the following kernels which seem to run fine:
 * 4.15.0-69
 * 4.15.0-70
 * 4.15.0-72
 * 5.0.0-32
 * 5.0.0-47 (current daily driver, runs without a hassle, max test length >7d - I am writing this bug report on it)
 * 5.4.0-32-generic #36-Ubuntu from the proposal https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/19306320
 * 5.6.0-1010-oem #10-Ubuntu SMP
 * 5.7.0-050700rc6-generic #202005172030

But the following kernels do not run stable:
 * 5.3.0-51 (Linux Mint 19.3 HWE)
 * 5.4.0-28 (Ubuntu 20.04)
 * 5.4.0-31-generic #35-Ubuntu
 * 5.6.0-1008 (Ubuntu 20.04 OEM)

$ lspci | grep -i ether
06:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03

$ mount | grep filer
filer:/ on /share type nfs4 (rw,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=192.168.3.55,local_lock=none,addr=192.168.2.33)
filer:/home/michael on /share/home/michael type nfs4 (rw,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=192.168.3.55,local_lock=none,addr=192.168.2.33)

$ cat /etc/fstab | grep -i filer
filer:/ /share/ nfs4 nfsvers=4,sec=krb5i,rw,x-systemd.automount,soft,intr,tcp,noatime 0 0
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: rootmw 2299 F.... pulseaudio
 /dev/snd/controlC2: rootmw 2299 F.... pulseaudio
 /dev/snd/controlC1: rootmw 2299 F.... pulseaudio
 /dev/snd/controlC3: rootmw 2299 F.... pulseaudio
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2020-05-02 (0 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
IwConfig:
 lo no wireless extensions.

 enp6s0 no wireless extensions.
MachineType: System manufacturer System Product Name
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1008-oem root=/dev/mapper/vgnvme-ubuntu2004 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.6.0-1008.8-oem 5.6.4
RelatedPackageVersions:
 linux-restricted-modules-5.6.0-1008-oem N/A
 linux-backports-modules-5.6.0-1008-oem N/A
 linux-firmware 1.187
RfKill:

Tags: focal
Uname: Linux 5.6.0-1008-oem x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/16/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3004
dmi.board.asset.tag: Default string
dmi.board.name: ROG CROSSHAIR VII HERO (WI-FI)
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3004:bd12/16/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGCROSSHAIRVIIHERO(WI-FI):rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: rootmw 2314 F.... pulseaudio
 /dev/snd/controlC1: rootmw 2314 F.... pulseaudio
 /dev/snd/controlC3: rootmw 2314 F.... pulseaudio
 /dev/snd/controlC2: rootmw 2314 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2020-05-02 (0 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
IwConfig:
 lo no wireless extensions.

 enp6s0 no wireless extensions.
MachineType: System manufacturer System Product Name
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1008-oem root=/dev/mapper/vgnvme-ubuntu2004 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.6.0-1008.8-oem 5.6.4
RelatedPackageVersions:
 linux-restricted-modules-5.6.0-1008-oem N/A
 linux-backports-modules-5.6.0-1008-oem N/A
 linux-firmware 1.187
RfKill:

Tags: focal
Uname: Linux 5.6.0-1008-oem x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/16/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3004
dmi.board.asset.tag: Default string
dmi.board.name: ROG CROSSHAIR VII HERO (WI-FI)
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3004:bd12/16/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGCROSSHAIRVIIHERO(WI-FI):rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: rootmw 2062 F.... pulseaudio
 /dev/snd/controlC2: rootmw 2062 F.... pulseaudio
 /dev/snd/controlC1: rootmw 2062 F.... pulseaudio
 /dev/snd/controlC3: rootmw 2062 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2020-05-02 (0 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
IwConfig:
 enp6s0 no wireless extensions.

 lo no wireless extensions.
MachineType: System manufacturer System Product Name
NonfreeKernelModules: nvidia_modeset nvidia
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-28-generic root=/dev/mapper/vgnvme-ubuntu2004 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.4.0-28.32-generic 5.4.30
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-28-generic N/A
 linux-backports-modules-5.4.0-28-generic N/A
 linux-firmware 1.187
RfKill:

Tags: focal
Uname: Linux 5.4.0-28-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/16/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3004
dmi.board.asset.tag: Default string
dmi.board.name: ROG CROSSHAIR VII HERO (WI-FI)
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3004:bd12/16/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGCROSSHAIRVIIHERO(WI-FI):rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Michael (miwait00) wrote :
Revision history for this message
Michael (miwait00) wrote :
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1876567

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
Michael (miwait00) wrote :

Same with kernel 5.6.0-1008 OEM on Ubuntu 20.04 LTS

tags: added: apport-collected
description: updated
Revision history for this message
Michael (miwait00) wrote : AlsaInfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci-vt.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-t.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-v.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcModules.txt

apport information

Revision history for this message
Michael (miwait00) wrote : PulseList.txt

apport information

Revision history for this message
Michael (miwait00) wrote : UdevDb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : WifiSyslog.txt

apport information

Revision history for this message
Michael (miwait00) wrote :

this program to access Launchpad on your behalf.
Waiting to hear from Launchpad about your decision...
dpkg-query: Kein Paket gefunden, das auf linux passt

"No package found, that matches linux"

Revision history for this message
Michael (miwait00) wrote :

Well, maybe kernel instance that once printed the stactraces has some... issues:

E: Sub-process /usr/bin/dpkg received a segmentation fault.

description: updated
Revision history for this message
Michael (miwait00) wrote : AlsaInfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : CRDA.txt

apport information

Revision history for this message
Michael (miwait00) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci-vt.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-t.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-v.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcModules.txt

apport information

Revision history for this message
Michael (miwait00) wrote : PulseList.txt

apport information

Revision history for this message
Michael (miwait00) wrote : UdevDb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : WifiSyslog.txt

apport information

description: updated
Revision history for this message
Michael (miwait00) wrote : AlsaInfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : CRDA.txt

apport information

Revision history for this message
Michael (miwait00) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lspci-vt.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-t.txt

apport information

Revision history for this message
Michael (miwait00) wrote : Lsusb-v.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael (miwait00) wrote : ProcModules.txt

apport information

Revision history for this message
Michael (miwait00) wrote : PulseList.txt

apport information

Revision history for this message
Michael (miwait00) wrote : UdevDb.txt

apport information

Revision history for this message
Michael (miwait00) wrote : WifiSyslog.txt

apport information

Revision history for this message
Michael (miwait00) wrote :

I ran apport once on 5.4.0-28 and once on 5.6.0-1008 after a fresh boot (which then ran without printing errors)

description: updated
Michael (miwait00)
description: updated
Michael (miwait00)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Michael (miwait00)
affects: linux (Ubuntu) → linux-hwe (Ubuntu)
affects: linux-hwe (Ubuntu) → linux-oem-5.6 (Ubuntu)
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

If latest mainline kernel doesn't work, we need to do a kernel bisection...

Revision history for this message
Michael (miwait00) wrote :

The kernel you linked me to (5.7.0-050700rc6-generic #202005172030) seems to work. There are no segfaults or other nfs related logs in dmesg while stressing the mounted nfs share as well as no IO errors on my tests.

I also noticed that there are updates available for the 5.4 and 5.6 kernel, so I'll test those next

description: updated
Revision history for this message
Michael (miwait00) wrote :

I now have also tested 5.6.0-1010-oem as well #10-Ubuntu SMP as 5.4.0-31-generic #35-Ubuntu SMP that were automatically installed after I ran 'apt update' and 'apt dist-upgrade'.
Kernel 5.6.0-1010-oem seems fine, 5.4.0-31-generic still has memory errors (see attached log).

The apport/'Problem detected' dialog automatically opened on my next reboot (mentioned some linux image 5.4 package) and I allowed the submission. I couldn't reference this ticket, but maybe you are able to grab/find the report in your database.

description: updated
description: updated
description: updated
Revision history for this message
Michael (miwait00) wrote :

The Kernel 5.4.0-32-generic #36-Ubuntu from the proposal* also seems to work... with one oddity: without the linux-modules-extra package, my mouse does not work. Everything is fine, if it is installed :)

*proposal https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/19306320

description: updated
Revision history for this message
Christoph Roeder (brightdroid) wrote :

Seems to be the same problem with 5.4.0-45, only my system (client) freezes completely.

System freezes while accessing nfs-share, mounted with following options:

rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=x.x.x.x,local_lock=none,addr=y.y.y.y

Kerberos is provided by FreeIPA and using sssd on the client (which crashes).

Revision history for this message
Christoph Roeder (brightdroid) wrote :

Last stable kernel which I'm currently using is 5.4.0-39-generic.

Revision history for this message
Christoph Roeder (brightdroid) wrote :

Still happens with kernel 5.4.0-54, today my (nfs-)server crashed too.

How can we help to fix this issue?

Thanks

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi Michael,
can you confirm if the latest 5.4 kernel in the update works for you?

Christoph Roeder, please give the latest 5.4 kernel a try, if it's still not working, please open a new bug with "ubuntu-bug linux" command, and attach error log if possible.
Thanks

Changed in linux-oem-5.6 (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux-oem-5.6 (Ubuntu) because there has been no activity for 60 days.]

Changed in linux-oem-5.6 (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.