Kernel BUG with multiple NFS4 kerberos mounts on boot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Expired
|
Medium
|
Unassigned |
Bug Description
Currently latest Ubuntu Linux kernel image has a a bug, probably a race condition, which happens when there are multiple kerberos nfs4 mounts in /etc/fstab. This does not happen on every boot, so to reproduce this you probably need a few retries. This happens using the current Ubuntu Linux kernel in 14.04:
# cat /proc/version_
Ubuntu 3.13.0-
Apparently you need to have _multiple_ kerberos NFS4 mounts in /etc/fstab to trigger this:
xxxxxx.
xxxxxx.
xxxxxx.
xxxxxx.
xxxxxx.
xxxxxx.
When this happens we get a kernel stack trace (complete trace included), which starts like this:
[ 19.999751] gss_pipe_downcall: bad return from gss_fill_context: -4
[ 19.999779] ------------[ cut here ]------------
[ 19.999791] kernel BUG at /build/
[ 19.999796] invalid opcode: 0000 [#1] SMP
[ 19.999802] Modules linked in: arc4(+) des_generic cmac xcbc nfsv4 rmd160 crypto_null af_key xfrm_algo dm_crypt snd_hda_
When this has happened, rpc.gssd gets stuck in D state:
# ps aux|grep gssd
root 452 0.0 0.0 0 0 ? Ds 13:18 0:00 [rpc.gssd]
Also NFS4 mounts will fail, with an error message which does not tell what is actually going on:
root@do0-
mount.nfs4: access denied by server while mounting xxxx.helsinki.
mount.nfs4: access denied by server while mounting xxxx.helsinki.
mount.nfs4: access denied by server while mounting xxxx.helsinki.
mount.nfs4: access denied by server while mounting xxxx.helsinki.
mount.nfs4: access denied by server while mounting xxxx..helsinki.
mount.nfs4: access denied by server while mounting xxxx.helsinki.
This also happened on Ubuntu 12.04, so the bug is probably old. There is a bug report, which is (IMHO) incorrectly reported against nfs-utils: https:/
We will fix this by removing the NFS mounts from fstab and doing them sequentially in startup scripts, but it would be nice if the kernel race would be fixed too.
---
ApportVersion: 2.14.1-0ubuntu3.3
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
/dev/snd/seq: timidity 2607 F.... timidity
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg:
[ 23.980615] init: gdm main process (1881) killed by TERM signal
[ 25.395448] init: plymouth-
[ 25.402287] init: plymouth-
[ 25.402298] init: plymouth-
[ 27.805918] init: plymouth-stop pre-start process (2612) terminated with status 1
DistroRelease: Ubuntu 14.04
HibernationDevice: RESUME=
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
MachineType: Hewlett-Packard HP Compaq 8000 Elite CMT PC
Package: linux (not installed)
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=
ProcVersionSign
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageV
linux-
linux-
linux-firmware 1.127.5
RfKill:
Tags: trusty
Uname: Linux 3.13.0-35-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 10/22/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 786G7 v01.02
dmi.board.
dmi.board.name: 3647h
dmi.board.vendor: Hewlett-Packard
dmi.chassis.
dmi.chassis.type: 6
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-
dmi.product.name: HP Compaq 8000 Elite CMT PC
dmi.sys.vendor: Hewlett-Packard
I'll just chime in (from the same site as Jani) that lacking a proper fix, it would be nice if the boot-time automounting worked around this by default by forgoing parallel mounts, or have the facilities to be instructed to do so.