kernel 5.4.0-40 hangs system when using nfs home directories

Bug #1886775 reported by Andrew Conway
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

We use nfs mounted (using autofs), kerberos authenticated home directories for most users.

Booting with kernel 5.4.0-40, users with nfs mounted home directories find the system freezes not long after use, somewhat randomly. Power off is then the only thing to do. Some specific things that caused crashes - opening a second tab on firefox; opening a terminal and running "cat" on log files, and running ubuntu-bug linux to try to generate this report :-(

Sometimes before the crash just one window freezes, and the rest of the GUI is responsive. A full freeze usually occurs within several seconds.

No such crashes were observed using an account without nfs mounted home directories (and the output from "ubuntu-bug linux" for one of these working users is at the end of this report).

Reverting to 5.4.0-39, everything is good.

Exactly the same behaviour is observed on a modern AMD Zen2 processor with a graphics card, and a several year old Intel processor with integrated graphics.

Looking at /var/log/syslog there are several suspicious messages like the one below. The general protection fault occurs always just before the freeze, and occasionally some times before.

Jul 4 16:23:37 emu kernel: [ 350.263903] ------------[ cut here ]------------
Jul 4 16:23:37 emu kernel: [ 350.263904] virt_to_cache: Object is not a Slab page!
Jul 4 16:23:37 emu kernel: [ 350.263917] WARNING: CPU: 13 PID: 4009 at mm/slab.h:473 kmem_cache_free+0x237/0x2b0
Jul 4 16:23:37 emu kernel: [ 350.263917] Modules linked in: rfcomm rpcsec_gss_krb5 nfsv4 nfs fscache vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) edac_mce_amd kvm_amd xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi binfmt_misc nvidia_uvm(OE) kvm nvidia_drm(POE) nvidia_modeset(POE) iwlmvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec nls_iso8859_1 snd_hda_core snd_hwdep snd_pcm btusb btrtl btbcm btintel snd_seq_midi mac80211 bluetooth snd_seq_midi_event crct10dif_pclmul snd_rawmidi bridge ecdh_generic stp ghash_clmulni_intel llc libarc4 input_leds joydev ecc nvidia(POE) snd_seq iwlwifi aesni_intel crypto_simd cryptd glue_helper drm_kms_helper snd_seq_device cfg80211 snd_timer ipmi_devintf
Jul 4 16:23:37 emu kernel: [ 350.263952] wmi_bmof ipmi_msghandler snd fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore k10temp ccp mac_hid sch_fq_codel parport_pc ppdev lp parport drm nfsd nfs_acl auth_rpcgss lockd grace sunrpc ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igb i2c_piix4 ahci i2c_algo_bit nvme libahci dca nvme_core wmi
Jul 4 16:23:37 emu kernel: [ 350.263971] CPU: 13 PID: 4009 Comm: kworker/u64:4 Tainted: P OE 5.4.0-40-generic #44-Ubuntu
Jul 4 16:23:37 emu kernel: [ 350.263972] Hardware name: Gigabyte Technology Co., Ltd. X570 I AORUS PRO WIFI/X570 I AORUS PRO WIFI, BIOS F4h 07/17/2019
Jul 4 16:23:37 emu kernel: [ 350.263986] Workqueue: rpciod rpc_async_schedule [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.263989] RIP: 0010:kmem_cache_free+0x237/0x2b0
Jul 4 16:23:37 emu kernel: [ 350.263990] Code: ff ff ff 80 3d 16 4f 56 01 00 0f 85 39 ff ff ff 48 c7 c6 20 44 67 86 48 c7 c7 08 25 98 86 c6 05 fb 4e 56 01 01 e8 64 8a df ff <0f> 0b e9 18 ff ff ff 48 8b 57 58 49 8b 4f 58 48 c7 c6 30 44 67 86
Jul 4 16:23:37 emu kernel: [ 350.263991] RSP: 0018:ffffc1ebc3077d20 EFLAGS: 00010282
Jul 4 16:23:37 emu kernel: [ 350.263993] RAX: 0000000000000000 RBX: ffffa040c01358e2 RCX: 0000000000000006
Jul 4 16:23:37 emu kernel: [ 350.263993] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffffa040beb578c0
Jul 4 16:23:37 emu kernel: [ 350.263994] RBP: ffffc1ebc3077d48 R08: 0000000000000506 R09: 0000000000000004
Jul 4 16:23:37 emu kernel: [ 350.263995] R10: 0000000000000000 R11: 0000000000000001 R12: ffffa041401358e2
Jul 4 16:23:37 emu kernel: [ 350.263995] R13: 0000000000000000 R14: ffffa040a7e47600 R15: ffffa04065a99cb0
Jul 4 16:23:37 emu kernel: [ 350.263997] FS: 0000000000000000(0000) GS:ffffa040beb40000(0000) knlGS:0000000000000000
Jul 4 16:23:37 emu kernel: [ 350.263997] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 4 16:23:37 emu kernel: [ 350.263998] CR2: 00007fe66802dfe0 CR3: 0000000717722000 CR4: 0000000000340ee0
Jul 4 16:23:37 emu kernel: [ 350.263999] Call Trace:
Jul 4 16:23:37 emu kernel: [ 350.264005] mempool_free_slab+0x17/0x20
Jul 4 16:23:37 emu kernel: [ 350.264007] mempool_free+0x2f/0x80
Jul 4 16:23:37 emu kernel: [ 350.264018] rpc_free+0x47/0x60 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.264028] xprt_release+0x91/0x1a0 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.264037] rpc_release_resources_task+0x13/0x50 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.264046] __rpc_execute+0x182/0x3a0 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.264055] rpc_async_schedule+0x30/0x50 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.264058] process_one_work+0x1eb/0x3b0
Jul 4 16:23:37 emu kernel: [ 350.264060] worker_thread+0x4d/0x400
Jul 4 16:23:37 emu kernel: [ 350.264062] kthread+0x104/0x140
Jul 4 16:23:37 emu kernel: [ 350.264064] ? process_one_work+0x3b0/0x3b0
Jul 4 16:23:37 emu kernel: [ 350.264065] ? kthread_park+0x90/0x90
Jul 4 16:23:37 emu kernel: [ 350.264068] ret_from_fork+0x22/0x40
Jul 4 16:23:37 emu kernel: [ 350.264069] ---[ end trace 89c40274a06595b4 ]---
Jul 4 16:23:37 emu kernel: [ 350.360090] general protection fault: 0000 [#1] SMP NOPTI
Jul 4 16:23:37 emu kernel: [ 350.360096] CPU: 8 PID: 346 Comm: kworker/u64:3 Tainted: P W OE 5.4.0-40-generic #44-Ubuntu
Jul 4 16:23:37 emu kernel: [ 350.360098] Hardware name: Gigabyte Technology Co., Ltd. X570 I AORUS PRO WIFI/X570 I AORUS PRO WIFI, BIOS F4h 07/17/2019
Jul 4 16:23:37 emu kernel: [ 350.360115] Workqueue: rpciod rpc_async_schedule [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360120] RIP: 0010:kmem_cache_alloc+0x7e/0x230
Jul 4 16:23:37 emu kernel: [ 350.360122] Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 40 9d 76 7a 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
Jul 4 16:23:37 emu kernel: [ 350.360124] RSP: 0018:ffffc1ebc0b37cc8 EFLAGS: 00010286
Jul 4 16:23:37 emu kernel: [ 350.360125] RAX: c72b0346c01358e2 RBX: 0000000000000000 RCX: 0000000000000002
Jul 4 16:23:37 emu kernel: [ 350.360126] RDX: 000000000000009a RSI: 0000000000092800 RDI: 0000000000031fb0
Jul 4 16:23:37 emu kernel: [ 350.360128] RBP: ffffc1ebc0b37cf8 R08: ffffa040bea31fb0 R09: ffffffffc2511a94
Jul 4 16:23:37 emu kernel: [ 350.360129] R10: ffffa040b530962c R11: 0000000000000018 R12: c72b0346c01358e2
Jul 4 16:23:37 emu kernel: [ 350.360130] R13: 0000000000092800 R14: ffffa040ba5ff640 R15: ffffa040ba5ff640
Jul 4 16:23:37 emu kernel: [ 350.360131] FS: 0000000000000000(0000) GS:ffffa040bea00000(0000) knlGS:0000000000000000
Jul 4 16:23:37 emu kernel: [ 350.360132] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 4 16:23:37 emu kernel: [ 350.360134] CR2: 00007fe67c030298 CR3: 00000007e9c8e000 CR4: 0000000000340ee0
Jul 4 16:23:37 emu kernel: [ 350.360135] Call Trace:
Jul 4 16:23:37 emu kernel: [ 350.360140] ? mempool_alloc_slab+0x17/0x20
Jul 4 16:23:37 emu kernel: [ 350.360143] mempool_alloc_slab+0x17/0x20
Jul 4 16:23:37 emu kernel: [ 350.360145] mempool_alloc+0x64/0x180
Jul 4 16:23:37 emu kernel: [ 350.360156] rpc_malloc+0xa1/0xb0 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360169] call_allocate+0xd1/0x1b0 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360178] ? call_refreshresult+0x100/0x100 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360187] __rpc_execute+0x8c/0x3a0 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360197] rpc_async_schedule+0x30/0x50 [sunrpc]
Jul 4 16:23:37 emu kernel: [ 350.360201] process_one_work+0x1eb/0x3b0
Jul 4 16:23:37 emu kernel: [ 350.360203] worker_thread+0x4d/0x400
Jul 4 16:23:37 emu kernel: [ 350.360206] kthread+0x104/0x140
Jul 4 16:23:37 emu kernel: [ 350.360207] ? process_one_work+0x3b0/0x3b0
Jul 4 16:23:37 emu kernel: [ 350.360209] ? kthread_park+0x90/0x90
Jul 4 16:23:37 emu kernel: [ 350.360212] ret_from_fork+0x22/0x40
Jul 4 16:23:37 emu kernel: [ 350.360214] Modules linked in: rfcomm rpcsec_gss_krb5 nfsv4 nfs fscache vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) edac_mce_amd kvm_amd xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi binfmt_misc nvidia_uvm(OE) kvm nvidia_drm(POE) nvidia_modeset(POE) iwlmvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec nls_iso8859_1 snd_hda_core snd_hwdep snd_pcm btusb btrtl btbcm btintel snd_seq_midi mac80211 bluetooth snd_seq_midi_event crct10dif_pclmul snd_rawmidi bridge ecdh_generic stp ghash_clmulni_intel llc libarc4 input_leds joydev ecc nvidia(POE) snd_seq iwlwifi aesni_intel crypto_simd cryptd glue_helper drm_kms_helper snd_seq_device cfg80211 snd_timer ipmi_devintf
Jul 4 16:23:37 emu kernel: [ 350.360249] wmi_bmof ipmi_msghandler snd fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore k10temp ccp mac_hid sch_fq_codel parport_pc ppdev lp parport drm nfsd nfs_acl auth_rpcgss lockd grace sunrpc ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igb i2c_piix4 ahci i2c_algo_bit nvme libahci dca nvme_core wmi
Jul 4 16:23:37 emu kernel: [ 350.360269] ---[ end trace 89c40274a06595b5 ]---

The output from ubuntu-bug linux (running on a non-nfs user) I have attached. I was not able to run ubuntu-bug as a nfs user as it froze the system.

Tags: focal
Revision history for this message
Andrew Conway (acubuntuone) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1886775

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Possible duplicate of lp: #1886277?

Revision history for this message
Andrew Conway (acubuntuone) wrote :

#3: Agreed that it is a duplicate of lp: #1886277 . Sorry, I looked for similar bugs but did a lousy job it appears. I just made a comment to this effect in #1886277.

#2: I believe the apport files are attached in comment #1, though it is the first time I have used it and may be confusing it.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Juan Andrés Ghigliazza (tizone) wrote :

Same problem here over kernel 5.4.0-42. My logs look more similar to the logs of this bug than to the logs of bug #1886277.

We are using NFS too, but not for home directories, just for shared mounts. We are using autofs and kerberos authentication too (FreeIPA).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.