Regression on NFS: unable to handle page fault in mempool_alloc_slab

Bug #1886277 reported by Marian Rainer-Harbach
78
This bug affects 12 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

On kernel 5.4.0-40-generic in focal I'm getting errors like this on several machines with different hardware in the first hour after boot:

Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for address: ffff9083e222e632
Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
Jul 04 16:58:32 hostname kernel: #PF: error_code(0x0000) - not-present page
Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
Jul 04 16:58:32 hostname kernel: Oops: 0000 [#1] SMP NOPTI
Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: G OE 5.4.0-40-generic #44-Ubuntu
Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, BIOS N2IET88W (1.66 ) 04/22/2020
Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
Jul 04 16:58:32 hostname kernel: RSP: 0018:ffffbc38c046fcc8 EFLAGS: 00010282
Jul 04 16:58:32 hostname kernel: RAX: ffff9083e222e632 RBX: 0000000000000000 RCX: 0000000000000002
Jul 04 16:58:32 hostname kernel: RDX: 0000000000000009 RSI: 0000000000092800 RDI: 0000000000031fb0
Jul 04 16:58:32 hostname kernel: RBP: ffffbc38c046fcf8 R08: ffff90836c331fb0 R09: ffffffffc1436a94
Jul 04 16:58:32 hostname kernel: R10: ffff908368178d2c R11: 0000000000000018 R12: ffff9083e222e632
Jul 04 16:58:32 hostname kernel: R13: 0000000000092800 R14: ffff908367ca6140 R15: ffff908367ca6140
Jul 04 16:58:32 hostname kernel: FS: 0000000000000000(0000) GS:ffff90836c300000(0000) knlGS:0000000000000000
Jul 04 16:58:32 hostname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 04 16:58:32 hostname kernel: CR2: ffff9083e222e632 CR3: 00000003ab80a003 CR4: 00000000003606e0
Jul 04 16:58:32 hostname kernel: Call Trace:
Jul 04 16:58:32 hostname kernel: ? mempool_alloc_slab+0x17/0x20
Jul 04 16:58:32 hostname kernel: mempool_alloc_slab+0x17/0x20
Jul 04 16:58:32 hostname kernel: mempool_alloc+0x64/0x180
Jul 04 16:58:32 hostname kernel: rpc_malloc+0xa1/0xb0 [sunrpc]
Jul 04 16:58:32 hostname kernel: call_allocate+0xd1/0x1b0 [sunrpc]
Jul 04 16:58:32 hostname kernel: ? call_refreshresult+0x100/0x100 [sunrpc]
Jul 04 16:58:32 hostname kernel: __rpc_execute+0x8c/0x3a0 [sunrpc]
Jul 04 16:58:32 hostname kernel: rpc_async_schedule+0x30/0x50 [sunrpc]
Jul 04 16:58:32 hostname kernel: process_one_work+0x1eb/0x3b0
Jul 04 16:58:32 hostname kernel: worker_thread+0x4d/0x400
Jul 04 16:58:32 hostname kernel: kthread+0x104/0x140
Jul 04 16:58:32 hostname kernel: ? process_one_work+0x3b0/0x3b0
Jul 04 16:58:32 hostname kernel: ? kthread_park+0x90/0x90
Jul 04 16:58:32 hostname kernel: ret_from_fork+0x35/0x40
Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
Jul 04 16:58:32 hostname kernel: nvram ledtrig_audio mei_me cfg80211 mei processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd int3403_thermal soundcore int340x_thermal_zone i>
Jul 04 16:58:32 hostname kernel: pinctrl_cannonlake video pinctrl_intel
Jul 04 16:58:32 hostname kernel: CR2: ffff9083e222e632
Jul 04 16:58:32 hostname kernel: ---[ end trace cbbaed921eb439ce ]---
Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
Jul 04 16:58:32 hostname kernel: RSP: 0018:ffffbc38c046fcc8 EFLAGS: 00010282
Jul 04 16:58:32 hostname kernel: RAX: ffff9083e222e632 RBX: 0000000000000000 RCX: 0000000000000002
Jul 04 16:58:32 hostname kernel: RDX: 0000000000000009 RSI: 0000000000092800 RDI: 0000000000031fb0
Jul 04 16:58:32 hostname kernel: RBP: ffffbc38c046fcf8 R08: ffff90836c331fb0 R09: ffffffffc1436a94
Jul 04 16:58:32 hostname kernel: R10: ffff908368178d2c R11: 0000000000000018 R12: ffff9083e222e632
Jul 04 16:58:32 hostname kernel: R13: 0000000000092800 R14: ffff908367ca6140 R15: ffff908367ca6140
Jul 04 16:58:32 hostname kernel: FS: 0000000000000000(0000) GS:ffff90836c300000(0000) knlGS:0000000000000000
Jul 04 16:58:32 hostname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 04 16:58:32 hostname kernel: CR2: ffff9083e222e632 CR3: 00000003ab80a003 CR4: 00000000003606e0

When booting 5.4.0-39-generic the problem does not occur.
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: lsysadmin 2042 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
HibernationDevice: RESUME=UUID=9d3714bb-8799-42f9-a51d-790f87b0a7fc
MachineType: LENOVO 20N2CTO1WW
Package: linux (not installed)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-40-generic root=/dev/mapper/vgmagiko-root ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 5.4.0-40.44-generic 5.4.44
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-40-generic N/A
 linux-backports-modules-5.4.0-40-generic N/A
 linux-firmware 1.187.1
Tags: focal
Uname: Linux 5.4.0-40-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 04/22/2020
dmi.bios.vendor: LENOVO
dmi.bios.version: N2IET88W (1.66 )
dmi.board.asset.tag: Not Available
dmi.board.name: 20N2CTO1WW
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40709 WIN
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.modalias: dmi:bvnLENOVO:bvrN2IET88W(1.66):bd04/22/2020:svnLENOVO:pn20N2CTO1WW:pvrThinkPadT490:rvnLENOVO:rn20N2CTO1WW:rvrSDK0J40709WIN:cvnLENOVO:ct10:cvrNone:
dmi.product.family: ThinkPad T490
dmi.product.name: 20N2CTO1WW
dmi.product.sku: LENOVO_MT_20N2_BU_Think_FM_ThinkPad T490
dmi.product.version: ThinkPad T490
dmi.sys.vendor: LENOVO

CVE References

tags: added: focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1886277

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
description: updated
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : CRDA.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : IwConfig.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Lspci.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Lspci-vt.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Lsusb.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Lsusb-t.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Lsusb-v.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : ProcEnviron.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : ProcModules.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : RfKill.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : UdevDb.txt

apport information

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote : Re: unable to handle page fault in mempool_alloc_slab
Download full text (5.4 KiB)

Log from another machine:
[ 1388.217900] BUG: unable to handle page fault for address: ffff8cf4f6fc5d86
[ 1388.217907] #PF: supervisor read access in kernel mode
[ 1388.217908] #PF: error_code(0x0000) - not-present page
[ 1388.217910] PGD 75201067 P4D 75201067 PUD 0
[ 1388.217913] Oops: 0000 [#4] SMP PTI
[ 1388.217916] CPU: 3 PID: 471 Comm: dmcrypt_write/2 Tainted: G D 5.4.0-40-generic #44-Ubuntu
[ 1388.217918] Hardware name: Hewlett-Packard HP EliteBook 8540w/1521, BIOS 68CVD Ver. F.60 11/11/2015
[ 1388.217923] RIP: 0010:kmem_cache_alloc+0x7e/0x230
[ 1388.217925] Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 40 9d 76 4f 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
[ 1388.217927] RSP: 0018:ffff9b3200977948 EFLAGS: 00010282
[ 1388.217928] RAX: ffff8cf4f6fc5d86 RBX: 0000000000000000 RCX: 0000000000000001
[ 1388.217929] RDX: 000000000000001f RSI: 0000000000092a20 RDI: 0000000000031ca0
[ 1388.217930] RBP: ffff9b3200977978 R08: ffff8cf4b3af1ca0 R09: ffff8cf4a08138b8
[ 1388.217931] R10: ffff8cf4b3ad7848 R11: 000000000000002d R12: ffff8cf4f6fc5d86
[ 1388.217932] R13: 0000000000092a20 R14: ffff8cf4b227c380 R15: ffff8cf4b227c380
[ 1388.217934] FS: 0000000000000000(0000) GS:ffff8cf4b3ac0000(0000) knlGS:0000000000000000
[ 1388.217935] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1388.217936] CR2: ffff8cf4f6fc5d86 CR3: 000000007480a000 CR4: 00000000000006e0
[ 1388.217937] Call Trace:
[ 1388.217944] ? mempool_alloc_slab+0x17/0x20
[ 1388.217948] mempool_alloc_slab+0x17/0x20
[ 1388.217950] mempool_alloc+0x64/0x180
[ 1388.217956] ? try_to_wake_up+0x224/0x6a0
[ 1388.217960] ? __blk_bios_map_sg+0xe5/0x4c0
[ 1388.217964] sg_pool_alloc+0x4f/0x60
[ 1388.217968] __sg_alloc_table+0x10b/0x170
[ 1388.217971] sg_alloc_table_chained+0x47/0xa0
[ 1388.217972] ? mac_pton+0xb0/0xb0
[ 1388.217976] scsi_init_io+0x52/0x180
[ 1388.217981] sd_setup_read_write_cmnd+0x67/0x710
[ 1388.217984] ? __blk_mq_get_tag+0x28/0x80
[ 1388.217986] sd_init_command+0x11a/0x472
[ 1388.217988] scsi_queue_rq+0x32e/0xa00
[ 1388.217990] blk_mq_dispatch_rq_list+0x96/0x5a0
[ 1388.217993] ? deadline_remove_request+0x4e/0xb0
[ 1388.217995] ? dd_dispatch_request+0x1/0x1f0
[ 1388.217998] blk_mq_do_dispatch_sched+0x67/0x100
[ 1388.218000] blk_mq_sched_dispatch_requests+0x12d/0x180
[ 1388.218003] __blk_mq_run_hw_queue+0x5a/0x110
[ 1388.218005] __blk_mq_delay_run_hw_queue+0x15b/0x160
[ 1388.218008] blk_mq_run_hw_queue+0x92/0x120
[ 1388.218010] blk_mq_sched_insert_requests+0x74/0x100
[ 1388.218012] blk_mq_flush_plug_list+0x1e8/0x290
[ 1388.218013] ? blk_mq_make_request+0x35f/0x5b0
[ 1388.218016] blk_flush_plug_list+0xe3/0x110
[ 1388.218018] blk_finish_plug+0x26/0x34
[ 1388.218024] dmcrypt_write+0x155/0x170 [dm_crypt]
[ 1388.218028] kthread+0x104/0x140
[ 1388.218030] ? crypt_iv_lmk_ctr+0xd0/0xd0 [dm_crypt]
[ 1388.218032] ? kthread_park+0x90/0x90
[ 1388.218037] ret_from_fork+0x35/0x40
[ 1388.218039] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache bnep bluetooth ecdh_generic ecc msr nls_iso8859_1 pata_pcmcia snd_hda_codec_hdmi snd_hd...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Would it be possible to bisect the kernel?

https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal

Between tag Ubuntu-5.4.0-39.43 and Ubuntu-5.4.0-40.44.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :
Download full text (4.6 KiB)

Is there a tutorial available on how to bisect the kernel and install and test the bisected version?

To me it seems the problem only occurs in connection with NFS. I could not reproduce the problem when NFS is not used. However, after reading data from NFS mounts the problem reliably occurs within a few minutes.

Here's a log from a third machine:
[ 359.319126] BUG: unable to handle page fault for address: ffff9d95f6147697
[ 359.319132] #PF: supervisor read access in kernel mode
[ 359.319134] #PF: error_code(0x0000) - not-present page
[ 359.319136] PGD 378401067 P4D 378401067 PUD 0
[ 359.319141] Oops: 0000 [#1] SMP PTI
[ 359.319146] CPU: 2 PID: 321 Comm: jbd2/sda6-8 Tainted: G OE 5.4.0-40-generic #44-Ubuntu
[ 359.319148] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B85M Pro4, BIOS P2.50 12/11/2015
[ 359.319154] RIP: 0010:kmem_cache_alloc+0x7e/0x230
[ 359.319158] Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 40 9d 56 4a 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
[ 359.319161] RSP: 0018:ffffa8984060b7c8 EFLAGS: 00010282
[ 359.319163] RAX: ffff9d95f6147697 RBX: 0000000000000000 RCX: 0000000000000001
[ 359.319165] RDX: 0000000000000019 RSI: 0000000000092a20 RDI: 0000000000031f90
[ 359.319167] RBP: ffffa8984060b7f8 R08: ffff9d954dab1f90 R09: ffff9d953f8132b8
[ 359.319169] R10: ffff9d953f8276c8 R11: 0000000000000029 R12: ffff9d95f6147697
[ 359.319171] R13: 0000000000092a20 R14: ffff9d954ada4380 R15: ffff9d954ada4380
[ 359.319174] FS: 0000000000000000(0000) GS:ffff9d954da80000(0000) knlGS:0000000000000000
[ 359.319176] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 359.319178] CR2: ffff9d95f6147697 CR3: 0000000377a0a002 CR4: 00000000001606e0
[ 359.319180] Call Trace:
[ 359.319187] ? mempool_alloc_slab+0x17/0x20
[ 359.319192] mempool_alloc_slab+0x17/0x20
[ 359.319196] mempool_alloc+0x64/0x180
[ 359.319201] ? __enqueue_entity+0x96/0xa0
[ 359.319206] sg_pool_alloc+0x4f/0x60
[ 359.319211] __sg_alloc_table+0x10b/0x170
[ 359.319214] sg_alloc_table_chained+0x47/0xa0
[ 359.319217] ? mac_pton+0xb0/0xb0
[ 359.319222] scsi_init_io+0x52/0x180
[ 359.319227] sd_setup_read_write_cmnd+0x67/0x710
[ 359.319231] sd_init_command+0x11a/0x472
[ 359.319235] scsi_queue_rq+0x32e/0xa00
[ 359.319239] blk_mq_dispatch_rq_list+0x96/0x5a0
[ 359.319243] ? deadline_remove_request+0x4e/0xb0
[ 359.319246] ? dd_dispatch_request+0x1/0x1f0
[ 359.319250] blk_mq_do_dispatch_sched+0x67/0x100
[ 359.319254] blk_mq_sched_dispatch_requests+0x12d/0x180
[ 359.319259] __blk_mq_run_hw_queue+0x5a/0x110
[ 359.319263] __blk_mq_delay_run_hw_queue+0x15b/0x160
[ 359.319267] blk_mq_run_hw_queue+0x92/0x120
[ 359.319270] blk_mq_sched_insert_requests+0x74/0x100
[ 359.319273] blk_mq_flush_plug_list+0x1e8/0x290
[ 359.319277] blk_flush_plug_list+0xe3/0x110
[ 359.319280] blk_finish_plug+0x26/0x34
[ 359.319287] jbd2_journal_commit_transaction+0xda5/0x17e8
[ 359.319293] kjournald2+0xb6/0x280
[ 359.319297] ? wait_woken+0x80/0x80
[ 359.319301] kthread+0x104/0x140
[ 359.319304] ? commit_timeout+0x20/0x20
[...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

$ sudo apt build-dep linux
$ git clone git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic
$ cd linux
$ git bisect start
$ git bisect good Ubuntu-5.4.0-39.43
$ git bisect bad Ubuntu-5.4.0-40.44
$ fakeroot debian/rules do_dkms_nvidia=false do_dkms_vbox=false do_dkms_wireguard=false do_tools=false do_zfs=false no_dumpfile=1 skipabi=true skipconfig=true skipdbg=true skipmodule=true skipretpoline=true clean prepare-generic binary-headers binary-generic
Install the newly built kernel, then reboot with it.
If it still have the same issue,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the offending commit.

Revision history for this message
Andrew Conway (acubuntuone) wrote :

I also have this problem, which I reported as a new bug 1886775 which is probably just a duplicate of this bug. Same issue, -40 dies with NFS with similar stack trace and similar timing, -39 is fine, and multiple hardware has the identical issues.

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :

I tried bisecting the kernel as described in #22. However, I used the URL git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal instead of git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic, I hope that's correct.

The fakeroot command works and builds the kernel. However, doing make later fails with:
***
*** Configuration file ".config" not found!
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
make: *** [Makefile:695: .config] Error 1

I don't know how to go from here.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Please ignore "Repeat to "make -j`nproc` deb-pkg" until you find the offending commit." it's the other way to build a kernel.

Please install newly build kernel package, reboot and test. Then continue the process with `git bisect good` or `git bisect bad`.

summary: - unable to handle page fault in mempool_alloc_slab
+ Regression on NFS: unable to handle page fault in mempool_alloc_slab
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :

Thanks for the advice for bisecting! I found the commits causing the problem:

After bb4fb62a863ea6131bdac77b21faa9444a605c58 (SUNRPC: Add "@len" parameter to gss_unwrap()) NFS with Kerberos is completely broken:
$ mount -v /nfs/share
mount.nfs: trying text-based options 'sec=krb5p,soft,intr,bg,timeo=100,retrans=2,vers=4.2,addr=xxxx:xxxx:xxxx:1::1,clientaddr=xxxx:xxxx:xxxx:1::dbfb'
mount.nfs: mount(2): Input/output error
mount.nfs: mount system call failed

After c711ec1aa0c8ff7d979d7f12d1d09df46f349d61 (SUNRPC: Fix GSS privacy computation of auth->au_ralign) mounts work again but the crashes shown in the previous comments occur.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Please test latest mainline kernel:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.8-rc4/

If mainline kernel works, then we need to backport some commits to filling gap between v5.4 and the offending commit.

If mainline kernel doesn't work, we need to raise the issue to upstream.

Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :

I downloaded and installed the packages from https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.8-rc4/amd64/ and tested 5.8.0-050800-generic #202007052030. I could not reproduce the problem in this version.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please also test upstream 5.4:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.51/

Hopefully we don't need to do any backport, there are a hundred commits...

Revision history for this message
Pierre Sauter (pierre-sauter-z) wrote :

I can confirm that 5.8.0-050800-generic does fix the bug. I also tested https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.51/ and the problem is still present in that version.

Revision history for this message
Pierre Sauter (pierre-sauter-z) wrote :
Download full text (4.5 KiB)

https://github.com/torvalds/linux/commit/89a3c9f5b9f0bcaa9aea3e8b2a616fcaea9aad78
SUNRPC: Properly set the @Subbuf parameter of xdr_buf_subsegment()

When I apply that patch to 5.4.0-40-generic the original bug disappears, however I sometimes still get:

[Mo Jul 13 20:22:53 2020] BUG: unable to handle page fault for address: ffff98fd15cd0000
[Mo Jul 13 20:22:53 2020] #PF: supervisor write access in kernel mode
[Mo Jul 13 20:22:53 2020] #PF: error_code(0x0003) - permissions violation
[Mo Jul 13 20:22:53 2020] PGD 214c01067 P4D 214c01067 PUD 214c05067 PMD 455d94063 PTE 8000000455cd0061
[Mo Jul 13 20:22:53 2020] Oops: 0003 [#1] SMP PTI
[Mo Jul 13 20:22:53 2020] CPU: 0 PID: 1428 Comm: update-desktop- Tainted: G OE 5.4.0-40-generic #44
[Mo Jul 13 20:22:53 2020] Hardware name: XXXXXXXXXXX
[Mo Jul 13 20:22:53 2020] RIP: 0010:memcpy_erms+0x6/0x10
[Mo Jul 13 20:22:53 2020] Code: ff 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa
20 72 7e 40 38 fe
[Mo Jul 13 20:22:53 2020] RSP: 0018:ffffb4f780bdb610 EFLAGS: 00010286
[Mo Jul 13 20:22:53 2020] RAX: ffff98fd15ccffc4 RBX: ffffb4f780bdba08 RCX: 0000000000000004
[Mo Jul 13 20:22:53 2020] RDX: 0000000000000040 RSI: ffff98fd132eb064 RDI: ffff98fd15cd0000
[Mo Jul 13 20:22:53 2020] RBP: ffffb4f780bdb640 R08: 0000000000000000 R09: 000000000000015b
[Mo Jul 13 20:22:53 2020] R10: ffffb4f780bdb5e0 R11: ffff98fd10f14850 R12: 0000000000000028
[Mo Jul 13 20:22:53 2020] R13: 0000000000000040 R14: ffff98fd188be280 R15: 0000000000000040
[Mo Jul 13 20:22:53 2020] FS: 00007fea854dcb80(0000) GS:ffff98fd1da00000(0000) knlGS:0000000000000000
[Mo Jul 13 20:22:53 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mo Jul 13 20:22:53 2020] CR2: ffff98fd15cd0000 CR3: 00000004532e0003 CR4: 00000000003606f0
[Mo Jul 13 20:22:53 2020] Call Trace:
[Mo Jul 13 20:22:53 2020] ? _copy_from_pages+0x6f/0xa0 [sunrpc]
[Mo Jul 13 20:22:53 2020] xdr_shrink_pagelen+0x83/0xb0 [sunrpc]
[Mo Jul 13 20:22:53 2020] xdr_align_pages+0x8e/0x1c0 [sunrpc]
[Mo Jul 13 20:22:53 2020] xdr_read_pages+0x18/0x80 [sunrpc]
[Mo Jul 13 20:22:53 2020] nfs4_xdr_dec_readlink+0xea/0x140 [nfsv4]
[Mo Jul 13 20:22:53 2020] rpcauth_unwrap_resp_decode+0x27/0x30 [sunrpc]
[Mo Jul 13 20:22:53 2020] gss_unwrap_resp+0x358/0x5a0 [auth_rpcgss]
[Mo Jul 13 20:22:53 2020] ? call_bind_status+0x290/0x290 [sunrpc]
[Mo Jul 13 20:22:53 2020] rpcauth_unwrap_resp+0x24/0x30 [sunrpc]
[Mo Jul 13 20:22:53 2020] call_decode+0x158/0x1d0 [sunrpc]
[Mo Jul 13 20:22:53 2020] __rpc_execute+0x8c/0x3a0 [sunrpc]
[Mo Jul 13 20:22:53 2020] rpc_execute+0xa0/0xb0 [sunrpc]
[Mo Jul 13 20:22:53 2020] rpc_run_task+0x120/0x150 [sunrpc]
[Mo Jul 13 20:22:53 2020] nfs4_call_sync_custom+0x10/0x30 [nfsv4]
[Mo Jul 13 20:22:53 2020] nfs4_call_sync_sequence+0x65/0x80 [nfsv4]
[Mo Jul 13 20:22:53 2020] _nfs4_proc_readlink+0xa3/0xc0 [nfsv4]
[Mo Jul 13 20:22:53 2020] nfs4_proc_readlink+0x6e/0x100 [nfsv4]
[Mo Jul 13 20:22:53 2020] nfs_symlink_filler+0x33/0x70 [nfs]
[Mo Jul 13 20:22:53 2020] do_read_cache_page+0x2f6/0x830
[Mo Jul 13 20:22:53 2020] ? nfs_get_link+0x12...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

The patch author wants to know if upstream stable 5.5 has the same issue.
The kernel can be found here:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.5.19/

Revision history for this message
Pierre Sauter (pierre-sauter-z) wrote :

5.5.19 looks good for me. I have home over nfs4+krb5p, so the original bug triggers immediately after login and the DE locks up.
The problem mentioned in comment #31 is not really reproducible, I only encountered it on the first few boots on Monday, the patched 5.4.0-40 has been stable yesterday and today for my use case.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

From the patch author:

"Please ask what encryption type is in use. The
kerberos_v1 enctypes might exercise a code path I wasn't able to
test."

"Have the testers enable memory debugging : KASAN or SLUB debugging
might provide more information. I might have some time later this week
to try reproducing on upstream stable, but no guarantees."

Follow the discussion here: https://<email address hidden>/

Revision history for this message
Jani Jaakkola (jj-lousa) wrote :

This also affects Ubuntu Bionic, if linux-generic-hwe-18.04 is installed. The 4.15.0-112-generic works fine.

Revision history for this message
Jani Jaakkola (jj-lousa) wrote :

The enctypes according to klist are: Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96. The kerberos tickets come from Active Directory server and NFS servers are NetApps, if that matters.

Revision history for this message
Andrew Conway (acubuntuone) wrote :

For what it is worth, I also have the same enncryption aes256-cts-hmac-sha1-96 (and same problem). The tickets come from MIT Kerberos on Ubuntu 18.04; the NFS servers are Ubuntu 18.04 using krb5p security option.

Revision history for this message
Pierre Sauter (pierre-sauter-z) wrote :
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :

I can confirm that 5.4.59-050459-generic #202008190333 seems to work, I could not reproduce the problem there.

Revision history for this message
Marc Kolly (makuser) wrote :

That is great to hear @marianrh!

I switched the entire workforce to the 5.8 kernel after the bug was introduced on those machines that are still on 18.04 and have HWE installed. Now I installed two machines for testing purposes with Ubuntu 20.04 and the problem reappeared quite quickly, so I put them on the 5.8 kernel as well.

I will try 5.4.60-050460-generic #202008210836 now and see if it works as well.

Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

A fix mentioning this bug report has been applied upstream and to mainline stable linux-5.4.y and backported to focal/linux as part of our regular stable backports (bug 1892417):

SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal/commit/?h=master-next&id=0ed4abf7a4dceb99ed2dd2a3d593bb17409c8df9

This patch is committed for the next kernel SRU.

Changed in linux (Ubuntu Focal):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Marian Rainer-Harbach (marianrh) wrote :

I tested kernel 5.4.0-46-generic #50 from focal-proposed. This version seems to fix the problem, I could not reproduce it any more.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (57.9 KiB)

This bug was fixed in the package linux - 5.8.0-18.19

---------------
linux (5.8.0-18.19) groovy; urgency=medium

  * groovy/linux: 5.8.0-18.19 -proposed tracker (LP: #1893047)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * Groovy update: v5.8.4 upstream stable release (LP: #1893048)
    - drm/vgem: Replace opencoded version of drm_gem_dumb_map_offset()
    - drm/panel-simple: Fix inverted V/H SYNC for Frida FRD350H54004 panel
    - drm/ast: Remove unused code paths for AST 1180
    - drm/ast: Initialize DRAM type before posting GPU
    - khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
    - ALSA: hda: avoid reset of sdo_limit
    - ALSA: hda/realtek: Add quirk for Samsung Galaxy Flex Book
    - ALSA: hda/realtek: Add quirk for Samsung Galaxy Book Ion
    - can: j1939: transport: j1939_session_tx_dat(): fix use-after-free read in
      j1939_tp_txtimer()
    - can: j1939: socket: j1939_sk_bind(): make sure ml_priv is allocated
    - spi: Prevent adding devices below an unregistering controller
    - io_uring: find and cancel head link async work on files exit
    - mm/vunmap: add cond_resched() in vunmap_pmd_range
    - romfs: fix uninitialized memory leak in romfs_dev_read()
    - kernel/relay.c: fix memleak on destroy relay channel
    - uprobes: __replace_page() avoid BUG in munlock_vma_page()
    - squashfs: avoid bio_alloc() failure with 1Mbyte blocks
    - mm: include CMA pages in lowmem_reserve at boot
    - mm, page_alloc: fix core hung in free_pcppages_bulk()
    - ASoC: amd: renoir: restore two more registers during resume
    - RDMA/hfi1: Correct an interlock issue for TID RDMA WRITE request
    - opp: Enable resources again if they were disabled earlier
    - opp: Put opp table in dev_pm_opp_set_rate() for empty tables
    - opp: Put opp table in dev_pm_opp_set_rate() if _set_opp_bw() fails
    - ext4: do not block RWF_NOWAIT dio write on unallocated space
    - ext4: fix checking of directory entry validity for inline directories
    - jbd2: add the missing unlock_buffer() in the error path of
      jbd2_write_superblock()
    - scsi: zfcp: Fix use-after-free in request timeout handlers
    - selftests: kvm: Use a shorter encoding to clear RAX
    - s390/pci: fix zpci_bus_link_virtfn()
    - s390/pci: re-introduce zpci_remove_device()
    - s390/pci: fix PF/VF linking on hot plug
    - s390/pci: ignore stale configuration request event
    - mm/memory.c: skip spurious TLB flush for retried page fault
    - drm: amdgpu: Use the correct size when allocating memory
    - drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal
    - drm/amd/display: Fix incorrect backlight register offset for DCN
    - drm/amd/display: Fix EDID parsing after resume from suspend
    - drm/amd/display: Blank stream before destroying HDCP session
    - drm/amd/display: Fix DFPstate hang due to view port changed
    - drm/amd/display: fix pow() crashing when given base 0
    - drm/i915/pmu: Prefer drm_WARN_ON over WARN_ON
    - drm/i915: Provide the perf pmu.module
    - scsi: ufs: Add DELAY_BEFORE_LPM quirk for Micron devices
    - scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
  ...

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Martin (emw-launchpad) wrote :
Download full text (7.5 KiB)

Hi all,

after installation of 5.4.0-47 I also got the impression that the bug was gone and was happy.
Until now ... I'm getting this type of data corruption with the recent main focal kernel:

    Linux 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

(relatively fresh Ubuntu 20.04 installation on ZFS after this bug hopelessly corrupted the old ext4 installation)

Setup:
  * remote NFS4 + krb5 ( over Wifi)
  * local ZFS
Trigger:
  * rsync'ing a large amount of data from ZFS (local) to NFS4 (remote)

Workqueue: rpciod rpc_async_schedule [sunrpc]
RIP:
   #1: 0010:kmem_cache_free+0x237/0x2b0
   #2: 0010:kmem_cache_alloc+0x7e/0x230

Any idea?

BR, Martin

[198007.326710] ------------[ cut here ]------------
[198007.326711] virt_to_cache: Object is not a Slab page!
[198007.326721] WARNING: CPU: 2 PID: 1317011 at mm/slab.h:473 kmem_cache_free+0x237/0x2b0
[198007.326722] Modules linked in: cx23885 altera_ci tda18271 altera_stapl m88ds3103 tveeprom cx2341x videobuf2_dvb dvb_core rc_core videobuf2_dma_sg videobuf2_memops videobuf2_v4l2 videobuf2_common btrfs xor zstd_compress raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 si2157 si2168 cx25840 i2c_mux snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi videodev snd_hda_intel snd_intel_dspcfg mc snd_hda_codec snd_hda_core snd_hwdep mei_hdcp intel_rapl_msr snd_pcm snd_seq_midi snd_seq_midi_event intel_rapl_common x86_pkg_temp_thermal intel_powerclamp snd_rawmidi kvm_intel btusb btrtl snd_seq kvm btbcm btintel crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd eeepc_wmi cryptd glue_helper snd_seq_device snd_timer rapl intel_cstate bluetooth snd asus_wmi sparse_keymap ecdh_generic ecc wmi_bmof cdc_acm mei_me soundcore mei mac_hid
[198007.326749] acpi_pad sch_fq_codel nct6775 hwmon_vid coretemp parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) hid_generic usbhid hid i915 i2c_algo_bit mxm_wmi crc32_pclmul drm_kms_helper ahci libahci syscopyarea r8169 lpc_ich i2c_i801 sysfillrect realtek sysimgblt fb_sys_fops drm wmi video [last unloaded: dvb_core]
[198007.326765] CPU: 2 PID: 1317011 Comm: kworker/u8:3 Tainted: P OE 5.4.0-47-generic #51-Ubuntu
[198007.326766] Hardware name: ASUS All Series/H97M-E, BIOS 2702 03/28/2016
[198007.326804] Workqueue: rpciod rpc_async_schedule [sunrpc]
[198007.326809] RIP: 0010:kmem_cache_free+0x237/0x2b0
[198007.326810] Code: ff ff ff 80 3d a6 45 56 01 00 0f 85 39 ff ff ff 48 c7 c6 60 44 87 a5 48 c7 c7 00 2e b8 a5 c6 05 8b 45 56 01 01 e8 14 7f df ff <0f> 0b e9 18 ff ff ff 48 8b 57 58 49 8b 4f 58 48 c7 c6 70 44 87 a5
[198007.326811] RSP: 0018:ffffae38c34e3d20 EFLAGS: 00010282
[198007.326812] RAX: 0000000000000000 RBX: ffff927771c5355f RCX: 0000000000000006
[198007.326812] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9277d79178c0
[198007.326813] RBP: ffffae38c34e3d48 R08: 0000000000000b72 R09: 0000000000000004
[198007.326813] R10: 0000000000000000 R11...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (42.6 KiB)

This bug was fixed in the package linux - 5.4.0-48.52

---------------
linux (5.4.0-48.52) focal; urgency=medium

  * focal/linux: 5.4.0-48.52 -proposed tracker (LP: #1894654)

  * mm/slub kernel oops on focal kernel 5.4.0-45 (LP: #1895109)
    - SAUCE: Revert "mm/slub: fix a memory leak in sysfs_slab_add()"

  * Packaging resync (LP: #1786013)
    - update dkms package versions
    - update dkms package versions

  * Introduce the new NVIDIA 450-server and the 450 UDA series (LP: #1887674)
    - [packaging] add signed modules for nvidia 450 and 450-server

  * [UBUNTU 20.04] zPCI attach/detach issues with PF/VF linking support
    (LP: #1892849)
    - s390/pci: fix zpci_bus_link_virtfn()
    - s390/pci: re-introduce zpci_remove_device()
    - s390/pci: fix PF/VF linking on hot plug

  * [UBUNTU 20.04] kernel: s390/cpum_cf,perf: changeDFLT_CCERROR counter name
    (LP: #1891454)
    - s390/cpum_cf, perf: change DFLT_CCERROR counter name

  * [UBUNTU 20.04] zPCI: Enabling of a reserved PCI function regression
    introduced by multi-function support (LP: #1891437)
    - s390/pci: fix enabling a reserved PCI function

  * CVE-2020-12888
    - vfio/type1: Support faulting PFNMAP vmas
    - vfio-pci: Fault mmaps to enable vma tracking
    - vfio-pci: Invalidate mmaps and block MMIO access on disabled memory

  * [Hyper-V] VSS and File Copy daemons intermittently fails to start
    (LP: #1891224)
    - [Packaging] Bind hv_vss_daemon startup to hv_vss device
    - [Packaging] bind hv_fcopy_daemon startup to hv_fcopy device

  * alsa/hdmi: support nvidia mst hdmi/dp audio (LP: #1867704)
    - ALSA: hda - Rename snd_hda_pin_sense to snd_hda_jack_pin_sense
    - ALSA: hda - Add DP-MST jack support
    - ALSA: hda - Add DP-MST support for non-acomp codecs
    - ALSA: hda - Add DP-MST support for NVIDIA codecs
    - ALSA: hda: hdmi - fix regression in connect list handling
    - ALSA: hda: hdmi - fix kernel oops caused by invalid PCM idx
    - ALSA: hda: hdmi - preserve non-MST PCM routing for Intel platforms
    - ALSA: hda: hdmi - Keep old slot assignment behavior for Intel platforms
    - ALSA: hda - Fix DP-MST support for NVIDIA codecs

  * Focal update: v5.4.60 upstream stable release (LP: #1892899)
    - smb3: warn on confusing error scenario with sec=krb5
    - genirq/affinity: Make affinity setting if activated opt-in
    - genirq/PM: Always unlock IRQ descriptor in rearm_wake_irq()
    - PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
    - PCI: Add device even if driver attach failed
    - PCI: qcom: Define some PARF params needed for ipq8064 SoC
    - PCI: qcom: Add support for tx term offset for rev 2.1.0
    - btrfs: allow use of global block reserve for balance item deletion
    - btrfs: free anon block device right after subvolume deletion
    - btrfs: don't allocate anonymous block device for user invisible roots
    - btrfs: ref-verify: fix memory leak in add_block_entry
    - btrfs: stop incremening log_batch for the log root tree when syncing log
    - btrfs: remove no longer needed use of log_writers for the log root tree
    - btrfs: don't traverse into the seed devices in show_devname
    - btrfs: open device...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Marc Kolly (makuser) wrote :

mainline 5.4.60-050460-generic #202008210836 did indeed fix the setup for me, I just forget to get back to you guys here.

Hello Martin,
It looks like the fix had not been included in the 5.4.0-47 release, but only in the following 5.4.0-48 release.

Have you had the time to update to the updated kernel included with Ubuntu 20.04 and see if it has been fixed for you?

On a possibly connected note, may I ask you how you are mounting the nfs share on the client?
My server: Linux storage 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux with nfs-kernel-server 1:1.3.4-2.5+deb10u1.

I switched from mounting the drives via fstab to a systemd mount after 14.04 with the following options: _netdev,sec=krb5p,vers=4.1,auto
I have to use NFS4.1 with vers=4.1, because whenever I omit it, the client will still freeze when a file is being copied or cut (Ctrl+C/X).
Do you run into the same issue and has this been fixed for you on 5.4.0-48 as well?

Kind regards,
Marc

Revision history for this message
Andrew Conway (acubuntuone) wrote :

5.4.0-48-generic seems to have fixed this problem for me, thanks!

Revision history for this message
Alex (superdoublefire) wrote :
Download full text (3.9 KiB)

Seems a same bug on Ubuntu 20.04 kernel 5.4.0-52. The bug reproduces for me only under high load of CPU(for example, java compilation process)
---------------------
[10798.058030] bpfilter: Loaded bpfilter_umh pid 17193
[10798.058119] Started bpfilter
[14269.588116] BUG: unable to handle page fault for address: ffff919d84135500
[14269.588119] #PF: supervisor read access in kernel mode
[14269.588119] #PF: error_code(0x0000) - not-present page
[14269.588120] PGD 1f4c01067 P4D 1f4c01067 PUD 0
[14269.588122] Oops: 0000 [#1] SMP NOPTI
[14269.588123] CPU: 5 PID: 21991 Comm: Chrome_ChildIOT Kdump: loaded Not tainted 5.4.0-52-generic #57-Ubuntu
[14269.588123] Hardware name: Gigabyte Technology Co., Ltd. B365M D3H/B365M D3H-CF, BIOS F3c 11/28/2019
[14269.588126] RIP: 0010:lock_page_memcg+0x1f/0x90
[14269.588127] Code: 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 0f 1f 44 00 00 48 89 fb 4c 8b 67 38 4d 85 e4 74 36 <41> 8b 84 24 00 05 00 00 85 c0 7e 2d 4d 8d ac 24 b8 04 00 00 4c 89
[14269.588128] RSP: 0018:ffffb6ca43047bc8 EFLAGS: 00010286
[14269.588129] RAX: ffffeaf69e86afc0 RBX: ffffeaf69e86afc0 RCX: 8000000000000005
[14269.588129] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffeaf69e86afc0
[14269.588130] RBP: ffffb6ca43047be0 R08: ffffeaf69e86afc0 R09: 0000000000000000
[14269.588130] R10: 0000000000000001 R11: ffff91bdaf7d5000 R12: ffff919d84135000
[14269.588131] R13: 000055986f674000 R14: 80000007a1abf005 R15: 000055986f675000
[14269.588132] FS: 00007f7768299700(0000) GS:ffff91bd8f340000(0000) knlGS:0000000000000000
[14269.588132] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14269.588133] CR2: ffff919d84135500 CR3: 00000001f420a002 CR4: 00000000003606e0
[14269.588134] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[14269.588134] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[14269.588134] Call Trace:
[14269.588137] page_remove_rmap+0xa5/0x310
[14269.588139] ? tlb_flush_mmu+0x3a/0x140
[14269.588140] zap_pte_range.isra.0+0x273/0x7f0
[14269.588141] unmap_page_range+0x2da/0x4a0
[14269.588142] unmap_single_vma+0x7f/0xf0
[14269.588143] unmap_vmas+0x79/0xf0
[14269.588144] exit_mmap+0xb4/0x1b0
[14269.588146] mmput+0x5d/0x130
[14269.588147] do_exit+0x306/0xac0
[14269.588148] __x64_sys_exit+0x1b/0x20
[14269.588150] do_syscall_64+0x57/0x190
[14269.588152] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[14269.588153] RIP: 0033:0x559870c0bee5
[14269.588155] Code: Bad RIP value.
[14269.588155] RSP: 002b:00007f77682979a0 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
[14269.588156] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000559870c0bee5
[14269.588156] RDX: 0000000000000001 RSI: 0000000000000009 RDI: 0000000000000000
[14269.588157] RBP: 00007f7768297d20 R08: 000000000000008c R09: 000000000000000e
[14269.588157] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7768297d30
[14269.588158] R13: 0000000000000000 R14: 0000000000000003 R15: 0000000000000000
[14269.588159] Modules linked in: xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter bpfilter cpuid snd_hda_codec_hdmi intel_rapl_msr intel_rapl...

Read more...

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi Alex,
can you open a new bug to track this issue with command "ubuntu-bug linux"?
Thank you!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.