stress-ng on gcov enabled focal kernel triggers OOPS

Bug #1879470 reported by Colin Ian King
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Committed
Undecided
Unassigned

Bug Description

== SRU Justification Focal RISC-V ==

Running stress-ng coverage tests on an ARM64 VM gcov 5.4.0-26-generic focal kernel trips an oops because the kernel stack is too small.

== Fix ==

Clean Upstream commit:

commit 0cac21b02ba5f3095fd2dcc77c26a25a0b2432ed
Author: Andreas Schwab <email address hidden>
Date: Mon Jul 6 14:32:26 2020 +0200

    riscv: use 16KB kernel stack on 64-bit

== Test ==

Without the fix, we hit stack overflows. Run stress-ng kernel-coverage.sh script with gcov enabled and the issue occurs. With the fix the issue does not occur.

== Regression Potential ==

This fix just affects 64 bit RISC-V and doubles the stack size. In systems that run thousands of processes will see more memory being consumed, but this most probably won't bite anyone because RISC-V systems are generally not used in this manner.

----------------------------------------

Running stress-ng coverage tests on an ARM64 VM gcov 5.4.0-26-generic focal kernel trips the following oops:

cd stress-ng
./kernel-coverage.sh

[ 894.205097] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000bf608000
[ 894.209645] [ffffabd0c7eba050] pgd=00000001b6fff003, pud=00000001b6ffe003, pmd=000000019b19d003, pte=006000011f517f93
[ 894.217586] Internal error: Oops: 9600004f [#1] SMP
[ 894.220753] Modules linked in: userio(+) aes_arm64 algif_rng aegis128 algif_aead anubis fcrypt khazad seed sm4_generic tea crc32_generic md4 michael_mic nhpoly1305_neon nhpoly1305 poly1305_generic rmd128 rmd160 rmd256 rmd320 sha3_generic sm3_generic streebog_generic tgr192 wp512 xxhash_generic algif_hash blowfish_generic blowfish_common cast5_generic des_generic libdes salsa20_generic chacha_neon chacha_generic unix_diag camellia_generic cast6_generic cast_common sctp serpent_generic twofish_generic twofish_common algif_skcipher atm af_alg dccp_ipv4 dccp xfs binfmt_misc zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) nls_iso8859_1 zcommon(PO) znvpair(PO) dm_multipath scsi_dh_rdac scsi_dh_emc spl(O) scsi_dh_alua qemu_fw_cfg nfsd auth_rpcgss sch_fq_codel nfs_acl lockd grace drm sunrpc virtio_rng ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce
[ 894.227087] ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net virtio_blk net_failover virtio_scsi failover aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[ 894.287863] CPU: 5 PID: 24647 Comm: modprobe Tainted: P O 5.4.0-26-generic #30
[ 894.292103] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 894.295900] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 894.298999] pc : jump_label_swap+0x2c/0x80
[ 894.301451] lr : jump_label_swap+0x1c/0x80
[ 894.304012] sp : ffff8000168cba30
[ 894.305734] x29: ffff8000168cba30 x28: ffffabd0c7eba0b0
[ 894.308535] x27: ffffabd0c7eba050 x26: 0000000000000010
[ 894.311432] x25: 0000000000000010 x24: ffffabd0da14c3c0
[ 894.315165] x23: 0000000000000050 x22: 0000000000000000
[ 894.319411] x21: ffffabd0c7eba000 x20: ffffabd0c7eba050
[ 894.323865] x19: ffffabd0c7eba0b0 x18: 0000000000000000
[ 894.327192] x17: da02346b4a843289 x16: 785f32823a7a414c
[ 894.330232] x15: 000055552ed60f0f x14: 0000000000000004
[ 894.333187] x13: ffffabd0d8a918f0 x12: 0000000000000030
[ 894.336289] x11: ffff0000cc4c3420 x10: ffff00015cc23828
[ 894.339448] x9 : 0000000000000000 x8 : ffffabd0dd503000
[ 894.343242] x7 : 0000000000000000 x6 : 0000000000000000
[ 894.348272] x5 : 0000000000000001 x4 : 00000000ffffd82c
[ 894.351581] x3 : 00000000ffffd9bc x2 : 00000000ffffe410
[ 894.355001] x1 : ffffabd0d9d3e76c x0 : ffffffffffffffa0
[ 894.358183] Call trace:
[ 894.359991] jump_label_swap+0x2c/0x80
[ 894.362067] do_swap+0x40/0x170
[ 894.363886] sort_r+0x200/0x2f0
[ 894.365400] sort+0x30/0x48
[ 894.367488] jump_label_add_module+0x7c/0x458
[ 894.371041] jump_label_module_notify+0xc8/0x190
[ 894.375615] notifier_call_chain+0xa4/0x108
[ 894.378428] __blocking_notifier_call_chain+0xa4/0xe0
[ 894.381374] blocking_notifier_call_chain+0x54/0x70
[ 894.384374] load_module+0x1674/0x1d50
[ 894.386229] __do_sys_finit_module+0x184/0x1d8
[ 894.388569] __arm64_sys_finit_module+0x40/0x58
[ 894.390935] el0_svc_common.constprop.0+0x150/0x5e8
[ 894.394000] el0_svc_handler+0xbc/0x260
[ 894.396581] el0_svc+0x10/0x14
[ 894.398445] Code: b9400262 cb130280 29400e84 4b000042 (b9000282)
[ 894.402792] ---[ end trace 05455454484e1508 ]---
[ 897.757436] loop_set_block_size: loop3 () has still dirty pages (nrpages=1)

Tags: focal
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1879470

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Changed in linux (Ubuntu):
status: Expired → In Progress
description: updated
Changed in linux (Ubuntu Focal):
status: New → Fix Committed
Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers