Comment 0 for bug 2068024

Revision history for this message
Po-Hsu Lin (cypressyew) wrote : race_sched in ubuntu_stress_smoke_test will cause kernel panic on Azure 6.8

This issue can be found on:
  * N-Azure-6.8.0-1008.8
  * N-geneirc-6.8.0-35.35
  * J-Azure-6.8.0-1008.8~22.04.1

With 100% reproduced rate on Standard_A2_v2 instance, (reproduce rate 100%), it can be found on Standard_D2pds_v5 as well, but with a lower reproduce rate.

[ 1167.163045] I/O error, dev loop0, sector 256 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 1435.517597] BUG: kernel NULL pointer dereference, address: 00000000000000a0
[ 1435.522651] #PF: supervisor read access in kernel mode
[ 1435.525407] #PF: error_code(0x0000) - not-present page
[ 1435.528122] PGD 0 P4D 0
[ 1435.529813] Oops: 0000 [#1] SMP PTI
[ 1435.531744] CPU: 0 PID: 121253 Comm: stress-ng-race- Tainted: P O 6.8.0-1008-azure #8-Ubuntu
[ 1435.536481] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 1435.543274] RIP: 0010:pick_next_task_fair+0x91/0x620
[ 1435.545480] Code: 91 00 00 00 49 81 bd b0 02 00 00 80 a8 89 92 75 60 4d 89 fe eb 27 4c 89 f7 e8 0b b7 ff ff 84 c0 75 3f 4c 89 f7 e8 5f 04 ff ff <4c> 8b b0 a0 00 00 00 48 89 c3 4d 85 f6 0f 84 f4 00 00 00 49 8b 46
[ 1435.554629] RSP: 0018:ffffb2b202e73cf8 EFLAGS: 00010096
[ 1435.558030] RAX: 0000000000000000 RBX: ffffb2b202e73dc8 RCX: fd78d84d198c4000
[ 1435.562226] RDX: 0000000000000c00 RSI: e411d03fda1d7382 RDI: 0000000000000c02
[ 1435.566496] RBP: ffffb2b202e73d38 R08: 0000000000000002 R09: 0000000000000002
[ 1435.570327] R10: 0000000000000000 R11: 0000000000000000 R12: ffff920dbbc33580
[ 1435.574620] R13: ffff920d05570000 R14: ffff920dbbc33680 R15: ffff920dbbc33680
[ 1435.579115] FS: 00007fb92ad12d00(0000) GS:ffff920dbbc00000(0000) knlGS:0000000000000000
[ 1435.583308] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1435.586094] CR2: 00000000000000a0 CR3: 0000000102364001 CR4: 00000000003706f0
[ 1435.590178] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1435.594054] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1435.597740] Call Trace:
[ 1435.599469] <TASK>
[ 1435.600605] ? show_regs+0x65/0x70
[ 1435.602396] ? __die+0x24/0x70
[ 1435.603999] ? page_fault_oops+0x99/0x1a0
[ 1435.605856] ? do_user_addr_fault+0x2ae/0x670
[ 1435.607915] ? exc_page_fault+0x7b/0x170
[ 1435.609976] ? asm_exc_page_fault+0x27/0x30
[ 1435.611989] ? pick_next_task_fair+0x91/0x620
[ 1435.614311] ? pick_next_task_fair+0x91/0x620
[ 1435.616811] ? wp_page_copy+0x2f7/0x690
[ 1435.618799] pick_next_task+0x5f/0xcd0
[ 1435.621060] ? do_wp_page+0x1d0/0x430
[ 1435.623596] __schedule+0x169/0x760
[ 1435.625947] ? __cgroup_account_cputime+0x28/0x30
[ 1435.628329] ? update_curr+0x15e/0x1e0
[ 1435.630179] schedule+0x2c/0xf0
[ 1435.633476] do_sched_yield+0x85/0xb0
[ 1435.635452] __do_sys_sched_yield+0xe/0x20
[ 1435.637356] x64_sys_call+0x3d9/0x2030
[ 1435.639400] do_syscall_64+0x7b/0x160
[ 1435.641857] ? handle_mm_fault+0xac/0x3a0
[ 1435.644956] ? irqentry_exit_to_user_mode+0x7b/0x220
[ 1435.647799] ? irqentry_exit+0x1d/0x30
[ 1435.650587] ? exc_page_fault+0x87/0x170
[ 1435.653213] entry_SYSCALL_64_after_hwframe+0x78/0x80
[ 1435.656728] RIP: 0033:0x7fb92ab0e7db
[ 1435.659593] Code: 73 01 c3 48 8b 0d 3d 46 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 18 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 46 0f 00 f7 d8 64 89 01 48
[ 1435.675388] RSP: 002b:00007fff7ca243d8 EFLAGS: 00000282 ORIG_RAX: 0000000000000018
[ 1435.680830] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb92ab0e7db
[ 1435.686046] RDX: 000055c47ee77db0 RSI: 0000000000000000 RDI: 0000000000000002
[ 1435.690268] RBP: 0000000000000791 R08: 0000000000000002 R09: 011d99605fac8414
[ 1435.694941] R10: 00007fb92ad12fd0 R11: 0000000000000282 R12: 00007fb92acfde18
[ 1435.698607] R13: 0000000000000002 R14: 000000000001d9a5 R15: 0000000000000008
[ 1435.703633] </TASK>
[ 1435.705016] Modules linked in: vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb zfs(PO) spl(O) dccp_ipv4 dccp atm sm3_generic sm3_avx_x86_64 sm3 poly1305_generic poly1305_x86_64 nhpoly1305_avx2 nhpoly1305_sse2 nhpoly1305 libpoly1305 michael_mic md4 streebog_generic rmd160 cmac algif_rng twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic fcrypt cast6_avx_x86_64 cast6_generic cast5_avx_x86_64 cast5_generic cast_common camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 blowfish_generic blowfish_x86_64 blowfish_common algif_skcipher algif_hash aria_aesni_avx2_x86_64 aria_aesni_avx_x86_64 aria_generic sm4_generic sm4_aesni_avx2_x86_64 sm4_aesni_avx_x86_64 sm4 ccm des3_ede_x86_64 des_generic libdes authenc aegis128 aegis128_aesni algif_aead af_alg tls 8021q garp mrp stp llc binfmt_misc nls_iso8859_1 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_owner xt_tcpudp
[ 1435.705128] nft_compat nf_tables serio_raw joydev dm_multipath msr nvme_fabrics efi_pstore nfnetlink ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 hid_generic hid_hyperv crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 hid pata_acpi hyperv_keyboard hyperv_drm hv_netvsc aesni_intel crypto_simd cryptd
[ 1435.776455] CR2: 00000000000000a0
[ 1435.778976] ---[ end trace 0000000000000000 ]---
[ 1435.782217] RIP: 0010:pick_next_task_fair+0x91/0x620
[ 1435.785040] Code: 91 00 00 00 49 81 bd b0 02 00 00 80 a8 89 92 75 60 4d 89 fe eb 27 4c 89 f7 e8 0b b7 ff ff 84 c0 75 3f 4c 89 f7 e8 5f 04 ff ff <4c> 8b b0 a0 00 00 00 48 89 c3 4d 85 f6 0f 84 f4 00 00 00 49 8b 46
[ 1435.794724] RSP: 0018:ffffb2b202e73cf8 EFLAGS: 00010096
[ 1435.798116] RAX: 0000000000000000 RBX: ffffb2b202e73dc8 RCX: fd78d84d198c4000
[ 1435.802543] RDX: 0000000000000c00 RSI: e411d03fda1d7382 RDI: 0000000000000c02
[ 1435.807466] RBP: ffffb2b202e73d38 R08: 0000000000000002 R09: 0000000000000002
[ 1435.811823] R10: 0000000000000000 R11: 0000000000000000 R12: ffff920dbbc33580
[ 1435.815818] R13: ffff920d05570000 R14: ffff920dbbc33680 R15: ffff920dbbc33680
[ 1435.820778] FS: 00007fb92ad12d00(0000) GS:ffff920dbbc00000(0000) knlGS:0000000000000000
[ 1435.825269] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1435.828468] CR2: 00000000000000a0 CR3: 0000000102364001 CR4: 00000000003706f0
[ 1435.832087] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1435.837461] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1435.841312] note: stress-ng-race-[121253] exited with irqs disabled

I can reproduce this with 6.8.0-1001-azure as well.