Activity log for bug #2029917

Date Who What changed Old value New value Message
2023-08-04 08:53:16 Po-Hsu Lin bug added bug
2023-08-04 08:53:22 Po-Hsu Lin tags aws bionic focal ubuntu-ltp-controllers aws bionic focal sru-20230710 ubuntu-ltp-controllers
2023-08-04 09:20:47 Po-Hsu Lin summary cpuset_hotplug in ubuntu_ltp_controllers triggers kernel bug (arch/x86/xen/spinlock.c:62) on AWS cloud c3.xlarge cpuset_hotplug in ubuntu_ltp_controllers triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge
2023-08-04 14:06:51 Po-Hsu Lin summary cpuset_hotplug in ubuntu_ltp_controllers triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge [Potential Regression] cpuset_hotplug in ubuntu_ltp_controllers triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge
2023-08-04 14:07:03 Po-Hsu Lin bug task added linux-aws (Ubuntu)
2023-08-04 14:07:12 Po-Hsu Lin nominated for series Ubuntu Focal
2023-08-04 14:07:12 Po-Hsu Lin bug task added linux-aws (Ubuntu Focal)
2023-09-01 06:58:16 Po-Hsu Lin linux-aws (Ubuntu): status New Invalid
2023-09-01 06:58:34 Po-Hsu Lin nominated for series Ubuntu Bionic
2023-09-01 06:58:34 Po-Hsu Lin bug task added linux-aws (Ubuntu Bionic)
2023-09-01 13:00:35 Launchpad Janitor linux-aws (Ubuntu Bionic): status New Confirmed
2023-09-01 13:00:35 Launchpad Janitor linux-aws (Ubuntu Focal): status New Confirmed
2023-09-04 09:45:26 Po-Hsu Lin summary [Potential Regression] cpuset_hotplug in ubuntu_ltp_controllers triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge [Potential Regression] ubuntu_ltp_controllers/cpuset_hotplug and ubuntu_ltp/cpuhotplug:cpuhotplug02 triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge
2023-09-04 10:01:54 Po-Hsu Lin summary [Potential Regression] ubuntu_ltp_controllers/cpuset_hotplug and ubuntu_ltp/cpuhotplug:cpuhotplug02 triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge [Potential Regression] cpuhotplug related tests triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge
2023-09-04 10:12:25 Po-Hsu Lin description Issue found with 5.4.0-1107.115~18.04.1 Bionic AWS and 5.4.0-1107.115 Focal AWS kernel, on c3.xlarge instance only. There is no output from the test itself (looks like it has crashed): START ubuntu_ltp_controllers.cpuset_hotplug ubuntu_ltp_controllers.cpuset_hotplug timestamp=1689920544 timeout=4500 localtime=Jul 21 06:22:24 Persistent state client._record_indent now set to 2 Persistent state client.unexpected_reboot now set to ('ubuntu_ltp_controllers.cpuset_hotplug', 'ubuntu_ltp_controllers.cpuset_hotplug') Waiting for pid 925631 for 4500 seconds System python is too old, crash handling disabled (nothing after this point) But from the console log you will see a kernel BUG and kernel panic: [ 3451.829941] kernel BUG at /build/linux-aws-5.4-I38rpz/linux-aws-5.4-5.4.0/arch/x86/xen/spinlock.c:62! [ 3451.833383] invalid opcode: 0000 [#1] SMP PTI [ 3451.835146] CPU: 1 PID: 14 Comm: cpuhp/1 Tainted: G C 5.4.0-1107-aws #115~18.04.1-Ubuntu [ 3451.838679] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006 [ 3451.840965] RIP: 0010:dummy_handler+0x4/0x10 [ 3451.842675] Code: 8b 75 e4 74 d6 44 89 e7 e8 39 89 61 00 eb d6 44 89 e7 e8 af ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 69 d0 9f 01 00 75 02 f3 [ 3451.849042] RSP: 0000:ffffb54b0000ee38 EFLAGS: 00010046 [ 3451.851021] RAX: ffffffff92c2e3d0 RBX: 000000000000003b RCX: 0000000000000000 [ 3451.853509] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b [ 3451.855996] RBP: ffffb54b0000ee38 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3451.858435] R10: 0000000000000000 R11: ffffffff94664da8 R12: 0000000000000000 [ 3451.860896] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8a9de6583200 [ 3451.863313] FS: 0000000000000000(0000) GS:ffff8a9de8040000(0000) knlGS:0000000000000000 [ 3451.899246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3451.901338] CR2: 0000000000000000 CR3: 000000002040a001 CR4: 00000000001606e0 [ 3451.903757] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3451.906184] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3451.908623] Call Trace: [ 3451.909869] <IRQ> [ 3451.911014] __handle_irq_event_percpu+0x44/0x1a0 [ 3451.912818] handle_irq_event_percpu+0x32/0x80 [ 3451.914578] handle_percpu_irq+0x3d/0x60 [ 3451.916198] generic_handle_irq+0x28/0x40 [ 3451.917834] handle_irq_for_port+0x8f/0xe0 [ 3451.919493] evtchn_2l_handle_events+0x157/0x270 [ 3451.921298] __xen_evtchn_do_upcall+0x76/0xe0 [ 3451.923046] xen_evtchn_do_upcall+0x2b/0x40 [ 3451.924742] xen_hvm_callback_vector+0xf/0x20 [ 3451.926484] </IRQ> [ 3451.927632] RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20 [ 3451.929674] Code: e8 a0 3d 64 ff 4c 29 e0 4c 39 f0 76 cf 80 0b 08 eb 8a 90 90 90 0f 1f 44 00 00 55 48 89 e5 e8 d6 ad 66 ff 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 c6 07 [ 3451.935996] RSP: 0000:ffffb54b000fbcf8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c [ 3451.939023] RAX: 0000000000000001 RBX: ffff8a9de6583200 RCX: 000000000002cc00 [ 3451.941475] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 [ 3451.943948] RBP: ffffb54b000fbcf8 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3451.946382] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000003b [ 3451.948849] R13: 0000000000000000 R14: ffff8a9d8e75c600 R15: ffff8a9d8e75c6a4 [ 3451.951297] __setup_irq+0x456/0x760 [ 3451.952850] ? kmem_cache_alloc_trace+0x170/0x230 [ 3451.954661] request_threaded_irq+0xfb/0x160 [ 3451.956376] bind_ipi_to_irqhandler+0xba/0x1c0 [ 3451.958113] ? xen_qlock_wait+0x90/0x90 [ 3451.959723] ? snr_uncore_mmio_init+0x20/0x20 [ 3451.961445] xen_init_lock_cpu+0x78/0xd0 [ 3451.963057] ? snr_uncore_mmio_init+0x20/0x20 [ 3451.964810] xen_cpu_up_online+0xe/0x20 [ 3451.966415] cpuhp_invoke_callback+0x8a/0x580 [ 3451.968144] cpuhp_thread_fun+0xb8/0x120 [ 3451.969760] smpboot_thread_fn+0xfc/0x170 [ 3451.971400] kthread+0x121/0x140 [ 3451.972855] ? sort_range+0x30/0x30 [ 3451.974378] ? kthread_park+0x90/0x90 [ 3451.975929] ret_from_fork+0x35/0x40 [ 3451.977454] Modules linked in: exfat(C) ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs nfsd auth_rpcgss nfs_acl lockd grace sunrpc nls_iso8859_1 binfmt_misc serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ixgbevf [ 3451.992926] ---[ end trace 4433bc23c8979a4c ]--- [ 3451.994720] RIP: 0010:dummy_handler+0x4/0x10 [ 3451.996427] Code: 8b 75 e4 74 d6 44 89 e7 e8 39 89 61 00 eb d6 44 89 e7 e8 af ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 69 d0 9f 01 00 75 02 f3 [ 3452.002753] RSP: 0000:ffffb54b0000ee38 EFLAGS: 00010046 [ 3452.004708] RAX: ffffffff92c2e3d0 RBX: 000000000000003b RCX: 0000000000000000 [ 3452.007130] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b [ 3452.009569] RBP: ffffb54b0000ee38 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3452.011998] R10: 0000000000000000 R11: ffffffff94664da8 R12: 0000000000000000 [ 3452.014449] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8a9de6583200 [ 3452.016893] FS: 0000000000000000(0000) GS:ffff8a9de8040000(0000) knlGS:0000000000000000 [ 3452.020028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3452.022109] CR2: 0000000000000000 CR3: 000000002040a001 CR4: 00000000001606e0 [ 3452.024568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3452.027003] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3452.029446] Kernel panic - not syncing: Fatal exception in interrupt [ 3452.031753] Kernel Offset: 0x11c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Issue found with 5.4.0-1107.115~18.04.1 Bionic AWS and 5.4.0-1107.115 Focal AWS kernel, on c3.xlarge instance only. cpu-hotplug related tests will crash the instance, they are: * cpuset_hotplug in ubuntu_ltp_controllers * cpuhotplug:cpuhotplug02 in ubuntu_ltp (comment #7 in this bug) * cpu-hotplug:cpu-on-off-test.sh in ubuntu_kernel_selftests (comment #8 in this bug) Take cpuset_hotplug in ubuntu_ltp_controllers for example. There is no output from the test itself (looks like it has crashed):  START ubuntu_ltp_controllers.cpuset_hotplug ubuntu_ltp_controllers.cpuset_hotplug timestamp=1689920544 timeout=4500 localtime=Jul 21 06:22:24  Persistent state client._record_indent now set to 2  Persistent state client.unexpected_reboot now set to ('ubuntu_ltp_controllers.cpuset_hotplug', 'ubuntu_ltp_controllers.cpuset_hotplug')  Waiting for pid 925631 for 4500 seconds  System python is too old, crash handling disabled (nothing after this point) But from the console log you will see a kernel BUG and kernel panic: [ 3451.829941] kernel BUG at /build/linux-aws-5.4-I38rpz/linux-aws-5.4-5.4.0/arch/x86/xen/spinlock.c:62! [ 3451.833383] invalid opcode: 0000 [#1] SMP PTI [ 3451.835146] CPU: 1 PID: 14 Comm: cpuhp/1 Tainted: G C 5.4.0-1107-aws #115~18.04.1-Ubuntu [ 3451.838679] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006 [ 3451.840965] RIP: 0010:dummy_handler+0x4/0x10 [ 3451.842675] Code: 8b 75 e4 74 d6 44 89 e7 e8 39 89 61 00 eb d6 44 89 e7 e8 af ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 69 d0 9f 01 00 75 02 f3 [ 3451.849042] RSP: 0000:ffffb54b0000ee38 EFLAGS: 00010046 [ 3451.851021] RAX: ffffffff92c2e3d0 RBX: 000000000000003b RCX: 0000000000000000 [ 3451.853509] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b [ 3451.855996] RBP: ffffb54b0000ee38 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3451.858435] R10: 0000000000000000 R11: ffffffff94664da8 R12: 0000000000000000 [ 3451.860896] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8a9de6583200 [ 3451.863313] FS: 0000000000000000(0000) GS:ffff8a9de8040000(0000) knlGS:0000000000000000 [ 3451.899246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3451.901338] CR2: 0000000000000000 CR3: 000000002040a001 CR4: 00000000001606e0 [ 3451.903757] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3451.906184] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3451.908623] Call Trace: [ 3451.909869] <IRQ> [ 3451.911014] __handle_irq_event_percpu+0x44/0x1a0 [ 3451.912818] handle_irq_event_percpu+0x32/0x80 [ 3451.914578] handle_percpu_irq+0x3d/0x60 [ 3451.916198] generic_handle_irq+0x28/0x40 [ 3451.917834] handle_irq_for_port+0x8f/0xe0 [ 3451.919493] evtchn_2l_handle_events+0x157/0x270 [ 3451.921298] __xen_evtchn_do_upcall+0x76/0xe0 [ 3451.923046] xen_evtchn_do_upcall+0x2b/0x40 [ 3451.924742] xen_hvm_callback_vector+0xf/0x20 [ 3451.926484] </IRQ> [ 3451.927632] RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20 [ 3451.929674] Code: e8 a0 3d 64 ff 4c 29 e0 4c 39 f0 76 cf 80 0b 08 eb 8a 90 90 90 0f 1f 44 00 00 55 48 89 e5 e8 d6 ad 66 ff 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 c6 07 [ 3451.935996] RSP: 0000:ffffb54b000fbcf8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c [ 3451.939023] RAX: 0000000000000001 RBX: ffff8a9de6583200 RCX: 000000000002cc00 [ 3451.941475] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 [ 3451.943948] RBP: ffffb54b000fbcf8 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3451.946382] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000003b [ 3451.948849] R13: 0000000000000000 R14: ffff8a9d8e75c600 R15: ffff8a9d8e75c6a4 [ 3451.951297] __setup_irq+0x456/0x760 [ 3451.952850] ? kmem_cache_alloc_trace+0x170/0x230 [ 3451.954661] request_threaded_irq+0xfb/0x160 [ 3451.956376] bind_ipi_to_irqhandler+0xba/0x1c0 [ 3451.958113] ? xen_qlock_wait+0x90/0x90 [ 3451.959723] ? snr_uncore_mmio_init+0x20/0x20 [ 3451.961445] xen_init_lock_cpu+0x78/0xd0 [ 3451.963057] ? snr_uncore_mmio_init+0x20/0x20 [ 3451.964810] xen_cpu_up_online+0xe/0x20 [ 3451.966415] cpuhp_invoke_callback+0x8a/0x580 [ 3451.968144] cpuhp_thread_fun+0xb8/0x120 [ 3451.969760] smpboot_thread_fn+0xfc/0x170 [ 3451.971400] kthread+0x121/0x140 [ 3451.972855] ? sort_range+0x30/0x30 [ 3451.974378] ? kthread_park+0x90/0x90 [ 3451.975929] ret_from_fork+0x35/0x40 [ 3451.977454] Modules linked in: exfat(C) ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs nfsd auth_rpcgss nfs_acl lockd grace sunrpc nls_iso8859_1 binfmt_misc serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ixgbevf [ 3451.992926] ---[ end trace 4433bc23c8979a4c ]--- [ 3451.994720] RIP: 0010:dummy_handler+0x4/0x10 [ 3451.996427] Code: 8b 75 e4 74 d6 44 89 e7 e8 39 89 61 00 eb d6 44 89 e7 e8 af ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 69 d0 9f 01 00 75 02 f3 [ 3452.002753] RSP: 0000:ffffb54b0000ee38 EFLAGS: 00010046 [ 3452.004708] RAX: ffffffff92c2e3d0 RBX: 000000000000003b RCX: 0000000000000000 [ 3452.007130] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b [ 3452.009569] RBP: ffffb54b0000ee38 R08: ffff8a9de6c01240 R09: ffff8a9de6c01440 [ 3452.011998] R10: 0000000000000000 R11: ffffffff94664da8 R12: 0000000000000000 [ 3452.014449] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8a9de6583200 [ 3452.016893] FS: 0000000000000000(0000) GS:ffff8a9de8040000(0000) knlGS:0000000000000000 [ 3452.020028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3452.022109] CR2: 0000000000000000 CR3: 000000002040a001 CR4: 00000000001606e0 [ 3452.024568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3452.027003] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3452.029446] Kernel panic - not syncing: Fatal exception in interrupt [ 3452.031753] Kernel Offset: 0x11c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)