kernel panic at smpboot.c:134 when rebooting qemu with multiple cores
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
QEMU |
Expired
|
Undecided
|
Unassigned |
Bug Description
Hi all,
I can reproduce this with kernel 3.14 and 3.17rc1. I suspect it is a qemu issue, but I'm not sure. The test case is the following script:
qemu-system-x86_64 -machine accel=kvm -pidfile /tmp/pid$$ -m 512M -smp 8,sockets=8 -kernel vmlinuz -append "init=/sbin/reboot -f console=
Note that we pass /sbin/reboot as the init program so it just reboots forever. After a dozen or so iterations, I hit this:
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.17.0-
[ 0.000000] Command line: init=/sbin/reboot -f console=
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000
[ 0.000000] BIOS-e820: [mem 0x000000000009f
[ 0.000000] BIOS-e820: [mem 0x00000000000f0
[ 0.000000] BIOS-e820: [mem 0x0000000000100
[ 0.000000] BIOS-e820: [mem 0x000000001fffd
[ 0.000000] BIOS-e820: [mem 0x00000000feffc
[ 0.000000] BIOS-e820: [mem 0x00000000fffc0
[ 0.000000] process: using polling idle threads
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.4 present.
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] e820: last_pfn = 0x1fffd max_arch_pfn = 0x400000000
[ 0.000000] PAT not supported by CPU.
[ 0.000000] init_memory_
[ 0.000000] init_memory_
[ 0.000000] init_memory_
[ 0.000000] init_memory_
[ 0.000000] init_memory_
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x00000000000F0A90 000014 (v00 BOCHS )
[ 0.000000] ACPI: RSDT 0x000000001FFFFC21 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: FACP 0x000000001FFFEF40 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
[ 0.000000] ACPI: DSDT 0x000000001FFFDDC0 001180 (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: FACS 0x000000001FFFDD80 000040
[ 0.000000] ACPI: SSDT 0x000000001FFFEFB4 000B85 (v01 BOCHS BXPCSSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: APIC 0x000000001FFFFB39 0000B0 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
[ 0.000000] ACPI: HPET 0x000000001FFFFBE9 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001)
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at [mem 0x0000000000000
[ 0.000000] Initmem setup node 0 [mem 0x00000000-
[ 0.000000] NODE_DATA [mem 0x1fffa000-
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 0:1fff9001, primary cpu clock
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x00001000-
[ 0.000000] DMA32 [mem 0x01000000-
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x00001000-
[ 0.000000] node 0: [mem 0x00100000-
[ 0.000000] ACPI: PM-Timer IO Port: 0xb008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled)
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[ 0.000000] smpboot: Allowing 8 CPUs, 0 hotplug CPUs
[ 0.000000] e820: [mem 0x20000000-
[ 0.000000] Booting paravirtualized kernel on KVM
[ 0.000000] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:8 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 27 pages/cpu @ffff88001fc00000 s80064 r8192 d22336 u262144
[ 0.000000] KVM setup async PF for cpu 0
[ 0.000000] kvm-stealtime: cpu 0, msr 1fc0d000
[ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 128902
[ 0.000000] Policy zone: DMA32
[ 0.000000] Kernel command line: mlx4_core.
[ 0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.000000] Memory: 497836K/523884K available (6197K kernel code, 845K rwdata, 2312K rodata, 968K init, 2676K bss, 26048K reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=8.
[ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[ 0.000000] NR_IRQS:4352 nr_irqs:488 0
[ 0.000000] Console: colour VGA+ 80x25
[ 0.000000] console [ttyS0] enabled
[ 0.000000] tsc: Detected 3491.912 MHz processor
[ 0.008000] Calibrating delay loop (skipped) preset value.. 6983.82 BogoMIPS (lpj=13967648)
[ 0.008000] pid_max: default: 32768 minimum: 301
[ 0.008000] ACPI: Core revision 20140724
[ 0.008000] ACPI: All ACPI Tables successfully acquired
[ 0.008000] Security Framework initialized
[ 0.008000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.008000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 0.008000] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.008000] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.008106] Initializing cgroup subsys devices
[ 0.008379] Initializing cgroup subsys freezer
[ 0.008647] Initializing cgroup subsys net_cls
[ 0.008913] Initializing cgroup subsys blkio
[ 0.009169] Initializing cgroup subsys perf_event
[ 0.009486] mce: CPU supports 10 MCE banks
[ 0.009759] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.009759] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[ 0.010597] Freeing SMP alternatives memory: 28K (ffffffff81dc7000 - ffffffff81dce000)
[ 0.013902] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.014366] smpboot: CPU0: Intel QEMU Virtual CPU version 2.0.0 (fam: 06, model: 06, stepping: 03)
[ 0.016000] Performance Events: Broken PMU hardware detected, using software events only.
[ 0.016000] Failed to access perfctr msr (MSR c1 is 0)
[ 0.016000] NMI watchdog: disabled (cpu0): hardware events not enabled
[ 0.016000] x86: Booting SMP configuration:
[ 0.016000] .... node #0, CPUs: #1
[ 0.008000] kvm-clock: cpu 1, msr 0:1fff9041, secondary cpu clock
[ 0.028010] KVM setup async PF for cpu 1
[ 0.028358] #2
[ 0.028358] kvm-stealtime: cpu 1, msr 1fc4d000
[ 0.008000] kvm-clock: cpu 2, msr 0:1fff9081, secondary cpu clock
[ 0.044008] KVM setup async PF for cpu 2
[ 0.044506] #3
[ 0.044507] kvm-stealtime: cpu 2, msr 1fc8d000
[ 0.008000] kvm-clock: cpu 3, msr 0:1fff90c1, secondary cpu clock
[ 0.060011] KVM setup async PF for cpu 3
[ 0.060416] #4
[ 0.060416] kvm-stealtime: cpu 3, msr 1fccd000
[ 0.008000] kvm-clock: cpu 4, msr 0:1fff9101, secondary cpu clock
[ 0.072010] KVM setup async PF for cpu 4
[ 0.072461] #5
[ 0.072461] kvm-stealtime: cpu 4, msr 1fd0d000
[ 0.008000] kvm-clock: cpu 5, msr 0:1fff9141, secondary cpu clock
[ 0.088001] KVM setup async PF for cpu 5
[ 0.088001] #6
[ 0.088001] kvm-stealtime: cpu 5, msr 1fd4d000
[ 0.008000] kvm-clock: cpu 6, msr 0:1fff9181, secondary cpu clock
[ 0.108008] ------------[ cut here ]------------
[ 0.108366] WARNING: CPU: 0 PID: 1 at /src/linux-
[ 0.109172] Modules linked in:
[ 0.109419] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-
[ 0.112001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 0.112606] 0000000000000009 ffff88001e927db8 ffffffff81601466 0000000000000000
[ 0.113208] ffff88001e927df0 ffffffff810b4bb8 ffff88001fd92400 ffff88001fd92730
[ 0.113813] ffff88001fd92708 0000000000000006 ffff88001ea92540 ffff88001e927e00
[ 0.114422] Call Trace:
[ 0.114616] [<ffffffff81601
[ 0.115011] [<ffffffff810b4
[ 0.115474] [<ffffffff810b4
[ 0.116002] [<ffffffff810cc
[ 0.116507] [<ffffffff810d0
[ 0.116962] [<ffffffff810d1
[ 0.117458] [<ffffffff810b4
[ 0.117857] [<ffffffff810b5
[ 0.118249] [<ffffffff81d06
[ 0.118633] [<ffffffff81cea
[ 0.119114] [<ffffffff815f9
[ 0.119524] [<ffffffff815f9
[ 0.120002] [<ffffffff81607
[ 0.120443] [<ffffffff815f9
[ 0.120867] ---[ end trace bac34f2af212d79e ]---
[ 0.121255] ------------[ cut here ]------------
[ 0.121243] KVM setup async PF for cpu 6
[ 0.121243] kvm-stealtime: cpu 6, msr 1fd8d000
[ 0.122309] kernel BUG at /src/linux-
[ 0.122799] invalid opcode: 0000 [#1] SMP
[ 0.123150] Modules linked in:
[ 0.123406] CPU: 0 PID: 36 Comm: watchdog/6 Tainted: G W 3.17.0-
[ 0.124000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 0.124000] task: ffff88001eb00000 ti: ffff88001eb08000 task.ti: ffff88001eb08000
[ 0.124000] RIP: 0010:[<
[ 0.124000] RSP: 0000:ffff88001e
[ 0.124000] RAX: 0000000000000000 RBX: ffff88001eb00000 RCX: 0000000000000000
[ 0.124000] RDX: ffff88001eb0bfd8 RSI: ffff88001eb00000 RDI: 0000000000000006
[ 0.124000] RBP: ffff88001eb0bec8 R08: ffff88001eb08000 R09: ffff88001eb01a89
[ 0.124000] R10: 0000000000000010 R11: 0000000000000001 R12: ffff88001e801930
[ 0.124000] R13: ffffffff81c4b720 R14: ffff88001eb00000 R15: ffff88001eb00000
[ 0.124000] FS: 000000000000000
[ 0.124000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.124000] CR2: 00000000ffffffff CR3: 0000000001c14000 CR4: 00000000000006f0
[ 0.124000] Stack:
[ 0.124000] 0000000000000000 ffff88001eb0bea0 ffffffff81603714 ffff88001e90bb00
[ 0.124000] ffff88001e801930 ffffffff810d3770 0000000000000000 0000000000000000
[ 0.124000] ffff88001eb0bf48 ffffffff810d00cd 0000000000000001 0000000000000006
[ 0.124000] Call Trace:
[ 0.124000] [<ffffffff81603
[ 0.124000] [<ffffffff810d3
[ 0.124000] [<ffffffff810d0
[ 0.124000] [<ffffffff810d0
[ 0.124000] [<ffffffff81607
[ 0.124000] [<ffffffff810d0
[ 0.124000] Code: 89 fa 48 0f a3 11 19 d2 31 f6 85 d2 40 0f 95 c6 ff d0 4c 89 e7 e8 82 16 0f 00 48 83 c4 18 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 0f 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 89 d0 55 48
[ 0.124000] RIP [<ffffffff810d3
[ 0.124000] RSP <ffff88001eb0be88>
[ 0.124002] ---[ end trace bac34f2af212d79f ]---
[ 0.124456] Kernel panic - not syncing: Fatal exception
[ 0.128000] Shutting down cpus with NMI
[ 0.128000] ---[ end Kernel panic - not syncing: Fatal exception
Note there's an SMP-related warning coming out of workqueue.c right before the panic.
I have attached the .config I'm using with the kernel.
The bug is probably fixed by /git.kernel. org/cgit/ linux/kernel/ git/torvalds/ linux.git/ commit/ ?id=dd9d3843755 da95f63dd3a376f 62b3e45c011210
https:/