4.12.0-11-generic - crashing in infrastructure on i386 openvswitch tests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Triaged
|
High
|
Unassigned | ||
Artful |
Triaged
|
High
|
Unassigned |
Bug Description
Hi,
this seems to me to be a kernel crash of some sorts.
Somewhat in the spirit of older bugs:
- bug 1630940
- bug 1630578
Xnox asked me to look into a hang on openvswitch dep8 tests.
What I found initially was in the log just
"ERROR: Removing temporary files on testbed timed out"
That message brought me to the two bugs above.
But in there I read that this was the infra running dep8 crashing.
So for a better bug report I tried to reproduce locally and that actually seems to work very reliable.
To reproduce do:
$ autopkgtest-
$ pull-lp-source openvswitch
$ autopkgtest --apt-upgrade --shell --no-built-binaries openvswitch_
# This guest currently will crash after a while of testing
But with that running you can attach to the console and monitor of that guest.
For example:
$ sudo nc -U /tmp/autopkgtes
That gave me a crash on the hang which kind of matches the older bugs, here the console:
[ 54.256253] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 54.257156] IP: add_grec+0x28/0x440
[ 54.257553] *pdpt = 000000001a869001 *pde = 0000000000000000
[ 54.257555]
[ 54.258338] Oops: 0000 [#1] SMP
[ 54.258638] Modules linked in: veth openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c 9p fscache ppdev kvm_intel kvm irqbypass joydev input_leds serio_raw 9pnet_virtio parport_pc 9pnet parport qemu_fw_cfg i2c_piix4 mac_hid ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse virtio_blk virtio_net floppy pata_acpi
[ 54.261891] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.12.0-11-generic #12-Ubuntu
[ 54.262715] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-
[ 54.263610] task: c7b622c0 task.stack: c7b5a000
[ 54.264039] EIP: add_grec+0x28/0x440
[ 54.264378] EFLAGS: 00010202 CPU: 0
[ 54.264711] EAX: 00000000 EBX: dd062540 ECX: 00000006 EDX: dd062540
[ 54.265308] ESI: dd1e6e00 EDI: dd1e6e00 EBP: dbcc5f30 ESP: dbcc5ef0
[ 54.265793] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 54.266297] CR0: 80050033 CR2: 00000000 CR3: 1e930d80 CR4: 000006f0
[ 54.266885] Call Trace:
[ 54.267120] <SOFTIRQ>
[ 54.267349] mld_ifc_
[ 54.267754] ? mld_dad_
[ 54.268173] call_timer_
[ 54.268524] ? mld_dad_
[ 54.268942] ? mld_dad_
[ 54.269364] run_timer_
[ 54.269760] ? __softirqentry_
[ 54.270198] __do_softirq+
[ 54.270539] ? __softirqentry_
[ 54.270976] do_softirq_
[ 54.271373] </SOFTIRQ>
[ 54.271611] irq_exit+0xad/0xb0
[ 54.271913] smp_apic_
[ 54.272344] apic_timer_
[ 54.272745] EIP: native_
[ 54.273139] EFLAGS: 00000246 CPU: 0
[ 54.273624] EAX: 00000000 EBX: 00000000 ECX: 00000001 EDX: 00000000
[ 54.274000] ESI: 00000000 EDI: c7b622c0 EBP: c7b5bf10 ESP: c7b5bf10
[ 54.274361] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 54.274687] default_
[ 54.274896] arch_cpu_
[ 54.275286] default_
[ 54.275787] do_idle+0x145/0x1c0
[ 54.276092] cpu_startup_
[ 54.276441] rest_init+0x62/0x70
[ 54.276718] start_kernel+
[ 54.276974] i386_start_
[ 54.277260] startup_
[ 54.277509] Code: 00 00 00 3e 8d 74 26 00 55 89 e5 57 56 53 89 c6 83 ec 34 89 4d e8 65 a1 14 00 00 00 89 45 f0 31 c0 8b 42 10 f6 42 48 08 89 45 cc <8b> 00 c7 45 ec 00 00 00 00 89 45 c8 89 f0 0f 85 b4 02 00 00 8b
[ 54.279314] EIP: add_grec+0x28/0x440 SS:ESP: 0068:dbcc5ef0
[ 54.279829] CR2: 0000000000000000
[ 54.280143] ---[ end trace 3164b1c0dd7745bc ]---
[ 54.280550] Kernel panic - not syncing: Fatal exception in interrupt
[ 54.281078] Kernel Offset: 0x6000000 from 0xc1000000 (relocation range: 0xc0000000-
[ 54.281797] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
But since this is lovely qemu the machine isn't as dead as real HW now, so via the monitor
$ sudo nc -U /tmp/autopkgtes
I took a dump.
Fetching the i386 debug kernel shows me I can load that in crash.
But I'd leave the authoritative look what happened to the kernel Team. So I shared via fileshare.
Please load with debug-kernel of linux-image-
I have issues properly loading that in crash as this is a i386 artful on a 64bit KVM, so the format of the kdump generated by qemu is x86_64 and something seems to disagree.
But since I have simple repro steps above I hope you can crash and analyze it the way you want/need.
In case you still want my dump fetch it from https:/
I also have raw (non kdump style formats if needed).
tags: |
added: kernel-da-key removed: kernel-key |
Changed in linux (Ubuntu): | |
status: | Confirmed → Triaged |
The last two known good runs on LP infra also were on the same kernel 4.12.0.11.12, no idea yet what has changed.
Maybe the openvswitch upload itself a while ago - but that worked as well on LP infra ... ?