kernel panic on boot with hyper-v

Bug #994870 reported by Ben Howard on 2012-05-04
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Critical
Andy Whitcroft
Precise
Critical
Andy Whitcroft
Quantal
Critical
Andy Whitcroft

Bug Description

Kernel panic on boot on Hyper-V Servers with large memory/cpu footprints. This happens about 1 out of every 5 boots, regardless of host.

[ 0.000000] console [ttyS1] enabled, bootconsole disabled
[ 0.000000] allocated 118489088 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Fast TSC calibration failed
[ 0.000000] TSC: Unable to calibrate against PIT
[ 0.000000] TSC: using PMTIMER reference calibration
[ 0.000000] Detected 2100.001 MHz processor.
[ 0.000000] Marking TSC unstable due to TSCs unsynchronized
[ 0.022957] Calibrating delay loop (skipped), value calculated using timer frequency.. 4200.00 BogoMIPS (lpj=8400004)
[ 0.028036] pid_max: default: 65536 minimum: 512
[ 0.032174] Security Framework initialized
[ 0.034871] AppArmor: AppArmor initialized
[ 0.036042] Yama: becoming mindful.
[ 0.045795] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
[ 0.065063] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[ 0.075454] Mount-cache hash table entries: 256
[ 0.076750] Initializing cgroup subsys cpuacct
[ 0.080073] Initializing cgroup subsys memory
[ 0.084118] Initializing cgroup subsys devices
[ 0.088043] Initializing cgroup subsys freezer
[ 0.092031] Initializing cgroup subsys blkio
[ 0.096061] Initializing cgroup subsys perf_event
[ 0.101646] CPU: Physical Processor ID: 0
[ 0.104033] CPU: Processor Core ID: 0
[ 0.108043] mce: CPU supports 0 MCE banks
[ 0.110110] using AMD E400 aware idle routine
[ 0.122203] ACPI: Core revision 20110623
[ 0.129677] ftrace: allocating 27043 entries in 107 pages
[ 0.136151] Switched APIC routing to physical flat.
[ 0.167447] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.205745] CPU0: Quad-Core AMD Opteron(tm) Processor 2373 EE stepping 02
[ 0.208012] APIC calibration not consistent with PM-Timer: 94ms instead of 100ms
[ 0.208012] APIC delta adjusted to PM-Timer: 1249988 (1175283)
[ 0.208012] Performance Events: Broken PMU hardware detected, using software events only.
[ 0.212305] NMI watchdog disabled (cpu0): hardware events not enabled
[ 0.216327] Booting Node 0, Processors #1
[ 0.024001] calibrate_delay_direct() dropping min bogoMips estimate 4 = 4861299
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.324476] NMI watchdog disabled (cpu1): hardware events not enabled
[ 0.328405] #2
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.424050] NMI watchdog disabled (cpu2): hardware events not enabled
[ 0.426814] #3
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.524044] NMI watchdog disabled (cpu3): hardware events not enabled
[ 0.526614] Ok.
[ 0.527587] Booting Node 1, Processors #4
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.624239] NMI watchdog disabled (cpu4): hardware events not enabled
[ 0.626614] #5
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.724425] NMI watchdog disabled (cpu5): hardware events not enabled
[ 0.727528] #6
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.824111] NMI watchdog disabled (cpu6): hardware events not enabled
[ 0.828339] #7
[ 0.024001] mce: CPU supports 0 MCE banks
[ 0.924430] NMI watchdog disabled (cpu7): hardware events not enabled
[ 0.928107] Brought up 8 CPUs
[ 0.930635] Total of 8 processors activated (33606.81 BogoMIPS).
[ 0.941254] devtmpfs: initialized
[ 0.946143] EVM: security.selinux
[ 0.948066] EVM: security.SMACK64
[ 0.952161] EVM: security.capability
[ 0.954972] PM: Registering ACPI NVS region at 3fff000 (4096 bytes)
[ 0.958308] print_constraints: dummy:
[ 0.983296] RTC time: 18:29:10, date: 05/04/12
[ 0.984262] NET: Registered protocol family 16
[ 0.988286] Trying to unpack rootfs image as initramfs...
[ 0.996484] Extended Config Space enabled on 0 nodes
[ 0.999676] ACPI: bus type pci registered
[ 1.000978] PCI: Using configuration type 1 for base access
[ 1.004092] PCI: Using configuration type 1 for extended access
[ 1.056863] bio: create slab <bio-0> at 0
[ 1.060717] ACPI: Added _OSI(Module Device)
[ 1.064087] ACPI: Added _OSI(Processor Device)
[ 1.068071] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 1.072083] ACPI: Added _OSI(Processor Aggregator Device)
[ 1.092661] ACPI: Interpreter enabled
[ 1.095335] ACPI: (supports S0 S5)
[ 1.097261] ACPI: Using IOAPIC for interrupt routing
[ 1.107291] ACPI: No dock devices found.
[ 1.108072] HEST: Table not found.
[ 1.109182] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 1.112149] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 1.116207] pci_root PNP0A03:00: host bridge window [io 0x0000-0x0cf7]
[ 1.120075] pci_root PNP0A03:00: host bridge window [io 0x0d00-0xffff]
[ 1.124074] pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
[ 1.128073] pci_root PNP0A03:00: host bridge window [mem 0xf8000000-0xfffbffff]
[ 1.137827] * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
[ 1.137829] * this clock source is slow. Consider trying other clock sources
[ 1.144770] pci 0000:00:07.3: quirk: [io 0x0400-0x043f] claimed by PIIX4 ACPI
[ 1.148171] pci 0000:00:07.3: quirk: [io 0x0440-0x044f] claimed by PIIX4 SMB
[ 1.158806] pci0000:00: Requesting ACPI _OSC control (0x1d)
[ 1.160080] pci0000:00: ACPI _OSC request failed (AE_NOT_FOUND), returned control mask: 0x1d
[ 1.164075] ACPI _OSC control for PCIe not granted, disabling ASPM
[ 1.168265] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 1.178820] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 1.190007] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 1.200822] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 1.211231] vgaarb: device added: PCI:0000:00:08.0,decodes=io+mem,owns=io+mem,locks=none
[ 1.212082] vgaarb: loaded
[ 1.216080] vgaarb: bridge control possible 0000:00:08.0
[ 1.220226] i2c-core: driver [aat2870] using legacy suspend method
[ 1.224081] i2c-core: driver [aat2870] using legacy resume method
[ 1.227472] SCSI subsystem initialized
[ 1.228259] usbcore: registered new interface driver usbfs
[ 1.232135] usbcore: registered new interface driver hub
[ 1.234652] usbcore: registered new device driver usb
[ 1.236262] PCI: Using ACPI for IRQ routing
[ 1.237891] NetLabel: Initializing
[ 1.240081] NetLabel: domain hash size = 128
[ 1.241298] NetLabel: protocols = UNLABELED CIPSOv4
[ 1.243319] NetLabel: unlabeled traffic allowed by default
[ 1.244213] Switching to clocksource hyperv_clocksource
[ 1.265779] AppArmor: AppArmor Filesystem Enabled
[ 1.269369] pnp: PnP ACPI init
[ 1.272419] ACPI: bus type pnp registered
[ 1.279224] Freeing initrd memory: 4304k freed
[ 1.285794] system 00:0a: [io 0x01e0-0x01ef] has been reserved
[ 1.289102] system 00:0a: [io 0x0160-0x016f] has been reserved
[ 1.292532] system 00:0a: [io 0x0278-0x027f] has been reserved
[ 1.295424] system 00:0a: [io 0x0378-0x037f] has been reserved
[ 1.298827] system 00:0a: [io 0x0678-0x067f] has been reserved
[ 1.301622] system 00:0a: [io 0x0778-0x077f] has been reserved
[ 1.303637] system 00:0a: [io 0x04d0-0x04d1] has been reserved
[ 1.308961] system 00:0b: [io 0x0400-0x043f] has been reserved
[ 1.312746] system 00:0b: [io 0x0370-0x0371] has been reserved
[ 1.316766] system 00:0b: [io 0x0440-0x044f] has been reserved
[ 1.320599] system 00:0b: [mem 0xfec00000-0xfec00fff] could not be reserved
[ 1.324531] system 00:0b: [mem 0xfee00000-0xfee00fff] has been reserved
[ 1.327992] system 00:0c: [mem 0x00000000-0x0009ffff] could not be reserved
[ 1.332340] system 00:0c: [mem 0x000c0000-0x000dffff] could not be reserved
[ 1.337151] system 00:0c: [mem 0x000e0000-0x000fffff] could not be reserved
[ 1.341746] system 00:0c: [mem 0x00100000-0xf7ffffff] could not be reserved
[ 1.345047] system 00:0c: [mem 0xfffc0000-0xffffffff] has been reserved
[ 1.350514] pnp: PnP ACPI: found 13 devices
[ 1.352633] ACPI: ACPI bus type pnp unregistered
[ 1.372435] NET: Registered protocol family 2
[ 1.377270] IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
[ 1.385805] TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
[ 1.396625] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
[ 1.400171] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.403637] TCP reno registered
[ 1.405789] UDP hash table entries: 8192 (order: 6, 262144 bytes)
[ 1.410093] UDP-Lite hash table entries: 8192 (order: 6, 262144 bytes)
[ 1.418359] NET: Registered protocol family 1
[ 1.420355] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 1.423646] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 1.428564] Placing 64MB software IO TLB between ffff8800bc000000 - ffff8800c0000000
[ 1.435903] software IO TLB at phys 0xbc000000 - 0xc0000000
[ 1.441980] audit: initializing netlink socket (disabled)
[ 1.444742] type=2000 audit(1336156150.440:1): initialized
[ 1.578905] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[ 1.720468] VFS: Disk quotas dquot_6.5.2
[ 1.723709] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 1.728023] fuse init (API version 7.17)
[ 1.730144] msgmni has been set to 28172
[ 1.734336] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
[ 1.737387] io scheduler noop registered
[ 1.739960] io scheduler deadline registered (default)
[ 1.743105] io scheduler cfq registered
[ 1.745694] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[ 1.748775] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[ 1.752529] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[ 1.757730] ACPI: Power Button [PWRF]
[ 1.765623] ERST: Table is not found!
[ 1.768116] GHES: HEST is not enabled!
[ 1.770843] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[ 1.798748] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 1.838301] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 2.274969] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 2.314617] 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 2.330695] Linux agpgart interface v0.103
[ 2.339765] brd: module loaded
[ 2.345849] loop: module loaded
[ 2.352362] scsi0 : ata_piix
[ 2.355920] scsi1 : ata_piix
[ 2.360287] ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
[ 2.364879] ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
[ 2.370406] Fixed MDIO Bus: probed
[ 2.373130] tun: Universal TUN/TAP device driver, 1.6
[ 2.376418] tun: (C) 1999-2004 Max Krasnyansky <email address hidden>
[ 2.379942] PPP generic driver version 2.4.2
[ 2.384148] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 2.387884] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 2.392136] uhci_hcd: USB Universal Host Controller Interface driver
[ 2.396747] usbcore: registered new interface driver libusual
[ 2.400317] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
[ 2.413790] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 2.416792] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 2.421357] mousedev: PS/2 mouse device common for all mice
[ 2.425490] rtc_cmos 00:02: RTC can wake from S4
[ 2.450909] rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
[ 2.456122] rtc0: alarms up to one month, 114 bytes nvram
[ 2.459727] device-mapper: uevent: version 1.0.3
[ 2.463082] device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised: <email address hidden>
[ 2.467164] cpuidle: using governor ladder
[ 2.469848] cpuidle: using governor menu
[ 2.473575] EFI Variables Facility v0.08 2004-May-17
[ 2.477610] TCP cubic registered
[ 2.480122] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[ 2.485196] NET: Registered protocol family 10
[ 2.491503] NET: Registered protocol family 17
[ 2.494343] Registering the dns_resolver key type
[ 2.497913] registered taskstats version 1
[ 2.545196] ata2.00: ATA disk ignored deferring to Hyper-V paravirt driver
[ 2.571438] ata1.01: ATA disk ignored deferring to Hyper-V paravirt driver
[ 2.578919] ata1.00: ATA disk ignored deferring to Hyper-V paravirt driver
[ 2.599068] Magic number: 12:129:493
[ 2.602208] rtc_cmos 00:02: setting system clock to 2012-05-04 18:29:12 UTC (1336156152)
[ 2.607529] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 2.610627] EDD information not available.
[ 2.624253] Freeing unused kernel memory: 920k freed
[ 2.627415] Write protecting the kernel read-only data: 12288k
[ 2.638284] Freeing unused kernel memory: 1636k freed
[ 2.646804] Freeing unused kernel memory: 1200k freed
Loading, please wait...
[ 2.689774] general protection fault: 0000 [#1] SMP
[ 2.693146] CPU 7
[ 2.693146] Modules linked in:
[ 2.693146]
[ 2.693146] Pid: 1, comm: init Not tainted 3.2.0-24-virtual #37-Ubuntu Micros
oft Corporation Virtual Machine/Virtual Machine
[ 2.693146] RIP: 0010:[<ffffffff8116136a>] [<ffffffff8116136a>] kmem_cache_a
lloc+0x4a/0x120
[ 2.693146] RSP: 0018:ffff880233d91d60 EFLAGS: 00010206
[ 2.693146] RAX: 0000000000000000 RBX: ffff8803f32216e0 RCX: 0000000000000014
[ 2.693146] RDX: 0000000000000013 RSI: 0000000000016500 RDI: ffffffff810853c4
[ 2.693146] RBP: ffff880233d91db0 R08: ffff8803ff676500 R09: 0000000000000040
[ 2.693146] R10: ffff8803ffffbe00 R11: ffff880232c4a800 R12: ffff880238402800
[ 2.693146] R13: 48446c2075617274 R14: 0000000000000000 R15: ffff880233d91f58
[ 2.693146] FS: 00007f8dff70c700(0000) GS:ffff8803ff660000(0000) knlGS:0000000000000000
[ 2.693146] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2.693146] CR2: 000000000064eb38 CR3: 0000000232c4a000 CR4: 00000000000006e0
[ 2.693146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2.693146] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2.693146] Process init (pid: 1, threadinfo ffff880233d90000, task ffff880233d88000)
[ 2.693146] Stack:
[ 2.693146] ffff8802325b0000 0000000000000000 ffff880233d91da0 000000d0f3261880
[ 2.693146] ffff8802325b0000 ffff8803f32216e0 ffffffff81c28020 00007f8dff70c9d0
[ 2.693146] 0000000000000000 ffff880233d91f58 ffff880233d91de0 ffffffff810853c4
[ 2.693146] Call Trace:
[ 2.693146] [<ffffffff810853c4>] alloc_pid+0x24/0x200
[ 2.693146] [<ffffffff81064848>] copy_process.part.18+0x958/0xe70
[ 2.693146] [<ffffffff8116032c>] ? kmem_cache_alloc_trace+0xfc/0x120
[ 2.693146] [<ffffffff81064dd7>] copy_process+0x77/0x80
[ 2.693146] [<ffffffff81064f2a>] do_fork+0xfa/0x2d0
[ 2.693146] [<ffffffff81191947>] ? alloc_fd+0xf7/0x150
[ 2.693146] [<ffffffff816552de>] ? _raw_spin_lock+0xe/0x20
[ 2.693146] [<ffffffff81173011>] ? fd_install+0x61/0x80
[ 2.693146] [<ffffffff8117efb3>] ? do_pipe_flags+0xc3/0x120
[ 2.693146] [<ffffffff8101c568>] sys_clone+0x28/0x30
[ 2.693146] [<ffffffff8165dbe3>] stub_clone+0x13/0x20
[ 2.693146] [<ffffffff8165d8c2>] ? system_call_fastpath+0x16/0x1b
[ 2.693146] Code: cc 4d 8b 04 24 65 4c 03 04 25 08 da 00 00 49 8b 50 08 4d 8b
28 4d 85 ed 0f 84 c4 00 00 00 49 63 44 24 20 49 8b 34 24 48 8d 4a 01 <49> 8b 5c
05 00 4c 89 e8 65 48 0f c7 0e 0f 94 c0 84 c0 74 c2 4d
[ 2.693146] RIP [<ffffffff8116136a>] kmem_cache_alloc+0x4a/0x120
[ 2.693146] RSP <ffff880233d91d60>
[ 2.859315] ---[ end trace a9e489e317ba84bc ]---
[ 2.861909] Kernel panic - not syncing: Attempted to kill init!
[ 2.864640] Pid: 1, comm: init Tainted: G D 3.2.0-24-virtual #37-Ubuntu
[ 2.867570] Call Trace:
[ 2.868730] [<ffffffff8163d3ee>] panic+0x91/0x1a4
[ 2.870210] [<ffffffff81069be5>] forget_original_parent+0x245/0x250
[ 2.873573] [<ffffffff81069c07>] exit_notify+0x17/0x160
[ 2.876318] [<ffffffff8106a4e3>] do_exit+0x1f3/0x420
[ 2.877960] [<ffffffff81656460>] oops_end+0xb0/0xf0
[ 2.880247] [<ffffffff81016808>] die+0x58/0x90
[ 2.881955] [<ffffffff81655fe2>] do_general_protection+0x162/0x170
[ 2.884968] [<ffffffff81655a05>] general_protection+0x25/0x30
[ 2.888279] [<ffffffff810853c4>] ? alloc_pid+0x24/0x200
[ 2.890215] [<ffffffff8116136a>] ? kmem_cache_alloc+0x4a/0x120
[ 2.893536] [<ffffffff810853c4>] alloc_pid+0x24/0x200
[ 2.895646] [<ffffffff81064848>] copy_process.part.18+0x958/0xe70
[ 2.898210] [<ffffffff8116032c>] ? kmem_cache_alloc_trace+0xfc/0x120
[ 2.901313] [<ffffffff81064dd7>] copy_process+0x77/0x80
[ 2.903183] [<ffffffff81064f2a>] do_fork+0xfa/0x2d0
[ 2.904693] [<ffffffff81191947>] ? alloc_fd+0xf7/0x150
[ 2.906371] [<ffffffff816552de>] ? _raw_spin_lock+0xe/0x20
[ 2.908421] [<ffffffff81173011>] ? fd_install+0x61/0x80
[ 2.910160] [<ffffffff8117efb3>] ? do_pipe_flags+0xc3/0x120
[ 2.912520] [<ffffffff8101c568>] sys_clone+0x28/0x30
[ 2.915129] [<ffffffff8165dbe3>] stub_clone+0x13/0x20
[ 2.918025] [<ffffffff8165d8c2>] ? system_call_fastpath+0x16/0x1b

Ben Howard (darkmuggle) on 2012-05-04
description: updated

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/994870/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Andy Whitcroft (apw) wrote :

@Ben -- is this a 32bit or 64bit instance, and exactly how much memory is being allocated in this instance size?

affects: ubuntu → linux (Ubuntu)
Long Li (longli) wrote :

This is a 64bit instance. This VM is running on 8 VCPU and 14GB RAM. This issue repros on other VCPU and RAM settings too but with much less chance.

However, the kernel panic traces are not consistent for each repro. The trace in the bug report happens at a early stage of boot, at which point we have yet to load the Linux Hyper-V drivers.

I am seeing a kernel panic during the x64 Ubuntu Server 12.04 installer, on a small Hyper-V guest (2 VCPU, 1GB RAM). Much later than boot, though. Last time it was while it was partitioning the disk. The installer UI locks up and my caps lock blinks. Is there a way to get the stack so I can see whether it's the same as this bug?

Ben Howard (darkmuggle) wrote :

Raising to critical since there is no known workaround.

Changed in linux (Ubuntu):
importance: High → Critical
Long Li (longli) wrote :

Update: We only see this issue when using >1 CPU for a Hyper-V guest.

Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Andy Whitcroft (apw) wrote :

I see elsewhere noted that this does not occur on 3.2.0-16 but does on 3.2.0-24, a logical test would be to try the interviening kernels and confirm where this was introduced.

Andy Whitcroft (apw) wrote :

I have been trying to reproduce this for several hours using a 4 vcpu/8GB memory instance through 100 reboots (and counting) and have not had a single failure.

Andy Whitcroft (apw) wrote :

The ideal situation here would be to find the latest version we had which worked relialby and pull in the missing parts, this would give us a quick fix for the immediate problems.

tags: added: kernel-da-key kernel-key precise
Long Li (longli) wrote :

After removing ata_piix from the kernel, the panic problem goes away.

It seems the regression comes from this commit:

commit a207c1ea485cc9cd7d546eadeb0877515c952f2a
Author: Andy Whitcroft <email address hidden>
Date: Thu Mar 8 11:32:35 2012 +0000

    UBUNTU: SAUCE: ata_piix: defer disks to the Hyper-V drivers by default

    When we are hosted on a Microsoft Hyper-V hypervisor the guest disks
    are exposed both via the Hyper-V paravirtualised drivers and via an
    emulated SATA disk drive. In this case we want to use the paravirtualised
    drivers if we can as they are much more efficient. Note that the Hyper-V
    paravirtualised drivers only expose the virtual hard disk devices, the
    CDROM/DVD devices must still be enumerated.

    Check the disk type when picking up its ID and if it appears to be a
    disk just report it disconnected.

    BugLink: http://bugs.launchpad.net/bugs/929545
    BugLink: http://bugs.launchpad.net/bugs/942316
    Signed-off-by: Andy Whitcroft <email address hidden>

Andy Whitcroft (apw) wrote :

@Long Li -- that commit unfortuantly changes the behaviour, so without that commit we will never use the drives in with the paravirt driver. So it may be the change itself, or just the change in behaviour. The fix as applied upstream works slightly differently, I will get some test kernels with that form applied for comparison.

Andy Whitcroft (apw) on 2012-05-17
Changed in linux (Ubuntu Precise):
status: New → In Progress
assignee: nobody → Andy Whitcroft (apw)
Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw)
status: Confirmed → In Progress
Changed in linux (Ubuntu Precise):
importance: Undecided → Critical
milestone: none → ubuntu-12.04.1
Andy Whitcroft (apw) on 2012-05-17
Changed in linux (Ubuntu):
milestone: ubuntu-12.04.1 → quantal-alpha-1
Andy Whitcroft (apw) on 2012-05-21
Changed in linux (Ubuntu Precise):
status: In Progress → Fix Committed
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Precise in -proposed solves the problem (3.2.0-24.39). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
Andy Whitcroft (apw) wrote :

Installed this kernel into an instance, and tested both with and without the new option:

ubuntu@ubuntu:~$ dmesg | egrep 'Linux |ata'
[ 0.000000] Linux version 3.2.0-24-virtual (buildd@crested) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #39-Ubuntu SMP Mon May 21 18:44:18 UTC 2012 (Ubuntu 3.2.0-24.39-virtual 3.2.16)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-24-virtual root=UUID=0a14fe51-b4cf-4bd5-a11b-75ef1ef9a0d0 ro earlyprintk ata_piix.disable_driver
[...]
[ 1.259036] ata_piix: driver disabled completely

ubuntu@ubuntu:~$ dmesg | egrep 'Linux |ata'
[ 0.000000] Linux version 3.2.0-24-virtual (buildd@crested) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #39-Ubuntu SMP Mon May 21 18:44:18 UTC 2012 (Ubuntu 3.2.0-24.39-virtual 3.2.16)
[...]
[ 1.585875] ata1.00: ATA disk ignored deferring to Hyper-V paravirt driver
[ 1.586466] ata1.00: NODEV after polling detection
[ 1.587614] ata2.00: ATAPI: Virtual CD, , max MWDMA2
[ 1.588964] ata2.00: configured for MWDMA2

tags: added: verification-done-precise
removed: verification-needed-precise
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.2.0-24.39

---------------
linux (3.2.0-24.39) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1002329

  [ Andy Whitcroft ]

  * SAUCE: ata_piix: add a disable_driver option
    - LP: #994870
 -- Luis Henriques <email address hidden> Mon, 21 May 2012 15:51:25 +0100

Changed in linux (Ubuntu Precise):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
milestone: quantal-alpha-1 → quantal-alpha-2
Changed in linux (Ubuntu Quantal):
milestone: quantal-alpha-2 → quantal-alpha-3
Bill Maxwell (wamaxwell) wrote :

Alpha 1-No problems, boots okay
Alpha 2 & 3-Has problems, will not boot up
                       [32.741233] Kernal panic-not syncing: fatal exception in interrupt
                       [32.741275] Panic occurred, switching back to text console

Dell GX260 Intel 2.4 GHz, 1GB RAM, nVidia 32MB SSE, 40GB HDD

tags: removed: kernel-key
Changed in linux (Ubuntu Quantal):
milestone: quantal-alpha-3 → ubuntu-12.10
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.5.0-16.24

---------------
linux (3.5.0-16.24) quantal-proposed; urgency=low

  [ Andy Whitcroft ]

  * SAUCE: ata_piix: add a disable_driver option
    - LP: #994870

  [ Christian König ]

  * (pre-stable) drm/radeon: make 64bit fences more robust v3 (3.5 stable)
    - LP: #1029582

  [ David Henningsson ]

  * SAUCE: ALSA: hda - use both input paths on Conexant auto parser
    - LP: #1037642
  * SAUCE: ALSA: hda - fix control names for multiple speaker out on
    IDT/STAC
    - LP: #1046734

  [ Herton Ronaldo Krzesinski ]

  * SAUCE: ALSA: hda/via - don't report presence on HPs with no presence
    support
    - LP: #1052499
  * SAUCE: ext4: fix crash when accessing /proc/mounts concurrently
    - LP: #1053019
  * SAUCE: ALSA: hda/realtek - Fix detection of ALC271X codec
    - LP: #1006690

  [ Kyle Fazzari ]

  * SAUCE: input: Cypress PS/2 Trackpad fix disabling tap-to-click
    - LP: #1048816

  [ Leann Ogasawara ]

  * [Config] Disable CONFIG_DRM_AST
    - LP: #1053290

  [ Stefan Bader ]

  * [Config] Disable the Cirrus QEMU drm driver
    - LP: #1038055

  [ Upstream Kernel Changes ]

  * Revert "KVM: VMX: Fix KVM_SET_SREGS with big real mode segments"
    - LP: #1045027
  * x86, efi: Handover Protocol
  * drm/i915: HDMI - Clear Audio Enable bit for Hot Plug
    - LP: #1056729
  * UBUNTU SAUCE: apparmor: fix IRQ stack overflow
    - LP: #1056078
  * drm/nouveau: fix booting with plymouth + dumb support
    - LP: #1043518
  * ALSA: hda - Add DeviceID for Haswell HDA
    - LP: #1057698
  * ALSA: hda - add Haswell HDMI codec id
    - LP: #1057698
  * ALSA: hda - Fix driver type of Haswell controller to AZX_DRIVER_SCH
    - LP: #1057698
  * ALSA: hda_intel: Add Device IDs for Intel Lynx Point-LP PCH
    - LP: #1011438, #1057698

  [ Wang Xingchao ]

  * SAUCE: ALSA: hda - Add another pci id for Haswell board
    - LP: #1057698

  [ Wen-chien Jesse Sung ]

  * SAUCE: drm/i915: Explicitly disable RC6 for certain models
    - LP: #1002170, #1008867
 -- Leann Ogasawara <email address hidden> Thu, 27 Sep 2012 13:55:52 -0700

Changed in linux (Ubuntu Quantal):
status: In Progress → Fix Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers