[Hyper-V] Kernel panic not functional on Vivid

Bug #1440103 reported by Chris Valean
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Vivid
Fix Released
Medium
Joseph Salisbury
Wily
Fix Released
Medium
Joseph Salisbury
Xenial
Fix Released
Medium
Joseph Salisbury

Bug Description

Issue description:
Triggering a kernel panic will not produce a crash dump file.

Steps to Reproduce:
1. install kdump related packages.
2. verify crashmem value - defaults to 384M-:129M
3. verify that the system is ready for crash:
        # cat /sys/kernel/kexec_crash_loaded
        1
4. trigger kernel panic with # echo c | sudo tee /proc/sysrq-trigger
5. The system will freeze and no kernel dump will be generated, system is not rebooted automatically as configured in kdump-tools config.

Versions details:
Windows Server Host Edition: Microsoft Windows Server 2012 R2 Datacenter build 9600
Distribution name and release: Ubuntu Vivid Vervet
Kernel version: Linux ubuntu31 3.19.0-11-generic #11lp14233432v201504021617 SMP

Repro output with VM settings as follows:
2GB RAM, 1vCPU, crashmem: 384M:128M OR 384M:128M

[ 149.328829] SysRq : Trigger a crash
[ 149.342418] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 149.346369] IP: [<ffffffff814ae776>] sysrq_handle_crash+0x16/0x20
[ 149.346369] PGD 367a3067 PUD 7bdf0067 PMD 0
[ 149.346369] Oops: 0002 [#1] SMP
[ 149.346369] Modules linked in: joydev hid_generic hid_hyperv serio_raw hid 8250_fintek hyperv_keyboard hv_balloon hyp
erv_fb i2c_piix4 mac_hid autofs4 hv_netvsc hv_utils hv_storvsc psmouse floppy pata_acpi hv_vmbus
[ 149.346369] CPU: 0 PID: 1379 Comm: tee Not tainted 3.19.0-11-generic #11lp14233432v201504021617
[ 149.346369] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 149.346369] task: ffff88007a7d3ae0 ti: ffff88007be7c000 task.ti: ffff88007be7c000
[ 149.346369] RIP: 0010:[<ffffffff814ae776>] [<ffffffff814ae776>] sysrq_handle_crash+0x16/0x20
[ 149.346369] RSP: 0018:ffff88007be7fe68 EFLAGS: 00010292
[ 149.346369] RAX: 000000000000000f RBX: 0000000000000063 RCX: 000000000000000f
[ 149.346369] RDX: ffff88007ce0fd78 RSI: ffff88007ce0e498 RDI: 0000000000000063
[ 149.346369] RBP: ffff88007be7fe68 R08: 0000000000000002 R09: 0000000000000275
[ 149.346369] R10: 0000000000000092 R11: 0000000000000275 R12: 0000000000000004
[ 149.346369] R13: 0000000000000000 R14: ffffffff81cb38e0 R15: 0000000000000008
[ 149.346369] FS: 00007f0a2e22a700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000
[ 149.346369] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 149.346369] CR2: 0000000000000000 CR3: 0000000036047000 CR4: 00000000000006f0
[ 149.346369] Stack:
[ 149.346369] ffff88007be7fe98 ffffffff814aef96 0000000000000002 fffffffffffffffb
[ 149.346369] 00007ffc64e934b0 0000000000000002 ffff88007be7feb8 ffffffff814af443
[ 149.346369] 00007ffc64e934b0 ffff8800364f15c0 ffff88007be7fed8 ffffffff8125ad98
[ 149.346369] Call Trace:
[ 149.346369] [<ffffffff814aef96>] __handle_sysrq+0x106/0x170
[ 149.346369] [<ffffffff814af443>] write_sysrq_trigger+0x33/0x40
[ 149.346369] [<ffffffff8125ad98>] proc_reg_write+0x48/0x70
[ 149.346369] [<ffffffff811f33d7>] vfs_write+0xb7/0x1f0
[ 149.346369] [<ffffffff811f3ece>] ? vfs_read+0x11e/0x140
[ 149.346369] [<ffffffff811f3fe6>] SyS_write+0x46/0xb0
[ 149.346369] [<ffffffff817c924d>] system_call_fastpath+0x16/0x1b
[ 149.346369] Code: ef e8 df f7 ff ff eb d8 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 66 66 66 66 90 55 c7 05 34 42 a3 00
01 00 00 00 48 89 e5 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 66 66 66 66 90 55 31 c0 48 89 e5
[ 149.346369] RIP [<ffffffff814ae776>] sysrq_handle_crash+0x16/0x20
[ 149.346369] RSP <ffff88007be7fe68>
[ 149.346369] CR2: 0000000000000000
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.19.0-11-generic (apw@gloin) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu12) ) #11lp14233432v
201504021617 SMP Thu Apr 2 15:15:51 UTC 2015 (Ubuntu 3.19.0-11.11lp14233432v201504021617-generic 3.19.3)
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.19.0-11-generic root=/dev/mapper/ubuntu31--vg-root ro console=tty0 co
nsole=ttyS1 irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service elfcorehdr=867700K

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1440103

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: vivid
Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key kernel-hyper-v
Changed in linux (Ubuntu):
importance: High → Medium
Chris Valean (cvalean)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-key
Revision history for this message
nmeier (nmeier) wrote :

It was requested that I test Ballooning on 14.04.02.
Ballooning fails on 14.04.02.

Ballooning test on 14.04.02 after running apt-get upgrade
  Startup Memory : 2048
  Minimum Memory : 1024
  Maximum Memory : 8192
The utility stressapptest was used to generate memory demand.
Let memory settle to 2048MB assigned, 644MB demand.
Addd memory demand of 1664MB
Assigned memory grew to 2624MB assigned, 2125MB demand.
Let system settle to 1044MB assigned, 428MB demand
Added memory demand of 1664MB
Assigned memory grew to 2732MB assigned, 2376MB demand.
After all instances of stressapptest terminated demand dropped to 464MB
Assigned memory did not balloon down after 15 minutes.
Configured second VM to have a startup memory value to consume all of the hosts free memory.
Test VM assigned memory did not balloon down.
The dmesg log has stack traces for hung_task_timeout_secs.
The call traces are trying to add pages.
See attached file.

tags: removed: kernel-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test the latest 4.1 upstream kernel and see if this bug still exists? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1-unstable/

tags: added: kernel-key
Revision history for this message
Chris Valean (cvalean) wrote :

The patch has not been submitted yet upstream, we're going to update this thread as soon as it will be ready.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the update, Chris!

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Joshua R. Poulson (jrp) wrote :

I'm attaching thee patches that went upstream as part of a larger bundle to fix this kexec issue.

From "K. Y. Srinivasan" <>
Subject [PATCH V2 00/10] Drivers: hv: vmbus: Enable kexec and other misc cleanup
Date Thu, 4 Jun 2015 16:25:39 -0700

In addition to enabling kexec, this patch-set has a bunch of miscellaneous
fixes.

In this version of the patch-set; I have fixed up the subject line for couple
of patches (Greg)

Alex Ng (1):
  Drivers: hv: balloon: Enable dynamic memory protocol negotiation with
    Windows 10 hosts

K. Y. Srinivasan (1):
  Drivers: hv: vmbus: Permit sending of packets without payload

Vitaly Kuznetsov (8):
  cpu-hotplug: export cpu_hotplug_enable/cpu_hotplug_disable
  Drivers: hv: vmbus: use cpu_hotplug_enable/disable
  Drivers: hv: vmbus: remove hv_synic_free_cpu() call from
    hv_synic_cleanup()
  Drivers: hv: vmbus: add special kexec handler
  Drivers: hv: don't do hypercalls when hypercall_page is NULL
  Drivers: hv: vmbus: use 'die' notification chain instead of 'panic'
  Drivers: hv: kvp: check kzalloc return value
  Drivers: hv: fcopy: dynamically allocate smsg_out in
    fcopy_send_data()

 Documentation/power/suspend-and-cpuhotplug.txt | 6 +-
 arch/x86/include/asm/mshyperv.h | 2 +
 arch/x86/kernel/cpu/mshyperv.c | 30 +++++++++
 drivers/hv/channel.c | 4 +-
 drivers/hv/hv.c | 15 +++--
 drivers/hv/hv_balloon.c | 26 ++++++--
 drivers/hv/hv_fcopy.c | 21 ++++---
 drivers/hv/hv_kvp.c | 3 +
 drivers/hv/vmbus_drv.c | 76 +++++++++---------------
 kernel/cpu.c | 13 +++--
 10 files changed, 119 insertions(+), 77 deletions(-)
--
1.7.4.1

Revision history for this message
Joshua R. Poulson (jrp) wrote :
Revision history for this message
Joshua R. Poulson (jrp) wrote :
Revision history for this message
Joshua R. Poulson (jrp) wrote :
Revision history for this message
Joshua R. Poulson (jrp) wrote :

FYI, re: upstream status, the series was resubmitted on 6/28 but needs an ack from the CPU hotplug folks before gregkh will accept it into his tree. We have tested it internally, as well as testing it with our partners.

tags: added: patch
tags: removed: kernel-key
Changed in linux (Ubuntu Vivid):
importance: Undecided → Medium
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Vivid):
status: New → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Upstream thread:

https://lkml.org/lkml/2015/6/28/70

We can cherry-pick these commits once they are accepted in mainline.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It looks like the commits requested in this bug have now landed in upstream v4.3-rc1

tags: added: wily
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I performed a cherry pick of the following three commits into Wily:

2b94ed2 kexec: define kexec_in_progress in !CONFIG_KEXEC case
2517281 Drivers: hv: vmbus: add special kexec handler
b4370df Drivers: hv: vmbus: add special crash handler

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1440103/wily/

Can you test this kernel and see if it resolves this bug?

Revision history for this message
Zsolt Dudás (v-zsduda) wrote :

It resolved the bug. Tested successfully on Wily with 4.2.0-10-generic #12~lp1440103 test kernel.

These are the traces in dmesg in the crash logs:
[ 187.204774] CPU: 3 PID: 946 Comm: bash Not tainted 4.2.0-10-generic #12~lp1440103
[ 187.204774] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 187.204774] task: ffff8800f077cb00 ti: ffff8800f23ac000 task.ti: ffff8800f23ac000
[ 187.204774] RIP: 0010:[<ffffffff814a1786>]
[ 187.204774] [<ffffffff814a1786>] sysrq_handle_crash+0x16/0x20
[ 187.204774] RSP: 0018:ffff8800f23afe28 EFLAGS: 00010296
[ 187.204774] RAX: 000000000000000f RBX: 0000000000000063 RCX: 000000000000000f
[ 187.204774] RDX: ffff8801026d12f8 RSI: ffff8801026cea58 RDI: 0000000000000063
[ 187.204774] RBP: ffff8800f23afe28 R08: 00000000000000c2 R09: 0000000000000002
[ 187.204774] R10: 0000000000000260 R11: 0000000000000002 R12: 0000000000000004
[ 187.204774] R13: 0000000000000000 R14: ffffffff81cc2560 R15: 0000000000000000
[ 187.204774] FS: 00007f385dfc0700(0000) GS:ffff8801026c0000(0000) knlGS:0000000000000000
[ 187.204774] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 187.204774] CR2: 0000000000000000 CR3: 00000000f1111000 CR4: 00000000000006e0
[ 187.204774] Stack:
[ 187.204774] ffff8800f23afe58 ffffffff814a1f86 0000000000000002 fffffffffffffffb
[ 187.204774] 0000000000980408 0000000000000002 ffff8800f23afe78 ffffffff814a23e3
[ 187.204774] 0000000000000002 ffff8800edeec180 ffff8800f23afe98 ffffffff81252278
[ 187.204774] Call Trace:
[ 187.204774] [<ffffffff814a1f86>] __handle_sysrq+0xf6/0x150
[ 187.204774] [<ffffffff814a23e3>] write_sysrq_trigger+0x33/0x40
[ 187.204774] [<ffffffff81252278>] proc_reg_write+0x48/0x70
[ 187.204774] [<ffffffff811eb598>] __vfs_write+0x18/0x40
[ 187.204774] [<ffffffff811ebbd9>] vfs_write+0xa9/0x190
[ 187.204774] [<ffffffff811ec946>] SyS_write+0x46/0xa0
[ 187.204774] [<ffffffff81208f0f>] ? __close_fd+0x8f/0xb0
[ 187.204774] [<ffffffff817b8132>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 187.204774] Code: 45 39 7d 34 75 e5 4c 89 ef e8 17 f8 ff ff eb db 0f 1f 44 00 00 0f 1f 44 00 00 55 c7 05 28 3c a6 00 01 00 00 00 48 89 e5 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 0f 1f 44 00 00 55 31 c0 48 89 e5
[ 187.204774] RIP [<ffffffff814a1786>] sysrq_handle_crash+0x16/0x20
[ 187.204774] RSP <ffff8800f23afe28>
[ 187.204774] CR2: 0000000000000000

Revision history for this message
Chris Valean (cvalean) wrote :

Hi Joseph,
We got the confirmation that all the below patches must be included for the kdump issue.
Please provide us with a test kernel with these included so we can re-test.

Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=510f7aef65bb7ed22cf9c7f94f955727f963ede4

Drivers: hv: vmbus: add special crash handler
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=b4370df2b1f5158de028e167974263c5757b34a6

Drivers: hv: don't do hypercalls when hypercall_page is NULL
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=d7646eaa7678fe5adc42247b4bdfbe9d9db8c253

Drivers: hv: vmbus: add special kexec handler
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2517281d63a2b09d94aedfb522943617048f337e

kexec: define kexec_in_progress in !CONFIG_KEXEC case
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2b94ed245861a7d378dcde6eef7fa7717e06e349

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Applied to Wily:

Drivers: hv: kvp: check kzalloc return value
Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
Drivers: hv: vmbus: add special crash handler
Drivers: hv: don't do hypercalls when hypercall_page is NULL
Drivers: hv: vmbus: add special kexec handler
kexec: define kexec_in_progress in !CONFIG_KEXEC case

Changed in linux (Ubuntu Wily):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.1 KiB)

This bug was fixed in the package linux - 4.2.0-14.16

---------------
linux (4.2.0-14.16) wily; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1501818
  * rebase to v4.2.2
  * [Config] CONFIG_RTC_DRV_XGENE=y
    - LP: #1499869

  [ Upstream Kernel Changes ]

  * mei: do not access freed cb in blocking write
    - LP: #1494076
  * mei: bus: fix drivers and devices names confusion
    - LP: #1494076
  * mei: bus: rename nfc.c to bus-fixup.c
    - LP: #1494076
  * mei: bus: move driver api functions at the start of the file
    - LP: #1494076
  * mei: bus: rename uevent handler to mei_cl_device_uevent
    - LP: #1494076
  * mei: bus: don't enable events implicitly in device enable
    - LP: #1494076
  * mei: bus: report if event registration failed
    - LP: #1494076
  * mei: bus: revamp device matching
    - LP: #1494076
  * mei: bus: revamp probe and remove functions
    - LP: #1494076
  * mei: bus: add reference to bus device in struct mei_cl_client
    - LP: #1494076
  * mei: bus: add me client device list infrastructure
    - LP: #1494076
  * mei: bus: enable running fixup routines before device registration
    - LP: #1494076
  * mei: bus: blacklist the nfc info client
    - LP: #1494076
  * mei: bus: blacklist clients by number of connections
    - LP: #1494076
  * mei: bus: simplify how we build nfc bus name
    - LP: #1494076
  * mei: bus: link client devices instead of host clients
    - LP: #1494076
  * mei: support for dynamic clients
    - LP: #1494076
  * mei: disconnect on connection request timeout
    - LP: #1494076
  * mei: define async notification hbm commands
    - LP: #1494076
  * mei: implement async notification hbm messages
    - LP: #1494076
  * mei: enable async event notifications only from hbm version 2.0
    - LP: #1494076
  * mei: add mei_cl_notify_request command
    - LP: #1494076
  * mei: add a handler that waits for notification on event
    - LP: #1494076
  * mei: add async event notification ioctls
    - LP: #1494076
  * mei: support polling for event notification
    - LP: #1494076
  * mei: implement fasync for event notification
    - LP: #1494076
  * mei: bus: add and call callback on notify event
    - LP: #1494076
  * mei: hbm: add new error code MEI_CL_CONN_NOT_ALLOWED
    - LP: #1494076
  * mei: me: d0i3: add the control registers
    - LP: #1494076
  * mei: me: d0i3: add flag to indicate D0i3 support
    - LP: #1494076
  * mei: me: d0i3: enable d0i3 interrupts
    - LP: #1494076
  * mei: hbm: reorganize the power gating responses
    - LP: #1494076
  * mei: me: d0i3: add d0i3 enter/exit state machine
    - LP: #1494076
  * mei: me: d0i3: move mei_me_hw_reset down in the file
    - LP: #1494076
  * mei: me: d0i3: exit d0i3 on driver start and enter it on stop
    - LP: #1494076
  * mei: me: add sunrise point device ids
    - LP: #1494076
  * mei: hbm: bump supported HBM version to 2.0
    - LP: #1494076
  * mei: remove check on pm_runtime_active in __mei_cl_disconnect
    - LP: #1494076
  * mei: fix debugfs files leak on error path
    - LP: #1494076

  [ Upstream Kernel Changes ]

  * rebase to v4.2.2
    - LP: #1492132

 -- Tim Gardner <email address hidden> Tue, 29 Sep 20...

Read more...

Changed in linux (Ubuntu Wily):
status: Fix Committed → Fix Released
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Vivid test kernel with the six patches posted in comment #16.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid

Can you test this kernel and see if it resolves this bug?

Revision history for this message
Chris Valean (cvalean) wrote :

Hi Joe,
I'm running the test kernel 3.19.0-32_3.19.0-32.37~lp1440103_amd64 on a Vivid either default installed or fully updated, however crashdump is not happening as expected.

It will show the usual crash log output, but the system will hang, never reboot - and the crash dump file is not generated.

Attaching you the full crash serial log output.
One note to mention here is that on the console itself, after several minutes, there is a message showing up that CPU x, command bash not tainted.
This is showing up *only* on the console, doesn't get logged on serial output.

For validaton, I'm running this on WS 2012R2.
For crashkernel memory sizes, I tried either the default - 384M-:128M, 384M or 512M. All of these in sets of SMP or single core.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It was found in bug 1400319 that the issue may be due to kdump-tools. Can you see if this is also the case with this bug and try the version of kdump-tools mentioned in comment #33 and #20?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It may be that we need a combination of the six patches in the test kernel and a new kdump-tools.

Revision history for this message
Chris Valean (cvalean) wrote :

The use of a different kdump-tools package version as in bug #1400319 was due to a different error, which was anyway only on 32bit.

Nevertheless, I still tried this on the Vivid VM and verified the crash functionality with kdump-tools from Trusty and then the one from Wily.
With both attempts, kdump still fails with the same behavior as in comment #19.

Validation for kdump setup:
# cat /sys/kernel/kexec_*
1
402653184

# service kdump-tools status
â kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Thu 2015-11-05 03:01:05 PST; 1min 51s ago
  Process: 661 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 661 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/kdump-tools.service

Nov 05 03:01:03 ubuntu1504srv systemd[1]: Starting Kernel crash dump capture service...
Nov 05 03:01:05 ubuntu1504srv kdump-tools[661]: Starting kdump-tools: * loaded kdump kernel
Nov 05 03:01:05 ubuntu1504srv systemd[1]: Started Kernel crash dump capture service.

kdump-tools packages tested:
- vivid default
- kdump-tools_1.5.5-2ubuntu1_all - Trusty
- kdump-tools_1.5.8-4_all - Wily

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a V2 Vivid test kernel with the six patches posted in comment #16 and some prereq commits I applied instead of backporting.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid

Can you test this kernel and see if it resolves this bug?

Also, can you confirm that Wily with the latest updates does not exhibit this bug anymore?

Revision history for this message
Ovidiu Rusu (orusu) wrote :

I've installed the test kernel but the VM doesn't boot.

You can check the boot log which I attached.

Changed in linux (Ubuntu):
status: Fix Released → In Progress
Revision history for this message
Chris Valean (cvalean) wrote :

Joe, can you please re-check on the patches included, based on the no-boot issue from comment #24?
Thank you!

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I made some changes to the backports and have a new V3 test kernel to try. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid

Can you test this kernel and see if it resolves this bug?

Also, can you confirm that Wily with the latest updates does not exhibit this bug anymore?

Revision history for this message
Paula Crismaru (pcrismaru) wrote :

Hello!

I tested Willy with the latest kernel version (4.2.0-23-generic) and VM reboots itself after kernel panic.
I also tested Vivid with the following kernel http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid/. The VM doesn't boot after installing the kernel. I attached the log.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I created one more test kernel, which is v5. I confirmed this test kernel now boots. One of the prereq backports needed to be done differently.

The new test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid/

Can you test this kernel and see if it resolves this bug?

Revision history for this message
Paula Crismaru (pcrismaru) wrote :

Hello!

I tested the following kernel http://kernel.ubuntu.com/~jsalisbury/lp1440103/vivid/ with the following crashkernel values: 384M and 512M. Everything worked as expected.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

That is good news, paula. If those commits now fix this bug, I'll SRU them to Vivid.

Thanks for the help testing!

Revision history for this message
Chris Valean (cvalean) wrote :

Hi Joe,
Can you provide a list of commits and dependencies you have to include for the working test kernel?
The same or a similar issue might be occurring on Ubuntu with linux-next so we want to compare them.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Chris,

Here is the list of commits used in test kernel:

0f22bce staging: lustre: check kzalloc return value
1d67237 Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
a28d64c Drivers: hv: vmbus: add special crash handler
5990a10 Drivers: hv: don't do hypercalls when hypercall_page is NULL
7d8e182 Drivers: hv: vmbus: add special kexec handler
c3e9e25 kexec: define kexec_in_progress in !CONFIG_KEXEC case
e051d15 Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
7f0f4f3 Drivers: hv: vmbus: unregister panic notifier on module unload
2f940da hv: run non-blocking message handlers in the dispatch tasklet
ce30250 Drivers: hv: vmbus: Add support for VMBus panic notifier handler
9ed3a0e Drivers: hv: vmbus: Teardown clockevent devices on module unload
25a696e clockevents: export clockevents_unbind_device instead of clockevents_unbind
891b295 drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload
3efe049 Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
45a0fc8 Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors

The following commits did not cherry-pick cleanly and needed backporting:
e513229, 96c1d05, 652594c, 2db84ef, 2517281 and 510f7aef

The subjects listed above are the same as in mainline, but SHA1s are different, since that is from my local repo. Just let me know if you need me to post the original SHA1s and I will

Brad Figg (brad-figg)
Changed in linux (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-vivid' to 'verification-done-vivid'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-vivid
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Chris,

Can you verify the kernel in -proposed as requested in #33? That way the fix can move into -updates.

Thanks in advance!

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Joshua R. Poulson (jrp) wrote :

We're looking at it.

Revision history for this message
Paula Crismaru (pcrismaru) wrote :

Hello!

I installed the kernel in -proposed (kernel version 3.19.0-49-generic), triggered a kernel panic and it worked as expected. The crashkernel memory sizes I used are 512M and 384M.

tags: added: verification-done-vivid
removed: verification-needed-vivid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.19.0-49.55

---------------
linux (3.19.0-49.55) vivid; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1536775

  [ Colin Ian King ]

  * SAUCE: (no-up) ACPI / tables: Add acpi_force_32bit_fadt_addr option to
    force 32 bit FADT addresses
    - LP: #1529381

  [ Tim Gardner ]

  * [Config] Add DRM ast driver to udeb installer image
    - LP: #1514711
  * SAUCE: (no-up) Revert "[SCSI] libiscsi: Reduce locking contention in
    fast path"
    - LP: #1517142

  [ Upstream Kernel Changes ]

  * powerpc/eeh: Fix recursive fenced PHB on Broadcom shiner adapter
    - LP: #1532942
  * Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
    - LP: #1440103
  * Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
    vmbus_connection pages on shutdown
    - LP: #1440103
  * drivers: hv: vmbus: Teardown synthetic interrupt controllers on module
    unload
    - LP: #1440103
  * clockevents: export clockevents_unbind_device instead of
    clockevents_unbind
    - LP: #1440103
  * Drivers: hv: vmbus: Teardown clockevent devices on module unload
    - LP: #1440103
  * Drivers: hv: vmbus: Add support for VMBus panic notifier handler
    - LP: #1440103
  * hv: run non-blocking message handlers in the dispatch tasklet
    - LP: #1440103
  * Drivers: hv: vmbus: unregister panic notifier on module unload
    - LP: #1440103
  * Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
    - LP: #1440103
  * kexec: define kexec_in_progress in !CONFIG_KEXEC case
    - LP: #1440103
  * Drivers: hv: vmbus: add special kexec handler
    - LP: #1440103
  * Drivers: hv: don't do hypercalls when hypercall_page is NULL
    - LP: #1440103
  * Drivers: hv: vmbus: add special crash handler
    - LP: #1440103
  * Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
    - LP: #1440103
  * hyperv: Implement netvsc_get_channels() ethool op
    - LP: #1494423
  * hv_netvsc: Properly size the vrss queues
    - LP: #1494423
  * hv_netvsc: Allocate the sendbuf in a NUMA aware way
    - LP: #1494423
  * hv_netvsc: Allocate the receive buffer from the correct NUMA node
    - LP: #1494423
  * Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels
    - LP: #1494423
  * Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
    - LP: #1494423
  * Drivers: hv: vmbus: Improve the CPU affiliation for channels
    - LP: #1494423
  * Drivers: hv: vmbus: Further improve CPU affiliation logic
    - LP: #1494423

linux (3.19.0-48.54) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1536124
  * Merged back Ubuntu-3.19.0-46.52

 -- Brad Figg <email address hidden> Thu, 21 Jan 2016 12:29:48 -0800

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.