kernel linux-image-4.15.0-44 not booting on Hyperv Server 2008R2

Bug #1814069 reported by Rene Koeldorfer on 2019-01-31
108
This bug affects 19 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Kai-Heng Feng

Bug Description

=== SRU Justification ===

[Impact]
NULL pointer dereference in netvsc_probe(). Module hv_netvsc is included
in initramfs, so this blocks the boot process.

For Hyper-V only supports single channel, rndis_filter_device_add()
bails early and jump to tag "out". Subsequent code calls
rndis_filter_device_remove() and returns ERR_PTR(ret), where ret is
0 (sucess). Because of that, it passes IS_ERR(nvdev) check in
netvsc_probe() and cause a NULL pointer dereference, as nvdev now is 0:

...
        if (nvdev->num_chn > 1)
                schedule_work(&nvdev->subchan_work);

[Fix]
Correctly return net_device at the end of rndis_filter_device_add().

[Test]
Users report positive result.

[Regression Potenial]
Low. Trivial change, patches are in upstream sometime.

=== Original Bug Report ===

Ubuntu stuck on booting on HyperV Server 2008R2.
I saw kernel messages, seems to load ram image the boot is stuck.
Seems to be a problem with hyperv drivers propably harddrive.
Reverted back to the previous kernel.

Description: Ubuntu 18.04.1 LTS
Release: 18.04Description: Ubuntu 18.04.1 LTS
Release: 18.04
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jan 31 08:52 seq
 crw-rw---- 1 root audio 116, 33 Jan 31 08:52 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: Microsoft Corporation Virtual Machine
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 hyperv_fb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-43-generic root=UUID=86036ccb-bc11-11e8-93c9-00155dfd7535 ro maybe-ubiquity
ProcVersionSignature: Ubuntu 4.15.0-43.46-generic 4.15.18
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-43-generic N/A
 linux-backports-modules-4.15.0-43-generic N/A
 linux-firmware 1.173.3
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic uec-images
Uname: Linux 4.15.0-43-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 03/19/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 090004
dmi.board.name: Virtual Machine
dmi.board.vendor: Microsoft Corporation
dmi.board.version: 7.0
dmi.chassis.asset.tag: 8531-7125-9206-2460-7819-2663-90
dmi.chassis.type: 3
dmi.chassis.vendor: Microsoft Corporation
dmi.chassis.version: 7.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090004:bd03/19/2009:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0:
dmi.product.name: Virtual Machine
dmi.product.version: 7.0
dmi.sys.vendor: Microsoft Corporation

Rene Koeldorfer (koeli75) wrote :

Kernel Version linux-image-4.15.0-44

affects: linux-meta (Ubuntu) → linux (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1814069

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic

apport information

tags: added: apport-collected uec-images
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Graham Bloice (graham-bloice) wrote :

Kernel 4.15.0-45, guest on Server 2012 R2 Hyper-V Host also affected.
Last working kernel 4.15.0-43.

Chunk (oubliette) wrote :

Confirming for both 4.15.0-44 and 4.15.0-45 on Windows Server 2012 Hyper-V. For now I've forced GRUB to boot 4.15.0-43.

The guest is Ubuntu Server 18.04.1. FWIW, the VM has 2 virtual cores and dynamically-allocated memory of up to 6144 MB.

I'm only able to view the boot process in Hyper-V's "Virtual Machine Connection" console, which has no scroll buffer. When boot failure occurs, I can see the last few lines of an error trace in the console, but I've been unable to capture the full trace. No information regarding the boot ends up in the system logs--i.e., journalctl doesn't recognize the failed boot attempts at all.

Changed in linux (Ubuntu):
status: Confirmed → New
status: New → Confirmed
Graham Bloice (graham-bloice) wrote :

I'm unable to reboot the faulty server to try to obtain any information on the crash, but creating a brand new instance of 18.04 LTS on Hyper-V 2012 R2 boots quite happily on 4.0.15-45.

I've also noticed that I haven't installed the azure tuned kernel package linux-azure.

Graham Bloice (graham-bloice) wrote :

Comment #14 refers to an incorrect kernel version, it should be 4.15.0-45, sorry for the confusion.

Mark Kovach (squidly70) wrote :

Can confirm

Confirming for both 4.15.0-44 and 4.15.0-45 on Windows Server 2012 Hyper-V. For now I've forced GRUB to boot 4.15.0-43.

Marcelo Cerri (mhcerri) wrote :

Does the problem happen in a gen1 or gen2 VM?

Chunk (oubliette) wrote :

The original report would have to be gen 1 since Rene is using Win Server 2008.

My failure is also with gen 1. (I'm using 2012, not 2012 R2, so gen 2 isn't an option for me.)

Since Graham is on 2012 R2, perhaps he can confirm whether his VMs (both the failed one and the newly-created working one) are gen 1 or 2.

Ravi (codesimian) wrote :

I also can confirm that the kernel 4.15.0-44-generic and -4.15.0-45-generic doesn’t boot in MS Hyper-V 2012 R2. Older kernels 4.15.0-29-generic and 4.15.0-43-generic are ok.

Hypervisor: MS Hyper-V 2012 R2
Virtual machine details:
* Generation: 1
* Number of logical processors : 4
* Memory: Min: 2014 MB and Max 2048 MB (I have also tested with static memory)

As commented by others it is not possible to scroll back on the Hyper-V’s virtual console nor capture text from the virtual console.

Graham Bloice (graham-bloice) wrote :

Poor reporting on my part, the original failure (#12) was on 2012, so is a Gen 1 VM. The subsequent test (#14) was on 2012 R2 and was using a Gen2 VM.

Sorry for the incorrect reports, I'll try to test on the original host with a new instance.

Graham Bloice (graham-bloice) wrote :

I installed a new instance on the original Hyper-V server, so 2012 with Gen 1, 2 CPU, 1GB RAM. I managed a partial screen image of the stack dump on boot.

Marcelo Cerri (mhcerri) wrote :

I don't have a Win Server 2012 machine available right now, but I have a Win10 Pro machine with Hyper-V and I was able to boot 4.15.0-45 on a gen1 VM.

Do you think there's any other specific VM configuration that you are using? I will try later with gen2.

Graham Bloice (graham-bloice) wrote :

No special config, used the Hyper-V wizard to create the VM, 2 CPU, 1024 MB RAM, 127GB HD, booting off 18.04.1 ISO.

Mark Kovach (squidly70) wrote :

Gen 1 for me also, I run on Server 2012.

Henrik Sozzi (energywave) wrote :

Windows Hyper-v Server 2012, Gen1 Hyper-v, same problem. Cannot boot 4.15.0-44 and 4.15.0-45 kernels. The last that works for me too is 4.15.0-43.
Just rebooted after updating Ubuntu server 18.04 and discovered that problem.

Alexander Sagen (kit-alex) wrote :

Having the same problem on Windows Server 2012 (build 9200) Hyper-V with linux kernel versions 4.15.0-44 and 4.15.0-45. Last known working kernel version 4.15.0-43. Attached is a screenshot of kernel panic stack trace.

ToXinE (toxine) wrote :

Having the same problem on Windows Server 2008 R2 Standard Hyper-V with linux kernel versions 4.15.0-44 and 4.15.0-45 and 4.15.0-46

Mathias (frozen1900) wrote :

problem still exist after update to 4.15.0-46

latest working kernel for me is 4.15.0-43

hyper-v server 2008r2

Ravi (codesimian) wrote :

I can confirm that that HWE kernl based on Linux kernel 4.18.x works fine on Hyper-V 2012 R2 (Gen 1) virtual machines.

I discovered this when I tried to install the 18.04.2 ISO and it had the same problem, but when I tried to install 18.04.2 with HWE kernel it worked.

I have now installed the kernel provided by the package 'linux-virtual-hwe-18.04' on our virtual machines. This has allowed us to receive security updates on the virtual machines affected by this bug.

Igor (gorjan19) wrote :

Having the same problem on Windows Server 2012 Standard , Hyper-V (6.2.9200) with linux kernel versions 4.15.0-44 and 4.15.0-45 and 4.15.0-46. Working kernel version 4.15.0-43

Mark Kovach (squidly70) wrote :

I have the same problem on Windows Server 2012 Standard Hyper-V with linux kernel versions 4.15.0-44 and 4.15.0-45 and 4.15.0-46

Kai-Heng Feng (kaihengfeng) wrote :

Probably caused by LP: #1807757.

Per Bengtsson (hofster) wrote :

I tried that kernel but it didn't seem to make any difference.
Version 4.15.0-46 from the repository was already installed so I had to uninstall that one first.

kaktus rav (x.kaktus) wrote :

I have been testing all versions 4.15.0-44 - 4.15.0-46, trouble is saving on Hyper-V Windows Server 2012 - 2016

kaktus rav (x.kaktus) wrote :

This kernel 4.15.0-46.49 (https://people.canonical.com/~khfeng/lp1814069/)

Having the same problem on Hyper-V Windows Server 2012

Working only kernel version 4.15.0-43

Kai-Heng Feng (kaihengfeng) wrote :

Is it possible to collect dmesg?
Boot with latest Ubuntu kernel with kernel parameter "blacklist=hv_netvsc modprobe.blacklist=hv_netvsc", then run `sudo modprobe hv_netvsc`.

Attach dmesg afterward.

Per Bengtsson (hofster) wrote :

Here's a log from a boot using your kernal at https://people.canonical.com/~khfeng/lp1814069/
Kernel parameters are as you requested and a modprobe hv_netvsc after boot.

I'm not sure exactly how to retrieve dmesg properly nowadays but I used journalctl -k -b -2 to retrieve the kernel messages for the boot we're interested in. If you need more detailed info give me a hint on how to retrieve it.

Kai-Heng Feng (kaihengfeng) wrote :

Per Bengtsson,

Is it possible to remove the kernel I built, and use the one in ubuntu archive?
Then please use dmesg instead of journalctl, thanks!

Per Bengtsson (hofster) wrote :

Attached is the dmesg output from a modprobe using the ordinary kernel 4.15.0-46

Kai-Heng Feng (kaihengfeng) wrote :

Please test this kernel:
https://people.canonical.com/~khfeng/lp1814069-2/

Cherry-picked additional two commits:
916c5e1413be058d1c1f6e502db350df890730ce
b19b46346f483ae055fa027cb2d5c2ca91484b91

Per Bengtsson (hofster) wrote :

https://people.canonical.com/~khfeng/lp1814069-2/ is working for me. I haven't done any extensive testing but the server can boot and has network access.
Attaching dmesg output from successful boot if that's relevant

BR Per

kaktus rav (x.kaktus) wrote :

https://people.canonical.com/~khfeng/lp1814069-2/ this working for me.
My server booting and has network access.

description: updated
tags: added: kernel-hyper-v
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Mark Kovach (squidly70) wrote :

When can we expect an update from the repository for this updated kernel? Just curious. Thanks

Hi,

For the current schedule of planned kernel stable updates please check:
https://kernel.ubuntu.com/

Thanks

Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Mark Kovach (squidly70) wrote :

So if this was committed 3-28 we can expect an update on May 13th? Is that correct?

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Mark Kovach (squidly70) wrote :

Installed linux-image-4.15.0-48-generic/bionic-proposed

linux-image-4.15.0-48-generic/bionic-proposed,now 4.15.0-48.51 amd64 [installed]

uname -a
Linux istest-d01 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

System booted ok, I see in the changelog

* kernel linux-image-4.15.0-44 not booting on Hyperv Server 2008R2
    (LP: #1814069)
    - hv/netvsc: fix handling of fallback to single queue mode
    - hv/netvsc: Fix NULL dereference at single queue mode fallback

I'm satisfied, anyone else test this?

Mark Kovach (squidly70) on 2019-04-05
tags: added: verification-done-bionic
removed: verification-needed-bionic
Per Bengtsson (hofster) wrote :

I have also installed linux-image-4.15.0-48-generic/bionic-proposed and it's working fine for me as well.

rgt (robertgt4) wrote :

Works for me too

uname -a
Linux LUbuntu2 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Po-Hsu Lin (cypressyew) wrote :

Patchset has already landed in Cosmic / Disco, marking the linux package as fix released.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
assignee: nobody → Kai-Heng Feng (kaihengfeng)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Kai-Heng Feng (kaihengfeng)
Changed in linux (Ubuntu):
assignee: Kai-Heng Feng (kaihengfeng) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers