Azure kernels fail to boot on some large Azure instance types

Bug #1940564 reported by Tim Gardner
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Undecided
Unassigned
Bionic
Fix Released
Medium
Tim Gardner
Focal
In Progress
Undecided
Unassigned
Hirsute
Won't Fix
Undecided
Unassigned
Impish
Won't Fix
Undecided
Unassigned
linux-azure (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Tim Gardner
Hirsute
Fix Released
Undecided
Tim Gardner
Impish
Fix Released
Undecided
Unassigned
linux-azure-5.11 (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Tim Gardner
Hirsute
Invalid
Undecided
Unassigned
Impish
Invalid
Undecided
Unassigned
linux-azure-5.4 (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Fix Released
Undecided
Tim Gardner
Focal
Invalid
Undecided
Unassigned
Hirsute
Invalid
Undecided
Unassigned
Impish
Invalid
Undecided
Unassigned

Bug Description

[Impact]

Azure kernels fail to boot on large Azure instance types.

[Fix]

revert 10b9a1068ba301b6458a308c749eb0b93951010c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue") from "UBUNTU: upstream stable to v4.14.240, v4.19.198"

This commit originates from an unreleased kernel (5.14-rc1). According
to the commit log it purports to head off possible future problems. It
does not appear to address a specific bug. It has been backported to
4.14.y and 4.19.y.

[Test Plan]

Install and boot on an Azure Standard_D48_v3 instance.

[Where problems could occur]

iscsi requests could hang

CVE References

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Tim Gardner (timg-tpi)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1940564

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Tim Gardner (timg-tpi)
description: updated
Terry Rudd (terrykrudd)
summary: - linux
+ bionic linux Ubuntu-4.15.0-155.162 fails to boot on Azure
+ Standard_D48_v3
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Re: bionic linux Ubuntu-4.15.0-155.162 fails to boot on Azure Standard_D48_v3

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Booted in a Standard_D48_v3 instance:

uname -a
Linux selfprovisioned-rtg-bionic 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Tagging verification-done-bionic

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (33.7 KiB)

This bug was fixed in the package linux - 4.15.0-156.163

---------------
linux (4.15.0-156.163) bionic; urgency=medium

  * bionic/linux: 4.15.0-156.163 -proposed tracker (LP: #1940162)

  * linux (LP: #1940564)
    - SAUCE: Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

  * fails to launch linux L2 guests on AMD (LP: #1940134) // CVE-2021-3653
    - KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl
      (CVE-2021-3653)

  * fails to launch linux L2 guests on AMD (LP: #1940134)
    - SAUCE: Revert "UBUNTU: SAUCE: KVM: nSVM: avoid picking up unsupported bits
      from L2 in int_ctl"

linux (4.15.0-155.162) bionic; urgency=medium

  * bionic/linux: 4.15.0-155.162 -proposed tracker (LP: #1939833)

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.08.16)

  * CVE-2021-3656
    - SAUCE: KVM: nSVM: always intercept VMLOAD/VMSAVE when nested

  * CVE-2021-3653
    - SAUCE: KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl

  * dev_forward_skb: do not scrub skb mark within the same name space
    (LP: #1935040)
    - dev_forward_skb: do not scrub skb mark within the same name space

  * 'ptrace trace' needed to readlink() /proc/*/ns/* files on older kernels
    (LP: #1890848)
    - apparmor: fix ptrace read check

  * Bionic update: upstream stable patchset 2021-08-03 (LP: #1938824)
    - ALSA: usb-audio: fix rate on Ozone Z90 USB headset
    - media: dvb-usb: fix wrong definition
    - Input: usbtouchscreen - fix control-request directions
    - net: can: ems_usb: fix use-after-free in ems_usb_disconnect()
    - usb: gadget: eem: fix echo command packet response issue
    - USB: cdc-acm: blacklist Heimann USB Appset device
    - ntfs: fix validity check for file name attribute
    - iov_iter_fault_in_readable() should do nothing in xarray case
    - Input: joydev - prevent use of not validated data in JSIOCSBTNMAP ioctl
    - ARM: dts: at91: sama5d4: fix pinctrl muxing
    - btrfs: send: fix invalid path for unlink operations after parent
      orphanization
    - btrfs: clear defrag status of a root if starting transaction fails
    - ext4: cleanup in-core orphan list if ext4_truncate() failed to get a
      transaction handle
    - ext4: fix kernel infoleak via ext4_extent_header
    - ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
    - ext4: remove check for zero nr_to_scan in ext4_es_scan()
    - ext4: fix avefreec in find_group_orlov
    - ext4: use ext4_grp_locked_error in mb_find_extent
    - can: gw: synchronize rcu operations before removing gw job entry
    - can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in
      TX path
    - SUNRPC: Fix the batch tasks count wraparound.
    - SUNRPC: Should wake up the privileged task firstly.
    - s390/cio: dont call css_wait_for_slow_path() inside a lock
    - rtc: stm32: Fix unbalanced clk_disable_unprepare() on probe error path
    - iio: ltr501: mark register holding upper 8 bits of ALS_DATA{0,1} and PS_DATA
      as volatile, too
    - iio: ltr501: ltr559: fix initialization of LTR501_ALS_CONTR
    - iio: ltr501: ltr501_read_ps(): add missing endianness con...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Tim Gardner (timg-tpi) wrote :

This appears to be an issue with all instance types with large numbers of CPUs. focal/linux-azure fails on Standard_M416s_v2.

Changed in linux-azure (Ubuntu Bionic):
status: New → Invalid
Changed in linux (Ubuntu Focal):
status: New → In Progress
Changed in linux (Ubuntu Hirsute):
status: New → In Progress
Changed in linux (Ubuntu Impish):
status: Incomplete → In Progress
Changed in linux-azure (Ubuntu Focal):
status: New → In Progress
Changed in linux-azure (Ubuntu Hirsute):
status: New → In Progress
Changed in linux-azure (Ubuntu Impish):
status: New → In Progress
Tim Gardner (timg-tpi)
summary: - bionic linux Ubuntu-4.15.0-155.162 fails to boot on Azure
- Standard_D48_v3
+ bionic linux Ubuntu-4.15.0-155.162 fails to boot on some large Azure
+ instance types
summary: - bionic linux Ubuntu-4.15.0-155.162 fails to boot on some large Azure
- instance types
+ linux-azure fails to boot on some large Azure instance types
description: updated
summary: - linux-azure fails to boot on some large Azure instance types
+ Azure kernels fail to boot on some large Azure instance types
Tim Gardner (timg-tpi)
Changed in linux-azure (Ubuntu Focal):
assignee: nobody → Tim Gardner (timg-tpi)
status: In Progress → Fix Committed
Tim Gardner (timg-tpi)
Changed in linux-azure-5.11 (Ubuntu Bionic):
status: New → Invalid
Changed in linux-azure-5.11 (Ubuntu Focal):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux-azure-5.11 (Ubuntu Hirsute):
status: New → Invalid
Changed in linux-azure-5.11 (Ubuntu Impish):
status: New → Invalid
Changed in linux-azure-5.4 (Ubuntu Bionic):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux-azure-5.4 (Ubuntu Focal):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux-azure-5.4 (Ubuntu Hirsute):
status: New → Invalid
Changed in linux-azure-5.4 (Ubuntu Impish):
status: New → Invalid
Tim Gardner (timg-tpi)
Changed in linux-azure-5.4 (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux-azure-5.4 (Ubuntu Focal):
assignee: Tim Gardner (timg-tpi) → nobody
status: In Progress → Invalid
Tim Gardner (timg-tpi)
Changed in linux-azure-5.11 (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux-azure (Ubuntu Hirsute):
status: In Progress → Fix Committed
assignee: nobody → Tim Gardner (timg-tpi)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure - 5.13.0-1006.7

---------------
linux-azure (5.13.0-1006.7) impish; urgency=medium

  * impish/linux-azure: 5.13.0-1006.7 -proposed tracker (LP: #1946329)

  * Azure kernels fail to boot on some large Azure instance types (LP: #1940564)
    - Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

  * impish:linux 5.13 panic during systemd autotest (LP: #1946001)
    - azure: [Config] disable KFENCE

  [ Ubuntu: 5.13.0-19.19 ]

  * impish/linux: 5.13.0-19.19 -proposed tracker (LP: #1946337)
  * impish:linux-aws 5.13 panic during systemd autotest (LP: #1946001)
    - [Config] disable KFENCE

  [ Ubuntu: 5.13.0-18.18 ]

  * impish/linux: 5.13.0-18.18 -proposed tracker (LP: #1945995)
  * [21.10 FEAT] KVM: Use interpretation of specification exceptions
    (LP: #1932157)
    - KVM: s390: Enable specification exception interpretation

 -- Andrea Righi <email address hidden> Fri, 08 Oct 2021 16:13:17 +0200

Changed in linux-azure (Ubuntu Impish):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure-5.4 - 5.4.0-1061.64~18.04.1

---------------
linux-azure-5.4 (5.4.0-1061.64~18.04.1) bionic; urgency=medium

  * bionic/linux-azure-5.4: 5.4.0-1061.64~18.04.1 -proposed tracker
    (LP: #1946380)

  [ Ubuntu: 5.4.0-1061.64 ]

  * focal/linux-azure: 5.4.0-1061.64 -proposed tracker (LP: #1946377)
  * Azure kernels fail to boot on some large Azure instance types (LP: #1940564)
    - Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

 -- Marcelo Henrique Cerri <email address hidden> Thu, 07 Oct 2021 17:30:59 -0300

Changed in linux-azure-5.4 (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure - 5.4.0-1061.64

---------------
linux-azure (5.4.0-1061.64) focal; urgency=medium

  * focal/linux-azure: 5.4.0-1061.64 -proposed tracker (LP: #1946377)

  * Azure kernels fail to boot on some large Azure instance types (LP: #1940564)
    - Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

 -- Tim Gardner <email address hidden> Thu, 07 Oct 2021 09:11:51 -0600

Changed in linux-azure (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/5.4.0-1062.65 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Microsoft tested focal:linux-azure. Tagging verification-done-focal

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure-5.11 - 5.11.0-1019.20~20.04.1

---------------
linux-azure-5.11 (5.11.0-1019.20~20.04.1) focal; urgency=medium

  * focal/linux-azure-5.11: 5.11.0-1019.20~20.04.1 -proposed tracker
    (LP: #1946512)

  [ Ubuntu: 5.11.0-1019.20 ]

  * hirsute/linux-azure: 5.11.0-1019.20 -proposed tracker (LP: #1946505)
  * Azure kernels fail to boot on some large Azure instance types (LP: #1940564)
    - Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

 -- Marcelo Henrique Cerri <email address hidden> Fri, 08 Oct 2021 19:08:35 -0300

Changed in linux-azure-5.11 (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure - 5.11.0-1019.20

---------------
linux-azure (5.11.0-1019.20) hirsute; urgency=medium

  * hirsute/linux-azure: 5.11.0-1019.20 -proposed tracker (LP: #1946505)

  * Azure kernels fail to boot on some large Azure instance types (LP: #1940564)
    - Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"

 -- Tim Gardner <email address hidden> Fri, 08 Oct 2021 10:35:03 -0600

Changed in linux-azure (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

The Hirsute Hippo has reached End of Life, so this bug will not be fixed for that release.

Changed in linux (Ubuntu Hirsute):
status: In Progress → Won't Fix
Revision history for this message
Brian Murray (brian-murray) wrote :

Ubuntu 21.10 (Impish Indri) has reached end of life, so this bug will not be fixed for that specific release.

Changed in linux (Ubuntu Impish):
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.