Freezing on boot since kernel 4.15.0-72-generic release
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| linux (Ubuntu) |
Undecided
|
Unassigned | ||
| Bionic |
High
|
You-Sheng Yang | ||
| linux-oem (Ubuntu) |
Undecided
|
Unassigned | ||
| Bionic |
Undecided
|
You-Sheng Yang |
Bug Description
[SRU Justification]
[Impact]
In bug 1840239, HPET is disabled on some systems for they caused TSC
being marked unstable while it is not. This caused an regression as bug
1851216 that some systems may then hang at early, so a fix cherry picked
back from v5.3-rc1. However, this fix also introduce yet another
regression that some other users may hang at boot while PIT is diabled
in the previous fix.
[Fix]
Commit 979923871f69 ("x86/timer: Don't skip PIT setup when APIC is
disabled or in legacy mode") from v5.6-rc1, also backported to v5.4.19
and v5.5.3, fixes PIT setup in this case.
[Test Case]
Simply boot a patch kernel on systems affected and it shouldn't hang.
[Regression Potential]
Low. Stable patch and trivial backport.
[Other Info]
The same fix for bug 1851216 was also backported to Disco and Eoan, but
they were then fixed with this 979923871f69 commit backported in bug
1866858 and bug 1867051, which pulls v5.4 stable patches into Disco and
Eoan correspondingly, leaving B/OEM-B the only victims so far.
========== Original Bug Description ==========
After the update to install kernel 4.15.0-72-generic (a bit over a week ago) my computer will not boot. On boot, all I see is the purple screen with:
Loading Linux 4.15.0-72-generic ...
Loading initial ramdisk ...
and nothing happens. Just sits there. I've waited about 5-10 minutes on occasion but to no avail.
I've checked a number of logs in /var/log but not found anything.
If I go into the advanced options and select kernel 4.15.0-70-generic, the computer boots normally.
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-
ProcVersionSign
Uname: Linux 4.15.0-70-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CurrentDesktop: ubuntu:GNOME
Date: Sat Dec 14 21:53:14 2019
HibernationDevice: RESUME=
InstallationDate: Installed on 2018-09-16 (454 days ago)
InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:b59e Chicony Electronics Co., Ltd
Bus 001 Device 002: ID 046d:c063 Logitech, Inc. DELL Laser Mouse
Bus 001 Device 004: ID 8087:0aaa Intel Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: GIGABYTE Sabre 17WV8
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.173.13
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/22/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F05
dmi.board.
dmi.board.name: Sabre 17WV8
dmi.board.vendor: GIGABYTE
dmi.board.version: Not Applicable
dmi.chassis.
dmi.chassis.type: 10
dmi.chassis.vendor: GIGABYTE
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.family: Sabre
dmi.product.name: Sabre 17WV8
dmi.product.
dmi.sys.vendor: GIGABYTE
CVE References
Anthony Buckley (tony-buckley) wrote : | #1 |
Changed in linux (Ubuntu): | |
status: | New → Confirmed |
You-Sheng Yang (vicamo) wrote : | #3 |
Hi, two questions.
1. do you have dmesg booting with a -72 kernel? The one attached is -70.
2. Could you try booting with an extra kernel parameter "hpet=disable"?
You-Sheng Yang (vicamo) wrote : | #4 |
For 2), I mean booting -70 with "hpet=disable".
Anthony Buckley (tony-buckley) wrote : | #5 |
Hello You-Sheng,
Thanks for responding.
I tried "hpet=disable" on a boot for both -70 and -72. No change. Kernel -72 still just stops, but -70 boots OK. As for the dmesg, I don't seem to be able to get any logging / messaging for -72. The only reference I can find for it is in the dpkg log when it was installed back on 4-Dec-2019. It's as if it does not even start. For what it's worth I've attached the dmesg for the -70 boot with "hpet=disable".
Anthony Buckley (tony-buckley) wrote : | #6 |
Hello,
Just updating to say that I tried the upstream kernel below but the problem is still present.
Upstream kernel:
5.5.0-050500rc1
Regards.
Tony
You-Sheng Yang (vicamo) wrote : | #7 |
Is it possible for you to perform kernel bisecting between -70 and -72?
* https:/
* https:/
See https:/
Anthony Buckley (tony-buckley) wrote : | #8 |
Hello You-Sheng,
Thanks for responding again.
Yes, I was just looking at doing a bisect. I've done it before for another problem. I just have to familiarise myself again with the procedures for it. Will attend to it soon.
Regards
Anthony Buckley (tony-buckley) wrote : | #9 |
Hello,
I have completed the bisect as requested and identified the problem commit. The bisect message is as follows:-
git bisect good
f723dd269d0740e
commit f723dd269d0740e
Author: Thomas Gleixner <email address hidden>
Date: Thu Nov 7 09:05:00 2019 +0100
x86/timer: Skip PIT initialization on modern chipsets
BugLink: https:/
Recent Intel chipsets including Skylake and ApolloLake have a special
ITSSPRC register which allows the 8254 PIT to be gated. When gated, the
8254 registers can still be programmed as normal, but there are no IRQ0
timer interrupts.
Some products such as the Connex L1430 and exone go Rugged E11 use this
register to ship with the PIT gated by default. This causes Linux to fail
to boot:
Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
apic=debug and send a report.
The panic happens before the framebuffer is initialized, so to the user, it
appears as an early boot hang on a black screen.
Affected products typically have a BIOS option that can be used to enable
the 8254 and make Linux work (Chipset -> South Cluster Configuration ->
Miscellaneous Configuration -> 8254 Clock Gating), however it would be best
to make Linux support the no-8254 case.
Modern sytems allow to discover the TSC and local APIC timer frequencies,
so the calibration against the PIT is not required. These systems have
always running timers and the local APIC timer works also in deep power
states.
So the setup of the PIT including the IO-APIC timer interrupt delivery
checks are a pointless exercise.
Skip the PIT setup and the IO-APIC timer interrupt checks on these systems,
which avoids the panic caused by non ticking PITs and also speeds up the
boot process.
Thanks to Daniel for providing the changelog, initial analysis of the
problem and testing against a variety of machines.
Reported-by: Daniel Drake <email address hidden>
Signed-off-by: Thomas Gleixner <email address hidden>
Tested-by: Daniel Drake <email address hidden>
Cc: <email address hidden>
Cc: <email address hidden>
Cc: <email address hidden>
Cc: <email address hidden>
Cc: <email address hidden>
Link: https://<email address hidden>
(backported from commit c8c4076723daca0
Signed-off-by: You-Sheng Yang <email address hidden>
Acked-by: Stefan Bader <email address hidden>
Acked-by: Connor Kuehl <email address hidden>
Signed-off-by: Stefan Bader <email address hidden>
:040000 040000 9c51f067713006f
Regards
Anthony Buckley (tony-buckley) wrote : | #10 |
Hello again, sorry I meant to add the git bisect log (FYI). See below:-
# bad: [48d6312566e04b
# good: [ad85666cf30fb9
git bisect start 'Ubuntu-
# good: [c76da031386f02
git bisect good c76da031386f02b
# bad: [4e160cf6dea3a8
git bisect bad 4e160cf6dea3a85
# good: [7adb99a811f00d
git bisect good 7adb99a811f00df
# good: [b04e95663b835e
git bisect good b04e95663b835ee
# bad: [4d4aa7f60b63e5
git bisect bad 4d4aa7f60b63e5b
# good: [069a7c0f0d8cba
git bisect good 069a7c0f0d8cba1
# bad: [46888ff0dc4368
git bisect bad 46888ff0dc43682
# bad: [f723dd269d0740
git bisect bad f723dd269d0740e
# good: [f25dc28338aa62
git bisect good f25dc28338aa627
# first bad commit: [f723dd269d0740
You-Sheng Yang (vicamo) wrote : | #11 |
Hi Anthony,
As you may have found, this commit was landed in bug 1851216 for a possible system hang due to bug 1840239, and since it is a solution backported from v5.3-rc1, and you also stated this can be reproduced in v5.5-rc1, then you may probably want to either 1) try a slightly newer v5.5-rc5 mainline kernel[1], or 2) file a upstream bug in kernel bugzilla[2].
[1]: https:/
[2]: https:/
Anthony Buckley (tony-buckley) wrote : | #12 |
Hi You-Sheng,
Thanks for responding. Sadly, the problem is not fixed in v5.5-rc5 mainline kernel.
I'll look at lodging a bug in bugzilla.
Also I'll try putting some debug code around in that commit to see if I can identify anything.
Regards.
Tom Ivar Johansen (tijohansen) wrote : | #13 |
Hi,
I seem to have the same problem. I have no experience with linux kernels or bug reports, but I am a computer engineer so with guidance I will be able to contribute to debugging.
I am running "Ubuntu 4.15.0-
Both 4.15.0-72 and 4.15.0-74 failed as described by Anthony Buckley.
Anthony Buckley (tony-buckley) wrote : | #14 |
I have filed a bug in bugzilla. Hopefully it's been done OK as I've not done this before.
It is as follows:-
Bug 206125 - Freezing on boot since kernel 4.15.0-72-generic release
Regards.
You-Sheng Yang (vicamo) wrote : | #15 |
Anthony, you should stat this is still reproducible with v5.5-rc5, not Ubuntu 4.15.0-72-generic.
Include bugzilla url for further reference: https:/
Anthony Buckley (tony-buckley) wrote : | #16 |
Thanks for your feedback, You-Sheng. I've added a comment to that effect now.
Regards.
Anthony Buckley (tony-buckley) wrote : | #17 |
I have tested a proposed patch by Thomas Gleixner (<email address hidden>) at both the identified commit and also at the latest version and in both cases my computer booted successfully.
Comment also posted in bugzilla.
https:/
Regards
Tom Ivar Johansen (tijohansen) wrote : | #18 |
I can confirm that I have applied the same patch to 4.15.0-
sirkku (sirkusmaisteri) wrote : | #19 |
It seems that I have the same issue with my HP ZBook that Ubuntu doesn't boot since kernel 4.15.0-72-generic release.
You-Sheng Yang (vicamo) wrote : | #20 |
Hi, for those who still suffers from this issue, it was a regression issue caused by commit f723dd269d07 "x86/timer: Skip PIT initialization on modern chipsets"), which was backported to 4.15.0-71 in bug 1851216 as a fix for some other platforms. So, please try latest mainline kernel[1] as possible as there might be yet another fix to this regression, then we may finally cherry-pick it back to 4.15 and fix hardware platforms of either group. As far as we know, this issue was still reproducible on v5.5-rc5, so you may want to try something newer than that directly.
Anthony Buckley (tony-buckley) wrote : | #21 |
Hello all,
Does anyone know if this bug will cause a problem upgrading to Ubuntu 20.04 LTS? I assume it will as we're stuck using an older kernel and 20.04 is based on kernel 5.4. Or, will it simply upgrade and keep us on the older kernel?
Regards.
You-Sheng Yang (vicamo) wrote : | #22 |
@Anthony, you can try focal kernel directly on your Bionic installation first.
$ printf "deb http://
$ sudo apt update
$ sudo apt install linux-modules-
And, it seems an upstream fix commit 979923871f69 ("x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode") has been backported to v5.4.19 and therefore focal kernel included that in bug 1863588 since at least 5.4.0-15. So it should be fine for you to use focal kernel now.
Or, maybe you don't bother upgrade kernels from Focal. Just use 5.3 kernels from Bionic, as they should have the same fix since 5.3.0-46.
Changed in linux (Ubuntu Bionic): | |
status: | New → In Progress |
assignee: | nobody → You-Sheng Yang (vicamo) |
You-Sheng Yang (vicamo) wrote : | #23 |
Bug 1851216 backports commit c8c4076723da ("x86/timer: Skip PIT initialization on modern chipsets") to Bionic and Disco, which then has a follow-up commit 979923871f69 ("x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode") landed in Eoan and Focal and on, leaving Bionic the only victim suffering from this issue and not yet EOL-ed.
You-Sheng Yang (vicamo) wrote : | #24 |
Disco and OEM-OSP1-B have been fixed as well.
You-Sheng Yang (vicamo) wrote : | #25 |
PPA for testing https:/
Anthony Buckley (tony-buckley) wrote : | #26 |
Hello You-Sheng,
I've only just noticed this. Thanks for responding. I've been a bit busy lately, but I'll either try one of the bionic 5.3 kernels or try your ppa test hopefully soon.
Thanks. Regards.
You-Sheng Yang (vicamo) wrote : | #27 |
Please do try my ppa so that we can verify if it actually works and solve this problem for other Bionic users as well. Thank you.
Anthony Buckley (tony-buckley) wrote : | #28 |
OK. I must confess I'm a bit vague on ppa's. Does it clone the kernel source and then I do a build and test?
Anthony Buckley (tony-buckley) wrote : | #29 |
Hello You-Sheng,
(hope you are well by the way)
I've applied your ppa as described
sudo add-apt-repository ppa:vicamo/
sudo apt-get update
However I'm not sure what to do next. I assume it installed some packages, but how do I test. I tried a reboot, but I couldn't see how it would work as the latest kernel I have available is 4.15.0-99 and I understand the changes are in the 5... kernels.
Do I need to get or build a new kernel?
Anthony Buckley (tony-buckley) wrote : | #30 |
Hello again You-Sheng,
I think I get something. I think I understand what you mean by it would take several hours to publish built binaries. You want me to follow those links on the ppa page and do those git clones?
I'm doing that anyway just see what happens.
Regards.
Tony
Anthony Buckley (tony-buckley) wrote : | #31 |
Hello yet again You-Sheng,
OK, I'm missing something here unfortunately. I've dome the first clone:-
git clone -b bug-1856387/fix-PIT-
but obviously the second will be a problem trying to clone into 'ubuntu-kernel'.
What is:-
git clone -b bug-1856387/fix-PIT-
Regards
You-Sheng Yang (vicamo) wrote : | #32 |
Sorry for the late reply. I'm not going to let you compile kernel, as least for now. I promise.
So it's pretty simple here. After you ran following two commands:
$ sudo add-apt-repository ppa:vicamo/
$ sudo apt-get update
apt has rebuilt its available packages database for install, and all that you will do next is to install one or more packages listed in the ppa "View package details" link[1]. Since you were on 4.15 generic kernel, please try following:
$ sudo apt install linux-modules-
linux-
Note for the "=" sign. It will install all the prerequisite packages as well. Then you reboot and select to boot from 4.15.0-100-generic kernel from grub's menu, and see if now you boot into GUI as expected long time ago.
If you're also interested in having a try on -oem kernels, use:
$ sudo apt install linux-image-
linux-
[1]: https:/
Anthony Buckley (tony-buckley) wrote : | #33 |
Well done You-Sheng!
Tried both packages and they worked fine. Don't worry about the delay, I have more than enough to keep me occupied at the moment.
Thanks much for your efforts here. What happens next, just wait for 20.04.1 to release? Do I need to clean anything up or can I just leave as is for the moment?
Regards.
Tony
You-Sheng Yang (vicamo) wrote : | #34 |
@Anthony, thank you. As commented in #23, it should be safe to upgrade you system to Eoan/Focal and on if you feel like. I'll still do the follow-ups and send the fix to Bionic then.
Changed in linux-oem (Ubuntu Bionic): | |
status: | New → In Progress |
You-Sheng Yang (vicamo) wrote : | #35 |
Changed in linux-oem (Ubuntu Bionic): | |
assignee: | nobody → You-Sheng Yang (vicamo) |
Changed in linux (Ubuntu): | |
status: | Confirmed → Invalid |
Changed in linux-oem (Ubuntu): | |
status: | New → Invalid |
Changed in linux (Ubuntu Bionic): | |
importance: | Undecided → High |
Anthony Buckley (tony-buckley) wrote : | #36 |
Thanks You-Sheng, I'll set aside some time shortly to upgrade.
Regards.
description: | updated |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux-oem (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Launchpad Janitor (janitor) wrote : | #37 |
This bug was fixed in the package linux-oem - 4.15.0-1093.103
---------------
linux-oem (4.15.0-1093.103) bionic; urgency=medium
* bionic/linux-oem: 4.15.0-1093.103 -proposed tracker (LP: #1887026)
* [SRU] plug headset won't proper reconfig ouput to it on machine with default
output (LP: #1882248)
- SAUCE: ALSA: hda - let hs_mic be picked ahead of hp_mic
* Freezing on boot since kernel 4.15.0-72-generic release (LP: #1856387)
- x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
[ Ubuntu: 4.15.0-112.113 ]
* bionic/linux: 4.15.0-112.113 -proposed tracker (LP: #1887048)
* Packaging resync (LP: #1786013)
- update dkms package versions
* CVE-2020-11935
- SAUCE: aufs: do not call i_readcount_inc()
- SAUCE: aufs: bugfix, IMA i_readcount
* CVE-2020-10757
- mm: Fix mremap not considering huge pmd devmap
* Update lockdown patches (LP: #1884159)
- efi/efi_test: Lock down /dev/efi_test and require CAP_SYS_ADMIN
- efi: Restrict efivar_ssdt_load when the kernel is locked down
- powerpc/xmon: add read-only mode
- powerpc/xmon: Restrict when kernel is locked down
- [Config] CONFIG_
- SAUCE: acpi: disallow loading configfs acpi tables when locked down
* seccomp_bpf fails on powerpc (LP: #1885757)
- SAUCE: selftests/seccomp: fix ptrace tests on powerpc
* Introduce the new NVIDIA 418-server and 440-server series, and update the
current NVIDIA drivers (LP: #1881137)
- [packaging] add signed modules for the 418-server and the 440-server
flavours
[ Ubuntu: 4.15.0-111.112 ]
* bionic/linux: 4.15.0-111.112 -proposed tracker (LP: #1886999)
* Bionic update: upstream stable patchset 2020-05-07 (LP: #1877461)
- SAUCE: mlxsw: Add missmerged ERR_PTR hunk
* linux 4.15.0-109-generic network DoS regression vs -108 (LP: #1886668)
- SAUCE: Revert "netprio_cgroup: Fix unlimited memory leak of v2 cgroups"
-- Kelsey Skunberg <email address hidden> Tue, 14 Jul 2020 12:21:34 -0600
Changed in linux-oem (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-bionic |
Anthony Buckley (tony-buckley) wrote : | #39 |
I have tested the proposed kernel as requested and updated tag to verification-
tags: | added: verification-done-bionic |
Tom Ivar Johansen (tijohansen) wrote : | #40 |
Thanks for your great work. I have just verified that the kernel works as it should.
I don't know if I am supposed to add a tag as well or how to do it.
Again, thanks.
Launchpad Janitor (janitor) wrote : | #41 |
This bug was fixed in the package linux - 4.15.0-115.116
---------------
linux (4.15.0-115.116) bionic; urgency=medium
* bionic/linux: 4.15.0-115.116 -proposed tracker (LP: #1893055)
* [Potential Regression] dscr_inherit_
ubuntu_
- powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
linux (4.15.0-114.115) bionic; urgency=medium
* bionic/linux: 4.15.0-114.115 -proposed tracker (LP: #1891052)
* ipsec: policy priority management is broken (LP: #1890796)
- xfrm: policy: match with both mark and mask on user interfaces
linux (4.15.0-113.114) bionic; urgency=medium
* bionic/linux: 4.15.0-113.114 -proposed tracker (LP: #1890705)
* Packaging resync (LP: #1786013)
- update dkms package versions
* Reapply "usb: handle warm-reset port requests on hub resume" (LP: #1859873)
- usb: handle warm-reset port requests on hub resume
* Bionic update: upstream stable patchset 2020-07-29 (LP: #1889474)
- gpio: arizona: handle pm_runtime_get_sync failure case
- gpio: arizona: put pm_runtime in case of failure
- pinctrl: amd: fix npins for uart0 in kerncz_groups
- mac80211: allow rx of mesh eapol frames with default rx key
- scsi: scsi_transport_spi: Fix function pointer check
- xtensa: fix __sync_
- xtensa: update *pos in cpuinfo_op.next
- drivers/
- net: sky2: initialize return of gm_phy_read
- drm/nouveau/
- irqdomain/treewide: Keep firmware node unconditionally allocated
- SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO
compeletion")
- spi: spi-fsl-dspi: Exit the ISR with IRQ_NONE when it's not ours
- IB/umem: fix reference count leak in ib_umem_odp_get()
- uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix
GDB regression
- ALSA: info: Drop WARN_ON() from buffer NULL sanity check
- ASoC: rt5670: Correct RT5670_LDO_SEL_MASK
- btrfs: fix double free on ulist after backref resolution failure
- btrfs: fix mount failure caused by race with umount
- btrfs: fix page leaks after failure to lock page for delalloc
- bnxt_en: Fix race when modifying pause settings.
- hippi: Fix a size used in a 'pci_free_
path
- ax88172a: fix ax88172a_unbind() failures
- net: dp83640: fix SIOCSHWTSTAMP to update the struct with actual
configuration
- drm: sun4i: hdmi: Fix inverted HPD result
- net: smc91x: Fix possible memory leak in smc_drv_probe()
- bonding: check error value of register_
- mlxsw: destroy workqueue when trap_register in mlxsw_emad_init
- ipvs: fix the connection sync failed in some cases
- i2c: rcar: always clear ICSAR to avoid side effects
- bonding: check return value of register_
- serial: exar: Fix GPIO configuration for Sealevel cards based on XR17V35X
- scripts/
- HID: i...
Changed in linux (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
This change was made by a bot.