Dell Latitude 7300, i7-8665U, sig=0x806ec/20200609: hangs on Whiskey Lake

Bug #1883002 reported by Andrea C
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OEM Priority Project
New
Undecided
Unassigned
intel-microcode (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Basically the same symptoms as #1882809, but with 20.04 and on an i7-8665U (Dell Latitude 7300).

Affected version is 3.20200609.0ubuntu0.20.04.0 (tested with kernels 5.4.0-33 and 5.4.0-37)

CPU:
Vendor ID: GenuineIntel
CPU family: 6
Model: 142
Model name: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
Stepping: 12

microcode: sig=0x806ec, pf=0x80, revision=0xca

The first sign of a problem was a hard lockup while processing the new microcode package as part of a larger update; after fixing the resulting mess and completing the update (which regenerated the initramfs) I started having random lockups early in the boot process, with just "Loading initramfs" printed on the console.

Downgrading intel-microcode to 3.20191115.1ubuntu3 and regenerating the initramfs fixes the problem.

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote : Re: [Bug 1883002] [NEW] intel-ucode 20200609: hangs on Whiskey Lake

On Wed, 10 Jun 2020, Andrea C wrote:
> CPU family: 6
> Model: 142
> Model name: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
> Stepping: 12
>
> microcode: sig=0x806ec, pf=0x80, revision=0xca
>
> The first sign of a problem was a hard lockup while processing the new
> microcode package as part of a larger update; after fixing the resulting

At that moment in time, the microcode *is not changed*. Only a *reboot*
(or shutdown + powerup) would load the new microcode update.

That means the update run hang with whatever older microcode and kernel
you already had at the time -- so, it was not caused by the
intel-microcode update, *unless* you already had intel-microcode
installed and had already rebooted for some reason.

> Downgrading intel-microcode to 3.20191115.1ubuntu3 and regenerating the
> initramfs fixes the problem.

If it is not asking too much, with the computer fully stable and
intel-microcode 3.20191115.1ubuntu3 installed (*and* rebooted so that it
got applied to the processor), could you please do another install
cycle, *just* with the intel-microcode security update that supposedly
caused issues?

It must complete the update safely, it will not attempt to update the
processor. Only when you reboot, will it attempt to install the updated
microcode into the processor.

Please ensure you have enough free space in /boot: if it fills up, *bad
things can (and likely will) happen* the next time you reboot.

If the reboot to activate the microcode update does hang, does it also
hang when powering up the computer with the new microcode package
installed (instead of rebooting)?

--
  Henrique Holschuh

Revision history for this message
Steve Beattie (sbeattie) wrote : Re: intel-ucode 20200609: hangs on Whiskey Lake

Henrique, I think that's a consequence of the change in focal's intel-microcode to add a tmpfiles.d snippet to do late loading of microcode (LP: #1862938), the intel-microcode postinst generated ends up calling 'systemd-tmpfiles --create' on the added microcode conf file, causing it to be triggered immediately.

Revision history for this message
Steve Beattie (sbeattie) wrote :

For others hitting this issue, add the 'dis_ucode_ldr' kernel boot option in grub before booting to disable microcode loading.

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

I see. I will keep that in mind for the future, I thought you guys were doing late-loading triggering *only* inside the initramfs, and not during the update...

Well, blacklisting it from late-loading is a no-brainer. I will do that too for Debian, although Debian never late-loads automatically.

May I suggest preparing a test package with the late-loading blacklisted for the reporter to try?

That would have a non-zero chance of working, given the report that it crashes immediately upon late-load (which gives it a zero chance of working with late-loading enabled).

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.20.04.2

---------------
intel-microcode (3.20200609.0ubuntu0.20.04.2) focal-security; urgency=medium

  * REGRESSION UPDATE: revert the tmpfiles snippet to do late
    loading of microcode, this would also happen during package
    upgrades. Also, in the case of a problematic microcode update,
    this would prevent booting using an earlier kernel as the late
    loading would still load the problematic micrcode, forcing the use
    of the 'dis_ucode_ldr' kernel command line option to recover.
    (LP: #1883002)

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 13:36:29 -0700

Changed in intel-microcode (Ubuntu):
status: New → Fix Released
Steve Beattie (sbeattie)
Changed in intel-microcode (Ubuntu):
status: Fix Released → Incomplete
Revision history for this message
Steve Beattie (sbeattie) wrote :

Henrique, I didn't realize until today that the systemd tmpfiles.d would also get triggered as part of the intel-microcode postinst in addition to very early in the boot process. I have reverted it for focal (and eventually groovy) because of the increased risk of instability and the greater difficulty that it adds to recovery from a bad microcode update; booting an earlier kernel/initramfs combination with an older microcode embedded would still get the new bad microcode loaded via the tmpfiles.d snippet.

I used this bug report as a reference for the upload, but re-opened the issue as it's not clear whether an early-loaded microcode is a problem for the reporter.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.20.04.2

---------------
intel-microcode (3.20200609.0ubuntu0.20.04.2) focal-security; urgency=medium

  * REGRESSION UPDATE: revert the tmpfiles snippet to do late
    loading of microcode, this would also happen during package
    upgrades. Also, in the case of a problematic microcode update,
    this would prevent booting using an earlier kernel as the late
    loading would still load the problematic micrcode, forcing the use
    of the 'dis_ucode_ldr' kernel command line option to recover.
    (LP: #1883002)

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 13:36:29 -0700

Changed in intel-microcode (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

Well, now we need more testing to know if it works or not, since it is kinda expected that some microcode updates would object *heavily* to late-load but work just fine from the early-initramfs.

Steve Beattie (sbeattie)
Changed in intel-microcode (Ubuntu):
status: Fix Released → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.20.04.2

---------------
intel-microcode (3.20200609.0ubuntu0.20.04.2) focal-security; urgency=medium

  * REGRESSION UPDATE: revert the tmpfiles snippet to do late
    loading of microcode, this would also happen during package
    upgrades. Also, in the case of a problematic microcode update,
    this would prevent booting using an earlier kernel as the late
    loading would still load the problematic micrcode, forcing the use
    of the 'dis_ucode_ldr' kernel command line option to recover.
    (LP: #1883002)

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 13:36:29 -0700

Changed in intel-microcode (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Andrea C (alyf80) wrote :

3.20200609.0ubuntu0.20.04.2 installs cleanly, but there is still a problem with the new microcode causing freezes during boot.

I did some very unscientific testing in a couple of scenarios and came up with the following numbers:

* on battery, cold boot: froze 1 out of 10 times
* on battery, warm boot: froze 6 out of 10 times
* on AC, cold boot : froze 0 out of 10 times
* on AC, warm boot : froze 9 out of 10 times

"Cold boot": system powered off, try to boot 5.4.0-37 in recovery mode to a root shell
"Warm booot": "reboot" from a root shell in 5.4.0-37, try to boot 5.4.0-37 in recovery mode to a root shell

So, while battery vs. AC seems to have some influence, the real difference is in cold vs. warm boot.

When the system manages to boot, in the kernel log I get:

[ 0.000000] microcode: microcode updated early to revision 0xd6, date = 2020-04-23
[ 0.667433] microcode: sig=0x806ec, pf=0x80, revision=0xd6

I can run more tests if needed.

Steve Beattie (sbeattie)
Changed in intel-microcode (Ubuntu):
status: Fix Released → Confirmed
Revision history for this message
Steve Beattie (sbeattie) wrote :

Thanks for testing! The issue you are seeing looks very similar to https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/24 except that in that report, version 0xca was also problematic.

Revision history for this message
Andrea C (alyf80) wrote :

Yes, I was also affected by that (i.e. #1862751) and at the time had to revert to a previous microcode version while running bionic.

There is a difference, however, in that 1862751 only happened when running on battery (which is why I tested the two scenarios above)

In early May I upgraded to focal and everything has run fine since; I didn't even realize I was again using 0xca until now.

Looking at the logs I noticed that the processor is already running 0xca by the time Linux boots, so by now it's probably included in the system firmware (there were a couple of new BIOS releases quite recently); I don't know if that can make a difference.

You-Sheng Yang (vicamo)
summary: - intel-ucode 20200609: hangs on Whiskey Lake
+ Dell Latitude 7300, i7-8665U, sig=0x806ec/20200609: hangs on Whiskey
+ Lake
Revision history for this message
Steve Beattie (sbeattie) wrote :

Andrea, thanks again for the report and the testing you've done, and again, sorry you are having this issue. I have filed https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/35 specifically for this issue with sig=0x806ea revision=0xd6.

Revision history for this message
Émile E (kheperkare) wrote :

Hi, after losing a whole day thinking this was a kernel problem (as you would since rebooting to a different kernel usually solves it): I am also hit by this with a brand new Dell Latitude 7300 / i5-8265U running Mint 19.3 (Ubuntu 18.04). I observed it on AC (I didn't try on battery). When it hangs I hear fan noise. The failure rate on warm reboots was close to 100% until I used the PPA helpfully provided by vicamo to revert to 20191115 from 20200609; I was then able to reboot several times in a row so I think that fixed it.

Tell me if I can help by providing extra information. Firmware version is 1.7.4.

Revision history for this message
Andrea C (alyf80) wrote :

Dell released system firmware 1.9.1, which includes microcode revision 0xd6.

After the update, my system is running just fine with 0xd6 -- so once again the problem seems related to the way microcode is loaded more than to the microcode revision itself.

Rex Tsai (chihchun)
tags: added: merion oem-priority somerville
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.