intel-ucode/06-4e-03 from release 20200609 hangs system in early boot

Bug #1882890 reported by Jérôme
60
This bug affects 9 people
Affects Status Importance Assigned to Milestone
intel-microcode (Ubuntu)
Fix Released
Critical
Steve Beattie
Xenial
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Eoan
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

Hi,

This morning I applied the latest updates on my Ubuntu 16.04 (kernel 4.15.0-106-generic + intel-microcode 3.20200609.0ubuntu0.16.04.0) and rebooted as usual to boot on the new kernel but the system won't boot: black screen.

I tried booting the previous kernel 4.15.0-101 and it seems to be the same : in grub I select the 4.15.0-101 kernel, press enter, a message "Loading initrd..." is printed but nothing else happens.

Tried even older kenrel 4.4.0-184-generic : same problem, "Loading initrd...", and nothing happens.

I managed to boot a 4.4.0-142-generic, but bluetooth and wifi are non-functionnal.

I mounted the /boot partition from a USB rescue drive, and it seems that all the initrd images that got updated this morning are not bootable anymore :
- initrd.img-4.15.0-106-generic
- initrd.img-4.15.0.101-generic
- initrd.img-4.4.0-184-generic

I tried regenerating the images with update-initramfs but it's the same : "Loading initrd" message but nothing happens.

Revision history for this message
You-Sheng Yang (vicamo) wrote :

Hi, since you mentioned you also upgraded intel-microcode, does it help if you regenerate initramfs after purging intel-microcode?

In addition, could you also paste your microcode info when booted with USB rescue drive?

  $ dmesg |grep microcode:

Changed in linux-hwe (Ubuntu):
status: New → Incomplete
no longer affects: linux-hwe (Ubuntu)
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jérôme (jerome-auge) wrote :

I forced a downgrade of intel-microcode from "" to "", regenerated the initrd images, and now they seems to boot correctly!

So, the problem seems to be with latest intel-microcode?

Revision history for this message
Jérôme (jerome-auge) wrote :

I forced a downgrade of intel-microcode from "3.20200609.0ubuntu0.16.04.0" to "3.20151106.1", regenerated the initrd images, and now they seems to boot correctly!

So, the problem seems to be with latest intel-microcode?

Revision history for this message
Jérôme (jerome-auge) wrote :

Here is the output of "dmesg | microcode:" obtained from Ubuntu 20.04 live USB disk:

--8<--
[ 0.000000] microcode: microcode updated early to revision 0xd6, date = 2019-10-03
[ 1.177905] microcode: sig=0x406e3, pf=0x80, revision=0xd6
[ 1.177938] microcode: Microcode Update Driver: v2.2.
-->8--

Revision history for this message
Andrea Gabellini (andrea-gabellini) wrote :

Hello,

same problem here!!!

which commands do you use to downgrade intel-microcode.

Thanks,
Andrea

You-Sheng Yang (vicamo)
Changed in intel-microcode (Ubuntu):
status: New → Confirmed
no longer affects: linux (Ubuntu)
Revision history for this message
Jérôme (jerome-auge) wrote :

To downgrade intel-microcode I used the following command:

  $ sudo apt-get install intel-microcode=3.20151106.1

Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Jérôme, thank you. Could you also attach your cpu info by running `lscpu`? That would help complete the issue description.

$ git diff --stat debian/3.20151106.1 debian/3.20200609.1 -- intel-ucode/06-4e-03
 intel-ucode/06-4e-03 | Bin 0 -> 104448 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

Which merges Intel's new microcode release 15 hrs ago: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/commit/021c295269a06159b8c3ebefc0fac932e69e259e

Revision history for this message
You-Sheng Yang (vicamo) wrote :
You-Sheng Yang (vicamo)
summary: - Cannot boot initrd.img-4.15.0-106-generic on Ubuntu 16.04
+ intel-ucode/06-4e-03 from release 20200609 hangs system in early boot
Revision history for this message
Cole (coleanderson) wrote :

I am having this issue on ubuntu 20.04. I have an Intel® Core™ m3-6Y30 Processor and I was on 5.4.0-33-generic kernel

Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Cole, although it's true that this should affects all series as intel-microcode package is updated to all series. But, we will still need you to comment the output of "dmesg | microcode:" to make sure you both are victims of the same problematic blob.

Revision history for this message
You-Sheng Yang (vicamo) wrote :

@Jérôme, copying from comment on github[1]:

  ```
  The differences between 0xd6 and 0xdc were quite minimal, to work around an SGX issue.

  Does disabling SGX in the BIOS work around the hang using 0xdc?
  ```

Is it possible to have a try and reply on github?

[1]: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/31#issuecomment-641866465

Revision history for this message
Cole (coleanderson) wrote :

@You-Sheng Yang

"dmesg | microcode:"

[ 0.000000] microcode: microcode updated early to revision 0xd6, date = 2019-10-03
[ 0.985877] microcode: sig=0x406e3, pf=0x80, revision=0xd6
[ 0.986019] microcode: Microcode Update Driver: v2.2.

Revision history for this message
Jérôme (jerome-auge) wrote :
Norbert (nrbrtx)
tags: added: bionic eoan focal groovy xenial
Revision history for this message
Paulo J. S. Silva (pjssilva) wrote :

Same problem here on Ubuntu 20.04 in a Dell Latitude E7470 with a Core i5-6300U. The machine hangs on boot. I can not even reboot the machine using the alt-prtsc-b. It was completely frozen. Only a long press on the power button would work. My workaround was to boot into recovery mode (maybe you a need an older kernel for that) and do a

apt --purge remove intel-microcode

The purging process rebuilds initramfs. After that I could reboot again.

Here is the output asked above.

pjssilva@trinity:~$ dmesg | grep microcode
[ 0.102417] SRBDS: Vulnerable: No microcode
[ 0.578849] microcode: sig=0x406e3, pf=0x80, revision=0xcc
[ 0.578971] microcode: Microcode Update Driver: v2.2.

This looks like a serious bug. Maybe pull out the update from the repositories until a solution is found?

Revision history for this message
DanglingPointer (ferncasado) wrote :

I have the exact problem and laptop as @Cole

I have chroot into the laptop using an ubuntu-live-usb to figure out what the heck is going on.

The microcode that got installed is 3.20200609.0ubuntu0.20.04.0

I am uninstalling now.

I'll advise here if it worked.

Revision history for this message
Philipp Classen (philipp-classen) wrote :

Same problem with Ubuntu 20.04 on a ThinkPad X1 Carbon 4th.

"apt-get upgrade" froze at the installing the intel-ucode update. Then it also refused to start. As a workaround, I followed similar steps Paulo:

* Booting into recovery mode
* Step 1: dpkg --configure -a
* Step 2: apt --purge remove intel-microcode (first, I had to restore networking, but continuing boot in normal mode)

I also realized that the /boot partition was out of space. So, I also had to run "apt-get auto-remove" first.

Revision history for this message
DanglingPointer (ferncasado) wrote :

Ok it worked.

1)
I went here: https://launchpad.net/ubuntu/+source/intel-microcode
to check what version of intel-microcode to downgrade to. Basically I wanted the previous one that had no issues.

2)
I used an ubuntu-live-usb to chroot into my yoga 710-11ISK laptop (intel m3-6Y30)

3)
$ sudo apt-get purge --auto-remove intel-microcode

4)
$ sudo apt install -y intel-microcode=3.20191115.1ubuntu3
That's the version from point 1 above

5)
I was unable to update-initramfs for some reason while in chroot
I noticed that ~1/7 reboots into recovery would work; don't know why. But I found out the hard way trying to figure out what the heck had gone wrong banging my head reboot after reboot!
Anyway once in recovery mode I...
  $ sudo update-initramfs -u

6) I rebooted and logged-in multiple times to ensure problem is gone. Including from powered-off state.
Awesome!

7) Ubuntu's package management will want to install the newest cr@p intel-microcode (I swear I'm getting AMD next time)...
$ sudo apt-mark hold intel-microcode
That holds the package from upgrading until the microcode is fixed. I don't think Ubuntu can fix this, it will have to be Intel.

I hear Lenovo are now selling Ubuntu certified AMD laptops!... hmmm

Revision history for this message
Rodman (rodman-c) wrote :
Revision history for this message
Steve Beattie (sbeattie) wrote :

Rodman: no, that's an older issue affecting a difference processor family.

This issue is specifically affecting processors with id 0x406e3; if the output of dmesg | grep microcode does not contain "sig=0x406e3" then you have a different issue, and should open a new bug report.

I am working on reverting the 0xdc version of the microcode for the 0x406e3 family back to the 0xd6 version included in 20191115 microcode update. Test packages should show up soon in https://launchpad.net/~sbeattie/+archive/ubuntu/lp1882890/ ; I would appreciate confirmation that those packages do allow affected systems to boot successfully.

Thanks, and sorry for the problems people are having.

Revision history for this message
Steve Beattie (sbeattie) wrote :

Philipp Classen: indeed, a full /boot/ will cause problems on upgrade that can result in failure to boot, unrelated to microcode issues. With that corrected, are you still having an issue?

And again, can you confirm that

  dmesg | grep microcode

contains "sig=0x406e3"?

Changed in intel-microcode (Ubuntu):
assignee: nobody → Steve Beattie (sbeattie)
importance: Undecided → Critical
Revision history for this message
Nicolas Delvaux (malizor) wrote :

I grabbed the updated package from Steve's PPA (#19), installed it, made sure intramfs were all properly updated (update-initramfs -u -k all) and rebooted.

I confirm my laptop now boot properly (running Ubuntu 20.04).

Steve Beattie (sbeattie)
Changed in intel-microcode (Ubuntu Bionic):
status: New → Confirmed
Changed in intel-microcode (Ubuntu Xenial):
status: New → Confirmed
Changed in intel-microcode (Ubuntu Eoan):
status: New → Confirmed
Changed in intel-microcode (Ubuntu Focal):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.20.04.1

---------------
intel-microcode (3.20200609.0ubuntu0.20.04.1) focal-security; urgency=medium

  * REGRESSION UPDATE: some CPUs in the Skylake family sig=0x406e3
    fail to boot (LP: #1882890).
    - revert 06-4e-03/0x000406e3 microcode from 0x00dc to 0x00d6
      sig 0x000406e3, pf_mask 0xc0, 2019-10-03, rev 0x00d6, size 101376

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 08:34:20 -0700

Changed in intel-microcode (Ubuntu Focal):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.18.04.1

---------------
intel-microcode (3.20200609.0ubuntu0.18.04.1) bionic-security; urgency=medium

  * REGRESSION UPDATE: some CPUs in the Skylake family sig=0x406e3
    fail to boot (LP: #1882890).
    - revert 06-4e-03/0x000406e3 microcode from 0x00dc to 0x00d6
      sig 0x000406e3, pf_mask 0xc0, 2019-10-03, rev 0x00d6, size 101376

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 08:48:05 -0700

Changed in intel-microcode (Ubuntu Bionic):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.16.04.1

---------------
intel-microcode (3.20200609.0ubuntu0.16.04.1) xenial-security; urgency=medium

  * REGRESSION UPDATE: some CPUs in the Skylake family sig=0x406e3
    fail to boot (LP: #1882890).
    - revert 06-4e-03/0x000406e3 microcode from 0x00dc to 0x00d6
      sig 0x000406e3, pf_mask 0xc0, 2019-10-03, rev 0x00d6, size 101376

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 08:50:22 -0700

Changed in intel-microcode (Ubuntu Xenial):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.19.10.2

---------------
intel-microcode (3.20200609.0ubuntu0.19.10.2) eoan-security; urgency=medium

  * REGRESSION UPDATE: some CPUs in the Skylake family sig=0x406e3
    fail to boot (LP: #1882890).
    - revert 06-4e-03/0x000406e3 microcode from 0x00dc to 0x00d6
      sig 0x000406e3, pf_mask 0xc0, 2019-10-03, rev 0x00d6, size 101376

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 08:44:12 -0700

Changed in intel-microcode (Ubuntu Eoan):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20200609.0ubuntu0.20.04.2

---------------
intel-microcode (3.20200609.0ubuntu0.20.04.2) focal-security; urgency=medium

  * REGRESSION UPDATE: revert the tmpfiles snippet to do late
    loading of microcode, this would also happen during package
    upgrades. Also, in the case of a problematic microcode update,
    this would prevent booting using an earlier kernel as the late
    loading would still load the problematic micrcode, forcing the use
    of the 'dis_ucode_ldr' kernel command line option to recover.
    (LP: #1883002)

 -- Steve Beattie <email address hidden> Wed, 10 Jun 2020 13:36:29 -0700

Changed in intel-microcode (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
bandini (jphilippe-francois) wrote :

Hi, looks like this bug is back with the latest microcode update.
My ASUS PC only boot with dis_ucode_ldr kernel parameter.

4.4.0-194-generic #226-Ubuntu

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 78
model name : Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz
stepping : 3
microcode : 0x74

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

So, according to Intel, the probable cause for 0x406e3 and 0x506e3 processors to fail loading the microcode update *early* is an incompatibility when you update from a very old microcode release in BIOS/UEFI (BIOS has a revision older/smaller than 0x80) to the newer microcode updates.

This information was provided by Intel in the upstream bug report:
https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/31#issuecomment-761228960

However, it looks like it could have been fixed by the latest microcode update batch, from 20210608, which contains microcode revision 0xea for the 0x406e3 and 0x506e3 processors.

It would be extremely helpful if anyone that had this issue could test the 20210608 microcode update and report, please. It helps if you include the contents of /proc/cpuinfo *without* the microcode update, and then the contents of /proc/cpuinfo *with* the microcode update installed.

Thank you!

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

According to Intel, this bug has been fixed in the microcode updates released in the 20210608 package.

HOWEVER one must go from BIOS to the newest microcode update directly, using kernel early updates. This is how it is supposed to happen by default in Debian (and, AFAIK, Ubuntu nowadays) so it should just work.

https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/31#issuecomment-876558374

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.