Activity log for bug #1796443

Date Who What changed Old value New value Message
2018-10-06 03:35:31 John Clemens bug added bug
2018-10-06 03:44:19 John Clemens description My new Elitebook, with the latest bios 1.03.01, refuses to boot any kernel later than 4.10 unless mce=off is appended to the kernel command line. As in, there are no kernel messages at all after grub (yes, quiet and splash were removed from the command line). Perhaps it crashes before the efifb kicks in? System operates fine if mce=off is added to the kernel command line (and iommu=soft, but that's a separate issue, and fails with kernel output in that case). I opened upstream bug here : https://bugzilla.kernel.org/show_bug.cgi?id=201291 I bisected the problem down to this commit (and the few before it, which also added extra MCE output, but didn't actually crash. 18807ddb7f88d4ac3797302bafb18143d573e66f is the first bad commit commit 18807ddb7f88d4ac3797302bafb18143d573e66f Author: Yazen Ghannam <Yazen.Ghannam@amd.com> Date: Tue Nov 15 15:13:53 2016 -0600 x86/mce/AMD: Reset Threshold Limit after logging error The error count field in MCA_MISC does not get reset by hardware when the threshold has been reached. Software is expected to reset it. Currently, the threshold limit only gets reset during init or when a user writes to sysfs. If the user is not monitoring threshold interrupts and resetting the limit then the user will only see 1 interrupt when the limit is first hit. So if, for example, the limit is set to 10 then only 1 interrupt will be recorded after 10 errors even if 100 errors have occurred. The user may then assume that only 10 errors have occurred. There are threads online about this being related to the latest bios. The upstream bug has acpidump attached. ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.18.0-8-generic 4.18.0-8.9 ProcVersionSignature: Ubuntu 4.18.0-8.9-generic 4.18.7 Uname: Linux 4.18.0-8-generic x86_64 ApportVersion: 2.20.10-0ubuntu11 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: john 2015 F.... pulseaudio /dev/snd/pcmC1D0p: john 2015 F...m pulseaudio /dev/snd/controlC0: john 2015 F.... pulseaudio CurrentDesktop: ubuntu:GNOME Date: Fri Oct 5 23:24:45 2018 InstallationDate: Installed on 2018-09-30 (5 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180927) Lsusb: Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: HP HP EliteBook 745 G5 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-8-generic root=UUID=5cf73665-d2a3-4203-80fd-659faf1afea4 ro quiet splash iommu=soft mce=off RelatedPackageVersions: linux-restricted-modules-4.18.0-8-generic N/A linux-backports-modules-4.18.0-8-generic N/A linux-firmware 1.175 RfKill: 1: phy0: Wireless LAN Soft blocked: no Hard blocked: no SourcePackage: linux StagingDrivers: r8822be UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/26/2018 dmi.bios.vendor: HP dmi.bios.version: Q81 Ver. 01.03.01 dmi.board.name: 83D5 dmi.board.vendor: HP dmi.board.version: KBC Version 08.47.00 dmi.chassis.asset.tag: 5CG838305Y dmi.chassis.type: 10 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrQ81Ver.01.03.01:bd07/26/2018:svnHP:pnHPEliteBook745G5:pvr:rvnHP:rn83D5:rvrKBCVersion08.47.00:cvnHP:ct10:cvr: dmi.product.family: 103C_5336AN HP EliteBook dmi.product.name: HP EliteBook 745 G5 dmi.product.sku: 2MG23AV dmi.sys.vendor: HP My new Elitebook, with the latest bios 1.03.01, refuses to boot any kernel later than 4.10 unless mce=off is appended to the kernel command line. As in, there are no kernel messages at all after grub (yes, quiet and splash were removed from the command line). Perhaps it crashes before the efifb kicks in? System operates fine if mce=off is added to the kernel command line (and iommu=soft, but that's a separate issue, and fails with kernel output in that case). I opened upstream bug here : https://bugzilla.kernel.org/show_bug.cgi?id=201291 I bisected the problem down to this commit (and the few before it, which also added extra MCE output, but didn't actually crash):     18807ddb7f88d4ac3797302bafb18143d573e66f is the first bad commit     commit 18807ddb7f88d4ac3797302bafb18143d573e66f     Author: Yazen Ghannam <Yazen.Ghannam@amd.com>     Date: Tue Nov 15 15:13:53 2016 -0600     x86/mce/AMD: Reset Threshold Limit after logging error     The error count field in MCA_MISC does not get reset by hardware when the     threshold has been reached. Software is expected to reset it. Currently,     the threshold limit only gets reset during init or when a user writes to     sysfs.     If the user is not monitoring threshold interrupts and resetting     the limit then the user will only see 1 interrupt when the limit is first     hit. So if, for example, the limit is set to 10 then only 1 interrupt will     be recorded after 10 errors even if 100 errors have occurred. The user may     then assume that only 10 errors have occurred. There are threads online about this being related to the latest bios. The upstream bug has acpidump attached. ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.18.0-8-generic 4.18.0-8.9 ProcVersionSignature: Ubuntu 4.18.0-8.9-generic 4.18.7 Uname: Linux 4.18.0-8-generic x86_64 ApportVersion: 2.20.10-0ubuntu11 Architecture: amd64 AudioDevicesInUse:  USER PID ACCESS COMMAND  /dev/snd/controlC1: john 2015 F.... pulseaudio  /dev/snd/pcmC1D0p: john 2015 F...m pulseaudio  /dev/snd/controlC0: john 2015 F.... pulseaudio CurrentDesktop: ubuntu:GNOME Date: Fri Oct 5 23:24:45 2018 InstallationDate: Installed on 2018-09-30 (5 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180927) Lsusb:  Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: HP HP EliteBook 745 G5 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  XDG_RUNTIME_DIR=<set>  LANG=en_US.UTF-8  SHELL=/bin/bash ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-8-generic root=UUID=5cf73665-d2a3-4203-80fd-659faf1afea4 ro quiet splash iommu=soft mce=off RelatedPackageVersions:  linux-restricted-modules-4.18.0-8-generic N/A  linux-backports-modules-4.18.0-8-generic N/A  linux-firmware 1.175 RfKill:  1: phy0: Wireless LAN   Soft blocked: no   Hard blocked: no SourcePackage: linux StagingDrivers: r8822be UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/26/2018 dmi.bios.vendor: HP dmi.bios.version: Q81 Ver. 01.03.01 dmi.board.name: 83D5 dmi.board.vendor: HP dmi.board.version: KBC Version 08.47.00 dmi.chassis.asset.tag: 5CG838305Y dmi.chassis.type: 10 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrQ81Ver.01.03.01:bd07/26/2018:svnHP:pnHPEliteBook745G5:pvr:rvnHP:rn83D5:rvrKBCVersion08.47.00:cvnHP:ct10:cvr: dmi.product.family: 103C_5336AN HP EliteBook dmi.product.name: HP EliteBook 745 G5 dmi.product.sku: 2MG23AV dmi.sys.vendor: HP
2018-10-06 04:00:08 Ubuntu Kernel Bot linux (Ubuntu): status New Confirmed
2018-10-08 14:58:30 Cristian Aravena Romero bug watch added https://bugzilla.kernel.org/show_bug.cgi?id=201291
2018-10-08 14:58:30 Cristian Aravena Romero bug task added linux
2018-10-09 16:10:21 Joseph Salisbury linux (Ubuntu): importance Undecided Medium
2018-10-09 16:10:25 Joseph Salisbury linux (Ubuntu): status Confirmed Triaged
2018-10-09 16:11:09 Joseph Salisbury tags amd64 apport-bug cosmic staging amd64 apport-bug cosmic kernel-da-key staging
2018-10-10 04:11:52 John Clemens linux (Ubuntu): status Triaged Confirmed
2018-10-10 04:12:59 John Clemens tags amd64 apport-bug cosmic kernel-da-key staging amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging
2018-11-07 13:58:43 Bug Watch Updater linux: status Unknown Confirmed
2018-11-07 13:58:43 Bug Watch Updater linux: importance Unknown Medium
2018-12-01 09:34:43 Bug Watch Updater linux: status Confirmed Fix Released
2018-12-15 22:39:00 Mathias Weyland bug added subscriber Mathias Weyland
2019-06-04 16:20:15 Jon Grimm bug added subscriber Jon Grimm
2019-06-04 17:44:36 Tom Lendacky bug added subscriber Tom Lendacky
2019-06-04 17:55:03 Jon Grimm attachment added kernel log from failing system (AMD) https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796443/+attachment/5268815/+files/dmesg-panic.log
2019-06-11 10:20:04 Anthony Wong bug added subscriber Anthony Wong
2019-07-03 07:24:39 Kai-Heng Feng nominated for series Ubuntu Disco
2019-07-03 07:24:39 Kai-Heng Feng bug task added linux (Ubuntu Disco)
2019-07-03 07:24:39 Kai-Heng Feng nominated for series Ubuntu Bionic
2019-07-03 07:24:39 Kai-Heng Feng bug task added linux (Ubuntu Bionic)
2019-07-03 07:24:39 Kai-Heng Feng nominated for series Ubuntu Cosmic
2019-07-03 07:24:39 Kai-Heng Feng bug task added linux (Ubuntu Cosmic)
2019-07-03 07:24:49 Kai-Heng Feng linux (Ubuntu): status Confirmed Fix Released
2019-07-03 07:25:18 Kai-Heng Feng description My new Elitebook, with the latest bios 1.03.01, refuses to boot any kernel later than 4.10 unless mce=off is appended to the kernel command line. As in, there are no kernel messages at all after grub (yes, quiet and splash were removed from the command line). Perhaps it crashes before the efifb kicks in? System operates fine if mce=off is added to the kernel command line (and iommu=soft, but that's a separate issue, and fails with kernel output in that case). I opened upstream bug here : https://bugzilla.kernel.org/show_bug.cgi?id=201291 I bisected the problem down to this commit (and the few before it, which also added extra MCE output, but didn't actually crash):     18807ddb7f88d4ac3797302bafb18143d573e66f is the first bad commit     commit 18807ddb7f88d4ac3797302bafb18143d573e66f     Author: Yazen Ghannam <Yazen.Ghannam@amd.com>     Date: Tue Nov 15 15:13:53 2016 -0600     x86/mce/AMD: Reset Threshold Limit after logging error     The error count field in MCA_MISC does not get reset by hardware when the     threshold has been reached. Software is expected to reset it. Currently,     the threshold limit only gets reset during init or when a user writes to     sysfs.     If the user is not monitoring threshold interrupts and resetting     the limit then the user will only see 1 interrupt when the limit is first     hit. So if, for example, the limit is set to 10 then only 1 interrupt will     be recorded after 10 errors even if 100 errors have occurred. The user may     then assume that only 10 errors have occurred. There are threads online about this being related to the latest bios. The upstream bug has acpidump attached. ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.18.0-8-generic 4.18.0-8.9 ProcVersionSignature: Ubuntu 4.18.0-8.9-generic 4.18.7 Uname: Linux 4.18.0-8-generic x86_64 ApportVersion: 2.20.10-0ubuntu11 Architecture: amd64 AudioDevicesInUse:  USER PID ACCESS COMMAND  /dev/snd/controlC1: john 2015 F.... pulseaudio  /dev/snd/pcmC1D0p: john 2015 F...m pulseaudio  /dev/snd/controlC0: john 2015 F.... pulseaudio CurrentDesktop: ubuntu:GNOME Date: Fri Oct 5 23:24:45 2018 InstallationDate: Installed on 2018-09-30 (5 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180927) Lsusb:  Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: HP HP EliteBook 745 G5 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  XDG_RUNTIME_DIR=<set>  LANG=en_US.UTF-8  SHELL=/bin/bash ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-8-generic root=UUID=5cf73665-d2a3-4203-80fd-659faf1afea4 ro quiet splash iommu=soft mce=off RelatedPackageVersions:  linux-restricted-modules-4.18.0-8-generic N/A  linux-backports-modules-4.18.0-8-generic N/A  linux-firmware 1.175 RfKill:  1: phy0: Wireless LAN   Soft blocked: no   Hard blocked: no SourcePackage: linux StagingDrivers: r8822be UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/26/2018 dmi.bios.vendor: HP dmi.bios.version: Q81 Ver. 01.03.01 dmi.board.name: 83D5 dmi.board.vendor: HP dmi.board.version: KBC Version 08.47.00 dmi.chassis.asset.tag: 5CG838305Y dmi.chassis.type: 10 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrQ81Ver.01.03.01:bd07/26/2018:svnHP:pnHPEliteBook745G5:pvr:rvnHP:rn83D5:rvrKBCVersion08.47.00:cvnHP:ct10:cvr: dmi.product.family: 103C_5336AN HP EliteBook dmi.product.name: HP EliteBook 745 G5 dmi.product.sku: 2MG23AV dmi.sys.vendor: HP === SRU Justification === [Impact] System doesn't boot without "mce=off". [Fix] Quote from the commit log: "Clear the "Counter Present" bit in the Instruction Fetch bank's MCA_MISC0 register. This will prevent enabling MCA thresholding on this bank which will prevent the high interrupt rate due to this error." [Test] The affected user reported these commits fix the issue. [Regression Potential] Low. Upstream stable commits. I don't see any regression on my unaffected AMD systems. === Original Bug Report === My new Elitebook, with the latest bios 1.03.01, refuses to boot any kernel later than 4.10 unless mce=off is appended to the kernel command line. As in, there are no kernel messages at all after grub (yes, quiet and splash were removed from the command line). Perhaps it crashes before the efifb kicks in? System operates fine if mce=off is added to the kernel command line (and iommu=soft, but that's a separate issue, and fails with kernel output in that case). I opened upstream bug here : https://bugzilla.kernel.org/show_bug.cgi?id=201291 I bisected the problem down to this commit (and the few before it, which also added extra MCE output, but didn't actually crash):     18807ddb7f88d4ac3797302bafb18143d573e66f is the first bad commit     commit 18807ddb7f88d4ac3797302bafb18143d573e66f     Author: Yazen Ghannam <Yazen.Ghannam@amd.com>     Date: Tue Nov 15 15:13:53 2016 -0600     x86/mce/AMD: Reset Threshold Limit after logging error     The error count field in MCA_MISC does not get reset by hardware when the     threshold has been reached. Software is expected to reset it. Currently,     the threshold limit only gets reset during init or when a user writes to     sysfs.     If the user is not monitoring threshold interrupts and resetting     the limit then the user will only see 1 interrupt when the limit is first     hit. So if, for example, the limit is set to 10 then only 1 interrupt will     be recorded after 10 errors even if 100 errors have occurred. The user may     then assume that only 10 errors have occurred. There are threads online about this being related to the latest bios. The upstream bug has acpidump attached. ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.18.0-8-generic 4.18.0-8.9 ProcVersionSignature: Ubuntu 4.18.0-8.9-generic 4.18.7 Uname: Linux 4.18.0-8-generic x86_64 ApportVersion: 2.20.10-0ubuntu11 Architecture: amd64 AudioDevicesInUse:  USER PID ACCESS COMMAND  /dev/snd/controlC1: john 2015 F.... pulseaudio  /dev/snd/pcmC1D0p: john 2015 F...m pulseaudio  /dev/snd/controlC0: john 2015 F.... pulseaudio CurrentDesktop: ubuntu:GNOME Date: Fri Oct 5 23:24:45 2018 InstallationDate: Installed on 2018-09-30 (5 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180927) Lsusb:  Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub  Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: HP HP EliteBook 745 G5 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  XDG_RUNTIME_DIR=<set>  LANG=en_US.UTF-8  SHELL=/bin/bash ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-8-generic root=UUID=5cf73665-d2a3-4203-80fd-659faf1afea4 ro quiet splash iommu=soft mce=off RelatedPackageVersions:  linux-restricted-modules-4.18.0-8-generic N/A  linux-backports-modules-4.18.0-8-generic N/A  linux-firmware 1.175 RfKill:  1: phy0: Wireless LAN   Soft blocked: no   Hard blocked: no SourcePackage: linux StagingDrivers: r8822be UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/26/2018 dmi.bios.vendor: HP dmi.bios.version: Q81 Ver. 01.03.01 dmi.board.name: 83D5 dmi.board.vendor: HP dmi.board.version: KBC Version 08.47.00 dmi.chassis.asset.tag: 5CG838305Y dmi.chassis.type: 10 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrQ81Ver.01.03.01:bd07/26/2018:svnHP:pnHPEliteBook745G5:pvr:rvnHP:rn83D5:rvrKBCVersion08.47.00:cvnHP:ct10:cvr: dmi.product.family: 103C_5336AN HP EliteBook dmi.product.name: HP EliteBook 745 G5 dmi.product.sku: 2MG23AV dmi.sys.vendor: HP
2019-07-09 11:12:49 Stefan Bader linux (Ubuntu Cosmic): status New Won't Fix
2019-07-09 11:12:59 Stefan Bader linux (Ubuntu Bionic): importance Undecided Medium
2019-07-09 11:13:02 Stefan Bader linux (Ubuntu Disco): importance Undecided Medium
2019-07-16 10:17:43 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status New Fix Committed
2019-07-16 10:36:16 Kleber Sacilotto de Souza linux (Ubuntu Disco): status New Fix Committed
2019-07-25 18:33:00 Ubuntu Kernel Bot tags amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-needed-bionic
2019-07-25 20:13:16 Tom Lendacky tags amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-needed-bionic amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-done-bionic
2019-08-07 08:34:46 Ubuntu Kernel Bot tags amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-done-bionic amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-done-bionic verification-needed-xenial
2019-08-07 15:57:54 Tom Lendacky tags amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-done-bionic verification-needed-xenial amd64 apport-bug cosmic kernel-bug-exists-upstream kernel-da-key staging verification-done-bionic verification-done-xenial
2019-08-13 11:27:47 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2019-08-13 11:27:47 Launchpad Janitor cve linked 2000-1134
2019-08-13 11:27:47 Launchpad Janitor cve linked 2007-3852
2019-08-13 11:27:47 Launchpad Janitor cve linked 2008-0525
2019-08-13 11:27:47 Launchpad Janitor cve linked 2009-0416
2019-08-13 11:27:47 Launchpad Janitor cve linked 2011-4834
2019-08-13 11:27:47 Launchpad Janitor cve linked 2015-1838
2019-08-13 11:27:47 Launchpad Janitor cve linked 2015-7442
2019-08-13 11:27:47 Launchpad Janitor cve linked 2016-7489
2019-08-13 11:27:47 Launchpad Janitor cve linked 2018-5383
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-10126
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-1125
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-12614
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-12818
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-12819
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-12984
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-13233
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-13272
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-2101
2019-08-13 11:27:47 Launchpad Janitor cve linked 2019-3846
2019-10-29 14:22:28 Gerard Forcada bug added subscriber Gerard Forcada
2020-07-02 19:53:19 Steve Langasek linux (Ubuntu Disco): status Fix Committed Won't Fix