System crashes and reboots every 15min - 2hr

Bug #1958692 reported by Justin Moore
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-signed-oem-5.14 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I have a NUC11NAHi5 system running Ubuntu 20.04.3, and every few minutes to hours it will crash:

justin :0 :0 Fri Jan 21 19:46 - crash (00:37)
justin :0 :0 Fri Jan 21 19:40 - crash (00:05)
justin :0 :0 Fri Jan 21 17:33 - crash (02:07)
justin :0 :0 Fri Jan 21 17:16 - crash (00:07)
justin :0 :0 Fri Jan 21 17:13 - crash (00:02)
justin :0 :0 Fri Jan 21 16:54 - crash (00:19)
justin :0 :0 Fri Jan 21 16:25 - crash (00:28)
justin :0 :0 Fri Jan 21 16:18 - crash (00:06)
justin :0 :0 Fri Jan 21 16:11 - crash (00:07)
justin :0 :0 Fri Jan 21 15:46 - crash (00:24)

When it comes back up I see the following errors in journalctl:

kernel: BERT: Error records from previous boot:
kernel: [Hardware Error]: event severity: fatal
kernel: [Hardware Error]: Error 0, type: fatal
kernel: [Hardware Error]: section_type: Firmware Error Record Reference
kernel: [Hardware Error]: Firmware Error Record Type: SOC Firmware Error Record Type2
kernel: [Hardware Error]: Revision: 2
kernel: [Hardware Error]: Record Identifier: 8f87f311-c998-4d9e-a0c4-6065518c4f6d

followed by a large hex dump. It could be related to this issue:
https://community.intel.com/t5/Processors/Frequent-crashes-on-i5-11500/td-p/1280709

The BIOS is up-to-date, as is the intel microcode package. It doesn't seem to matter which kernel I use. I'm using the OEM one since the regular kernels for 20.04.3 (a) also have this problem, and (b) have broken IR support (https://askubuntu.com/questions/1380500/kernel-rc-rc0-receive-overflow).

System: NUC11PAHi5
SKU: RNUC11PAHi5000
BIOS: PATGL357.0041.2021.0811.1505
Board: NUC11PABi5
CPU: i5-1135G7

I'm not sure if it's a kernel issue, BIOS issue, microcode issue, or hardware issue, so I decided to start here. Let me know how else I can help.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.14.0-1020-oem 5.14.0-1020.22
ProcVersionSignature: Ubuntu 5.14.0-1020.22-oem 5.14.20
Uname: Linux 5.14.0-1020-oem x86_64
ApportVersion: 2.20.11-0ubuntu27.21
Architecture: amd64
CasperMD5CheckResult: skip
Date: Fri Jan 21 20:44:43 2022
InstallationDate: Installed on 2022-01-08 (13 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-signed-oem-5.14
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Justin Moore (justinnonwork) wrote :
Revision history for this message
Justin Moore (justinnonwork) wrote :

Updates:

- The hardware passes multiple rounds of memtest86+ over a ~10 hour window and does not reboot during that time

- When left at the BIOS screen for hours on end, it does not reboot

- The system will do the reboot cycle under all kernels I currently have installed, which includes:
5.14.0-1020-oem
5.14.0-1018-oem
5.13.0-27-generic
5.11.0-40-generic
5.11.0-38-generic

By process of elimination this is starting to look more and more like a Linux kernel issue.

Revision history for this message
jsuejdh65 (jsuejdh65) wrote :

I have exactly the same issue with a NUC11PAKi7. Do you have an update on your issue?

Revision history for this message
Justin Moore (justinnonwork) wrote :

It had been fine for a little while after I replaced the hardware, but starting maybe a month ago I started seeing it again. Adding to the list of problem kernels are:

5.14.0-1044-oem
5.15.0-50-generi

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-oem-5.14 (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.