Random unrecoverable freezes on Ubuntu 18.10

Bug #1798961 reported by Douglas H. Silva on 2018-10-20
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Disco
Bionic
High
Unassigned
Cosmic
High
Unassigned
Disco
High
Unassigned

Bug Description

First thing I notice is that the mouse cursor freezes as I'm using it, then I hit the CAPS LOCK key and the LED indicator doesn't respond. Then I try the "REISUB" command, but it doesn't do anything either. Only a hard reset works, pressing down the power button for a few seconds.

How to reproduce?
I couldn't figure out a consistent method. It is still random to me.

Version: Ubuntu 4.18.0-10.11-generic 4.18.12
System information attached.

Also happens under Arch Linux and Fedora.
I've talked to another user on IRC who seems to be having the same freezes.

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: linux-image-4.18.0-10-generic 4.18.0-10.11
ProcVersionSignature: Ubuntu 4.18.0-10.11-generic 4.18.12
Uname: Linux 4.18.0-10-generic x86_64
ApportVersion: 2.20.10-0ubuntu13
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: dsilva 1213 F.... pulseaudio
 /dev/snd/controlC0: dsilva 1213 F.... pulseaudio
CurrentDesktop: XFCE
Date: Sat Oct 20 09:54:50 2018
InstallationDate: Installed on 2018-10-20 (0 days ago)
InstallationMedia: Xubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.2)
MachineType: Dell Inc. Inspiron 5458
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.18.0-10-generic root=/dev/mapper/xubuntu--vg-root ro quiet splash vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-4.18.0-10-generic N/A
 linux-backports-modules-4.18.0-10-generic N/A
 linux-firmware 1.175
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/02/2018
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A15
dmi.board.name: 09WGNT
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA15:bd02/02/2018:svnDellInc.:pnInspiron5458:pvr01:rvnDellInc.:rn09WGNT:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: Inspiron 5458
dmi.product.sku: Inspiron 5458
dmi.product.version: 01
dmi.sys.vendor: Dell Inc.

Douglas H. Silva (o-alquimista) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Did this issue start happening after an update/upgrade? Was there a
prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v4.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc8

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Douglas H. Silva (o-alquimista) wrote :

This is the bug report opened for Arch Linux [TASK FS#59483]:

https://bugs.archlinux.org/task/59483?string=freeze&project=1&type%5B0%5D=&sev%5B0%5D=&pri%5B0%5D=&due%5B0%5D=&reported%5B0%5D=&cat%5B0%5D=&status%5B0%5D=open&percent%5B0%5D=&opened=&dev=&closed=&duedatefrom=&duedateto=&changedfrom=&changedto=&openedfrom=&openedto=&closedfrom=&closedto=

The OP reports 4.17.10-1 being the problematic kernel version. What I can say for sure is that this problem did not exist in kernel versions 4.16 and older.

Yes, I can try the newest kernel, however these freezes are random and I don't know how to trigger them. I will take some time to experiment with it.

Douglas H. Silva (o-alquimista) wrote :

Affects the latest mainline build 4.19-rc8 as well.

Still cannot identify one way to reproduce it intentionally, although most of the time it happens when I have a video playing and/or multiple images being displayed with the image viewer.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
description: updated
Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
Changed in linux (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → High
Micha Preußer (mipronimo) wrote :

Hey, have you found any solution? I have the same issue and changed the default kernel now to 4.15.0-36-generic. This is working, but it would be better with the new kernel.

Kai-Heng Feng (kaihengfeng) wrote :

Would it be possible for you to do a kernel bisection?

First, find the last good -rc kernel and the first bad -rc kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Then,
$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.

Douglas H. Silva (o-alquimista) wrote :

The problem is, it could take days before the system freezes. I don't know how to reproduce, it simply happens. I don't even know where to start. Maybe 4.17-rc1 and so on. But that's a huge task and one needs a lot of patience to do it. I'm not sure I'll be able to.

Kai-Heng Feng (kaihengfeng) wrote :

The bug in #4 is for Ryzen platforms, so it doesn't apply to Inspiron 5458, which seems to be a Broadwell platform.

Please update the BIOS to A16. If you still see this issue, please attach `journalctl -b -1 -k` in next boot.

I have updated to A16 and so far no freezes, although it's still not uncommon for these to stop happening for a while and then return.

lb design (lbdesign) wrote :

I have the same freeze problem https://bugs.launchpad.net/ubuntu/+bug/1802902 and seemingly fixed it by deleting the folder .cache/thumbnails/fail

Hope it works so we can track this thing down.

It happened again.
See the attachment of the output of journalctl -b -1 -k

This time I was just editing the position of widgets on the xfce4-panel. I think it was around 11:45 on the clock.

And I do not have a .cache/thumbnails/fail folder.

no longer affects: linux (Arch Linux)
Kai-Heng Feng (kaihengfeng) wrote :

There are no noticeable error message. It can be hardware freeze. Can you try 4.20-rc2 and boot with kernel parameter `pcie_aspm=off`?

I also have had random system freezes ever since I upgraded from 18.04 to 18.10, Linux 4.18.0-10-generic, solved by always booting into 4.15.0-36-generic. When the freezes happen, the screen simply stops, and no input is accepted, not even the usual REISUB.

teresaejunior@laptop ~> inxi -Fz
System: Host: laptop Kernel: 4.15.0-36-generic x86_64 bits: 64 Desktop: Xfce 4.13.2
           Distro: Ubuntu 18.10 (Cosmic Cuttlefish)
Machine: Type: Laptop System: LENOVO product: 80JE v: Lenovo G40-80 serial: <filter>
           Mobo: LENOVO model: Lancer 4A1 v: SDK0E50515 STD serial: <filter> UEFI: LENOVO v: B0CN79WW
           date: 05/07/2015
Battery: ID-1: BAT0 charge: 26.4 Wh condition: 26.4/28.5 Wh (93%)
CPU: Topology: Dual Core model: Intel Core i5-5200U bits: 64 type: MT MCP L2 cache: 3072 KiB
           Speed: 1397 MHz min/max: 500/2700 MHz Core speeds (MHz): 1: 1146 2: 1000 3: 1079 4: 1039
Graphics: Device-1: Intel HD Graphics 5500 driver: i915 v: kernel
           Display: x11 server: X.Org 1.20.1 driver: modesetting unloaded: fbdev,vesa resolution: 1366x768~60Hz
           OpenGL: renderer: Mesa DRI Intel HD Graphics 5500 (Broadwell GT2) v: 4.5 Mesa 18.2.2
Audio: Device-1: Intel Broadwell-U Audio driver: snd_hda_intel
           Device-2: Intel Wildcat Point-LP High Definition Audio driver: snd_hda_intel
           Sound Server: ALSA v: k4.15.0-36-generic
Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169
           IF: enp2s0 state: down mac: <filter>
           Device-2: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter driver: ath9k
           IF: wlp3s0 state: up mac: <filter>
           Device-3: Atheros AR3012 Bluetooth 4.0 type: USB driver: btusb
           IF-ID-1: docker0 state: down mac: <filter>
Drives: Local Storage: total: 931.51 GiB used: 411.53 GiB (44.2%)
           ID-1: /dev/sda vendor: Seagate model: ST1000LM024 HN-M101MBB size: 931.51 GiB
Partition: ID-1: / size: 915.71 GiB used: 411.52 GiB (44.9%) fs: ext4 dev: /dev/sda2
Sensors: System Temperatures: cpu: 41.0 C mobo: N/A
           Fan Speeds (RPM): N/A
Info: Processes: 267 Uptime: 7h 12m Memory: 11.64 GiB used: 3.45 GiB (29.7%) Shell: fish inxi: 3.0.24

teresaejunior, could you try that:

"Can you try 4.20-rc2 and boot with kernel parameter `pcie_aspm=off`?"

I suggest first trying 4.20-rcX and seeing if it freezes at least once. If it does, try booting for a few days with pcie_aspm=off in the kernel parameters. Post back results.

I can't do that right now, because I'm not currently running Ubuntu 18.10.

After around two days of running with `pcie_aspm=off', it does not make any difference. I was just forced to do a hard reset of my laptop.

Kai-Heng Feng (kaihengfeng) wrote :

Please perform a kernel bisection to find which commit introduces the regression.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers