5.4.0-58-generic: Raspberry Pi 3 arm64 occasionally unresponsive

Bug #1908423 reported by Birgit Edel
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux-raspi (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Since upgrading to 5.4.0-56-generic my RPI3 devices started freezing randomly, about once every few hours.

-56, -58 and -59 are affected, reverting *only* the kernel back to -53 got me 3 days without freeze. Diagnostics so far have not turned up anything useful, still busy figuring out cross-compile bisect and a reasonably fast reproduction.

The devices are also known as (/proc/device-tree/compatible)
raspberrypi,3-model-bbrcm,bcm2837
raspberrypi,3-model-b-plusbrcm,bcm2837
I only use them in ARM64 mode, and I only run -generic kernels via Pete Batards EDKII builds.

(unreliable, slow) reproduction:
1. run affected aarch64 kernel
2. set low non-zero /proc/sys/kernel/hung_task_timeout_secs
3. utilize xorg through vc4 kms
4. check dmesg when display freezes

symptom 1 (always the same -62):
13:52:01.832230 kernel: raspberrypi-firmware soc:firmware: mbox_send_message returned -62
13:52:01.839310 kernel: raspberrypi-clk raspberrypi-clk: Failed to change pllb frequency: -62

Note: similar message in Bug 1889637
Exactly one Google search result for the exact message: https://forum.openwrt.org/t/rpi-4-failed-to-change-pllb-frequency/81840

symptom 2 (varying hangs):
[Dez15 16:09] INFO: task kworker/2:1:36 blocked for more than 120 seconds.
[ +0,000019] Tainted: G C E 5.4.0-58-generic #64-Ubuntu
[ +0,000004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0,000006] kworker/2:1 D 0 36 2 0x00000028
[ +0,000022] Workqueue: events dbs_work_handler
[ +0,000004] Call trace:
[ +0,000009] __switch_to+0xf8/0x1b0
[ +0,000008] __schedule+0x310/0x7d8
[ +0,000005] schedule+0x40/0xb8
[ +0,000004] schedule_timeout+0xa0/0x198
[ +0,000005] __wait_for_common+0xf0/0x230
[ +0,000005] wait_for_completion_timeout+0x38/0x48
[ +0,000006] mbox_send_message+0xd0/0x170
[ +0,000006] rpi_firmware_property_list+0xec/0x250
[ +0,000004] rpi_firmware_property+0x78/0xb8
[ +0,000010] raspberrypi_fw_pll_set_rate+0x60/0xe0 [clk_raspberrypi]
[ +0,000008] clk_change_rate+0xdc/0x420
[ +0,000004] clk_core_set_rate_nolock+0x1cc/0x1f0
[ +0,000003] clk_set_rate+0x3c/0xc0
[ +0,000006] dev_pm_opp_set_rate+0x3d4/0x520
[ +0,000004] set_target+0x4c/0x90
[ +0,000007] __cpufreq_driver_target+0x2c8/0x678
[ +0,000004] od_dbs_update+0x144/0x1a0
[ +0,000003] dbs_work_handler+0x48/0x80
[ +0,000006] process_one_work+0x1d0/0x468
[ +0,000005] worker_thread+0x154/0x4e0
[ +0,000004] kthread+0xf0/0x118
[ +0,000004] ret_from_fork+0x10/0x18

non-kernel-version explanations ruled out:
network adapter is neither idle nor saturated (~32MiB / hour on enxb827eb..)
4 different power supply models, 2 of them bad but not correlated with frequency of hang
swap size & usage obviously does change system responsiveness, but not correlated with frequency of hang
hang reported with & without use of wifi of device or any other device within 1m
hang reported with & without config.txt dtoverlay=disable-wifi (NOT confirmed that even works)
hang reported with cpu_thermal-virtual-0 between 59-67°C (3 copper plates, no fan)
hang reported with & without kernel cmdline memtest=2
hang reported with & without kernel cmdline dwc_otg.speed=1
hang reported with firmware-1.20200601 & firmware-1.20201201
hang reported with /proc/device-tree/model "Raspberry Pi 3 Model B Rev 1.2" & "Raspberry Pi 3 Model B Plus Rev 1.3"
hang reported with & without config.txt enable_uart=1,uart_2ndstage=1
hang reported with config.txt boot_delay=2 bootcode_delay=2
hang reported with EFI "rpi3 tianocore 1.27" & "rpi3 tianocore 1.31"

hang reported with 5.4.0-56-generic usually after a few hours of uptime
hang reported with 5.4.0-58-generic usually after a few hours of uptime
hang reported with (hwe-edge) 5.8.0-31-generic usually after a few hours of uptime
hang reported with 5.4.0-59-generic usually after a few hours of uptime
never seen in versions 5.4.0-26-generic through 5.4.0-53-generic, with one notable exception while investigating:
similiar hang reported with 5.4.0-53-generic if and only if attempting to unload i2c_bcm2835 module

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-58-generic 5.4.0-58.64
ProcVersionSignature: Ubuntu 5.4.0-58.64-generic 5.4.73
Uname: Linux 5.4.0-58-generic aarch64
AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-58-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.14
Architecture: arm64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/controlC1', '/dev/snd/pcmC1D1p', '/dev/snd/pcmC1D0p', '/dev/snd/by-path', '/dev/snd/seq', '/dev/snd/controlC0', '/dev/snd/pcmC0D0p', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
Card1.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card1.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
CasperMD5CheckResult: skip
Date: Tue Dec 15 16:46:43 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lspci:

Lspci-vt: -[0000:00]-
Lsusb:
 Bus 001 Device 004: ID 046d:c52b Logitech, Inc. Unifying Receiver
 Bus 001 Device 003: ID 0424:ec00 Microchip Technology, Inc. (formerly SMSC) SMSC9512/9514 Fast Ethernet Adapter
 Bus 001 Device 002: ID 0424:9514 Microchip Technology, Inc. (formerly SMSC) SMC9514 Hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Raspberry Pi Foundation Raspberry Pi 3 Model B
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 vc4drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-58-generic root=UUID=e3bc3b23-fd02-4250-8350-5916df6d6fd1 ro consoleblank=0 usbcore.authorized_default=0 splash fsck.repair=yes cma=128M ip=dhcp printk.devkmsg=on vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-58-generic N/A
 linux-backports-modules-5.4.0-58-generic N/A
 linux-firmware 1.187.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
StagingDrivers: snd_bcm2835 vchiq
UpgradeStatus: No upgrade log present (probably fresh install)
acpidump:

dmi.bios.date: 11/13/2020
dmi.bios.vendor: https://github.com/pftf/RPi3
dmi.bios.version: UEFI Firmware v1.31
dmi.board.name: Raspberry Pi 3 Model B
dmi.board.vendor: Sony UK
dmi.board.version: A02082
dmi.chassis.type: 34
dmi.chassis.vendor: Sony UK
dmi.chassis.version: Raspberry Pi 3 Model B
dmi.modalias: dmi:bvnhttps//github.com/pftf/RPi3:bvrUEFIFirmwarev1.31:bd11/13/2020:svnRaspberryPiFoundation:pnRaspberryPi3ModelB:pvrA02082:rvnSonyUK:rnRaspberryPi3ModelB:rvrA02082:cvnSonyUK:ct34:cvrRaspberryPi3ModelB:
dmi.product.family: Raspberry Pi
dmi.product.name: Raspberry Pi 3 Model B
dmi.product.sku: 0000000000A02082
dmi.product.version: A02082
dmi.sys.vendor: Raspberry Pi Foundation
modified.conffile..etc.default.apport: enabled=0
mtime.conffile..etc.default.apport: 2020-07-06T13:01:13.327228
---
ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-59-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.14
Architecture: arm64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/seq', '/dev/snd/controlC0', '/dev/snd/pcmC0D0p', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lspci:

Lspci-vt: -[0000:00]-
Lsusb:
 Bus 001 Device 004: ID 046d:c52b Logitech, Inc. Unifying Receiver
 Bus 001 Device 003: ID 0424:ec00 Microchip Technology, Inc. (formerly SMSC) SMSC9512/9514 Fast Ethernet Adapter
 Bus 001 Device 002: ID 0424:9514 Microchip Technology, Inc. (formerly SMSC) SMC9514 Hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Raspberry Pi Foundation Raspberry Pi 3 Model B
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 vc4drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-59-generic root=UUID=e3bc3b23-fd02-4250-8350-5916df6d6fd1 ro consoleblank=0 usbcore.authorized_default=0 splash fsck.repair=yes cma=128M ip=dhcp printk.devkmsg=on vt.handoff=7
ProcVersionSignature: Ubuntu 5.4.0-59.65-generic 5.4.78
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-59-generic N/A
 linux-backports-modules-5.4.0-59-generic N/A
 linux-firmware 1.187.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
StagingDrivers: snd_bcm2835 vchiq
Tags: focal staging
Uname: Linux 5.4.0-59-generic aarch64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
acpidump:

dmi.bios.date: 11/13/2020
dmi.bios.vendor: https://github.com/pftf/RPi3
dmi.bios.version: UEFI Firmware v1.31
dmi.board.name: Raspberry Pi 3 Model B
dmi.board.vendor: Sony UK
dmi.board.version: A02082
dmi.chassis.type: 34
dmi.chassis.vendor: Sony UK
dmi.chassis.version: Raspberry Pi 3 Model B
dmi.modalias: dmi:bvnhttps//github.com/pftf/RPi3:bvrUEFIFirmwarev1.31:bd11/13/2020:svnRaspberryPiFoundation:pnRaspberryPi3ModelB:pvrA02082:rvnSonyUK:rnRaspberryPi3ModelB:rvrA02082:cvnSonyUK:ct34:cvrRaspberryPi3ModelB:
dmi.product.family: Raspberry Pi
dmi.product.name: Raspberry Pi 3 Model B
dmi.product.sku: 0000000000A02082
dmi.product.version: A02082
dmi.sys.vendor: Raspberry Pi Foundation
modified.conffile..etc.default.apport: enabled=0
mtime.conffile..etc.default.apport: 2020-07-06T13:01:13.327228

Revision history for this message
Birgit Edel (biredel) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1908423

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Birgit Edel (biredel) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Birgit Edel (biredel) wrote : CRDA.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : Lsusb-t.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : Lsusb-v.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : ProcModules.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : UdevDb.txt

apport information

Revision history for this message
Birgit Edel (biredel) wrote : WifiSyslog.txt

apport information

Birgit Edel (biredel)
description: updated
Revision history for this message
Birgit Edel (biredel) wrote :

The logs attached to the original bug relate to the issue I care about.

The logs attached to fulfill ubuntu-kernel-bot demands contain the dmesg output produced when doing `modprobe -r -v i2c_bcm2835` that demonstrate how these kernel messages share some similarities.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
description: updated
Revision history for this message
Brian (tenethor) wrote :

I have the same/similar issue on my RPi 4. While playing video with Kodi the Pi randomly will hang and be unresponsive to mouse, keyboard and display frozen. I can SSH and the web server plus torrent servers are still running fine but the cpu load slowly rises to something in the 8-9 range. If left alone it will recover after maybe 30-45 minutes and when it does the kernel posts the below two messages to dmesg. Always the same two messages. I have also seen the "kworker has blocked for more than 120 seconds" message as mentioned above but not lately.

Distro:
Ubuntu MATE for Raspberry Pi 20.04

uname:
Linux raspberry 5.8.0-1011-raspi #14-Ubuntu SMP PREEMPT Tue Dec 15 08:53:29 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

dmesg:
[So Jan 10 04:31:09 2021] raspberrypi-firmware soc:firmware: mbox_send_message returned -62
[So Jan 10 04:31:09 2021] cpu cpu0: dev_pm_opp_set_rate: failed to find current OPP for freq 18446744073709551554 (-34)

Based on the "dev_pm_opp_set_rate" error I assumed it was related to the cpufreq function so I changed the cpufreq governor to 'performance' and I haven't had any issues since. I'll continue testing and maybe someday try to debug the kernel but for now it's fixed by leaving the governor on performance.

If I'm right then this isn't a Ubuntu bug but really a kernel bug, specific to the Raspberry Pi.

affects: linux (Ubuntu) → linux-raspi (Ubuntu)
Revision history for this message
Birgit Edel (biredel) wrote :

The workaround of using "performance" cpufreq governor only appears fully effective if applied from system boot.

On 5.4 kernels I can just disable the ondemand.service to keep performance governor active. On 5.8 kernels I would need to switch back from an apparently changed default. The kernel cmdline param cpufreq.default_governor is not available until 5.9.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.