Load growing continously

Bug #1912188 reported by Tamas Papp
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-ubuntu
Fix Released
Unknown
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Recently load started growing with no end. After a while random processes become zombies.

 11:32:22 up 26 min, 1 user, load average: 196.16, 186.30, 138.60

ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: linux-generic 5.8.0.36.40+21.04.38
ProcVersionSignature: Ubuntu 5.8.0-36.40+21.04.1-generic 5.8.18
Uname: Linux 5.8.0-36-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu55
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: tompos 2361 F.... pulseaudio
CasperMD5CheckResult: skip
Date: Mon Jan 18 11:30:11 2021
InstallationDate: Installed on 2020-12-31 (17 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
MachineType: HP HP EliteBook x360 1030 G3
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.8.0-36-generic root=UUID=086c6507-b389-4208-a8bc-d0b99071b5cd ro rootflags=subvol=@
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-36-generic N/A
 linux-backports-modules-5.8.0-36-generic N/A
 linux-firmware 1.194
SourcePackage: linux
UpgradeStatus: Upgraded to hirsute on 2021-01-02 (15 days ago)
dmi.bios.date: 10/22/2020
dmi.bios.release: 14.1
dmi.bios.vendor: HP
dmi.bios.version: Q90 Ver. 01.14.01
dmi.board.name: 8438
dmi.board.vendor: HP
dmi.board.version: KBC Version 14.4E.00
dmi.chassis.type: 31
dmi.chassis.vendor: HP
dmi.ec.firmware.release: 20.78
dmi.modalias: dmi:bvnHP:bvrQ90Ver.01.14.01:bd10/22/2020:br14.1:efr20.78:svnHP:pnHPEliteBookx3601030G3:pvr:rvnHP:rn8438:rvrKBCVersion14.4E.00:cvnHP:ct31:cvr:
dmi.product.family: 103C_5336AN HP EliteBook
dmi.product.name: HP EliteBook x360 1030 G3
dmi.product.sku: 5DT80EC#BH4
dmi.sys.vendor: HP
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu56
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: tompos 2721 F.... pulseaudio
CasperMD5CheckResult: unknown
DistroRelease: Ubuntu 21.04
InstallationDate: Installed on 2020-12-31 (30 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
MachineType: HP HP EliteBook x360 1030 G3
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
Package: linux (not installed)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.8.0-33-generic root=UUID=086c6507-b389-4208-a8bc-d0b99071b5cd ro rootflags=subvol=@
ProcVersionSignature: Ubuntu 5.8.0-33.36-generic 5.8.17
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-33-generic N/A
 linux-backports-modules-5.8.0-33-generic N/A
 linux-firmware 1.194
Tags: hirsute
Uname: Linux 5.8.0-33-generic x86_64
UpgradeStatus: Upgraded to hirsute on 2021-01-02 (28 days ago)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 10/22/2020
dmi.bios.release: 14.1
dmi.bios.vendor: HP
dmi.bios.version: Q90 Ver. 01.14.01
dmi.board.name: 8438
dmi.board.vendor: HP
dmi.board.version: KBC Version 14.4E.00
dmi.chassis.type: 31
dmi.chassis.vendor: HP
dmi.ec.firmware.release: 20.78
dmi.modalias: dmi:bvnHP:bvrQ90Ver.01.14.01:bd10/22/2020:br14.1:efr20.78:svnHP:pnHPEliteBookx3601030G3:pvr:rvnHP:rn8438:rvrKBCVersion14.4E.00:cvnHP:ct31:cvr:
dmi.product.family: 103C_5336AN HP EliteBook
dmi.product.name: HP EliteBook x360 1030 G3
dmi.product.sku: 5DT80EC#BH4
dmi.sys.vendor: HP

Revision history for this message
Tamas Papp (tomposmiko) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Tamas Papp (tomposmiko) wrote :

I don't experience the same issue with 5.8.0-33-generic .

Revision history for this message
Terry Rudd (terrykrudd) wrote :

Are you running the 5.8.0-33 kernel in the same 21.04 user space?

Revision history for this message
Tamas Papp (tomposmiko) wrote :

Kernel in userspace?
Do you mean, if it's the same machine? Yes.

Revision history for this message
Terry Rudd (terrykrudd) wrote :

Yes Tamas. In other words, I believe you are confirming the only think changing back and forth is the kernel version you are running. Thank you for that.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

I can confirm that.

Revision history for this message
Terry Rudd (terrykrudd) wrote :

Thank you Tamas. You are likely aware we don't plan to support 5.8.* in 21.04 user space but we've very interested to make sure this issue is not present in 20.10 (groovy). Have you tested with that user space? Also, have you tried the 5.10 kernel in -proposed for 21.04? If you see the same or similar issue with that kernel, please let us know.

I would be very interested in results from both if you have time/ability test both. Thanks in advance.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

So far it looks good with

Linux RF-PC-BUD-036 5.10.0-12-generic #13-Ubuntu SMP Mon Jan 11 22:44:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

I don't know if there is Groovy around me. Will try to check.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

I was wrong, I get the same with this kernel too.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

I can see several ZFS panics in the dmesg. Can you please try it without ZFS?

Revision history for this message
Tamas Papp (tomposmiko) wrote :

Yeah, it works pretty good with no preblem without mounting the home directory of the other (my standard) user, which is actually and encrypted zfs filesystem.

Revision history for this message
Colin Ian King (colin-king) wrote :
affects: linux → zfs-ubuntu
Changed in zfs-ubuntu:
status: Unknown → New
Revision history for this message
Colin Ian King (colin-king) wrote :

If possible, can you install zfsutils-linux 0.8.4-1ubuntu12 and see if that helps. I believe there maybe an issue with a 5.9/5.10 backport in zfsutils-linux 0.8.4-1ubuntu13

Revision history for this message
Colin Ian King (colin-king) wrote :

@all, ignore comment #14

Revision history for this message
Colin Ian King (colin-king) wrote :

The solution is to use ZFS 2.0.1 with Hirsute, however this is stuck in the -proposed pocket at the moment because of a build failure issue of a dependency on zsys. I suggest using the zfs that is in the following PPA:

sudo add-apt-repository ppa:colin-king/zfs-hirsute
sudo apt-get update
sudo apt-get install zfs-dkms

This is the same as the package that is stuck in -proposed, and should allow you to use ZFS on that kernel w/o issues. Please try this and let me know if this helps.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

If you don't need testing, I would rather wait for the standard package and avoid installing dkms.
Or do you need testing?

Revision history for this message
Colin Ian King (colin-king) wrote :

Testing the zfs dkms driver would be really useful since the ZFS 2.0.1 drivers won't be landing in a hirsute kernel that soon.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

I get the same with ZFS 2.0.1, dmesg output attached, if it helps.

Revision history for this message
Tamas Papp (tomposmiko) wrote :

Today I get the same even with 5.8.0-33-generic.

I also can see the same ZFS errors in dmesg.

tags: added: apport-collected
description: updated
Revision history for this message
Tamas Papp (tomposmiko) wrote : AlsaInfo.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : CRDA.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : IwConfig.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : Lspci.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : Lspci-vt.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : Lsusb.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : Lsusb-t.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : Lsusb-v.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : PaInfo.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : ProcEnviron.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : ProcModules.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : RfKill.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : UdevDb.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : WifiSyslog.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote : acpidump.txt

apport information

Revision history for this message
Tamas Papp (tomposmiko) wrote :

After some debugging I figured out, there are files on the filesystem that cause the accessing processing to hang. Even a simple ls or find can stuck if tries to access the file for some meta data. I still don't know if it's a coincidence or not, but all files were some kind of browser cache objects (browser, slack client, teams client).

I copied everything to a new ZFS dataset and set it to my home folder. It worked like charm for 2 weeks. So I think it cannot be a HW failure or must be something a very weird bug...

Now I created a new ZFS encrypted filesystem and migrated my home folder to it again.
If it starts failing again, it would confirm that problem occurs only if encryption is enabled.

It's also worth mentioning, if I try to copy the zfs filesystem with zfs send, the machine gets locked up really soon. Even if it's caused by an HW eror, this must be a ZFS bug IMO.
zfs scrub does not find anything.

Changed in zfs-ubuntu:
status: New → Fix Released
To post a comment you must log in.