very very very slow even though a quite well system

Bug #2068804 reported by emreozkapi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

The system is working unbelivable slow on boot time, and irregularly sometimes during the runtime.

ProblemType: Bug
DistroRelease: Ubuntu Kylin 24.04
Package: gnome-shell 46.0-0ubuntu5.1 [origin: Ubuntu]
ProcVersionSignature: Ubuntu 6.8.0-35.35.1-lowlatency 6.8.4
Uname: Linux 6.8.0-35-lowlatency x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.28.1-0ubuntu3
Architecture: amd64
CasperMD5CheckResult: pass
CloudArchitecture: x86_64
CloudID: none
CloudName: none
CloudPlatform: none
CloudSubPlatform: config
CurrentDesktop: ubuntu:GNOME
Date: Sat Jun 8 18:56:08 2024
DisplayManager: gdm3
InstallationDate: Installed on 2023-01-05 (520 days ago)
InstallationMedia: Ubuntu 22.10 "Kinetic Kudu" - Release amd64 (20221020)
RelatedPackageVersions: mutter-common 46.0-1ubuntu9
SourcePackage: gnome-shell
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
emreozkapi (eozkapi) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

While the problem is happening, please run:

  ps -e -o pid,pcpu,comm --sort -pcpu | head -20 > tops.txt
  journalctl -b0 > journal.txt

and attach the resulting text files here.

affects: gnome-shell (Ubuntu) → ubuntu
Changed in ubuntu:
status: New → Incomplete
Revision history for this message
emreozkapi (eozkapi) wrote :

I am checking the output as well, and i will update if i can figure out any positive changes.

Revision history for this message
emreozkapi (eozkapi) wrote :

i tested the same system via USB-Live OS Ubuntu 24.04, no problem, boot was fast as one can expect from a live OS which runs on a SD-Card.

Revision history for this message
emreozkapi (eozkapi) wrote :

The software which takes time during the boot to start up, some of them i deleted completely. But there's no change.

containerd.io - removed

virtualbox - removed

FluidSynth - removed
fprintd - removed

geoclue - removed

Revision history for this message
emreozkapi (eozkapi) wrote :

libgeoclue-2-0 will be removed with configuration
gir1.2-geoclue-2.0 will be removed
gnome-control-center will be removed
gnome-initial-setup will be removed
gnome-session-flashback will be removed
gnome-settings-daemon will be removed
gnome-shell will be removed
gnome-shell-extension-desktop-icons-ng will be removed
gnome-shell-extension-ubuntu-dock will be removed
gnome-shell-extension-ubuntu-tiling-assistant will be removed
unity-control-center (version 15.04.0+23.04.20230220-0ubuntu8) will be installed
qemu-utils will be removed with configuration
geoclue-2.0 will be removed
qemu-block-extra will be removed
realmd - removed

plasma-discover-notifier will be removed with configuration
update-notifier will be removed with configuration
update-notifier-common will be removed with configuration
plasma-distro-release-notifier will be removed
ubuntu-mate-core will be removed
ubuntu-mate-desktop will be removed
ubuntu-release-upgrader-gtk will be removed
update-manager will be removed

packagekit will be removed with configuration
apturl will be removed
gnome-software will be removed
gnome-software-plugin-flatpak will be removed
gstreamer1.0-packagekit will be removed
muon will be removed
packagekit-tools will be removed
plasma-discover will be removed
plasma-discover-backend-flatpak will be removed
plasma-discover-backend-fwupd will be removed
plasma-discover-backend-snap will be removed
software-properties-common will be removed
software-properties-gtk will be removed
software-properties-qt will be removed
ubuntu-wsl will be removed

Revision history for this message
emreozkapi (eozkapi) wrote :

And after deleting all that packages, a bit faster, but still not even close to the speed of the SD-Card booted liveOS.

The last version of the outputs of the commands that you requested attached to the post.

P.S: I will wait a few days more, since i am not in hurry, but then if no chance to find out the reason(s), then i will be going on with the fresh install.

Revision history for this message
emreozkapi (eozkapi) wrote :
Revision history for this message
emreozkapi (eozkapi) wrote :
Revision history for this message
emreozkapi (eozkapi) wrote :

dev/nvme0n1 /dev/ng0n1 50026B7686280109 KINGSTON SKC3000S1024G 0x1 1.02 TB / 1.02 TB 512 B + 0 B EIFK31.6
root@ee-Z790-Steel-Legend-WiFi:/home/ee# nvme --smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning : 0
temperature : 33 °C (306 K)
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 1%
endurance group critical warning summary: 0
Data Units Read : 11570893 (5.92 TB)
Data Units Written : 16116819 (8.25 TB)
host_read_commands : 145283771
host_write_commands : 185858657
controller_busy_time : 1021
power_cycles : 924
power_on_hours : 2173
unsafe_shutdowns : 119
media_errors : 0
num_err_log_entries : 4271
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 2 : 48 °C (321 K)
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0

Revision history for this message
emreozkapi (eozkapi) wrote :

fdisk -l /dev/nvme0n1
Disk /dev/nvme0n1: 953,87 GiB, 1024209543168 bytes, 2000409264 sectors
Disk model: KINGSTON SKC3000S1024G
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2CFAD824-1047-4456-A7B4-52F549941301

Device Start End Sectors Size Type
/dev/nvme0n1p1 2048 1023999 1021952 499M Windows recovery environment
/dev/nvme0n1p2 1024000 1226751 202752 99M EFI System
/dev/nvme0n1p3 1226752 1259519 32768 16M Microsoft reserved
/dev/nvme0n1p4 1259520 510574591 509315072 242,9G Microsoft basic data
/dev/nvme0n1p5 510574592 511997951 1423360 695M Windows recovery environment
/dev/nvme0n1p6 512000000 1740799999 1228800000 585,9G Microsoft basic data
/dev/nvme0n1p7 1740800000 1996799999 256000000 122,1G Linux filesystem
/dev/nvme0n1p8 1996800000 2000408575 3608576 1,7G EFI System

Revision history for this message
emreozkapi (eozkapi) wrote :

df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 6,3G 5,9M 6,3G 1% /run
efivarfs 192K 127K 61K 68% /sys/firmware/efi/efivars
/dev/nvme0n1p7 120G 105G 8,9G 93% /
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5,0M 16K 5,0M 1% /run/lock
/dev/nvme0n1p2 95M 39M 57M 41% /boot/efi
tmpfs 6,3G 404K 6,3G 1% /run/user/1000
/dev/nvme0n1p6 586G 558G 29G 96% /media/ee/NVMeXpress
/dev/sda4 270G 110G 161G 41% /media/ee/667AD2E77AD2B353

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The boot time slowness looks related to hardware errors on the boot drive:

  Jun 12 22:55:56 ee-Z790-Steel-Legend-WiFi smartd[2365]: Device: /dev/nvme0, KINGSTON SKC3000S1024G, S/N:50026B7686280109, FW:EIFK31.6, 1.02 TB
  ...
  Jun 12 22:55:58 ee-Z790-Steel-Legend-WiFi smartd[2365]: Device: /dev/nvme0, NVMe error count increased from 4268 to 4271 (0 new, 2 ignored, 1 unknown)

The run time slowness looks related to rendering load, which is probably due to Firefox:

    PID %CPU COMMAND
   3520 25.8 Xorg
   6987 4.5 firefox-bin
   6287 3.2 cinnamon

but also the failing Kingston NVMe drive can make runtime slow too. This would make the installed system slow, but live sessions would remain fast because they don't touch the failing drive.

Revision history for this message
emreozkapi (eozkapi) wrote :
Download full text (3.3 KiB)

But it's weird, because Windows is running fine.

Now i am checking the logs, and after a bit research those errors are only resulted from an unsupported vendor specific log pages.

/home/ee# smartctl --all /dev/nvme0n1
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-35-lowlatency] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: KINGSTON SKC3000S1024G
Serial Number: 50026B7686280109
Firmware Version: EIFK31.6
PCI Vendor/Subsystem ID: 0x2646
IEEE OUI Identifier: 0x0026b7
Total NVM Capacity: 1.024.209.543.168 [1,02 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 1.4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1.024.209.543.168 [1,02 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 0026b7 6862801095
Local Time is: Thu Jun 13 09:24:30 2024 CEST
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005d): Comp DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x08): Telmtry_Lg
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 84 Celsius
Critical Comp. Temp. Threshold: 89 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
 0 + 8.80W - - 0 0 0 0 0 0
 1 + 7.10W - - 1 1 1 1 0 0
 2 + 5.20W - - 2 2 2 2 0 0
 3 - 0.0620W - - 3 3 3 3 2500 7500
 4 - 0.0620W - - 4 4 4 4 2500 7500

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
 0 + 512 0 2
 1 - 4096 0 1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 27 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 1%
Data Units Read: 11.592.273 [5,93 TB]
Data Units Written: 16.127.139 [8,25 TB]
Host Read Commands: 145.543.073
Host Write Commands: 185.983.811
Controller Busy Time: 1.022
Power Cycles: 926
Power On Hours: 2.175
Unsafe Shutdowns: 119
Media and Data Integrity Errors: 0
Error Information Log Entries: 4.286
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 2: 48 Celsius

Error Information (NVMe Log 0x01, 16 of 63 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS Message
  0 4286 0 0x0012 0x4004 0x028 0 0 - Invalid Field in Command
  1 4285 0 0x00...

Read more...

Revision history for this message
emreozkapi (eozkapi) wrote (last edit ):

nvme error-log /dev/nvme0n1 -o normal
Error Log Entries for device:nvme0n1 entries:63
.................
 Entry[ 0]
.................
error_count : 4288
sqid : 0
cmdid : 0x4004
status_field : 0x2109(Invalid Log Page: The log page indicated is invalid)
phase_tag : 0
parm_err_loc : 0x28
lba : 0
nsid : 0
vs : 0
trtype : The transport type is not indicated or the error is not transport related.
csi : 0
opcode : 0
cs : 0
trtype_spec_info: 0
log_page_version: 0

and the rest of the errors same as well.

Revision history for this message
emreozkapi (eozkapi) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK well it sounds like an NVMe hardware failure to me. That might be the drive, might be the motherboard, or it might just be a software bug. I'm not totally sure. But the freezes and behaviour you describe fit perfectly with an NVMe failure. I had one on a laptop of my own a few weeks ago and the symptoms were exactly the same. Replacing the drive fixed it for me.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The fact that Windows survives drive errors better than Linux isn't really surprising and doesn't prove anything on its own. But it might be useful to run some testing tools on the drive from within Windows in that case.

Revision history for this message
emreozkapi (eozkapi) wrote :

Hello Daniel,

Thanks for your perspective. But i could not find where it's grounded.

You said replacing the drive fixed the issue, but your act to change the drive to solve the problem, might be well meaning the software problem as well.

My NVME have much free space. There's no evidence of drive failure. And when i deleted the packages i mentioned above, the system got faster.

I will update that issue, once i reinstall the system.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

My laptop NVMe was experiencing different errors to yours. Mine was more obviously a hardware failure.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If nothing else, you know that booting Ubuntu from USB or from an SD card is one way to remove the NVMe from the equation when testing.

Revision history for this message
emreozkapi (eozkapi) wrote :

Problems started on the specific partition that's only happening after the system update.

I can't blame point out so easily, especially in those conditions i have.

Good that you fixed your different problem of yours.

affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
emreozkapi (eozkapi) wrote :

I could finally reinstall the system: and testing since ~ an hour. No problem, everything is well, running as usually expected. Ubuntu 24.04 LTS with full packages installed.

Meaning, the visibly no errors -as it can be in the best "ideal case". Since that means the defined conditions of errors are not satisfied by the current interactions of different hardware and the software, issue is closed. It was because of the problematic update from the old system. 23.10 to 24.04. All Hardware is running fine, as there was no material traces of hardware problem.

And whatever unknown unavoidable "errors" happening, is not causing any visible problems.

If i see a problematic symptoms or visible errors, i will be updating.

Thanks for brainstorming together Daniel.

Revision history for this message
emreozkapi (eozkapi) wrote :

Ubuntu Studio 24.04 LTS, i was using and the reinstalled system is also the same one.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.