X freezes with kernel 4.15.0-23-generic (AMDGPU)

Bug #1777245 reported by Olaf Seibert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

This is on my parents machine; I only have remote access and not continuously, so this may slow down providing additional information.

Since the latest kernel upgrade (from
 Linux version 4.13.0-43-generic (buildd@lgw01-amd64-026) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3.2)) #48-Ubuntu SMP Wed May 16 12:18:48 UTC 2018 (Ubuntu 4.13.0-43.48-generic 4.13.16)
to
Linux version 4.15.0-23-generic (buildd@lgw01-amd64-055) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 (Ubuntu 4.15.0-23.25-generic 4.15.18)
the machine appears to freezes some time soon after booting. It is just X; the machine is still reachable via ssh.

I filed this bug using apport-cli, running the older (working) kernel, not the newer (failing) one.

In /var/log/kernel I can see the following:

Jun 15 23:26:23 xa-xubu kernel: [ 2417.562386] INFO: task Xorg:757 blocked for more than 120 seconds.
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562396] Not tainted 4.15.0-23-generic #25-Ubuntu
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562399] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562403] Xorg D 0 757 724 0x00400004
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562408] Call Trace:
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562424] __schedule+0x297/0x8b0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562430] ? __kfifo_in+0x37/0x50
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562434] schedule+0x2c/0x80
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562559] amd_sched_entity_push_job+0xad/0xf0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562565] ? wait_woken+0x80/0x80
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562653] amdgpu_job_submit+0x9f/0xc0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562723] amdgpu_vm_bo_update_mapping+0x389/0x3f0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562793] ? amdgpu_vm_it_iter_first+0x40/0x40 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562863] amdgpu_vm_bo_update+0x325/0x5b0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562930] amdgpu_gem_va_ioctl+0x524/0x540 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562962] ? drm_gem_handle_create_tail+0x120/0x190 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563028] ? amdgpu_gem_create_ioctl+0xc1/0x270 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563096] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563115] drm_ioctl_kernel+0x5f/0xb0 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563134] ? drm_ioctl_kernel+0x5f/0xb0 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563154] drm_ioctl+0x31b/0x3d0 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563220] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563225] ? update_load_avg+0x57f/0x6e0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563231] ? futex_wake+0x8f/0x180
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563290] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563296] do_vfs_ioctl+0xa8/0x630
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563300] ? __schedule+0x29f/0x8b0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563304] SyS_ioctl+0x79/0x90
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563309] do_syscall_64+0x73/0x130
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563313] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563317] RIP: 0033:0x7fbd7ddcf5d7
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563319] RSP: 002b:00007fff67e69aa8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563322] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fbd7ddcf5d7
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563324] RDX: 00007fff67e69af0 RSI: 00000000c0286448 RDI: 000000000000000e
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563326] RBP: 00007fff67e69af0 R08: 0000000101440000 R09: 000000000000000a
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563328] R10: 0000000000000039 R11: 0000000000003246 R12: 00000000c0286448
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563330] R13: 000000000000000e R14: 000055e820965f20 R15: 0

which seems to point to some in-kernel AMD GPU driver.
Since the problem seems to disappear when switching back to the previous kernel, I filed this as a kernel bug.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-23-generic 4.15.0-23.25
ProcVersionSignature: Ubuntu 4.13.0-43.48-generic 4.13.16
Uname: Linux 4.13.0-43-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: rhialto 21222 F.... pulseaudio
 /dev/snd/controlC0: rhialto 21222 F.... pulseaudio
Date: Sat Jun 16 16:03:46 2018
InstallationDate: Installed on 2017-10-29 (230 days ago)
InstallationMedia: Xubuntu 17.10 "Artful Aardvark" - Release amd64 (20171017.1)
IwConfig:
 enp1s0 no wireless extensions.

 lo no wireless extensions.
MachineType: LENOVO 90G9001RNY
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-43-generic.efi.signed root=UUID=103b739d-56cf-440b-a2b4-fc955e1a0a41 ro quiet splash vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-43-generic N/A
 linux-backports-modules-4.13.0-43-generic N/A
 linux-firmware 1.173.1
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-06-09 (7 days ago)
dmi.bios.date: 12/29/2016
dmi.bios.vendor: LENOVO
dmi.bios.version: O2HKT24A
dmi.board.name: Jadeite CRB
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40700 WIN 3258076524150
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnLENOVO:bvrO2HKT24A:bd12/29/2016:svnLENOVO:pn90G9001RNY:pvrideacentre310S-08ASR:rvnLENOVO:rnJadeiteCRB:rvrSDK0J40700WIN3258076524150:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: ideacentre 310S-08ASR
dmi.product.name: 90G9001RNY
dmi.product.version: ideacentre 310S-08ASR
dmi.sys.vendor: LENOVO

Revision history for this message
Olaf Seibert (rhialto) wrote :
Olaf Seibert (rhialto)
description: updated
Olaf Seibert (rhialto)
summary: - X freezes
+ X freezes with kernel 4.15.0-23-generic (AMDGPU)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: artful
Revision history for this message
Olaf Seibert (rhialto) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc1

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Olaf Seibert (rhialto) wrote :

Yes, the problem started after an upgrade. From what I can see in the logs it seems that first the machine was upgraded from Ubuntu 17.10 to 18.04, apparently without problem.
Then a few days later, the working kernel Ubuntu 4.13.0-43.48-generic 4.13.16 was replaced by 4.15.0-23.25-generic 4.15.18 which fails.

It will not be so easy to test another kernel, but I'll try. It will definitely be some days before I can report a result, I suspect.

Revision history for this message
Olaf Seibert (rhialto) wrote :

@jsalisbury: I had the machine tested with this kernel:

Linux xa-xubu 4.18.0-041800rc1-generic #201806162031 SMP Sun Jun 17 00:34:22 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

that I found via your link [0].

The result is that in this kernel, the problem is gone. So that sounds like good news!

I think there was another commenter asking me to try a (probably different) upstream kernel, but it seems gone now. To them I can report that I didn't try building a kernel myself, but hopefully their proposal was close enough to using the Ubuntu mainline kernel that this test gives them the answer they are looking for.

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.