linux 4.10 and AMD Polaris11 card -> graphics crash
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Won't Fix
|
Medium
|
Unassigned | ||
Zesty |
Won't Fix
|
Medium
|
Unassigned |
Bug Description
Using 4.10.0-22-generic from Ubuntu and running any of the Unigine benchmarks (Heaven-4.0, Valley-1.0, Superposition-1.0) causes the screen to go black and the graphics system to crash.
The graphics card's fan stops working and sensors reports 511C, clearly wrong.
I can still login via SSH and attempt to stop X, however the application (e.g. heaven) just remains in a zombie state and the system is unusable, I can't start X again. In fact the graphics card ends up in a pretty bad state, because if I press the reset button the UEFI BIOS is not able to detect it anymore, I have to power the whole system off and on again to make the card work.
Upgrading to mainline 4.11.3 avoids this problem: all 3 benchmarks are running fine, with no crashes.
I've attached two dmesgs: one with the default, where IOMMU is on and I get lots of AMD-Vi warnings logged:
[ 439.903842] ------------[ cut here ]------------
[ 439.903848] WARNING: CPU: 5 PID: 0 at /build/
[ 439.903848] Modules linked in: overlay ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
[ 439.903873] snd_timer snd k10temp mac_hid soundcore tpm_infineon shpchp tcp_bbr sch_fq cuse parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid amdkfd amd_iommu_v2 amdgpu mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect r8169 sysimgblt fb_sys_fops mii drm ahci libahci fjes wmi
[ 439.903893] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.10.0-22-generic #24-Ubuntu
[ 439.903894] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A99FX PRO R2.0, BIOS 2501 04/07/2014
[ 439.903895] Call Trace:
[ 439.903896] <IRQ>
[ 439.903899] dump_stack+
[ 439.903900] __warn+0xcb/0xf0
[ 439.903901] warn_slowpath_
[ 439.903903] __domain_
[ 439.903904] __queue_
[ 439.903905] ? queue_flush_
[ 439.903907] queue_flush_
[ 439.903908] queue_flush_
[ 439.903910] call_timer_
[ 439.903911] run_timer_
[ 439.903912] ? ktime_get+0x41/0xb0
[ 439.903914] ? lapic_next_
[ 439.903916] ? clockevents_
[ 439.903918] __do_softirq+
[ 439.903919] irq_exit+0xb6/0xc0
[ 439.903921] smp_apic_
[ 439.903922] apic_timer_
[ 439.903924] RIP: 0010:cpuidle_
[ 439.903925] RSP: 0018:ffffb4e181
[ 439.903926] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000001f
[ 439.903926] RDX: 0000006665f96c97 RSI: ffff9dbcded56a98 RDI: 0000000000000000
[ 439.903927] RBP: ffffb4e181a23e98 R08: cccccccccccccccd R09: 0000000000000018
[ 439.903927] R10: 0000000000000da8 R11: 0000000000003557 R12: ffff9dbcd036b600
[ 439.903928] R13: ffffffffbaeeba38 R14: 0000000000000002 R15: ffffffffbaeeba20
[ 439.903929] </IRQ>
[ 439.903930] ? cpuidle_
[ 439.903931] cpuidle_
[ 439.903933] call_cpuidle+
[ 439.903934] do_idle+0x189/0x200
[ 439.903935] cpu_startup_
[ 439.903937] start_secondary
[ 439.903938] start_cpu+0x14/0x14
[ 439.903939] ---[ end trace 9edd64d3e01a6c8c ]---
And another one with iommu=soft boot option, where nothing interesting in dmesg shows up, but the system still crashes.
Note: if I turn IOMMU off completely then USB devices are not working and I cannot use my keyboard/mouse so I cannot test that scenario.
ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: linux-image-generic 4.10.0.22.24
ProcVersionSign
Uname: Linux 4.10.0-22-generic x86_64
ApportVersion: 2.20.4-0ubuntu4.1
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
/dev/snd/
/dev/snd/
Date: Tue Jun 6 21:09:45 2017
HibernationDevice: RESUME=
InstallationDate: Installed on 2017-03-25 (72 days ago)
InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Beta amd64 (20170321.1)
MachineType: To be filled by O.E.M. To be filled by O.E.M.
ProcEnviron:
LANGUAGE=en_GB:en
TERM=xterm
PATH=(custom, no user)
LANG=en_GB.UTF-8
SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=
PulseList:
Error: command ['pacmd', 'list'] failed with exit code 1: Home directory not accessible: Permission denied
No PulseAudio daemon running, or not running as session daemon.
RelatedPackageV
linux-
linux-
linux-firmware 1.164.1
RfKill:
0: phy0: Wireless LAN
Soft blocked: no
Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/07/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2501
dmi.board.
dmi.board.name: M5A99FX PRO R2.0
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: To be filled by O.E.M.
dmi.product.
dmi.sys.vendor: To be filled by O.E.M.
Changed in linux (Ubuntu): | |
status: | Confirmed → Won't Fix |
Changed in linux (Ubuntu Zesty): | |
status: | Confirmed → Won't Fix |
This change was made by a bot.