Ubuntu 14.10 soft lockup with ATI Radeon R7 250 X

Bug #1386973 reported by Federico Tello Gentile
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Unassigned
Utopic
Fix Released
Medium
Unassigned
Vivid
Fix Released
Medium
Unassigned

Bug Description

I have a VGA controller Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde XT [Radeon HD 7770/8760 / R7 250X].
When I boot from the Utopic DVD I get to the language menu and I can select try ubuntu without installing, but after a few sencods I get "BUG: soft lockup - CPU#0 stuck for 22s!'" and the message just repeats every 22 secods forever.

I tried setting nomodeset using the F6 menu, but I got exactly the same result.

This is Ubuntu 14.10 64 bits which basically seems unable to boot with that hardware.

The experience was the same with 14.04, but I thought 14.10 would be better. In 14.04 disabling the video card and booting off the integrated graphics let me set nomodeset in /etc/default/grub and I was able boot with the R7 250X with the VESA driver. Installed Catalyst afterwards.

===
break-fix: - 16b036af31e1456cb69243a5a0c9ef801ecd1f17

Tags: patch trusty
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :

I added the kernel parameter softlockup_panic=1 and got a stacktrace.

Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :

https://github.com/alterapraxisptyltd/openatom/issues/1

Quote:
[linux] Infinite loop in pci_get_rom_size()

This is one of those issues that you find when putting supposedly stable code through unusual situations. I did expect any function in linux that is not part of radeon.ko to not be rock solid. Turns out that's not really the case.

If we have a PCIR structure with a zero size length, the loop iterating through those structure does not advance. It simply does "image += readw(pds + 16) * 512;", but if that field is zero we're back analyzing the same structure on the next loop. The way to get out of this loop is to set bit 7 of the type field. That's what 'last_image' does. If that bit is not set, with the above, that's an infinite loop.

Luckily, it doesn't crash the kernel, but it hangs any driver that calls the function under said circumstances. No more modprobe -r or unbinding. Reboot is needed. No idea why a firmware blob here is treated as trusted input.

Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Revision history for this message
Federico Tello Gentile (federicotg) wrote :

This is a kernel bug.

affects: xorg-server (Ubuntu) → linux-lts-trusty (Ubuntu)
tags: added: patch
Chris J Arges (arges)
affects: linux-lts-trusty (Ubuntu) → linux (Ubuntu)
Revision history for this message
Chris J Arges (arges) wrote :

@federicotg

You should submit this patch to upstream [1] and get some comments/review for this patch first. Looking upstream I don't see any obvious fixes for this.
Thanks for debugging!

[1] https://www.kernel.org/doc/Documentation/SubmittingPatches

--chris

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: trusty
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Federico Tello Gentile (federicotg) wrote :

This is the patch being integrated on kernel 3.20.
https://lkml.org/lkml/2015/1/23/737

Revision history for this message
Federico Tello Gentile (federicotg) wrote :
Andy Whitcroft (apw)
description: updated
tags: added: kernel-bug-break-fix
Changed in linux (Ubuntu Utopic):
status: New → Confirmed
Changed in linux (Ubuntu Trusty):
status: New → Confirmed
Changed in linux (Ubuntu Utopic):
importance: Undecided → Medium
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
Revision history for this message
Andy Whitcroft (apw) wrote :

This has now hit mainline as the commit below:

  commit 16b036af31e1456cb69243a5a0c9ef801ecd1f17
  Author: Michel Dänzer <email address hidden>
  Date: Mon Jan 19 17:53:20 2015 +0900

    PCI: Fix infinite loop with ROM image of size 0

As this is CC: stable, we should expect this fix to hit our trees naturally as a result via upstream stable. Marking this up to that effect.

Andy Whitcroft (apw)
Changed in linux (Ubuntu Vivid):
status: Confirmed → Fix Committed
Andy Whitcroft (apw)
Changed in linux (Ubuntu Precise):
importance: Undecided → High
Changed in linux (Ubuntu Trusty):
importance: Medium → High
Changed in linux (Ubuntu Precise):
importance: High → Medium
Changed in linux (Ubuntu Trusty):
importance: High → Medium
no longer affects: linux (Ubuntu Precise)
Andy Whitcroft (apw)
Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
Revision history for this message
Federico Tello Gentile (federicotg) wrote :

I can confirm the bug is fixed on today's Vivid daily image. I was able to boot and play in the live session. I verified the radeon driver with kernel mode setting was in use.

Andy Whitcroft (apw)
Changed in linux (Ubuntu Utopic):
status: Confirmed → Fix Committed
Andy Whitcroft (apw)
Changed in linux (Ubuntu Trusty):
status: Confirmed → Fix Committed
Andy Whitcroft (apw)
Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
Andy Whitcroft (apw)
Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
tags: removed: kernel-bug-break-fix
Revision history for this message
Christian Zigotzky (chzigotzky-xenosoft) wrote :

Hi All,

Thank you very much for fixing this bug. Have you applied this fix to the 3.13 LTS kernel yet? If not, could you apply this fix to the LTS kernel 3.13, please?

Kernel 3.13: http://www.ubuntuupdates.org/package/canonical_kernel_team/trusty/main/base/linux-source-3.13.0

Thanks in advance.

Christian

Revision history for this message
Federico Tello Gentile (federicotg) wrote :

It is fixed in the 3.13 since 2015-05-07.
I'm using 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux.

To post a comment you must log in.