Boot issues after running kernel 3.8.x

Bug #1116835 reported by Timothy G. Rundle
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned

Bug Description

PC fails to load BIOS after running the 3.8.x kernel until the power is completely cut off. It occurs in all 3.8.x versions and the mainline version as of 2/4. The issue does not occur with the 3.7.x builds.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: linux-image-3.8.0-4-generic 3.8.0-4.8
ProcVersionSignature: Ubuntu 3.8.0-4.8-generic 3.8.0-rc6
Uname: Linux 3.8.0-4-generic x86_64
ApportVersion: 2.8-0ubuntu4
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: timothy 2164 F.... pulseaudio
 /dev/snd/controlC2: timothy 2164 F.... pulseaudio
 /dev/snd/controlC0: timothy 2164 F.... pulseaudio
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
CurrentDmesg:

Date: Tue Feb 5 20:30:26 2013
HibernationDevice: RESUME=UUID=f1e9427d-1ba3-429c-a522-0f1bededfc67
InstallationDate: Installed on 2010-09-25 (864 days ago)
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.
MachineType: System manufacturer System Product Name
MarkForUpload: True
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-4-generic root=UUID=a1ae5a0a-8eec-4ed7-a592-2138e7969b04 ro radeon.modeset=1 quiet splash
RelatedPackageVersions:
 linux-restricted-modules-3.8.0-4-generic N/A
 linux-backports-modules-3.8.0-4-generic N/A
 linux-firmware 1.100
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to raring on 2012-11-17 (80 days ago)
dmi.bios.date: 08/23/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2801
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: M3A78-CM
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2801:bd08/23/2010:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnM3A78-CM:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. It would be very helpful to know the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v3.7 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-raring/
v3.8-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc1-raring/
v3.8-rc3: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc3-raring/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: performing-bisect
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

v3.7 final works just fine as well as 3.7.6, which I am currently running.
v3.8-rc1 doesn't work, nor does anything since (I have checked the last 2 night's daily build)

I noticed sp5100_tco fails to load in 3.7, but does load on 3.8-rc1. I am not sure if it is related in any manner, it just caught my attention.

dmesg from v3.7:

[ 14.431032] sp5100_tco: SP5100 TCO WatchDog Timer Driver v0.01
[ 14.431096] sp5100_tco: mmio address 0xfec000f0 already in use

dmesg from v3.8-rc1:

[ 10.854230] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.03
[ 10.854294] sp5100_tco: PCI Revision ID: 0x3a
[ 10.854341] sp5100_tco: Using 0xfed45000 for watchdog MMIO address
[ 10.854349] sp5100_tco: Last reboot was triggered by watchdog.
[ 10.854388] sp5100_tco: initialized (0xffffc90000678000). heartbeat=60 sec (n

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you see if blacklisting the sp5100_tco driver in v3.8-rc1 causes the bug to go away? If not, I'll start a kernel bisect between v3.7 and v3.8-rc1.

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

Yes blacklisting the sp5100_tco driver has resolved the issue.

Here is the command I ran:

   sudo su -c 'echo "blacklist sp5100_tco" >> /etc/modprobe.d/blacklist-watchdog.conf'

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This may be relates up upstream commit:
740fbddf5c3f9ad8b23c5d917ba1cc7e376a5104

I'll build a test kernel with that commit reverted and post a link to the kernel shortly.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit 740fbddf5c3f9ad8b23c5d917ba1cc7e376a5104 reverted.

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1116835

Can you test that kernel and report back if it has the bug or not.

Thanks in advance!

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

The test build does not have the bug. Everything works as expected.

Changed in linux (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing, Timothy. I sent a message to the patch author requesting feedback regarding this bug.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There are a couple of patches in upstream linux-next that may resolve this bug. Commits:
a6e26b1369b34760ca5e1f16039c9018bf68ce7e
adbdcc030be5f7bb54d229c04efdd0c6b7ed3ab7

I built a Raring test kernel with these two patches applied.

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1116835

Can you test that kernel and report back if it has the bug or not.

Thanks in advance!

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

The issue still occured in the test build with the patches.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing, Timothy. I reported the results of your testing upstream. It would be helpful if you could get a screen capture or digital picture of the panic and post it to the bug.

Thanks again for all your help.

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

The issue occurs before the bios loads. There is no text on the screen, simply a blank screen until I cut off all power to the PSU.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Timothy,

I received another patch from upstream. I'm building a new test kernel now.

The patch author is interested in the I/O resource of PC in which the problem has occurred, He requested the result of 'cat /proc/iomem'. The sp5100_tco driver does not have to be loaded when you run'cat /proc/iomem'.

I will post a link to the new test kernel shortly.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Raring test kernel with a new patch[0] from upstream applied.

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1116835

Can you test that kernel and report back if it has the bug or not.

Thanks in advance!

[0] http://lkml.org/lkml/2013/2/18/353

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

The latest test build resolves the issue as expected.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The upstream commit that introduced this regression will be revered upstream:
https://lkml.org/lkml/2013/2/23/10

Revision history for this message
Julian Wiedmann (jwiedmann) wrote :

Timothy G. Rundle,

could you please test the latest kernel in -updates (3.8.0-19.30) and
see if this bug is still reproducible? Thanks.

Should be fixed in 3.8.5-stable with

commit 18e4321276fcf083b85b788fee7cf15be29ed72a
Author: Takahisa Tanaka <email address hidden>
Date: Sun Mar 3 14:52:07 2013 +0900

    watchdog: sp5100_tco: Remove code that may cause a boot failure

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

Looks good. I rebooted and powered off/on a couple times without issue after performing the following:

timothy@thomas:~$ uname -a
Linux thomas 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

timothy@thomas:~$ sudo vi /etc/modprobe.d/blacklist-watchdog.conf

timothy@thomas:~$ grep sp5100 /etc/modprobe.d/blacklist-watchdog.conf
#blacklist sp5100_tco

 sudo update-initramfs -u

timothy@thomas:~$ sudo update-initramfs -u
[sudo] password for timothy:
update-initramfs: Generating /boot/initrd.img-3.8.0-19-generic

Here are the dmesg entries:

timothy@thomas:~$ dmesg |grep sp5100
[ 20.505566] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver v0.05
[ 20.505612] sp5100_tco: PCI Revision ID: 0x3a
[ 20.505629] sp5100_tco: failed to find MMIO address, giving up.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers