Kernel oops - BUG: unable to handle kernel NULL pointer dereference at 0000000000000030; IP: [<ffffffff811c07c7>] touch_atime+0x17/0x140

Bug #1249719 reported by Rick Lavoie
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Since 3.11.0-13-generic, I've been encountering the following kernel oops. I can trigger it quite reliably by watching full-screen videos on YouTube using google-chrome (Version 31.0.1650.48) for roughly 10 minutes. Once the oops happens, the system remains up, but certain programs (like top) start hanging and a reboot is needed to restore full functionality.

I've tried using 3.12.0-999-generic (3.12.0-999.201311080433), but get the same problem there. I don't get the oops when booting into 3.11.0-12-generic, so I suspect the problem was introduced in 3.11.0-13-generic (especially since I see there were some changes made related to the shmat system call).

Attached is the oops from dmesg. The EIP location and stack-trace are identical for all the oops I've seen so far.

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image-3.11.0-13-generic 3.11.0-13.20
ProcVersionSignature: Ubuntu 3.11.0-13.20-generic 3.11.6
Uname: Linux 3.11.0-13-generic x86_64
ApportVersion: 2.12.5-0ubuntu2.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: rick 1421 F.... pulseaudio
 /dev/snd/controlC1: rick 1421 F.... pulseaudio
 /dev/snd/controlC0: rick 1421 F.... pulseaudio
Date: Sat Nov 9 23:24:33 2013
HibernationDevice: RESUME=UUID=0ad92a6e-2969-424f-b469-faaa9f2ec20f
InstallationDate: Installed on 2012-10-19 (387 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017.5)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. X58A-UD3R
MarkForUpload: True
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.11.0-13-generic root=UUID=cf7d6665-f4a9-4be8-a12d-c72a985ea5eb ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-13-generic N/A
 linux-backports-modules-3.11.0-13-generic N/A
 linux-firmware 1.116
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to saucy on 2013-10-19 (22 days ago)
dmi.bios.date: 03/11/2010
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: F5
dmi.board.name: X58A-UD3R
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrF5:bd03/11/2010:svnGigabyteTechnologyCo.,Ltd.:pnX58A-UD3R:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnX58A-UD3R:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: X58A-UD3R
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Rick Lavoie (rlavoie83) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: performing-bisect regression-release
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v3.11.4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11.4-saucy/
v3.11.5: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11.5-saucy/
v3.11.6: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11.6-saucy/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

Hello,

I tested the three specified kernels, and v3.11.6 was the first one in which I could trigger the oops.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v3.11.5 and v3.11.6. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
b340404549c872099aec7554439a1821c0708878

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

The 3.11.5-031105.201311151139_amd64 (b340404549c872099aec7554439a1821c0708878) kernel is good, no oops.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
dbb563ec7b143323dd31520ab08c08394bc13381

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

The 3.11.5-031105.201311181306_amd64 (dbb563ec7b143323dd31520ab08c08394bc13381) kernel is good, no oops.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
8bacb5ad41da1fde1c05165515c50e2911f16899

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

Will do, though the new specified commit (8bacb5ad41da1fde1c05165515c50e2911f16899) is actually before the previous commit (dbb563ec7b143323dd31520ab08c08394bc13381), which usually happens if "bad" is reported to the bisection, whereas all the results to date have been "good".

Revision history for this message
Rick Lavoie (rlavoie83) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

You are correct. I actually marked commit dbb563ec7b143323dd31520ab08c08394bc13381 as bad, when you reported it as good. I updated the bisect and the next commit will be:
1104f1b038765d0e31262eecdea6010907194226

I'm building that kernel now.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1104f1b038765d0e31262eecdea6010907194226

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

The 3.11.5-031105.201311211301_amd64 (1104f1b038765d0e31262eecdea6010907194226) kernel seems to be good, no oops.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
6b8fe6d417f93fa628779b1afd9e7717bbcb0572

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

The 3.11.5-031105.201311251400_amd64 (6b8fe6d417f93fa628779b1afd9e7717bbcb0572) kernel is bad, I was able to trigger the oops. However, the oops was only triggered after several days of usage, making me uneasy that the cause of the oops might be present in earlier kernels and I simply failed to trigger it.

In the meantime until I get a new kernel to test, I'll go back to some previous kernels and test them a bit more to see if I can trigger anything.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
5b284be46733010b15406e15e61871cc1c30dd5e

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Rick Lavoie (rlavoie83) wrote :

I saw that 3.11.0-15 included a couple of backported fixes for shm races, including one (8380e1f50d3640498a3fec272ce32199de788a2d) that described my bug nearly exactly. I've been testing this kernel for the last couple of days, and have been unable to reproduce the kernel oops, so I think we can call this bug closed.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Great, thanks for the update, Rick!

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.