Kernel Oops - unable to handle kernel NULL pointer dereference at (null); Call Trace: [<ffffffff810fb39b>] ? audit_compare_dname_path+0x2b/0xa0
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | linux (Ubuntu) |
Undecided
|
Unassigned | ||
| | Trusty |
Critical
|
Chris J Arges | ||
| | Utopic |
Critical
|
Chris J Arges | ||
Bug Description
[Impact]
Ubuntu VMWare instances running 3.13.0-51 will crash with the following backtrace:
[ 12.357276] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 12.357886] IP: [<ffffffff8136c
[ 12.358457] PGD 230fe9067 PUD 230d5c067 PMD 0
[ 12.359034] Oops: 0000 [#1] SMP
[ 12.359590] Modules linked in: tcp_diag inet_diag vmw_vsock_
[ 12.364773] CPU: 2 PID: 1718 Comm: fail2ban-server Not tainted 3.13.0-51-generic #84-Ubuntu
[ 12.365587] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/14/2014
[ 12.367276] task: ffff880230fc3000 ti: ffff8802308c4000 task.ti: ffff8802308c4000
[ 12.368159] RIP: 0010:[<
[ 12.369073] RSP: 0018:ffff880230
[ 12.369963] RAX: 000000000000000d RBX: 000000000000000d RCX: 0000000000002df0
[ 12.370973] RDX: 0000000000000012 RSI: 0000000000000000 RDI: 0000000000000000
[ 12.372005] RBP: ffff8802308c5d90 R08: ffff8800b9218648 R09: ffff8802308c5d60
[ 12.372988] R10: 0000000000000002 R11: ffff88023082e180 R12: 0000000000000012
[ 12.373901] R13: 0000000000000000 R14: ffff880231f1b3f8 R15: ffff8800b9218460
[ 12.374827] FS: 00007f196f84c74
[ 12.375752] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 12.376667] CR2: 0000000000000000 CR3: 0000000230872000 CR4: 00000000000407e0
[ 12.377684] Stack:
[ 12.378612] ffffffff810fb39b 0000000000000000 0000000000000004 ffff88022ff74838
[ 12.379559] ffff8800b9218400 ffff8800b9218460 ffff8802308c5df8 ffffffff810fdb36
[ 12.380516] ffffffff811d56e0 000000042ff74838 ffff880231f1b3c0 ffff88022febecf8
[ 12.381506] Call Trace:
[ 12.382630] [<ffffffff810fb
[ 12.383784] [<ffffffff810fd
[ 12.384912] [<ffffffff811d5
[ 12.386013] [<ffffffff811ca
[ 12.387145] [<ffffffff816bf
[ 12.388207] [<ffffffff810ff
[ 12.389250] [<ffffffff8160d
[ 12.390297] [<ffffffff8172e
[ 12.391303] [<ffffffff8160e
[ 12.392426] [<ffffffff81733
[ 12.393581] Code: 89 f8 48 89 e5 f6 82 40 c7 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c7 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80
[ 12.396831] RIP [<ffffffff8136c
[ 12.397812] RSP <ffff8802308c5d60>
[ 12.398769] CR2: 0000000000000000
[ 12.399743] ---[ end trace 2c5a33d31a03347e ]---
We've also seen this on our precise machines that are running the backported trusty kernel.
When reverting to kernel 3.13.0-49 this no longer occurs.
[Test Case]
1) Run an Ubuntu VMWare instance with the affected kernel.
apt-get install auditd
echo "-w /etc/test" >>/etc/
/etc/init.d/auditd restart
apt-get install linux-headers-
reboot
attempt to login or ssh into the host - you'll get a similar stacktrace.
[Fix]
commit fcf22d8267ad260
--
uname -a:
Linux search-2 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
cat /proc/version_
Ubuntu 3.13.0-
| Alex Tomlins (alex-tomlins) wrote : | #1 |
| Alex Tomlins (alex-tomlins) wrote : | #2 |
| Changed in linux (Ubuntu): | |
| status: | New → Confirmed |
| tags: | added: trusty |
| Chris J Arges (arges) wrote : | #4 |
I suspect commit 4a92843601ad0f5
Commit fcf22d8267ad260
| Changed in linux (Ubuntu Trusty): | |
| assignee: | nobody → Chris J Arges (arges) |
| status: | New → Confirmed |
| Changed in linux (Ubuntu): | |
| status: | Confirmed → New |
| Changed in linux (Ubuntu Utopic): | |
| assignee: | nobody → Chris J Arges (arges) |
| Changed in linux (Ubuntu Trusty): | |
| status: | Confirmed → In Progress |
| importance: | Undecided → Medium |
| Changed in linux (Ubuntu Utopic): | |
| status: | New → In Progress |
| importance: | Undecided → Medium |
| Brad Figg (brad-figg) wrote : | #5 |
This change was made by a bot.
| Changed in linux (Ubuntu): | |
| status: | New → Confirmed |
| Chris J Arges (arges) wrote : | #6 |
Alex,
Can you test the following build to see if it fixes your issue?
http://
Thanks
| Changed in linux (Ubuntu Trusty): | |
| importance: | Medium → Critical |
| Changed in linux (Ubuntu Utopic): | |
| importance: | Medium → Critical |
| description: | updated |
| Pete Cheslock (pete-cheslock) wrote : | #7 |
It looks like this might be related? https:/
| Pete Cheslock (pete-cheslock) wrote : | #8 |
I've tested the build from http://
| tags: | added: regression-update |
| description: | updated |
| Chris J Arges (arges) wrote : | #9 |
Sent patches for 3.13/3.16 to kernel team ML for review.
| Alex Tomlins (alex-tomlins) wrote : | #10 |
Hi Chris, thanks for the speedy response to this.
To add another confirmation: I've tested your build on a couple of our servers, and I'm no longer seeing the Oops, so this looks to have addressed the issue.
thanks,
Alex
| Changed in linux (Ubuntu): | |
| status: | Confirmed → Fix Released |
| Philipp Kern (pkern) wrote : | #11 |
What's the ETA for trusty?
| Chris J Arges (arges) wrote : | #12 |
The fix is currently in the -proposed kernel. (3.13.0-52.85)
| Roman Fiedler (roman-fiedler) wrote : | #13 |
https:/
So if both have common cause (very likely), then 3.13.0-52.85 is only incomplete fix.
| Alex Tomlins (alex-tomlins) wrote : | #15 |
Scratch that last comment...
I see (from http://
| Janne Snabb (snabb) wrote : | #16 |
I encountered this issue on a Hetzner VPS. It is a KVM based virtual server, not VMware. After rebooting I was unable to login through ssh. Accessing the system from console was possible thugh running many commands resulted in "Killed". The kernel stack trace was the same as in the original bug report.
There is no quick and dirty workaround documented yet on this bug report, so I add it.
Do the following to get your system quickly back to usable state while waiting for the patched kernel:
1) disable starting "auditd" at boot (for example "chmod 000 /etc/init.d/auditd" is an easy and ugly way to do it)
2) reboot the system (in my case the "reboot" command did not work, I had to hard-reset the system)
Done.
| Jinn Ko (jinnko) wrote : | #17 |
Janne, good point. There's another possible workaround in certain circumstances. You can also clear the auditd rules which should allow you to continue working on a running system. This would be done by issuing an "auditctl -D", after which you should be able to use the running system, albeit without any auditing.
| Jeroen Pulles (jeroen-pulles) wrote : | #18 |
> Janne, good point. There's another possible workaround in certain circumstances. You can also clear the auditd rules which should allow you to continue working on a running system. This would be done by issuing an "auditctl -D", after which you should be able to use the running system, albeit without any auditing.
I have various systems with auditing that trigger the null reference without audit rules on the specific pieces. Ie. `chmod 0755 /run/foobar` hangs, even though a system only has a fs write rule for /etc/something; I am not sure that clearing the rules is enough. Like you said: "possible" and "certain circumstances".
| David Andruczyk (david-andruczyk) wrote : | #19 |
3.13.0.52-85 still has the same panic related to the audit subsystem....
| Adam Conrad (adconrad) wrote : | #20 |
Right, the fix is in 3.13.0-52.86, not 3.13.0-52.85.
| Chris J Arges (arges) wrote : | #21 |
This kernel has the patches that fix the issue:
https:/
If you can please verify this this kernel and post the results to this bug.
Thanks,
| David Andruczyk (david-andruczyk) wrote : | #22 |
3.13.0-52.86 DOES work and no longer exhibits the crash/oops when booted.
| Chris J Arges (arges) wrote : | #23 |
David,
Woo hoo. Thanks and sorry about the confusion regarding kernel versions earlier.
| tags: | added: verification-done |
| Launchpad Janitor (janitor) wrote : | #24 |
This bug was fixed in the package linux - 3.13.0-52.86
---------------
linux (3.13.0-52.86) trusty; urgency=low
[ Brad Figg ]
* Release Tracking Bug
- LP: #1451288
[ Upstream Kernel Changes ]
* audit: create private file name copies when auditing inodes
- LP: #1450442
-- Brad Figg <email address hidden> Sun, 03 May 2015 18:36:19 -0700
| Changed in linux (Ubuntu Trusty): | |
| status: | In Progress → Fix Released |
| status: | In Progress → Fix Released |
| Launchpad Janitor (janitor) wrote : | #26 |
This bug was fixed in the package linux - 3.16.0-37.51
---------------
linux (3.16.0-37.51) utopic; urgency=low
[ Brad Figg ]
* Release Tracking Bug
- LP: #1451489
[ Upstream Kernel Changes ]
* Fix a broken backport causing boot failure on gen8 Intel
- LP: #1449401
-- Brad Figg <email address hidden> Mon, 04 May 2015 09:42:43 -0700
| Changed in linux (Ubuntu Utopic): | |
| status: | In Progress → Fix Released |
| status: | In Progress → Fix Released |
| Pete Cheslock (pete-cheslock) wrote : | #28 |
I'm still able to recreate this issue with kernel version 3.13.0-52-generic #85-Ubuntu SMP Wed Apr 29 16:44:17 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
It looks like a different set of audit rules causes the same issue.
To replicate:
Install 3.13.0-52-generic kernel
apt-get install auditd
in /etc/audit/
---
-D
-b 5000
-f 0
-r 15000
-a exit,always -F arch=b64 -S execve -S exit -S exit_group -S fork -S clone -S vfork -S accept -S accept4 -S connect -S bind -S listen
---
restart auditd
below stacktrace happens.
Stacktrace:
[ 186.897309] BUG: unable to handle kernel NULL pointer dereference at 0000000000000690
[ 186.897322] IP: [<ffffffff8136c
[ 186.897331] PGD 0
[ 186.897334] Oops: 0000 [#1] SMP
[ 186.897339] Modules linked in: dm_crypt crct10dif_pclmul crc32_pclmul ghash_clmulni_intel isofs aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd
[ 186.897357] CPU: 0 PID: 2206 Comm: sudo Not tainted 3.13.0-52-generic #85-Ubuntu
[ 186.897363] task: ffff880003286000 ti: ffff880002a04000 task.ti: ffff880002a04000
[ 186.897368] RIP: e030:[<
[ 186.897375] RSP: e02b:ffff880002
[ 186.897379] RAX: ffff880002a05d40 RBX: 0000000000000690 RCX: 0000000000000000
[ 186.897382] RDX: 0000000000000036 RSI: 0000000000000690 RDI: 0000000000000690
[ 186.897385] RBP: ffff880002a05e08 R08: 0000000000000000 R09: 000000000000fffe
[ 186.897389] R10: 0000000000000000 R11: ffff880002a05c06 R12: ffff8801d298f340
[ 186.897393] R13: 0000000000000000 R14: ffff8801d0fa2000 R15: 0000000000000000
[ 186.897401] FS: 00007f4a9437084
[ 186.897408] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 186.897412] CR2: 0000000000000690 CR3: 00000000031f5000 CR4: 0000000000002660
[ 186.897418] Stack:
[ 186.897420] ffffffff810f7fda ffff8801d298f340 ffff8801d0fa2060 ffff880002a05e78
[ 186.897425] ffffffff810f9581 ffffffff8172a480 ffffffff81c55740 ffff880002a05e60
[ 186.897430] ffffffff8172a480 ffff880002a05ef0 ffff880002a05e60 ffffffff810f6b93
[ 186.897435] Call Trace:
[ 186.897441] [<ffffffff810f7
[ 186.897445] [<ffffffff810f9
[ 186.897451] [<ffffffff8172a
[ 186.897455] [<ffffffff8172a
[ 186.897459] [<ffffffff810f6
[ 186.897463] [<ffffffff810fb
[ 186.897467] [<ffffffff810fe
[ 186.897472] [<ffffffff81733
[ 186.897474] Code: 89 f8 48 89 e5 f6 82 40 c7 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c7 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80
[ 186.897508] RIP [<ffffffff8136c
[ 186.897511] RSP <ffff880002a05df0>
[ 186.897513] CR2: 0000000000000690
[ 186.897516] ---[ end trace 2626030fc35ecb54 ]---
The problem was resolved in #86, not #85
--
David J. Andruczyk
Systems Administrator
University IT - Enterprise Applications
44 Celebration Drive, Suite 3-100
Rochester, NY 14627
E-mail: <email address hidden>
Office: 585-275-9106
-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Pete Cheslock
Sent: Friday, May 15, 2015 11:55 AM
To: Andruczyk, David
Subject: [Bug 1450442] Re: Kernel Oops - unable to handle kernel NULL pointer dereference at (null); Call Trace: [<ffffffff810fb
I'm still able to recreate this issue with kernel version 3.13.0-52-generic #85-Ubuntu SMP Wed Apr 29 16:44:17 UTC 2015 x86_64
x86_64 x86_64 GNU/Linux
It looks like a different set of audit rules causes the same issue.
To replicate:
Install 3.13.0-52-generic kernel
apt-get install auditd
in /etc/audit/
---
-D
-b 5000
-f 0
-r 15000
-a exit,always -F arch=b64 -S execve -S exit -S exit_group -S fork -S clone -S vfork -S accept -S accept4 -S connect -S bind -S listen
---
restart auditd
below stacktrace happens.
Stacktrace:
[ 186.897309] BUG: unable to handle kernel NULL pointer dereference at 0000000000000690
[ 186.897322] IP: [<ffffffff8136c
[ 186.897331] PGD 0
[ 186.897334] Oops: 0000 [#1] SMP
[ 186.897339] Modules linked in: dm_crypt crct10dif_pclmul crc32_pclmul ghash_clmulni_intel isofs aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd
[ 186.897357] CPU: 0 PID: 2206 Comm: sudo Not tainted 3.13.0-52-generic #85-Ubuntu
[ 186.897363] task: ffff880003286000 ti: ffff880002a04000 task.ti: ffff880002a04000
[ 186.897368] RIP: e030:[<
[ 186.897375] RSP: e02b:ffff880002
[ 186.897379] RAX: ffff880002a05d40 RBX: 0000000000000690 RCX: 0000000000000000
[ 186.897382] RDX: 0000000000000036 RSI: 0000000000000690 RDI: 0000000000000690
[ 186.897385] RBP: ffff880002a05e08 R08: 0000000000000000 R09: 000000000000fffe
[ 186.897389] R10: 0000000000000000 R11: ffff880002a05c06 R12: ffff8801d298f340
[ 186.897393] R13: 0000000000000000 R14: ffff8801d0fa2000 R15: 0000000000000000
[ 186.897401] FS: 00007f4a9437084
[ 186.897408] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 186.897412] CR2: 0000000000000690 CR3: 00000000031f5000 CR4: 0000000000002660
[ 186.897418] Stack:
[ 186.897420] ffffffff810f7fda ffff8801d298f340 ffff8801d0fa2060 ffff880002a05e78
[ 186.897425] ffffffff810f9581 ffffffff8172a480 ffffffff81c55740 ffff880002a05e60
[ 186.897430] ffffffff8172a480 ffff880002a05ef0 ffff880002a05e60 ffffffff810f6b93
[ 186.897435] Call Trace:
[ 186.897441] [<ffffffff810f7
[ 186.897445] [<ffffffff810f9
[ 186.897451] [<ffffffff8172a
[ 186.897455] [<ffffffff8172a
[ 186.897459] [<ffffffff810f6
[ 186.897463] [<...
| Simon Déziel (sdeziel) wrote : | #30 |
On 05/15/2015 11:55 AM, Pete Cheslock wrote:
> I'm still able to recreate this issue with kernel version
> 3.13.0-52-generic #85-Ubuntu SMP Wed Apr 29 16:44:17 UTC 2015 x86_64
> x86_64 x86_64 GNU/Linux
The fix landed in the kernel (#86) right after the one you are running
(#85).
| Pete Cheslock (pete-cheslock) wrote : | #31 |
Ah - crap - sorry about that. You are right. Thanks!


This change was made by a bot.