Kernel panic : mempolicy potential use-after-free on server running mongodb
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| linux (Ubuntu) |
High
|
Jay Vosburgh | ||
| Precise |
High
|
Jay Vosburgh |
Bug Description
PID: 21767 TASK: ffff8800874bdc00 CPU: 12 COMMAND: "mongod"
#0 [ffff880657cc3820] machine_kexec at ffffffff810393da
#1 [ffff880657cc3890] crash_kexec at ffffffff810b53f8
#2 [ffff880657cc3960] oops_end at ffffffff8165e528
#3 [ffff880657cc3990] die at ffffffff810178d8
#4 [ffff880657cc39c0] do_trap at ffffffff8165de94
#5 [ffff880657cc3a20] do_invalid_op at ffffffff81014f65
#6 [ffff880657cc3ac0] invalid_op at ffffffff8166796b
[exception RIP: slab_node+46]
RIP: ffffffff8115a66e RSP: ffff880657cc3b70 RFLAGS: 00010097
RAX: 0000000000000000 RBX: ffff880657802c00 RCX: 00000000e62f6aef
RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff880abf18a288
RBP: ffff880657cc3b80 R8: 0000000000000001 R9: 0000000100100010
R10: 0000000000000000 R11: 0000000000000022 R12: 0000000000000002
R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000020
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff880657cc3b88] get_any_partial at ffffffff816496a0
#8 [ffff880657cc3c18] __slab_alloc at ffffffff816498cf
#9 [ffff880657cc3cc8] __kmalloc_
#10 [ffff880657cc3d38] __alloc_skb at ffffffff815364c8
#11 [ffff880657cc3d88] __netdev_alloc_skb at ffffffff81536b14
#12 [ffff880657cc3da8] enic_rq_alloc_buf at ffffffffa005484c [enic]
#13 [ffff880657cc3e08] enic_poll_msix at ffffffffa00559ff [enic]
#14 [ffff880657cc3e58] net_rx_action at ffffffff81545274
#15 [ffff880657cc3ec8] __do_softirq at ffffffff8106f5f8
#16 [ffff880657cc3f38] call_softirq at ffffffff81667bec
#17 [ffff880657cc3f50] do_softirq at ffffffff81016305
#18 [ffff880657cc3f70] irq_exit at ffffffff8106f9de
#19 [ffff880657cc3f80] do_IRQ at ffffffff816684a3
--- <IRQ stack> ---
#20 [ffff880544d8bd48] ret_from_intr at ffffffff8165d82e
[exception RIP: __slab_free+737]
RIP: ffffffff81649467 RSP: ffff880544d8bdf8 RFLAGS: 00000202
RAX: 0000000000000001 RBX: ffffffffff0a0210 RCX: 0000000180aa00a9
RDX: 0000000180aa00aa RSI: ffffea002afc6201 RDI: ffff880657806200
RBP: ffff880544d8bea8 R8: 0000000000000001 R9: 0000000000000000
R10: ffff8800874be020 R11: ffff8800874be030 R12: ffff880544d8be33
R13: 000000000000000d R14: ffffffff81191895 R15: ffff880544d8bdb8
ORIG_RAX: ffffffffffffff54 CS: 0010 SS: 0018
#21 [ffff880544d8be30] __change_pid at ffffffff81087dca
#22 [ffff880544d8beb0] kmem_cache_free at ffffffff81163634
#23 [ffff880544d8bef0] __mpol_put at ffffffff81159937
#24 [ffff880544d8bf00] do_exit at ffffffff8106c75c
#25 [ffff880544d8bf70] sys_exit at ffffffff8106caf7
#26 [ffff880544d8bf80] system_
RIP: 00007f6f476b8f37 RSP: 00007f68cbcfdbb0 RFLAGS: 00000202
RAX: 000000000000003c RBX: ffffffff81665982 RCX: ffffffffffffffff
RDX: 00007f68cbcfe700 RSI: 00007f6f478c9250 RDI: 0000000000000000
RBP: 0000000000000000 R8: 00007f68cbcfe700 R9: 00007f68e82a0370
R10: 000000007fffffff R11: 0000000000000246 R12: ffffffff8106caf7
R13: ffff880544d8bf78 R14: 0000000000000003 R15: 00007f68f8744a10
ORIG_RAX: 000000000000...
Changed in linux (Ubuntu): | |
status: | New → Triaged |
assignee: | nobody → Louis Bouchard (louis-bouchard) |
importance: | Undecided → High |
Changed in linux (Ubuntu Precise): | |
status: | New → Triaged |
assignee: | nobody → Louis Bouchard (louis-bouchard) |
importance: | Undecided → High |
tags: | added: kernel-da-key precise |
Louis Bouchard (louis) wrote : | #1 |
Changed in linux (Ubuntu Precise): | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu): | |
status: | Triaged → Incomplete |
lois garcia (lois-garcia-f) wrote : | #3 |
I have one server, previously affected by the bug, that has been stable for 8 days on 3.8.0-30-generic.
We also just provisioned 24 servers with 3.2.0-57-generic (not yet in production).
If I can provide any information to you that would help, please let me know through the ticket.
Louis Bouchard (louis) wrote : | #5 |
@christopher,
Lois was provided with a custom kernel that includes a patch to be tested. The patch comes from apw, following a kernel dump analysis that I did. From what we could gather, a possible race condition could be responsible for the panics.
So far, I have yet to get the confirmation from Lois that ONLY the kernel with the custom patch has fixed the problem, or that a newer kernel have stabilized the situation.
Since the 3.8.* kernel is available on precise, I'm not sure that identifying the specific commit that fixes the issue would be useful. Even if we identify the commit, a backport might not be posssible.
I would advise to go to the newer kernel
I'm sorry if this bug appears to be inactive, but the long period w/o any comment is caused by the fact that the issue does not happen on regular intervals
Changed in linux (Ubuntu): | |
status: | Incomplete → Triaged |
status: | Triaged → In Progress |
lois garcia (lois-garcia-f) wrote : | #6 |
Gentlemen, so far, both the custom patched kernel and 3.8.0-30-generic have been stable. We will keep one server on this kernel, and a number of servers on 3.2.0-55-generic and on .57, so you'll have data to compare.
Munehisa Kamata (kamatam-amazon) wrote : | #7 |
Hi,
We also have experienced this issue with 3.2.0-57-generic. Thanks to the core dump analysis by Louis Bouchard, I could notice that accessing current->mempolicy in interrupt context is totally bad idea, and then found the following commit.
This fix was introduced in 3.6-rc1, that's why 3.8 kernel hasn't experienced this issue. Can you backport the fix to 12.04's 3.2 kernel?
Changed in linux (Ubuntu Precise): | |
assignee: | Louis Bouchard (louis-bouchard) → Chris J Arges (arges) |
Changed in linux (Ubuntu): | |
assignee: | Louis Bouchard (louis-bouchard) → nobody |
Chris J Arges (arges) wrote : | #8 |
Can those affected by this issue test this build with the patch identified by @kamatam?
http://
Thanks!
Munehisa Kamata (kamatam-amazon) wrote : | #9 |
Hi Chris,
Does anyone of you have a repro case? Although the patch itself is really straightforward, I don't have a reliable repro case of this race unfortunately.
Chris J Arges (arges) wrote : | #10 |
A similar case is here:
http://<email address hidden>
It seems like using NUMA with high load MongoDB workloads are factors in causing this crash.
Munehisa Kamata (kamatam-amazon) wrote : | #11 |
Unfortunately, we don't have a repro case yet. Do you really need a repro case to proceed this?
Chris J Arges (arges) wrote : | #12 |
A way to verify the patch is required for any SRU. A simple reproducer is always best, but if this problem occurs with high probability within a known amount of time then running for that known amount of time could also assist in validating and verifying the fix.
Munehisa Kamata (kamatam-amazon) wrote : | #13 |
I will not be able to provide a reproducer of this immediately. If you agree, please keep this open until I can have it or someone comes here with his/her reproducer.
Changed in linux (Ubuntu Precise): | |
assignee: | Chris J Arges (arges) → nobody |
Changed in linux (Ubuntu): | |
assignee: | nobody → Jay Vosburgh (jvosburgh) |
Changed in linux (Ubuntu Precise): | |
assignee: | nobody → Jay Vosburgh (jvosburgh) |
Here is an analysis of the kernel core dump captured for this issue :
crash> sys 3.2.0-38- generic buildd/ linux-3. 2.0/mm/ mempolicy. c:1638! "
KERNEL: vmlinux-
DUMPFILE: VmCore
CPUS: 24
DATE: Wed Sep 18 22:34:35 2013
UPTIME: 1 days, 11:33:14
LOAD AVERAGE: 2.04, 2.09, 2.16
TASKS: 6656
NODENAME: ddb-mongo41
RELEASE: 3.2.0-38-generic
VERSION: #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013
MACHINE: x86_64 (2533 Mhz)
MEMORY: 47.9 GB
PANIC: "[127932.907100] kernel BUG at /build/
PID: 21767
COMMAND: "mongod"
TASK: ffff8800874bdc00 [THREAD_INFO: ffff880544d8a000]
CPU: 12
STATE: EXIT_DEAD (PANIC)
Analysis
========
This is the backtrace of the panic task :
crash> bt node_track_ caller at ffffffff81166f07
PID: 21767 TASK: ffff8800874bdc00 CPU: 12 COMMAND: "mongod"
#0 [ffff880657cc3820] machine_kexec at ffffffff810393da
#1 [ffff880657cc3890] crash_kexec at ffffffff810b53f8
#2 [ffff880657cc3960] oops_end at ffffffff8165e528
#3 [ffff880657cc3990] die at ffffffff810178d8
#4 [ffff880657cc39c0] do_trap at ffffffff8165de94
#5 [ffff880657cc3a20] do_invalid_op at ffffffff81014f65
#6 [ffff880657cc3ac0] invalid_op at ffffffff8166796b
[exception RIP: slab_node+0x2e]
RIP: ffffffff8115a66e RSP: ffff880657cc3b70 RFLAGS: 00010097
RAX: 0000000000000000 RBX: ffff880657802c00 RCX: 00000000e62f6aef
RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff880abf18a288
RBP: ffff880657cc3b80 R8: 0000000000000001 R9: 0000000100100010
R10: 0000000000000000 R11: 0000000000000022 R12: 0000000000000002
R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000020
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff880657cc3b88] get_any_partial at ffffffff816496a0
#8 [ffff880657cc3c18] __slab_alloc at ffffffff816498cf
#9 [ffff880657cc3cc8] __kmalloc_
#10 [ffff880657cc3d38] __alloc_skb at ffffffff815364c8
#11 [ffff880657cc3d88] __netdev_alloc_skb at ffffffff81536b14
#12 [ffff880657cc3da8] enic_rq_alloc_buf at ffffffffa005484c [enic]
#13 [ffff880657cc3e08] enic_poll_msix at ffffffffa00559ff [enic]
#14 [ffff880657cc3e58] net_rx_action at ffffffff81545274
#15 [ffff880657cc3ec8] __do_softirq at ffffffff8106f5f8
#16 [ffff880657cc3f38] call_softirq at ffffffff81667bec
#17 [ffff880657cc3f50] do_softirq at ffffffff81016305
#18 [ffff880657cc3f70] irq_exit at ffffffff8106f9de
#19 [ffff880657cc3f80] do_IRQ at ffffffff816684a3
--- <IRQ stack> ---
#20 [ffff880544d8bd48] ret_from_intr at ffffffff8165d82e
[exception RIP: __slab_free+0x2e1]
RIP: ffffffff81649467 RSP: ffff880544d8bdf8 RFLAGS: 00000202
RAX: 0000000000000001 RBX: ffffffffff0a0210 RCX: 0000000180aa00a9
RDX: 0000000180aa00aa RSI: ffffea002afc6201 RDI: ffff880657806200
RBP: ffff880544d8bea8 R8: 0000000000000001 R9: 0000000000000000
R10: ffff8800874be020 R11: ffff8800874be030 R12: ffff880544d8be33
R13: 000000000000000d R14: ffffffff81191895 R15: ffff880544d8bdb8
ORIG_RAX: ffffffffffffff54 CS: 0010 SS: 0018
#21 [ffff880544d8be30] __change_pid at ffffffff81087dca
#22 [ffff880544d8beb0] kmem_cache_free at ffffffff81163634
#23 [ffff...