Kernel panic : mempolicy potential use-after-free on server running mongodb

Bug #1233175 reported by Louis Bouchard on 2013-09-30
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Jay Vosburgh
Precise
High
Jay Vosburgh

Bug Description

PID: 21767 TASK: ffff8800874bdc00 CPU: 12 COMMAND: "mongod"
 #0 [ffff880657cc3820] machine_kexec at ffffffff810393da
 #1 [ffff880657cc3890] crash_kexec at ffffffff810b53f8
 #2 [ffff880657cc3960] oops_end at ffffffff8165e528
 #3 [ffff880657cc3990] die at ffffffff810178d8
 #4 [ffff880657cc39c0] do_trap at ffffffff8165de94
 #5 [ffff880657cc3a20] do_invalid_op at ffffffff81014f65
 #6 [ffff880657cc3ac0] invalid_op at ffffffff8166796b
    [exception RIP: slab_node+46]
    RIP: ffffffff8115a66e RSP: ffff880657cc3b70 RFLAGS: 00010097
    RAX: 0000000000000000 RBX: ffff880657802c00 RCX: 00000000e62f6aef
    RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff880abf18a288
    RBP: ffff880657cc3b80 R8: 0000000000000001 R9: 0000000100100010
    R10: 0000000000000000 R11: 0000000000000022 R12: 0000000000000002
    R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000020
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
 #7 [ffff880657cc3b88] get_any_partial at ffffffff816496a0
 #8 [ffff880657cc3c18] __slab_alloc at ffffffff816498cf
 #9 [ffff880657cc3cc8] __kmalloc_node_track_caller at ffffffff81166f07
#10 [ffff880657cc3d38] __alloc_skb at ffffffff815364c8
#11 [ffff880657cc3d88] __netdev_alloc_skb at ffffffff81536b14
#12 [ffff880657cc3da8] enic_rq_alloc_buf at ffffffffa005484c [enic]
#13 [ffff880657cc3e08] enic_poll_msix at ffffffffa00559ff [enic]
#14 [ffff880657cc3e58] net_rx_action at ffffffff81545274
#15 [ffff880657cc3ec8] __do_softirq at ffffffff8106f5f8
#16 [ffff880657cc3f38] call_softirq at ffffffff81667bec
#17 [ffff880657cc3f50] do_softirq at ffffffff81016305
#18 [ffff880657cc3f70] irq_exit at ffffffff8106f9de
#19 [ffff880657cc3f80] do_IRQ at ffffffff816684a3
--- <IRQ stack> ---
#20 [ffff880544d8bd48] ret_from_intr at ffffffff8165d82e
    [exception RIP: __slab_free+737]
    RIP: ffffffff81649467 RSP: ffff880544d8bdf8 RFLAGS: 00000202
    RAX: 0000000000000001 RBX: ffffffffff0a0210 RCX: 0000000180aa00a9
    RDX: 0000000180aa00aa RSI: ffffea002afc6201 RDI: ffff880657806200
    RBP: ffff880544d8bea8 R8: 0000000000000001 R9: 0000000000000000
    R10: ffff8800874be020 R11: ffff8800874be030 R12: ffff880544d8be33
    R13: 000000000000000d R14: ffffffff81191895 R15: ffff880544d8bdb8
    ORIG_RAX: ffffffffffffff54 CS: 0010 SS: 0018
#21 [ffff880544d8be30] __change_pid at ffffffff81087dca
#22 [ffff880544d8beb0] kmem_cache_free at ffffffff81163634
#23 [ffff880544d8bef0] __mpol_put at ffffffff81159937
#24 [ffff880544d8bf00] do_exit at ffffffff8106c75c
#25 [ffff880544d8bf70] sys_exit at ffffffff8106caf7
#26 [ffff880544d8bf80] system_call_fastpath at ffffffff81665982
    RIP: 00007f6f476b8f37 RSP: 00007f68cbcfdbb0 RFLAGS: 00000202
    RAX: 000000000000003c RBX: ffffffff81665982 RCX: ffffffffffffffff
    RDX: 00007f68cbcfe700 RSI: 00007f6f478c9250 RDI: 0000000000000000
    RBP: 0000000000000000 R8: 00007f68cbcfe700 R9: 00007f68e82a0370
    R10: 000000007fffffff R11: 0000000000000246 R12: ffffffff8106caf7
    R13: ffff880544d8bf78 R14: 0000000000000003 R15: 00007f68f8744a10
    ORIG_RAX: 000000000000...

Louis Bouchard (louis) on 2013-09-30
Changed in linux (Ubuntu):
status: New → Triaged
assignee: nobody → Louis Bouchard (louis-bouchard)
importance: Undecided → High
Louis Bouchard (louis) on 2013-09-30
Changed in linux (Ubuntu Precise):
status: New → Triaged
assignee: nobody → Louis Bouchard (louis-bouchard)
importance: Undecided → High
tags: added: kernel-da-key precise
Louis Bouchard (louis) wrote :
Download full text (10.9 KiB)

Here is an analysis of the kernel core dump captured for this issue :

crash> sys
     KERNEL: vmlinux-3.2.0-38-generic
    DUMPFILE: VmCore
        CPUS: 24
        DATE: Wed Sep 18 22:34:35 2013
      UPTIME: 1 days, 11:33:14
LOAD AVERAGE: 2.04, 2.09, 2.16
       TASKS: 6656
    NODENAME: ddb-mongo41
     RELEASE: 3.2.0-38-generic
     VERSION: #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013
     MACHINE: x86_64 (2533 Mhz)
      MEMORY: 47.9 GB
       PANIC: "[127932.907100] kernel BUG at /build/buildd/linux-3.2.0/mm/mempolicy.c:1638!"
         PID: 21767
     COMMAND: "mongod"
        TASK: ffff8800874bdc00 [THREAD_INFO: ffff880544d8a000]
         CPU: 12
       STATE: EXIT_DEAD (PANIC)

Analysis
========
This is the backtrace of the panic task :

crash> bt
PID: 21767 TASK: ffff8800874bdc00 CPU: 12 COMMAND: "mongod"
 #0 [ffff880657cc3820] machine_kexec at ffffffff810393da
 #1 [ffff880657cc3890] crash_kexec at ffffffff810b53f8
 #2 [ffff880657cc3960] oops_end at ffffffff8165e528
 #3 [ffff880657cc3990] die at ffffffff810178d8
 #4 [ffff880657cc39c0] do_trap at ffffffff8165de94
 #5 [ffff880657cc3a20] do_invalid_op at ffffffff81014f65
 #6 [ffff880657cc3ac0] invalid_op at ffffffff8166796b
    [exception RIP: slab_node+0x2e]
    RIP: ffffffff8115a66e RSP: ffff880657cc3b70 RFLAGS: 00010097
    RAX: 0000000000000000 RBX: ffff880657802c00 RCX: 00000000e62f6aef
    RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff880abf18a288
    RBP: ffff880657cc3b80 R8: 0000000000000001 R9: 0000000100100010
    R10: 0000000000000000 R11: 0000000000000022 R12: 0000000000000002
    R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000020
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
 #7 [ffff880657cc3b88] get_any_partial at ffffffff816496a0
 #8 [ffff880657cc3c18] __slab_alloc at ffffffff816498cf
 #9 [ffff880657cc3cc8] __kmalloc_node_track_caller at ffffffff81166f07
#10 [ffff880657cc3d38] __alloc_skb at ffffffff815364c8
#11 [ffff880657cc3d88] __netdev_alloc_skb at ffffffff81536b14
#12 [ffff880657cc3da8] enic_rq_alloc_buf at ffffffffa005484c [enic]
#13 [ffff880657cc3e08] enic_poll_msix at ffffffffa00559ff [enic]
#14 [ffff880657cc3e58] net_rx_action at ffffffff81545274
#15 [ffff880657cc3ec8] __do_softirq at ffffffff8106f5f8
#16 [ffff880657cc3f38] call_softirq at ffffffff81667bec
#17 [ffff880657cc3f50] do_softirq at ffffffff81016305
#18 [ffff880657cc3f70] irq_exit at ffffffff8106f9de
#19 [ffff880657cc3f80] do_IRQ at ffffffff816684a3
--- <IRQ stack> ---
#20 [ffff880544d8bd48] ret_from_intr at ffffffff8165d82e
    [exception RIP: __slab_free+0x2e1]
    RIP: ffffffff81649467 RSP: ffff880544d8bdf8 RFLAGS: 00000202
    RAX: 0000000000000001 RBX: ffffffffff0a0210 RCX: 0000000180aa00a9
    RDX: 0000000180aa00aa RSI: ffffea002afc6201 RDI: ffff880657806200
    RBP: ffff880544d8bea8 R8: 0000000000000001 R9: 0000000000000000
    R10: ffff8800874be020 R11: ffff8800874be030 R12: ffff880544d8be33
    R13: 000000000000000d R14: ffffffff81191895 R15: ffff880544d8bdb8
    ORIG_RAX: ffffffffffffff54 CS: 0010 SS: 0018
#21 [ffff880544d8be30] __change_pid at ffffffff81087dca
#22 [ffff880544d8beb0] kmem_cache_free at ffffffff81163634
#23 [ffff...

Louis Bouchard (louis) on 2013-10-25
Changed in linux (Ubuntu Precise):
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: Triaged → Incomplete
lois garcia (lois-garcia-f) wrote :

I have one server, previously affected by the bug, that has been stable for 8 days on 3.8.0-30-generic.

We also just provisioned 24 servers with 3.2.0-57-generic (not yet in production).

If I can provide any information to you that would help, please let me know through the ticket.

Louis Bouchard (louis) wrote :

@christopher,

Lois was provided with a custom kernel that includes a patch to be tested. The patch comes from apw, following a kernel dump analysis that I did. From what we could gather, a possible race condition could be responsible for the panics.

So far, I have yet to get the confirmation from Lois that ONLY the kernel with the custom patch has fixed the problem, or that a newer kernel have stabilized the situation.

Since the 3.8.* kernel is available on precise, I'm not sure that identifying the specific commit that fixes the issue would be useful. Even if we identify the commit, a backport might not be posssible.

I would advise to go to the newer kernel

I'm sorry if this bug appears to be inactive, but the long period w/o any comment is caused by the fact that the issue does not happen on regular intervals

Changed in linux (Ubuntu):
status: Incomplete → Triaged
status: Triaged → In Progress
lois garcia (lois-garcia-f) wrote :

Gentlemen, so far, both the custom patched kernel and 3.8.0-30-generic have been stable. We will keep one server on this kernel, and a number of servers on 3.2.0-55-generic and on .57, so you'll have data to compare.

Hi,

We also have experienced this issue with 3.2.0-57-generic. Thanks to the core dump analysis by Louis Bouchard, I could notice that accessing current->mempolicy in interrupt context is totally bad idea, and then found the following commit.

 http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=e7b691b085fda913830e5280ae6f724b2a63c824

This fix was introduced in 3.6-rc1, that's why 3.8 kernel hasn't experienced this issue. Can you backport the fix to 12.04's 3.2 kernel?

Chris J Arges (arges) on 2014-04-14
Changed in linux (Ubuntu Precise):
assignee: Louis Bouchard (louis-bouchard) → Chris J Arges (arges)
Changed in linux (Ubuntu):
assignee: Louis Bouchard (louis-bouchard) → nobody
Chris J Arges (arges) wrote :

Can those affected by this issue test this build with the patch identified by @kamatam?
http://people.canonical.com/~arges/lp1233175/

Thanks!

Hi Chris,

Does anyone of you have a repro case? Although the patch itself is really straightforward, I don't have a reliable repro case of this race unfortunately.

Chris J Arges (arges) wrote :

A similar case is here:
http://<email address hidden>/msg351591.html

It seems like using NUMA with high load MongoDB workloads are factors in causing this crash.

Unfortunately, we don't have a repro case yet. Do you really need a repro case to proceed this?

Chris J Arges (arges) wrote :

A way to verify the patch is required for any SRU. A simple reproducer is always best, but if this problem occurs with high probability within a known amount of time then running for that known amount of time could also assist in validating and verifying the fix.

I will not be able to provide a reproducer of this immediately. If you agree, please keep this open until I can have it or someone comes here with his/her reproducer.

Chris J Arges (arges) on 2014-06-18
Changed in linux (Ubuntu Precise):
assignee: Chris J Arges (arges) → nobody
Jay Vosburgh (jvosburgh) on 2014-08-05
Changed in linux (Ubuntu):
assignee: nobody → Jay Vosburgh (jvosburgh)
Changed in linux (Ubuntu Precise):
assignee: nobody → Jay Vosburgh (jvosburgh)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers