Ubuntu
linux package

Bug #279186
Comment #22

Comment 22 for bug 279186

Revision history for this message

Robert Nelson (robertcnelson) wrote on 2008-10-22: Re: kernel oops on boot (dual-core Atom 330 board D945GCLF2)

#22

Okay, this really doesn't make sense. (config-2.6.27-7.12 has it disabled) not what i was hoping for, doesn't really make sense. git bisect results (Atom 330 D945GCLF2, x86_64), this more then likely will not get reverted. (since it can damage hardware) But will have to be modified to work with this cpu.

Any Ideas?

Regards,
Robert

67d9b90a1c844bf1c6daaffd2c60561fc8c445f7 is first bad commit
commit 67d9b90a1c844bf1c6daaffd2c60561fc8c445f7
Author: Steven Rostedt <email address hidden>
Date: Wed Oct 15 18:21:44 2008 -0400

disable CONFIG_DYNAMIC_FTRACE due to possible memory corruption on module unload

    While debugging the e1000e corruption bug with Intel, we discovered
    today that the dynamic ftrace code in mainline is the likely source of
    this bug.

For the stable kernel we are providing the only viable fix patch: labeling
CONFIG_DYNAMIC_FTRACE as broken. (see the patch below)

    We will follow up with a backport patch that contains the fixes. But since
    the fixes are not a one liner, the safest approach for now is to
    disable the code in question.

    The cause of the bug is due to the way the current code in mainline
    handles dynamic ftrace. When dynamic ftrace is turned on, it also
    turns on CONFIG_FTRACE which enables the -pg config in gcc that places
    a call to mcount at every function call. With just CONFIG_FTRACE this
    causes a noticeable overhead. CONFIG_DYNAMIC_FTRACE works to ease this
    overhead by dynamically updating the mcount call sites into nops.

    The problem arises when we trace functions and modules are unloaded.
    The first time a function is called, it will call mcount and the mcount
    call will call ftrace_record_ip. This records the calling site and
    stores it in a preallocated hash table. Later on a daemon will
    wake up and call kstop_machine and convert any mcount callers into
    nops.

    The evolution of this code first tried to do this without the kstop_machine
    and used cmpxchg to update the callers as they were called. But I
    was informed that this is dangerous to do on SMP machines if another
    CPU is running that same code. The solution was to do this with
    kstop_machine.

    We still used cmpxchg to test if the code that we are modifying is
    indeed code that we expect to be before updating it - as a final
    line of defense.

    But on 32bit machines, ioremapped memory and modules share the same
    address space. When a module would load its code into memory and execute
    some code, that would register the function.

    On module unload, ftrace incorrectly did not zap these functions from
    its hash (this was the bug). The cmpxchg could have saved us in most
    cases (via luck) - but with ioremap-ed memory that was exactly the wrong
    thing to do - the results of cmpxchg on device memory are undefined.
    (and will likely result in a write)

    The pending .28 ftrace tree does not have this bug anymore, as a general push
    towards more robustness of code patching, this is done differently: we do not
    use cmpxchg and we do a WARN_ON and turn the tracer off if anything deviates
    from its expected state. Furthermore, patch sites are statically identified
    during build time so there's no runtime discovery of dynamic code areas
    anymore, and no room for code unmaps to cause the hash to become out of date.

    We believe the fragility of dynamic patching has been sufficiently
    addressed in the development code via the static patching method, but further
    suggestions to make it more robust are welcome.

    Signed-off-by: Steven Rostedt <email address hidden>
    Acked-by: Ingo Molnar <email address hidden>
    Acked-by: Thomas Gleixner <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>
    Signed-off-by: Tim Gardner <email address hidden>

:040000 040000 22d7188976a687f1a02ca43ee8a55e2202e10397 e7e8cf17d28e09c98efac31848bae727973ca8b8 M kernel

Okay, this really doesn't make sense. (config-2.6.27-7.12 has it disabled) not what i was hoping for, doesn't really make sense.  git bisect results (Atom 330 D945GCLF2, x86_64), this more then likely will not get reverted. (since it can damage hardware) But will have to be modified to work with this cpu.

Any Ideas?

Regards,
Robert

67d9b90a1c844bf1c6daaffd2c60561fc8c445f7 is first bad commit
commit 67d9b90a1c844bf1c6daaffd2c60561fc8c445f7
Author: Steven Rostedt <rostedt@goodmis.org>
Date:   Wed Oct 15 18:21:44 2008 -0400

disable CONFIG_DYNAMIC_FTRACE due to possible memory corruption on module unload
    
    While debugging the e1000e corruption bug with Intel, we discovered
    today that the dynamic ftrace code in mainline is the likely source of
    this bug.
    
    For the stable kernel we are providing the only viable fix patch: labeling
    CONFIG_DYNAMIC_FTRACE as broken. (see the patch below)
    
    We will follow up with a backport patch that contains the fixes. But since
    the fixes are not a one liner, the safest approach for now is to
    disable the code in question.
    
    The cause of the bug is due to the way the current code in mainline
    handles dynamic ftrace.  When dynamic ftrace is turned on, it also
    turns on CONFIG_FTRACE which enables the -pg config in gcc that places
    a call to mcount at every function call. With just CONFIG_FTRACE this
    causes a noticeable overhead.  CONFIG_DYNAMIC_FTRACE works to ease this
    overhead by dynamically updating the mcount call sites into nops.
    
    The problem arises when we trace functions and modules are unloaded.
    The first time a function is called, it will call mcount and the mcount
    call will call ftrace_record_ip. This records the calling site and
    stores it in a preallocated hash table. Later on a daemon will
    wake up and call kstop_machine and convert any mcount callers into
    nops.
    
    The evolution of this code first tried to do this without the kstop_machine
    and used cmpxchg to update the callers as they were called. But I
    was informed that this is dangerous to do on SMP machines if another
    CPU is running that same code. The solution was to do this with
    kstop_machine.
    
    We still used cmpxchg to test if the code that we are modifying is
    indeed code that we expect to be before updating it - as a final
    line of defense.
    
    But on 32bit machines, ioremapped memory and modules share the same
    address space. When a module would load its code into memory and execute
    some code, that would register the function.
    
    On module unload, ftrace incorrectly did not zap these functions from
    its hash (this was the bug). The cmpxchg could have saved us in most
    cases (via luck) - but with ioremap-ed memory that was exactly the wrong
    thing to do - the results of cmpxchg on device memory are undefined.
    (and will likely result in a write)
    
    The pending .28 ftrace tree does not have this bug anymore, as a general push
    towards more robustness of code patching, this is done differently: we do not
    use cmpxchg and we do a WARN_ON and turn the tracer off if anything deviates
    from its expected state. Furthermore, patch sites are statically identified
    during build time so there's no runtime discovery of dynamic code areas
    anymore, and no room for code unmaps to cause the hash to become out of date.
    
    We believe the fragility of dynamic patching has been sufficiently
    addressed in the development code via the static patching method, but further
    suggestions to make it more robust are welcome.
    
    Signed-off-by: Steven Rostedt <srostedt@goodmis.org>
    Acked-by: Ingo Molnar <mingo@elte.hu>
    Acked-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

:040000 040000 22d7188976a687f1a02ca43ee8a55e2202e10397 e7e8cf17d28e09c98efac31848bae727973ca8b8 M	kernel

Ubuntulinux package

Comment 22 for bug 279186

Ubuntu
linux package