Comment 6 for bug 1533349

Revision history for this message
Brian Murray (brian-murray) wrote : Re: [Bug 1533349] Re: crashes sometimes do not get retraced

On Wed, Feb 17, 2016 at 08:26:20AM -0000, Martin Pitt wrote:
> Ah, so the problem is that gdb is not showing most of the addresses in
> the frames on "bt", so that the address signature cannot be computed. I
> tried to retrace this manually on xenial. With both the current ifupdown
> version (i. e. report vs. current xenial version mismatch) as well as
> the actual 0.8.6-1ubuntu1 version I can reproduce this stack trace with
> missing addresses.
>
> Interestingly, "info f" does show a PC for the ones missing an address
> in "bt":
>
> (gdb) bt
> #0 __GI_strncpy (s1=0xbe86aa6f "", s1@entry=0xbe86aa70 "lo", s2=0x5 <error: Cannot access memory at address 0x5>, n=n@entry=80)
> at strncpy.c:41
> #1 0x00013032 in strncpy (__len=80, __src=<optimized out>, __dest=0xbe86aa70 "lo")
> at /usr/include/arm-linux-gnueabihf/bits/string3.h:126
> #2 do_interface (target_iface=<optimized out>) at main.c:846
> #3 0x00011994 in main (argc=<optimized out>, argv=0xbe86ade8) at main.c:1146
> (gdb) info f 0
> Stack frame at 0xbe86a9b0:
> pc = 0xb6e9a124 in __GI_strncpy (strncpy.c:41); saved pc = 0x13032
> called by frame at 0xbe86ac40
> source language c.
> Arglist at 0xbe86a9a0, args: s1=0xbe86aa6f "", s1@entry=0xbe86aa70 "lo", s2=0x5 <error: Cannot access memory at address 0x5>,
> n=n@entry=80
> Locals at 0xbe86a9a0, Previous frame's sp is 0xbe86a9b0
> Saved registers:
> r4 at 0xbe86a9a0, r5 at 0xbe86a9a4, r6 at 0xbe86a9a8, lr at 0xbe86a9ac
> (gdb) info f 1
> Stack frame at 0xbe86ac40:
> pc = 0x13032 in strncpy (/usr/include/arm-linux-gnueabihf/bits/string3.h:126); saved pc = 0x11994
> inlined into frame 2, caller of frame at 0xbe86a9b0
> source language c.
> Arglist at unknown address.
> Locals at unknown address, Previous frame's sp is 0xbe86a9b0
> Saved registers:
> r4 at 0xbe86a9a0, r5 at 0xbe86a9a4, r6 at 0xbe86a9a8, lr at 0xbe86a9ac
> (gdb) info f 2
> Stack frame at 0xbe86ac40:
> pc = 0x13032 in do_interface (main.c:846); saved pc = 0x11994
> called by frame at 0xbe86ac90, caller of frame at 0xbe86ac40
> source language c.
> Arglist at 0xbe86a9b0, args: target_iface=<optimized out>
> Locals at 0xbe86a9b0, Previous frame's sp is 0xbe86ac40
> Saved registers:
> r4 at 0xbe86ac1c, r5 at 0xbe86ac20, r6 at 0xbe86ac24, r7 at 0xbe86ac28, r8 at 0xbe86ac2c, r9 at 0xbe86ac30, r10 at 0xbe86ac34,
> r11 at 0xbe86ac38, lr at 0xbe86ac3c
> (gdb) info f 3
> Stack frame at 0xbe86ac90:
> pc = 0x11994 in main (main.c:1146); saved pc = 0xb6e59772
> caller of frame at 0xbe86ac40
> source language c.
> Arglist at 0xbe86ac40, args: argc=<optimized out>, argv=0xbe86ade8
> Locals at 0xbe86ac40, Previous frame's sp is 0xbe86ac90
> Saved registers:
> r4 at 0xbe86ac6c, r5 at 0xbe86ac70, r6 at 0xbe86ac74, r7 at 0xbe86ac78, r8 at 0xbe86ac7c, r9 at 0xbe86ac80, r10 at 0xbe86ac84,
> r11 at 0xbe86ac88, lr at 0xbe86ac8c
>
> Reading ftp://ftp.gnu.org/old-gnu/Manuals/gdb/html_chapter/gdb_7.html
> suggests that -fomit-frame-pointer could be responsible for this. It
> also mentions a missing address for the topmost frame. Following the
> suggestion there, I think we should make crash_signature_addresses() get
> some fallbacks. In particular, here:
>
> addr = line.split()[1]
> if not addr.startswith('0x'):
> continue
>
> instead of ignoring that frame, we should check if the frame has a
> file:line reference as in
>
> #2 do_interface (target_iface=<optimized out>) at main.c:846
>
> and use that as a signature of the current frame instead of ignoring it.
> If that also isn't present, it could then fall back to the function
> name; that isn't particularly precise as it could happen anywhere in
> that function, but it's still better than skipping the frame completely.
>
> WDYT?

Given that 'info f' for a frame returns a pc, is there a reason not to
use that?

Otherwise it seems like a fine idea.

--
Brian Murray