Comment 5 for bug 1533349

Revision history for this message
Martin Pitt (pitti) wrote : Re: crashes sometimes do not get retraced

Ah, so the problem is that gdb is not showing most of the addresses in the frames on "bt", so that the address signature cannot be computed. I tried to retrace this manually on xenial. With both the current ifupdown version (i. e. report vs. current xenial version mismatch) as well as the actual 0.8.6-1ubuntu1 version I can reproduce this stack trace with missing addresses.

Interestingly, "info f" does show a PC for the ones missing an address in "bt":

(gdb) bt
#0 __GI_strncpy (s1=0xbe86aa6f "", s1@entry=0xbe86aa70 "lo", s2=0x5 <error: Cannot access memory at address 0x5>, n=n@entry=80)
    at strncpy.c:41
#1 0x00013032 in strncpy (__len=80, __src=<optimized out>, __dest=0xbe86aa70 "lo")
    at /usr/include/arm-linux-gnueabihf/bits/string3.h:126
#2 do_interface (target_iface=<optimized out>) at main.c:846
#3 0x00011994 in main (argc=<optimized out>, argv=0xbe86ade8) at main.c:1146
(gdb) info f 0
Stack frame at 0xbe86a9b0:
 pc = 0xb6e9a124 in __GI_strncpy (strncpy.c:41); saved pc = 0x13032
 called by frame at 0xbe86ac40
 source language c.
 Arglist at 0xbe86a9a0, args: s1=0xbe86aa6f "", s1@entry=0xbe86aa70 "lo", s2=0x5 <error: Cannot access memory at address 0x5>,
    n=n@entry=80
 Locals at 0xbe86a9a0, Previous frame's sp is 0xbe86a9b0
 Saved registers:
  r4 at 0xbe86a9a0, r5 at 0xbe86a9a4, r6 at 0xbe86a9a8, lr at 0xbe86a9ac
(gdb) info f 1
Stack frame at 0xbe86ac40:
 pc = 0x13032 in strncpy (/usr/include/arm-linux-gnueabihf/bits/string3.h:126); saved pc = 0x11994
 inlined into frame 2, caller of frame at 0xbe86a9b0
 source language c.
 Arglist at unknown address.
 Locals at unknown address, Previous frame's sp is 0xbe86a9b0
 Saved registers:
  r4 at 0xbe86a9a0, r5 at 0xbe86a9a4, r6 at 0xbe86a9a8, lr at 0xbe86a9ac
(gdb) info f 2
Stack frame at 0xbe86ac40:
 pc = 0x13032 in do_interface (main.c:846); saved pc = 0x11994
 called by frame at 0xbe86ac90, caller of frame at 0xbe86ac40
 source language c.
 Arglist at 0xbe86a9b0, args: target_iface=<optimized out>
 Locals at 0xbe86a9b0, Previous frame's sp is 0xbe86ac40
 Saved registers:
  r4 at 0xbe86ac1c, r5 at 0xbe86ac20, r6 at 0xbe86ac24, r7 at 0xbe86ac28, r8 at 0xbe86ac2c, r9 at 0xbe86ac30, r10 at 0xbe86ac34,
  r11 at 0xbe86ac38, lr at 0xbe86ac3c
(gdb) info f 3
Stack frame at 0xbe86ac90:
 pc = 0x11994 in main (main.c:1146); saved pc = 0xb6e59772
 caller of frame at 0xbe86ac40
 source language c.
 Arglist at 0xbe86ac40, args: argc=<optimized out>, argv=0xbe86ade8
 Locals at 0xbe86ac40, Previous frame's sp is 0xbe86ac90
 Saved registers:
  r4 at 0xbe86ac6c, r5 at 0xbe86ac70, r6 at 0xbe86ac74, r7 at 0xbe86ac78, r8 at 0xbe86ac7c, r9 at 0xbe86ac80, r10 at 0xbe86ac84,
  r11 at 0xbe86ac88, lr at 0xbe86ac8c

Reading ftp://ftp.gnu.org/old-gnu/Manuals/gdb/html_chapter/gdb_7.html suggests that -fomit-frame-pointer could be responsible for this. It also mentions a missing address for the topmost frame. Following the suggestion there, I think we should make crash_signature_addresses() get some fallbacks. In particular, here:

                addr = line.split()[1]
                if not addr.startswith('0x'):
                    continue

instead of ignoring that frame, we should check if the frame has a file:line reference as in

   #2 do_interface (target_iface=<optimized out>) at main.c:846

and use that as a signature of the current frame instead of ignoring it. If that also isn't present, it could then fall back to the function name; that isn't particularly precise as it could happen anywhere in that function, but it's still better than skipping the frame completely.

WDYT?