crash_signature should(?) be created for stacktraces missing the first symbol

Bug #1507711 reported by Brian Murray
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Apport
Triaged
Low
Unassigned

Bug Description

I was looking into some failures to retrace (on the Error Tracker) for seb128 and notified him both of them failed with the following messages from apport.

2015-10-08 15:03:09,234:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Processing.
2015-10-08 15:03:09,364:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Decompressing to /tmp/tmpk4lXdk-swift.52b2224c-6dbf-11e5-a87d-fa163e78b027.oopsid.core
2015-10-08 15:03:17,879:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Retracing 52b2224c-6dbf-11e5-a87d-fa163e78b027:swift
2015-10-08 15:04:15,083:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Writing back to Cassandra
2015-10-08 15:04:15,097:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Apport did not return a crash_signature.
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:StacktraceTop:
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:?? ()
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:process_layout (tree=tree@entry=0x2656970, parent=parent@entry=0x7fc16400f1d0, layout=layout@entry=0x260ba40, allocated=allocated@entry=0x2eb1520) at /build/gnome-menus-eVQuVf/gnome-menus-3.13.3/./libmenu/gmenu-tree.c:3684
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:process_layout (tree=tree@entry=0x2656970, parent=parent@entry=0x0, layout=<optimized out>, allocated=allocated@entry=0x2eb1520) at /build/gnome-menus-eVQuVf/gnome-menus-3.13.3/./libmenu/gmenu-tree.c:3460
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:gmenu_tree_build_from_layout (error=0x7ffc10dd8888, tree=0x2656970) at /build/gnome-menus-eVQuVf/gnome-menus-3.13.3/./libmenu/gmenu-tree.c:4761
2015-10-08 15:04:15,098:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:gmenu_tree_load_sync (tree=0x2656970, error=error@entry=0x7ffc10dd88f0) at /build/gnome-menus-eVQuVf/gnome-menus-3.13.3/./libmenu/gmenu-tree.c:743
2015-10-08 15:04:15,157:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Could not retrace.
2015-10-08 15:04:15,158:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:RetraceOutdatedPackages:
2015-10-08 15:04:15,158:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:no debug symbol package found for libuuid1 (Ubuntu 15.10)
2015-10-08 15:04:23,950:855:140577166866176:INFO:root:52b2224c-6dbf-11e5-a87d-fa163e78b027:swift:Done processing /tmp/tmpk4lXdk-swift.52b2224c-6dbf-11e5-a87d-fa163e78b027.oopsid

2015-10-19 13:10:40,068:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Decompressing to /tmp/tmpjr4RXf-swift.8bd2a4b0-7662-11e5-81da-fa163e22e467.oopsid.core
2015-10-19 13:10:41,210:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Retracing 8bd2a4b0-7662-11e5-81da-fa163e22e467:swift
2015-10-19 13:13:16,845:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Writing back to Cassandra
2015-10-19 13:13:16,855:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Apport did not return a crash_signature.
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:StacktraceTop:
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:?? ()
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:make_protobuf_object<mir::protobuf::wire::Result> () at /build/mir-uRPMwH/mir-0.17.0+15.10.20151008.2/src/client/rpc/../make_protobuf_object.h:29
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:mir::client::rpc::MirProtobufRpcChannel::on_data_available (this=0x1b5f04) at /build/mir-uRPMwH/mir-0.17.0+15.10.20151008.2/src/client/rpc/mir_protobuf_rpc_channel.cpp:356
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:operator()<std::shared_ptr<mir::client::rpc::StreamTransport::Observer> > (__closure=<optimized out>, observer=...) at /build/mir-uRPMwH/mir-0.17.0+15.10.20151008.2/src/client/rpc/stream_socket_transport.cpp:40
2015-10-19 13:13:16,856:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:std::_Function_handler<void(const std::shared_ptr<mir::client::rpc::StreamTransport::Observer>&), mir::client::rpc::TransportObservers::on_data_available()::<lambda(auto:1)> >::_M_invoke(const std::_Any_data &, const std::shared_ptr<mir::client::rpc::StreamTransport::Observer> &) (__functor=..., __args#0=...) at /usr/include/c++/5/functional:1871
2015-10-19 13:13:16,857:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Saved OOPS 8bd2a4b0-7662-11e5-81da-fa163e22e467 for manual investigation.
2015-10-19 13:13:18,242:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Could not retrace.
2015-10-19 13:13:21,681:32568:140019280500480:INFO:root:8bd2a4b0-7662-11e5-81da-fa163e22e467:swift:Done processing /tmp/tmpjr4RXf-swift.8bd2a4b0-7662-11e5-81da-fa163e22e467.oopsid

seb128 indicated that the stacktrace is useful though so maybe we should return a crash_signature in this case.

Revision history for this message
Martin Pitt (pitti) wrote :

Ignoring the topmost function seems a bit dangerous to me, as it's usually the most important one. Imagine if the second and third etc. frames are only things like g_main_loop_process_event(), g_main_run(), and main(), and the topmost one is the particular callback. So entirely ignoring the first frame could easily lead to unifying crashes which are different.

What is the rationale for this? I. e. as it's coming from you Brian I suppose this is somehow related to whoopsie and errors.u.c. Don't we use the StacktraceAddressSignature for duplication anyway, as that's pretty much always available (albeit an N:1 mapping)? What happens if there is an SAS, but no DuplicateSignature because of a missing/broken first frame?

Depending on what we need this for, we could potentially be more clever here -- e. g. if we have 4 "named" frames, and the topmost one is ??, we could change the topmost one to be in the format of SAS, i. e. library+offset (based on the stack address). So instead of just having ?? we could have a "hybrid" DuplicateSignature like

  /lib/x86_64-linux-gnu/libc-2.19.so+36f79:connect_to_socket:wl_display_connect:_gdk_wayland_display_open:main()

i. e. if your use case does not depend on having function names but you treat the DuplicateSignature as an opaque blob, that'd be fine. I would exaggerate this, as the idea of a DuplicateSignature is that it's uniquely identifying a crash; as opposed to a SAS which is often N:1, i. e. a particular crash often has more than one (I've seen up to 6) SAS. But having a DupSig with max. one address offset as opposed to no DupSig at all indeed seems like a worthwhile improvement.

Changed in apport:
status: New → Confirmed
Revision history for this message
Sebastien Bacher (seb128) wrote :

to give some more context, I was asking at Brian why that retracing is failing
https://errors.ubuntu.com/problem/2ebfd81699a9dba286619318826c3c894a0475a4

the report page has those info
"trust-stored-skeleton (11) [heap]+502c → /usr/lib/arm-linux-gnueabihf/libmirclient.so.9+4153e → /usr/lib/arm-linux-gnueabihf/libmirclient.so.9+39590 → /usr/lib/arm-linux-gnueabihf/libmirclient.so.9+3ba10 → /usr/lib/arm-linux-gnueabihf/libmirclient.so.9+39706"

where as you can see in the description the backtrace has a missing symbol but then 4 good ones, having that bit of the backtrace would be useful and better than nothing

Revision history for this message
Brian Murray (brian-murray) wrote : Re: [Bug 1507711] Re: crash_signature should(?) be created for stacktraces missing the first symbol

On Tue, Oct 20, 2015 at 07:25:58AM -0000, Martin Pitt wrote:
> Ignoring the topmost function seems a bit dangerous to me, as it's
> usually the most important one. Imagine if the second and third etc.
> frames are only things like g_main_loop_process_event(), g_main_run(),
> and main(), and the topmost one is the particular callback. So entirely
> ignoring the first frame could easily lead to unifying crashes which are
> different.
>
> What is the rationale for this? I. e. as it's coming from you Brian I
> suppose this is somehow related to whoopsie and errors.u.c. Don't we use
> the StacktraceAddressSignature for duplication anyway, as that's pretty
> much always available (albeit an N:1 mapping)? What happens if there is
> an SAS, but no DuplicateSignature because of a missing/broken first
> frame?

Yes, we use the StacktraceAddressSignature for duplication. If there is
a SAS, but no DuplicateSignature then the DuplicateSignature is
failed:SAS. This can be seen in the following error:

https://errors.ubuntu.com/oops/52b2224c-6dbf-11e5-a87d-fa163e78b027

Notice how the Problem is just the SAS with failed: prepended.

> Depending on what we need this for, we could potentially be more clever
> here -- e. g. if we have 4 "named" frames, and the topmost one is ??, we
> could change the topmost one to be in the format of SAS, i. e.
> library+offset (based on the stack address). So instead of just having
> ?? we could have a "hybrid" DuplicateSignature like
>
> /lib/x86_64-linux-
> gnu/libc-2.19.so+36f79:connect_to_socket:wl_display_connect:_gdk_wayland_display_open:main()
>
> i. e. if your use case does not depend on having function names but you
> treat the DuplicateSignature as an opaque blob, that'd be fine.

Yes, the DuplicateSignature is just treated as an opaque blob.

--
Brian Murray

Revision history for this message
Martin Pitt (pitti) wrote :

Brian: If we do the "hybrid" DuplicateSignature, then it's possible that one and the same bug would have several different buckets as the address signatures are not uniquely identifying a bug. I guess that's not a problem as right now the SAS is even "less" unique, but we should be aware of this. Setting triaged now, as it seems we agree on this.

Changed in apport:
status: Confirmed → Triaged
importance: Undecided → Low
assignee: nobody → Martin Pitt (pitti)
assignee: Martin Pitt (pitti) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.