requesting location updates in oxide webview triggers memory corruption

Bug #1559428 reported by costales on 2016-03-19
60
This bug affects 13 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Critical
David Barth
Oxide
Undecided
Unassigned
uNav
Critical
Unassigned
oxide-qt (Ubuntu)
Undecided
Olivier Tilloy

Bug Description

Hi!

The system is killing uNav a few times without reason and in random times in rc-proposed.

Steps for reproduce (not works always):
- Open uNav.
- Center position.
- uNav is killed (sometimes).

From my test "I think" is when the app asks for a geolocation position, then the geolocation is not available and kills the application.

This is a log of a kill:
http://paste.ubuntu.com/15409376/

Thanks in advance!

costales (costales) on 2016-03-19
description: updated
description: updated
summary: - System kills app if app ask for GPS and GPS is not available
+ System kills app if app asks for GPS and GPS is not available
costales (costales) on 2016-03-20
description: updated
summary: - System kills app if app asks for GPS and GPS is not available
+ System kills app if app asks for GPS with AGPS enabled and not connected
description: updated
description: updated
description: updated
Changed in canonical-devices-system-image:
status: New → Confirmed
costales (costales) on 2016-03-22
description: updated
summary: - System kills app if app asks for GPS with AGPS enabled and not connected
+ System kills uNav when it asks for geolocation

I get this same bug on ota 9 (stable) with the last couple unav updates. If there's no GPS fix, unav crashes.

Pat McGowan (pat-mcgowan) wrote :

Could be https://errors.ubuntu.com/problem/a438a6fd8998af9b1ded9178f043a29b79f42ce3

It crashes often here on E4.5 running v286, other apps show a location fix
but it works fine on mx4 so far

*** Error in `/usr/lib/arm-linux-gnueabihf/qt5/bin/qmlscene': malloc(): smallbin double linked list corrupted: 0x00a0f750 ***

Pat McGowan (pat-mcgowan) wrote :

I note that after I installed the app, I never got the trust prompt to allow location access. I just started it for a fourth time and got the prompt and then it worked immediately

I turned the access permission off via settings and it crashed again.Turned it back on and it also crashed.

I turned off access on the mx4 and it also crashed

Pat McGowan (pat-mcgowan) wrote :

@thomas any thoughts?

Changed in canonical-devices-system-image:
assignee: nobody → Thomas Voß (thomas-voss)

On Tue, Mar 22, 2016 at 9:03 PM, Pat McGowan <email address hidden>
wrote:

> I never got the trust prompt to
> allow location access
>

Hi Pat!
Could be this bug?
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1469007
Thanks!

--
Sent using Dekko from my Ubuntu device

Same problem for me on Meizu MX4, OTA 9.1.

When i'm asking for geolocation uNav is killed. I have to open again uNav and it works.

Pat McGowan (pat-mcgowan) wrote :

@costales I cannot reproduce that bug with other apps so I assume not. I think it just crashed before the prompt made it up.

Thomas Voß (thomas-voss) wrote :

I just tried rc-proposed across krillin, e5, mx4, browser + mapping web pages work fine. With that, I'm marking uNav as affected and it seems that the behavior is app specific.

@costales: To further debug the issue, we would need a backtrace with symbols.

Changed in canonical-devices-system-image:
importance: Undecided → High

Hi!

I (think) found the problem. I would say it's an oxide issue (?)...

I can't do anything now in uNav for solving/parsing it because of this...

uNav is asking about GPS with this standard HTML5 geolocation code and the
system kills uNav (I think because uNav asks for a position and the system
trigger a timeout and kills it):

    id_gps = navigator.geolocation.watchPosition(
        function (pos) {
            console.log('Enter: ' + pos.coords.latitude);
        },
        function (error) {
            console.log('WatchPosition error: '+error.code);
        },
        {
            enableHighAccuracy:true
        }
    );

The previous code is correct. How to fix the crash? Adding a timeout:

    id_gps = navigator.geolocation.watchPosition(
        function (pos) {
            console.log('Enter: ' + pos.coords.latitude);
        },
        function (error) {
            console.log('WatchPosition error: '+error.code);
        },
        {
            enableHighAccuracy:true,
            maximumAge:Infinity,
            timeout:0
        }
    );

But this code will enter into the error procedure with a code 3 (timeout)
and the watchPosition will not be called again.
Then I'll not get a crash, but I'll not get a position ever.

What did I do then? Simplify the issue. I created a basic example (qml
embebing a html). It will show a map and center in the current position,
only that. It is crashing too (http://termbin.com/0q5d). I think the
network status or gps status could be related.

bzr branch lp:~costales/+junk/testing_unav_crash
cd testing_unav_crash

Testing:
qmlscene qml/Main.qml
Build:
click build .

How should it work?
- maximumAge and timeout has to be optional in the watchPosition.
- watchPosition should run until a watchPosition clear will be called. This
would return positions in next times.

Best regards.

Changed in unav:
status: New → Invalid
importance: Undecided → Critical

For reference: The system does not kill the app due to a timeout. I will see if I can get a backtrace out of the stripped down example.

For the record, we happen to stumble on a problem with Here Maps that looks very very similar (early to see it as a dup):

https://bugs.launchpad.net/webapps-core/+bug/1561119

Pat McGowan (pat-mcgowan) wrote :

Stack trace at http://pastebin.ubuntu.com/15482488/
seems to point to oxide

summary: - System kills uNav when it asks for geolocation
+ requesting location updates in oxide webview triggers memory corruption
Changed in oxide-qt (Ubuntu):
assignee: nobody → Olivier Tilloy (osomon)
Pat McGowan (pat-mcgowan) wrote :

target to next milestone but need this asap

Changed in canonical-devices-system-image:
milestone: none → 11
Chris Coulson (chrisccoulson) wrote :

If it's memory corruption, the stack trace isn't particularly useful as it doesn't identify what causes the corruption. It could be literally anything. Has anyone tried running it in valgrind? (you'll need to do that on the desktop)

@Chris: no it is not, it is plain corrupted and from it we cannot infer yet that it is oxide related,

Olivier Tilloy (osomon) wrote :
Download full text (5.2 KiB)

While trying to reproduce the issue with Marcos’ stripped down example, I got a crash (not sure it’s related as it happened when I was brining the system settings apps to the foreground. Anyway, here is the backtrace:

#0 0xb5dd8e64 in malloc_consolidate (av=av@entry=0xb5e727a8 <main_arena>) at malloc.c:4142
#1 0xb5dda1e0 in _int_malloc (av=av@entry=0xb5e727a8 <main_arena>, bytes=bytes@entry=512) at malloc.c:3417
#2 0xb5ddb95e in __GI___libc_malloc (bytes=512) at malloc.c:2895
#3 0xb5f35090 in operator new(unsigned int) () from /usr/lib/arm-linux-gnueabihf/libstdc++.so.6
#4 0xac93ea18 in allocate (__n=128, this=<optimized out>) at /usr/include/c++/4.9/ext/new_allocator.h:104
#5 _M_allocate_node (this=<optimized out>) at /usr/include/c++/4.9/bits/stl_deque.h:538
#6 _M_create_nodes (this=0xbef4b360, __nfinish=0xa45258, __nstart=0xa45254)
    at /usr/include/c++/4.9/bits/stl_deque.h:632
#7 std::_Deque_base<oxide::FetchTextureResourcesTaskInfo*, std::allocator<oxide::FetchTextureResourcesTaskInfo*> >::_M_initialize_map (this=0xbef4b360, __num_elements=0) at /usr/include/c++/4.9/bits/stl_deque.h:606
#8 0xaca8612e in _Deque_base (this=0xbef4b360) at /usr/include/c++/4.9/bits/stl_deque.h:458
#9 deque (this=0xbef4b360) at /usr/include/c++/4.9/bits/stl_deque.h:788
#10 content::FrameTree::ForEach(base::Callback<bool (content::FrameTreeNode*)> const&, content::FrameTreeNode*) const (
    this=this@entry=0xa39748, on_node=..., skip_this_subtree=skip_this_subtree@entry=0x0)
    at ../../../../third_party/chromium/src/content/browser/frame_host/frame_tree.cc:182
#11 0xaca864c4 in ForEach (on_node=..., this=0xa39748)
    at ../../../../third_party/chromium/src/content/browser/frame_host/frame_tree.cc:176
#12 content::FrameTree::UpdateLoadProgress (this=0xa39748)
    at ../../../../third_party/chromium/src/content/browser/frame_host/frame_tree.cc:431
#13 0xaca86c6a in content::FrameTreeNode::DidChangeLoadProgress (this=<optimized out>, load_progress=<optimized out>)
    at ../../../../third_party/chromium/src/content/browser/frame_host/frame_tree_node.cc:380
#14 0xaca942be in content::RenderFrameHostImpl::OnDidChangeLoadProgress (this=this@entry=0xa416d8,
    load_progress=<optimized out>)
    at ../../../../third_party/chromium/src/content/browser/frame_host/render_frame_host_impl.cc:1719
#15 0xaca9cb70 in DispatchToMethodImpl<content::RenderFrameHostImpl, void (content::RenderFrameHostImpl::*)(double), double, 0u> (arg=..., method=
    (void (content::RenderFrameHostImpl::*)(content::RenderFrameHostImpl * const, double)) 0xaca942b9 <content::RenderFrameHostImpl::OnDidChangeLoadProgress(double)>, obj=0xa416d8) at ../../../../third_party/chromium/src/base/tuple.h:252
#16 DispatchToMethod<content::RenderFrameHostImpl, void (content::RenderFrameHostImpl::*)(double), double> (arg=...,
    method=
    (void (content::RenderFrameHostImpl::*)(content::RenderFrameHostImpl * const, double)) 0xaca942b9 <content::RenderFrameHostImpl::OnDidChangeLoadProgress(double)>, obj=0xa416d8) at ../../../../third_party/chromium/src/base/tuple.h:259
#17 Dispatch<content::RenderFrameHostImpl, content::RenderFrameHostImpl, void, void (content::RenderFrameHostImpl::*)(dou...

Read more...

Olivier Tilloy (osomon) wrote :

Just got the exact same backtrace from a different crash: this time the test app was sitting idle (supposedly waiting for a position update), the screen didn’t go blank, the app simply crashed.

That’s on arale running the latest rc-proposed image (#284), which has oxide 1.13.6-0ubuntu0.15.04.1~overlay1.

Olivier Tilloy (osomon) wrote :

Just got the exact same crash and backtrace with an even simpler test app, on the same device:

$ cat test-geo.qml
import QtQuick 2.4
import com.canonical.Oxide 1.12
WebView {
  url: Qt.resolvedUrl("test-geo.html")
  onGeolocationPermissionRequested: request.allow()
  onJavaScriptConsoleMessage: console.log(message)
}

$ cat test-geo.html
<html>
 <body>
  <script>
   navigator.geolocation.getCurrentPosition(
    function(pos) { console.log("SUCCESS:", pos); },
    function(error) { console.log("ERROR:", error.code, error.message); },
    {enableHighAccuracy: false, maximumAge: Infinity, timeout: 3000}
   );
  </script>
 </body>
</html>

Changed in canonical-devices-system-image:
milestone: 11 → ww08-2016
Olivier Tilloy (osomon) wrote :

I can observe a crash on my desktop setup (x86-64, xenial, oxide 1.13.6-0ubuntu1) with the example app above, although the backtrace is not the same, and the crash doesn’t always happen (about once every 15 runs):

#0 0x0000000000000000 in ?? ()
#1 0x00007fd52f3249c0 in QScopedPointerDeleter<QEvent>::cleanup (pointer=0x1f6e0e0) at ../../include/QtCore/../../src/corelib/tools/qscopedpointer.h:54
#2 QScopedPointer<QEvent, QScopedPointerDeleter<QEvent> >::~QScopedPointer (this=<synthetic pointer>, __in_chrg=<optimized out>) at ../../include/QtCore/../../src/corelib/tools/qscopedpointer.h:101
#3 QCoreApplicationPrivate::sendPostedEvents (receiver=receiver@entry=0x0, event_type=event_type@entry=0, data=0x1939fa0) at kernel/qcoreapplication.cpp:1590
#4 0x00007fd52f324e98 in QCoreApplication::sendPostedEvents (receiver=receiver@entry=0x0, event_type=event_type@entry=0) at kernel/qcoreapplication.cpp:1451
#5 0x00007fd52f378643 in postEventSourceDispatch (s=0x1a37420) at kernel/qeventdispatcher_glib.cpp:271
#6 0x00007fd52dcff127 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#7 0x00007fd52dcff380 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#8 0x00007fd52dcff42c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#9 0x00007fd52f378a4f in QEventDispatcherGlib::processEvents (this=0x1a652c0, flags=...) at kernel/qeventdispatcher_glib.cpp:418
#10 0x00007fd52f31fd7a in QEventLoop::exec (this=this@entry=0x7ffc8591f400, flags=..., flags@entry=...) at kernel/qeventloop.cpp:204
#11 0x00007fd52f327e1c in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1229
#12 0x00007fd52f65bc3c in QGuiApplication::exec () at kernel/qguiapplication.cpp:1542
#13 0x00007fd52fc11495 in QApplication::exec () at kernel/qapplication.cpp:2976
#14 0x00000000004050da in main (argc=2, argv=<optimized out>) at main.cpp:598

Olivier Tilloy (osomon) wrote :

Inspecting the backtrace in comment #16:

at frame 10 (frame_tree.cc:182):
  std::queue<FrameTreeNode*> queue;

at frame 7 (stl_deque.h:458) in _Deque_base(), with _Tp=oxide::FetchTextureResourcesTaskInfo* and _Alloc=std::allocator<oxide::FetchTextureResourcesTaskInfo*>

This looks suspicious, how can instantiating a queue of FrameTreeNode* use oxide::FetchTextureResourcesTaskInfo* as a template type? That’s beyond my field of expertise, but it could very well be memory corruption indeed.

Pat McGowan (pat-mcgowan) wrote :

Oxide fix has been identified and is in a silo

Changed in canonical-devices-system-image:
status: Confirmed → In Progress
assignee: Thomas Voß (thomas-voss) → David Barth (dbarth)
importance: High → Critical
David Barth (dbarth) on 2016-03-25
Changed in oxide:
status: New → Fix Committed
Changed in canonical-devices-system-image:
status: In Progress → Fix Committed
Changed in unav:
milestone: none → 0.57
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in oxide-qt (Ubuntu):
status: New → Confirmed
David Barth (dbarth) on 2016-04-07
Changed in oxide-qt (Ubuntu):
status: Confirmed → Fix Released
Changed in oxide:
status: Fix Committed → Fix Released
Changed in canonical-devices-system-image:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers