unity8 sometimes hangs on boot
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | Canonical System Image |
Critical
|
Unassigned | ||
| | autopilot (Ubuntu) |
Undecided
|
Unassigned | ||
| | libusermetrics (Ubuntu) |
Critical
|
Pete Woods | ||
| | lxc-android-config (Ubuntu) |
Undecided
|
Unassigned | ||
| | qtbase-opensource-src (Ubuntu) |
Medium
|
Unassigned | ||
| | ubuntu-system-settings-online-accounts (Ubuntu) |
Undecided
|
Unassigned | ||
| | unity8 (Ubuntu) |
Undecided
|
Unassigned | ||
Bug Description
The following gdbus call is failing with a "Error: Timeout was reached" message:
gdbus call --session --dest com.canonical.
This is being seen on krillin devices starting with image 106 from ubuntu-
A copy of ~/.cache/
I have 3 test cases where the problem was observed:
http://
http://
http://
In all cases, the test is using adt-run (from autopkgtest) to drive a test on the phone device. adt-run uses the above gdbus call to determine if the desktop is active. In all the examples, the device was freshly flashed.
== Test Case ==
# Prepare debugging
adb shell
sudo apt-get clean # so that you wouldn't run out of disk space
sudo apt install qtbase5-dbg libc6-dbg libdbus-
# Add also libusermetrics debug symbols, unless you're testing a PPA version
echo "deb http://
sudo apt-key adv --keyserver keyserver.
sudo apt-get update
sudo apt install libusermetricso
# Start the reboot loop
# This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually with Unity 8 having a black background. Other kind of hangs (like just Google logo showing, no adb) are not related to this bug. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.
bzr branch lp:unity8
cd unity8
while true; do adb shell rm -R "~phablet/
# When it fails
adb shell
sudo gdb -p $(pidof unity8)
bt full
--
At this point, the backtrace should show:
#0 syscall () at ../sysdeps/
#1 0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
at thread/
#2 lockInternal_
at thread/
#3 QBasicMutex:
at thread/
#4 0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
#5 lock (timeout=-1, this=0x1523b38) at thread/
#6 QMutex::lock (this=this@
#7 0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
a=ToggleWat
#8 QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAc
this=<synthetic pointer>) at qdbusthreaddebu
#9 qDBusRealToggle
at qdbusintegrator
#10 0xb5ae18f6 in ?? () from /lib/arm-
With this, it's know that it was a QDBus locking related problem.
--
---
Timeline/Updates:
2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://
2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
2015-04-10: Further patches from upstream fix all AP tests.
2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.
Related branches
- PS Jenkins bot: Needs Fixing (continuous-integration) on 2015-04-08
- Christopher Lee (community): Needs Fixing on 2015-04-02
-
Diff: 151 lines (+26/-83)2 files modifiedautopilot/introspection/_search.py (+1/-18)
autopilot/tests/unit/test_introspection_search.py (+25/-65)
- PS Jenkins bot: Approve (continuous-integration) on 2015-03-26
- Albert Astals Cid (community): Abstain on 2015-03-26
- Thomi Richards: Pending requested 2015-03-26
- Leo Arias: Pending requested 2015-03-26
- Christopher Lee: Pending requested 2015-03-26
-
Diff: 22 lines (+1/-5)1 file modifiedautopilot/introspection/_search.py (+1/-5)
- PS Jenkins bot: Approve (continuous-integration) on 2015-04-28
- Timo Jyrinki (community): Approve on 2015-04-28
-
Diff: 169 lines (+17/-31)8 files modifieddebian/control (+1/-3)
online-accounts-service/main.cpp (+6/-6)
online-accounts-service/ui-proxy.cpp (+2/-0)
online-accounts-ui/main.cpp (+2/-18)
online-accounts-ui/online-accounts-ui.pro (+1/-4)
online-accounts-ui/ui-server.cpp (+2/-0)
tests/online-accounts-service/tst_ui_proxy.cpp (+2/-0)
tests/online-accounts-ui/qml/tst_AuthorizationPage.qml (+1/-0)
- PS Jenkins bot: Approve (continuous-integration) on 2015-05-22
- Albert Astals Cid (community): Needs Information on 2015-05-22
- Martin Pitt: Needs Fixing on 2015-04-28
-
Diff: 62 lines (+52/-0)2 files modifieddebian/tests/control (+17/-0)
debian/tests/doesnt-hang-on-boot (+35/-0)
- PS Jenkins bot: Needs Fixing (continuous-integration) on 2015-04-27
- Unity Team: Pending requested 2015-04-27
-
Diff: 83 lines (+9/-9)6 files modifiedsrc/libusermetricsinput/MetricManager.cpp (+2/-1)
src/libusermetricsoutput/UserMetrics.cpp (+2/-1)
src/usermetricsservice/DBusDataSet.cpp (+1/-2)
src/usermetricsservice/DBusDataSource.cpp (+1/-2)
src/usermetricsservice/DBusUserData.cpp (+1/-2)
src/usermetricsservice/main.cpp (+2/-1)
- PS Jenkins bot: Needs Fixing (continuous-integration) on 2015-04-27
- Timo Jyrinki: Needs Fixing on 2015-04-27
-
Diff: 168 lines (+18/-18)9 files modifiedplugins/Ubuntu/DownloadDaemonListener/DownloadTracker.cpp (+1/-1)
plugins/Ubuntu/SystemImage/SystemImage.cpp (+1/-1)
plugins/Unity/Session/dbusunitysessionservice.cpp (+4/-4)
plugins/Wizard/System.cpp (+1/-1)
src/libunity8-private/abstractdbusservicemonitor.cpp (+5/-5)
src/libunity8-private/unitydbusobject.cpp (+1/-1)
src/libunity8-private/unitydbusvirtualobject.cpp (+1/-1)
tests/plugins/Unity/Launcher/CMakeLists.txt (+2/-2)
tests/plugins/Unity/Session/CMakeLists.txt (+2/-2)
- PS Jenkins bot: Approve (continuous-integration) on 2015-04-27
- Unity Team: Pending requested 2015-04-27
-
Diff: 40 lines (+6/-3)3 files modifiedsrc/usermetricsservice/DBusDataSet.cpp (+2/-1)
src/usermetricsservice/DBusDataSource.cpp (+2/-1)
src/usermetricsservice/DBusUserData.cpp (+2/-1)
| Michael Terry (mterry) wrote : | #1 |
| Michael Terry (mterry) wrote : | #2 |
Looks like a bad interaction between libusermetrics and dbus that results in a mutex lock...
| Michael Terry (mterry) wrote : | #3 |
Hmm, neither of those packages changed recently. But nor did unity8... I doubt this is unity8's direct fault.
| summary: |
- unity8 sometimes fails to respond to com.canonical.UnityGreeter IsActive + unity8 sometimes hangs on boot |
| Michał Sawicz (saviq) wrote : | #4 |
Bugs #1417773 and #1418707 show similar dbus-related lockups :/
| Michael Terry (mterry) wrote : | #5 |
Interesting. They may be the same bug. But thankfully, this version of it is easily reproducable. Just flash and boot until you hit it.
| Launchpad Janitor (janitor) wrote : | #6 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in unity8 (Ubuntu): | |
| status: | New → Confirmed |
| Michał Sawicz (saviq) wrote : | #7 |
I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits.
We suspect a relation to QTBUG https:/
| Changed in unity8 (Ubuntu): | |
| status: | Confirmed → Triaged |
| importance: | Undecided → High |
| assignee: | nobody → Michał Sawicz (saviq) |
| Albert Astals Cid (aacid) wrote : | #8 |
Yeah my current guess is that QDBusConnection
| Changed in unity8 (Ubuntu): | |
| assignee: | Michał Sawicz (saviq) → Albert Astals Cid (aacid) |
| status: | Triaged → In Progress |
| Albert Astals Cid (aacid) wrote : | #9 |
Have been using https:/
| Michał Sawicz (saviq) wrote : | #10 |
https:/
| Changed in unity8 (Ubuntu): | |
| status: | In Progress → Invalid |
| assignee: | Albert Astals Cid (aacid) → nobody |
| Changed in qtbase-opensource-src (Ubuntu): | |
| assignee: | nobody → Timo Jyrinki (timo-jyrinki) |
| Timo Jyrinki (timo-jyrinki) wrote : | #11 |
So, 107969 broke things and couldn't be landed. We landed Qt 5.4.1 now, open to new suggestions that are bullet proof.
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | New → Incomplete |
| Albert Astals Cid (aacid) wrote : | #12 |
Timo if you could create a silo with
https:/
https:/
https:/
https:/
https:/
https:/
https:/
https:/
https:/
It'd be great, suposedly that will land in Qt 5.5 and makes all the dbus threading "better"
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | Incomplete → Confirmed |
| Francis Ginther (fginther) wrote : | #13 |
Another boottest failure, this one seen with krillin image 152:
https:/
| Timo Jyrinki (timo-jyrinki) wrote : | #14 |
Albert's list of cherry-picked patches testable in vivid-021 landing silo together with bug #1431798 fix.
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | Confirmed → In Progress |
| Michał Sawicz (saviq) wrote : | #15 |
Without silo 21, I was able to get the deadlock with 6-8 reboots. With the silo, no deadlock in 30 reboots and a wipe.
| Timo Jyrinki (timo-jyrinki) wrote : | #16 |
The current set of qtbase DBus patches reliably regress when running UITK autopilot tests: http://
The qtbase update - current version - has now moved to silo 018 to be able to land the qtdeclarative update from silo 021 still.
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | In Progress → Incomplete |
| Albert Astals Cid (aacid) wrote : | #17 |
Two more patches that are part of the patchset and would make sense having
https:/
https:/
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | Incomplete → In Progress |
| Changed in qtbase-opensource-src (Ubuntu): | |
| importance: | Undecided → Critical |
| Timo Jyrinki (timo-jyrinki) wrote : | #18 |
Adding libusermetrics as upgrade of it in http://
If fixing Qt seems a time consuming process, an option would be to revert the part of the libusermetrics upgrade that triggers the bug.
Also adding autopilot as at least the current set of patches regressed on running Autopilot tests which may mean Autopilot is relying on old behavior and the patches themselves might be correct. We'll know more when the new 2 patches have built and tests run.
| Changed in libusermetrics (Ubuntu): | |
| status: | New → Incomplete |
| Changed in autopilot (Ubuntu): | |
| status: | New → Incomplete |
| Francis Ginther (fginther) wrote : | #19 |
Some more boottest failures, these with krillin image 158
https:/
https:/
| Francis Ginther (fginther) wrote : | #20 |
And some more with 158:
https:/
https:/
https:/
https:/
https:/
| Timo Jyrinki (timo-jyrinki) wrote : | #21 |
I ran some Autopilot testing last night and continuing a bit now . The two new patches do not fix the AP problems. As examples:
- Archive image (#145):
* UITK: 1-2 AP failures
* webbrowser-app: 1-2 AP failures
* calculator: 0 AP failures
* ubuntu-
- Image (#147) added with silo 018 (qtbase DBus patches)
* UITK: 7-8 failures (similar to earlier #145 + previous qtbase DBus fix build)
* webbrowser-app: 5-7 failures
* calculator: 2 failures
* ubuntu-
Details at http://
Even if the qtbase patches themselves would be correct and functioning well and they just happen to make Autopilot break, we can't let Autopilot regress so badly. Either there should be something more done to Qt or an Autopilot landing fixing the issues should be put to the same silo.
As for libusermetrics, the diff for its landing in #106 is at http://
| Changed in autopilot (Ubuntu): | |
| status: | Incomplete → New |
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | In Progress → Incomplete |
| Timo Jyrinki (timo-jyrinki) wrote : | #22 |
All the failure traces seem to be stating that autopilot wouldn't be able to launch the application. In case of eg address-book-app once a test fails, all subsequent tests fail too. In case of eg UITK, it's just that random tests fail.
Here's a simplified test case:
while [ 1 ] ; do phablet-test-run ubuntuuitoolkit
When the failure happens, the application seems to start fine, but autopilot thinks it did not. After waiting for some while (~38 seconds as opposed to ~6 seconds on successful run) it errors out.
Without silo 018 the above test case never fails.
| description: | updated |
| Albert Astals Cid (aacid) wrote : | #23 |
Timo, can you try running the tests again with https:/
while [ 1 ] ; do phablet-test-run ubuntuuitoolkit
with it for over an hour and had no failure.
| Richard Huddie (rhuddie) wrote : | #24 |
I tried reproducing this issue on silo 18 using address_
The test failed on the 10th attempt in the same way as in the logs: http://
Watching the test, the address-book-app froze as it was launching, displaying a frozen spinner on the loading screen. Autopilot then timed out waiting for the app to launch and reported the introspection failure.
Subsequent tests then failed with the maliit-server error as reported in the same logs.
The only crash file reported was: _usr_lib_
| Albert Astals Cid (aacid) wrote : | #25 |
Richard, which hardware are you using?
Can you try reproducing the issue again and then gdb'ing in to the frozen app and posting a backtrace? I've been running it in a loop for minutes and had nothing.
| Richard Huddie (rhuddie) wrote : | #26 |
I was using mako devel-proposed image #148 with the silo installed.
I'll try and reproduce again and get some more info.
| Timo Jyrinki (timo-jyrinki) wrote : | #27 |
I've reflashed the device so this is with vivid #150 + 018. The MP in #23 (just adjusted for AP 1.5 branch) is now in the 018 PPA.
The simplified test case seems fixed for me too, but unfortunately it'd look like running a whole suite like phablet-test-run ubuntuuitoolkit simply hangs at some point - in the sense that test app loads but nothing further happens. Killing the process gets the tests to continue. This ended up with: http://
The second UITK run with qtbase + AP seemingly went uninterrupted further, but still with more (7) AP failures compared to before updating qtbase. However, at the end of the run it seemed like the test app was hanging - I tried gdb attaching to it, realized I didn't have debug symbols and detached. After this it seemed the it continued and the testing completed immediately: http://
So it'd seem the AP patch does change behavior and fixes the simple testcase, but is not the real fix. Or it's possible it fixes the AP end but there is some sort hanging issue with the updated Qt.
| Richard Huddie (rhuddie) wrote : | #28 |
I tried again to reproduce the issue, using #150 + silo 18 (including autopilot update). I've not been able to reproduce it again after many retries.
| Albert Astals Cid (aacid) wrote : | #29 |
running phablet-test-run ubuntuuitoolkit hanged for me with https:/
| Timo Jyrinki (timo-jyrinki) wrote : | #30 |
webbrowser-app with the new qtbase + AP http://
| Albert Astals Cid (aacid) wrote : | #31 |
So i tried with silo 13 two times in a row and all the SDK autopilot tests passed fine.
| Francis Ginther (fginther) wrote : | #32 |
A boottest failure with image 160:
http://
| Timo Jyrinki (timo-jyrinki) wrote : | #33 |
Unfortunately I couldn't reproduce success with my mako last night with 018 (qtbase + autopilot) + 013 (Mir 0.12.1) + rsalveti (glibc fix). Here are the results until aborted: http://
I'll try again with the new #151 image including Mir, glibc and other things + 018, but I doubt it'll change this picture.
@Albert did you test on krillin or mako?
| Albert Astals Cid (aacid) wrote : | #34 |
krillin.
| Timo Jyrinki (timo-jyrinki) wrote : | #35 |
Looking in more detail, it seems the new Mir may have helped mako in the address-book-app "hang" case, even though for example UITK and webbrowser-app still have more failures. The address-book-app that failed reliably several times earlier, now in the comment #33 linked results failed only 1 test on both runs, similar to what happens on archive.
Rerunning now on image #151 + silo 018, UITK still seems problematic which is not a surprise since it should be similar to yesterday's combination of #150 + silo 018 + silo 013 + rsalveti PPA. This time executing tests seemed to have hanged at one point, and touching the screen interestingly resulted in Unity 8 crashing: http://
| Albert Astals Cid (aacid) wrote : | #36 |
The crash in http://
| Albert Astals Cid (aacid) wrote : | #37 |
https:/
| tags: | added: lt-blocker lt-category-visible |
| Timo Jyrinki (timo-jyrinki) wrote : | #38 |
The earlier simplified test case does not fail anymore or at least not easily. Whether on stock image or using the PPA, the problem with running the same test over and over is that Unity8 crashes or restarts sooner or later when running the command, for example that crasher bug noticed.
I ran a new baseline set of results on Saturday with image #153 to see how the situation looks like on mako after the Mir, UITK, glibc landings. It's still the same - UITK, webbrowser, calculator, ubuntu-
In other words, to reproduce the autopilot problem when using the PPA you can run any of the following on mako:
phablet-test-run ubuntuuitoolkit
phablet-test-run webbrowser_app
phablet-test-run ubuntu_
phablet-test-run ubuntu_
phablet-test-run ubuntu_weather_app
Prepare the device by first running on device sudo apt install ubuntu-
| description: | updated |
| description: | updated |
| description: | updated |
| description: | updated |
| description: | updated |
| description: | updated |
| Albert Astals Cid (aacid) wrote : | #39 |
My opinion is that all those failures where probably there already since the ones i've been able to find/replicate don't seem qdbus related
Now why is this happening with these patches? These patches make qdbus serialize much less on the main thread, i.e. things are executed more in different threads, this decreases the points of sincronization qdbus caused on the main thread without these patches and so threads run more "freely"
This can make this that were previously race hit the races and thus cause the autopilot tests to randomly fail or not depending if the race is fixed or not
Now i can't say that none of these failures are qdbus related since i've not been able to reproduce them all, would need help from the people with expertise on those particular apps to have a look at them.
| Francis Ginther (fginther) wrote : | #40 |
A boottest failure with image 167 (krillin):
https:/
| Timo Jyrinki (timo-jyrinki) wrote : | #41 |
Addition to my previous comment, for weather + calculator apps you need to run phablet-
phablet-
phablet-
The other tests come from the mentioned deb packages.
| Timo Jyrinki (timo-jyrinki) wrote : | #42 |
And one more addition to anyone not constantly running autopilot and aware of the pre-requirements:
phablet-config autopilot --dbus-probe enable
is needed before running tests. This is documented at https:/
| Albert Astals Cid (aacid) wrote : | #43 |
Ok, got one thread deadlock in the calculator app that seems dbus related, investigating a bit more
| Timo Jyrinki (timo-jyrinki) wrote : | #44 |
As per Albert's instructions I took three updated patches and five new patches from upstream. Four of them needed rebasing for Qt 5.4, but a local build now finished and a PPA build with version number ubuntu6~
| Launchpad Janitor (janitor) wrote : | #45 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in autopilot (Ubuntu): | |
| status: | New → Confirmed |
| Timo Jyrinki (timo-jyrinki) wrote : | #46 |
The bug is now finally confirmed as fixed without regressions in autopilot tests, in the silo 018 (qtbase version ending ubuntu6). The silo is now handed off to QA.
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | Incomplete → In Progress |
| Changed in libusermetrics (Ubuntu): | |
| status: | Incomplete → Invalid |
| Albert Astals Cid (aacid) wrote : | #47 |
Bad news, tried to start Plasma 5 with the patches to be greated by a zillion of crashing processes, here a backtrace of one of them
Thread 1 (Thread 0x7f658362a780 (LWP 10972)):
[KCrash Handler]
#6 0x00007f6583662168 in QDBusConnection
#7 0x00007f6583662ce2 in QDBusConnection
#8 0x00007f65836a8695 in QDBusServiceWat
#9 0x00007f65836a8695 in QDBusServiceWat
#10 0x00007f653fde3254 in () at /usr/lib/
#11 0x00007f653fdcbd38 in () at /usr/lib/
#12 0x00007f65823bd608 in KPluginFactory:
#13 0x00007f65832ece7d in () at /usr/lib/
#14 0x00007f65832ed45f in () at /usr/lib/
#15 0x00007f65832edc88 in () at /usr/lib/
#16 0x00007f65832eec53 in () at /usr/lib/
#17 0x00007f65832ef03a in kdemain () at /usr/lib/
#18 0x00007f6582f38a40 in __libc_start_main (main=0x400720 <main>, argc=1, argv=0x7fff1373
#19 0x000000000040074e in _start ()
Also konsole is now failing to start (this may be a misuse of api on KUniqueApplication) but still not good to regress like that
| Timo Jyrinki (timo-jyrinki) wrote : | #48 |
Indeed the testing was finished only on the phone + Unity 7 desktop (Qt 5 apps only). QA signoff is now put on hold once again because of the Plasma 5 regression.
The phone Autopilot results are at http://
| Timo Jyrinki (timo-jyrinki) wrote : | #49 |
Yesterday ubuntu7~test1 was done with http://
Now building ubuntu7~test2 with https:/
| Albert Astals Cid (aacid) wrote : | #50 |
Seems the updated patch indeed fixes the crashes in Plasma 5 startup, but applications using KUniqueApplication like konsole or kfontview fail to start, these applications got a warning before (i.e. without the patchset) that they were misusing qdbus and i think that Qt changing to make them fail to start is "ok" but the problem is us introducing this before Qt does (qt 5.4 vs qt 5.5) and so suddenly/close to the vivid release.
Maybe this patchset could be applied either only to armhf or only to vivid-rtm (if we end up with that solution for this release) so the desktop apps using Qt5 have time to adapt?
| Timo Jyrinki (timo-jyrinki) wrote : | #51 |
The autopilot tests with ubuntu7~test3 are ready and looking (with unity8 and music-app flaky and could be rerun) good when compared to the stock archive results: http://
vivid-rtm should be available late next week so this could land there. Modifying the packaging to selectively patch only armhf might be an option but given vivid is <2 weeks from release even that's a bit far-fetched idea at this point. For vivid+1 that could be done so that we're not out of sync with rtm there.
| Tony Espy (awe) wrote : | #52 |
As the bug description seems to focus on test hangs, and mentions krillin and mako, I want to add that I see this on normal boots of mako ( vivid-proposed / #166 ) and arale ( vivid-proposed / #165 ); roughly 30% of the time.
| Changed in canonical-devices-system-image: | |
| status: | New → In Progress |
| importance: | Undecided → Critical |
| milestone: | none → ww17-2015 |
| Changed in autopilot (Ubuntu): | |
| status: | Confirmed → Fix Released |
| Timo Jyrinki (timo-jyrinki) wrote : | #53 |
This is still pending on an upstream fix that doesn't break certain KDE apps using deprecated functions. Meanwhile, I can (and will) prepare a landing to vivid-rtm PPA if it seems a fix that can be landed to Ubuntu proper continues to take time.
| Timo Jyrinki (timo-jyrinki) wrote : | #54 |
With the current build 5.4.1+dfsg-
| Timo Jyrinki (timo-jyrinki) wrote : | #55 |
Weekend's test results at http://
| Timo Jyrinki (timo-jyrinki) wrote : | #56 |
As an update ~test3 updated ppc64el symbols and "ubuntu7" is a no-change rebuild of ~test3 with a saner version number. Putting forward to QA for further testing and signoff.
| Timo Jyrinki (timo-jyrinki) wrote : | #57 |
QA has found an app installation problem sometimes happening with the silo 018, fixed at least by removing and re-adding U1 account. I did not see such problem, but I also flashed clean, updated and only then added the U1 account for installing apps so it maybe the reason I was able to install apps without problems.
I did see both with and without the PPA that sometimes adding U1 account in general stalls on vivid, so that's not a regression.
On IRC I got some info about how DBus is used with app installations: "dbus is used to monitor installation progress (for the progress bar in the preview). apps store scope just passes dbus object path to unity8 dash and doesn't interact with dbus directly (at least when it comes to installation); probably unity8/dash logs are first to look at"
I now updated to latest image, added U1 account, installed an app, and then upgraded to silo 018, and I can see the problem now. When I click "Install", dbus.log gets:
---
Activating service name='com.
Successfully activated service 'com.ubuntu.
---
but unity8-dash.log gets:
---
RequestAccess failed: QDBusError(
---
Then when I clicked back arrow from the app page, I got a request for U1 credentials, which I canceled. When I go to u-s-s the U1 account is not there anymore. After adding U1 account, it appears that app installation work smoothly.
It seems to me the pattern is that the U1 account gets removed randomly when a reboot is involved. One time I observed the account had survived a reboot, but disappeared at the moment I clicked "Install". But it seems that without a reboot and with account added app installation seems to work for multiple apps without problems.
I did clean QML cache at one point so it does not seem to be related (also, qtdeclarative isn't touched in the PPA).
After ppa-purging the problem doesn't occur anymore. So the problem has to come from the QDBus changes in the silo, even if those provably fix the Unity 8 hanging issue during boot and no issues were seen in any of the autopilot tests.
| Timo Jyrinki (timo-jyrinki) wrote : | #58 |
| description: | updated |
| description: | updated |
| Timo Jyrinki (timo-jyrinki) wrote : | #59 |
https:/
But it'd seem possible that a similar fix to a racy code would be needed somewhere that gets called when "Install" button gets pressed for the first time after a reboot. It now often removes the account when it's pressed, although sometimes you can do a full "add U1 account, reboot, install app" cycle without problems.
| Alejandro J. Cura (alecu) wrote : | #60 |
With latest silo 18 on mako #183 I still got the account deleted after reboot.
And the spinner is still shown many times after clicking on "Sign In" in the online accounts screen, both when started from system-
While the spinner is shown, both online-accounts-ui and unity8 seem to go on a merry dance while chewing at the cpu. (perhaps it's online-accounts-lib the one inside the unity8 process, or it's qtdbus)
The top that alesage sees on arale looks similar to what I see on mako: http://
But, starting dbus-monitor doesn't show any unusual traffic that might explain the high cpu usage.
| Michał Sawicz (saviq) wrote : | #61 |
Note that the activity indicator itself might be the CPU hogger - see bug #1431957.
| Timo Jyrinki (timo-jyrinki) wrote : | #62 |
It seems I've found a workaround for the original unity8 hang on boot, which we can include in lxc-android-config without touching Qt while the upstream fixing of QDBus is not 100% complete.
| Changed in lxc-android-config (Ubuntu): | |
| assignee: | nobody → Timo Jyrinki (timo-jyrinki) |
| status: | New → In Progress |
| Timo Jyrinki (timo-jyrinki) wrote : | #63 |
Scratch that. We started testing on arale, and it seems that on arale at least the delay workaround is not enough. It also seems 30 reboot is noy enough to reliably reproduce the hang, as once it happened after 54 reboots.
| Changed in lxc-android-config (Ubuntu): | |
| assignee: | Timo Jyrinki (timo-jyrinki) → nobody |
| status: | In Progress → Incomplete |
| description: | updated |
| description: | updated |
| Vincent Ladeuil (vila) wrote : | #64 |
I've created https:/
Given the last comment above it may need to be tweaked s/30/60/ .
| Vincent Ladeuil (vila) wrote : | #65 |
Also reproduced on arale with the branch above after 25 reboots.
$ adb shell system-image-cli -i
current build number: 22
device name: arale
channel: ubuntu-
last update: 2010-01-01 00:29:22
version version: 22
version ubuntu: 30e975dd988b3be
version device: 998baf0435d37e4
version custom: 76da72854bde96b
In addition, the first time I ran this test I ended up with a black screen with only the tiny ubuntu logo after two reboots only.
Rebooting the phone from the power button was enough to recover it though.
Looks like this test can reproduce more than one bug ;)
| Pat McGowan (pat-mcgowan) wrote : | #66 |
Tried running the new test but I get /tmp/autopkgtes
| Vincent Ladeuil (vila) wrote : | #67 |
@Pat: Can you re-try adding '-d' to your adt-run options and share the log ?
/tmp/autopkgtes
I'm eager to learn how such a failure can happen and where it needs to be fixed.
| Vincent Ladeuil (vila) wrote : | #68 |
@Pat: Just in case, I'm on vivid and use autopkgtest 3.13, that may be related.
| description: | updated |
| description: | updated |
| description: | updated |
| Timo Jyrinki (timo-jyrinki) wrote : | #69 |
Some weekend numbers of how many reboots needed to reproduce the problem on mako: 30, 12, 15, 59, 15, 9, 26, 10, 6. There's no clear upper limit.
In other testing, reverting the libusermetrics landing from February does not seem to cure the problem - I was able to reproduce the problem also with the revert I've pushed to https:/
I've improved the test case in the description several times to get fuller backtraces. With the latest version I was able to see that both in case of the current libusermetrics and the reverted one the backtrace leads back to usermetrics's DBus usage. Which is not to say there's anything wrong with libusermetrics, as the bug is in multi-threaded handling of DBus inside Qt. The DBus call being called is:
#17 0xffffffff in dbus_bus_add_match (connection=
msg = 0xf76058
| Timo Jyrinki (timo-jyrinki) wrote : | #70 |
| Timo Jyrinki (timo-jyrinki) wrote : | #71 |
(note that the "path" and "member" may be different on each hang)
| description: | updated |
| description: | updated |
| description: | updated |
| Changed in libusermetrics (Ubuntu): | |
| status: | Invalid → In Progress |
| importance: | Undecided → Critical |
| assignee: | nobody → Pete Woods (pete-woods) |
| Timo Jyrinki (timo-jyrinki) wrote : | #72 |
There was a tip from tvoss to try to use QDBusConnection
There are 3 PPA:s now all doing that:
- https:/
- https:/
- https:/
Using 007, my mako is now at 105 reboots and counting.
| Timo Jyrinki (timo-jyrinki) wrote : | #73 |
"+ similar changes to unity8 from Albert", the 010 description was meant to be.
| Timo Jyrinki (timo-jyrinki) wrote : | #74 |
Ok 007 was validated on mako and arale (>= 100 reboots), but we're switching silos now. Also qt5-proper was tested but it won't be needed at this point.
A new commit on Pete's libusermetrics branch combines his and my changes in a better way and that is now being built in silo _017_. We're going to revalidate that and if that's good, put it to QA for sign-off. Unity 8 changes would not be needed at this point.
So, testing should be done with https:/
| Timo Jyrinki (timo-jyrinki) wrote : | #75 |
arale testing stopped at 127 successful reboots, mako at 83. Now in QA sign-off.
| description: | updated |
| Timo Jyrinki (timo-jyrinki) wrote : | #76 |
This bug is now fixed (workarounded) in the overlay PPA, ie the unity8 doesn't hang anymore on boot:
libusermetrics (1.1.1+
[ Pete Woods ]
* Stop using shared DBus connections (LP: #1421009)
-- CI Train Bot <email address hidden> Mon, 27 Apr 2015 15:38:43 +0000
The bug will stay open for the possibility of getting the reworked QDBus from upstream at some point.
| Changed in libusermetrics (Ubuntu): | |
| status: | In Progress → Fix Released |
| Changed in canonical-devices-system-image: | |
| status: | In Progress → Fix Committed |
| Changed in lxc-android-config (Ubuntu): | |
| status: | Incomplete → Invalid |
| Changed in qtbase-opensource-src (Ubuntu): | |
| importance: | Critical → High |
| Changed in canonical-devices-system-image: | |
| status: | Fix Committed → Fix Released |
| tags: | removed: lt-blocker |
| Changed in qtbase-opensource-src (Ubuntu): | |
| importance: | High → Medium |
| Changed in unity8 (Ubuntu): | |
| importance: | High → Undecided |
| Łukasz Zemczak (sil2100) wrote : | #77 |
Removing it from the landing team tracker as in theory this is no longer a real issue.
| Timo Jyrinki (timo-jyrinki) wrote : | #78 |
Upstream has postponed their QDBus rework to Qt 5.6, so it seems there won't be anything to test for some time.
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | In Progress → Incomplete |
| assignee: | Timo Jyrinki (timo-jyrinki) → nobody |
| Launchpad Janitor (janitor) wrote : | #79 |
This bug was fixed in the package ubuntu-
---------------
ubuntu-
[ Alberto Mardegan ]
* Fix build with Qt 5.5 (LP: #1387537, #1421009, #1448878, #1447175)
* Return the error name to the client (LP: #1441873)
[ CI Train Bot ]
* New rebuild forced.
-- CI Train Bot <email address hidden> Wed, 24 Jun 2015 13:36:39 +0000
| Changed in ubuntu-system-settings-online-accounts (Ubuntu): | |
| status: | New → Fix Released |
| Timo Jyrinki (timo-jyrinki) wrote : | #80 |
Qt 5.6 now in yakkety and xenial-overlay with largely re-engineered QDBus (...with new but different bugs).
| Changed in qtbase-opensource-src (Ubuntu): | |
| status: | Incomplete → Fix Released |


I just got this after 7 tries. On the screen is the Ubuntu spinner. Here's a stacktrace:
(gdb) bt full linux-gnueabihf /libc.so. 6 :lockInternal( ) (op=0, val=3, timeout=0x0, addr=<optimized out>) at thread/ qmutex_ linux.cpp: 154 :lockInternal( ) (timeout=-1, elapsedTimer=0x0, d_ptr=...) at thread/ qmutex_ linux.cpp: 195 :lockInternal( ) (this=this@ entry=0x132b2e4 ) qmutex_ linux.cpp: 211 qmutex. cpp:628 entry=0x132b414 ) qmutex. cpp:223 Watch(QDBusConn ectionPrivate* , DBusWatch*, int) (m=0x132b414, s=0x132b3f0, a=ToggleWatchAc tion, this=<synthetic pointer>) g_p.h:191
{< QDBusMutexLocke r> = {<QDBusLockerBase> = {<No data fields>}, self = 0x132b3f0, mutex = 0x132b414, action = ToggleWatchAction}, <No data fields>} Watch(QDBusConn ectionPrivate* , DBusWatch*, int) (s=0x132b3f0, a=ToggleWatchAc tion, this=<synthetic pointer>) g_p.h:206
{< QDBusMutexLocke r> = {<QDBusLockerBase> = {<No data fields>}, self = 0x132b3f0, mutex = 0x132b414, action = ToggleWatchAction}, <No data fields>} Watch(QDBusConn ectionPrivate* , DBusWatch*, int) (d=0x132b3f0, watch=0x132c480, fd=41) at qdbusintegrator .cpp:344
{< QDBusMutexLocke r> = {<QDBusLockerBase> = {<No data fields>}, self = 0x132b3f0, mutex = 0x132b414, action = ToggleWatchAction}, <No data fields>} 0x132c3e8) dbus-transport- socket. c:167
socket_ transport = 0x132c3e8 0x132c3e8, flags=1, timeout_ milliseconds= <optimized out>) at ../../dbus/ dbus-transport- socket. c:1210
socket_ transport = 0x132c3e8
poll_timeout = <optimized out> _do_iteration (transport= 0x132c3e8, flags=1, timeout_ milliseconds= -1) at ../../dbus/ dbus-transport. c:1001 n_do_iteration_ unlocked (connection= 0x132c808, pending=<optimized out>, flags=1, timeout_ milliseconds= -1) dbus-connection .c:1227 n_send_ preallocated_ unlocked_ no_update (connection= connection@ entry=0x132c808 , preallocated=0x0, message= message@ entry=0x14ee998 , client_ serial= client_ serial@ entry=0x0) at ../../dbus/ dbus-connection .c:205. ..
#0 0xffffffff in syscall () at /lib/arm-
#1 0xffffffff in QBasicMutex:
addr2 = 0x0
val2 = 0
#2 0xffffffff in QBasicMutex:
#3 0xffffffff in QBasicMutex:
at thread/
#4 0xffffffff in QMutex::lock() (this=0x132b2e4) at thread/qmutex.h:67
self = 0xb6efd410
success = true
current = 0x132b2d8
#5 0xffffffff in QMutex::lock() (timeout=-1, this=0x132b2d8)
at thread/
self = 0xb6efd410
success = true
current = 0x132b2d8
#6 0xffffffff in QMutex::lock() (this=this@
at thread/
current = 0x132b2d8
#7 0xffffffff in qDBusRealToggle
at qdbusthreaddebu
locker =
i = <optimized out>
#8 0xffffffff in qDBusRealToggle
at qdbusthreaddebu
locker =
i = <optimized out>
#9 0xffffffff in qDBusRealToggle
locker =
i = <optimized out>
#10 0xffffffff in check_write_watch (transport=
at ../../dbus/
needed = <optimized out>
transport = 0x132c3e8
#11 0xffffffff in socket_do_iteration (transport=
poll_fd = {fd = 41, events = 0, revents = -16705}
poll_res = <optimized out>
#12 0xffffffff in _dbus_transport
#13 0xffffffff in _dbus_connectio
at ../../dbus/
#14 0xffffffff in _dbus_connectio