Unity 8 fails to start on staging (xenial) on the phone when --wipe is used for flashing

Bug #1604421 reported by Timo Jyrinki on 2016-07-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical System Image
Critical
Łukasz Zemczak
gsettings-ubuntu-touch-schemas (Ubuntu)
Undecided
Unassigned
ubuntu-touch-session (Ubuntu)
Critical
Łukasz Zemczak
unity8 (Ubuntu)
Undecided
Unassigned

Bug Description

Update: this only happens when --wipe (or --bootstrap) is used for flashing. It is possible to get functional xenial/staging running by doing non-destructive flash update from rc-proposed.

Unity 8 fails to start on staging (=xenial + xenial-overlay) on my krillin.

unity-system-compositor.log: http://paste.ubuntu.com/20035267/ - shows Opening/Closing/Opening/Closing when starting unity8.

unity8.log attached.

Using ubuntu-touch/staging/bq-aquaris.en channel on krillin. version_detail: ubuntu=20160719,device=20160606-ab415b2,custom=20160701-981-38-14,version=49

Crash file at http://people.ubuntu.com/~timo-jyrinki/unity8/_usr_bin_unity8.32011.crash

Xenial can be flashed on the phone with eg:
ubuntu-device-flash touch --channel=ubuntu-touch/staging/ubuntu --developer-mode --password=0000 --wipe --boostrap
(ubuntu channel)
ubuntu-device-flash touch --channel=ubuntu-touch/staging/bq-aquaris.en --developer-mode --password=0000 --wipe --recovery-image recovery-krillin.img
(Bq channel)

Related branches

Timo Jyrinki (timo-jyrinki) wrote :
description: updated
Changed in canonical-devices-system-image:
milestone: none → xenial
description: updated
Timo Jyrinki (timo-jyrinki) wrote :

Tested now also on mako, same problem. ubuntu channel, ubuntu=20160719,device=20160402,custom=20160719,version=44

Changed in unity8 (Ubuntu):
assignee: nobody → Daniel d'Andrada (dandrader)
Changed in unity8 (Ubuntu):
status: New → In Progress
Daniel d'Andrada (dandrader) wrote :

I give up. Everything is crashing repeatedly there and I don't know how to control services there as upstart seems no longer used.

systemctl also doesn't seem to be working correctly. At least not like it does on the desktop.

systemctl list-units for instance fails with:
Failed to list units: No such method 'ListUnitsFiltered'

Changed in unity8 (Ubuntu):
status: In Progress → Confirmed
assignee: Daniel d'Andrada (dandrader) → nobody
Changed in canonical-devices-system-image:
assignee: nobody → Michał Sawicz (saviq)
Timo Jyrinki (timo-jyrinki) wrote :

It's something very recent. Image 40 on ubuntu staging does boot to Unity 8. I will continue bisecting tomorrow.

That is, on my mako ubuntu-device-flash touch --channel=ubuntu-touch/staging/ubuntu --developer-mode --password=0000 --wipe --revision=40 --bootstrap boots to unity8.

Timo Jyrinki (timo-jyrinki) wrote :

@Daniel I don't think systemd is to blame, older xenial images worked. Upstart is still in uses as well for example starting/stopping unity8 via /sbin/initctl.

It's the image 40 which is the last one that works. 41 and newer are broken. The diff in packages between 40 and 41 is quite big: http://paste.ubuntu.com/20295674/

Timo Jyrinki (timo-jyrinki) wrote :

version_detail: ubuntu=20160623,device=20160402,custom=20160623,version=40

Timo Jyrinki (timo-jyrinki) wrote :

I did upgrades from image 40 and finally ended up with everything upgraded and Unity 8 still starting fine. So it seems it would tarball related.

The 41 details: version_detail: ubuntu=20160713,device=20160402,custom=20160713,version=41

-> custom tarball was updated from 20160623 to 20160713
-> device tarball was not updated

Changed in canonical-devices-system-image:
status: New → Confirmed
importance: Undecided → Critical
Łukasz Zemczak (sil2100) wrote :

One thought is: since it's unlikely that the custom tarball actually could break anything, maybe the reason why dist-upgrading works is because unity8 ran once on 40, setting the state of the system to some 'state' which seems to work with all packages upgraded. On a fresh flash of 41 it might have not worked because it was the 'first boot'.

This, of course, only makes sense if no one did tests of flashing 40, booting up and then flashing 41 without --bootstrap or --wipe (so hopefully conserving the state of the rw system).

Timo Jyrinki (timo-jyrinki) wrote :

I think testing flashing 40 + flashing newer without --bootstrap/--wipe was never done.

You can try using my (still working, flashed formerly and dist-upgraded) krillin's home directory:

https://private-fileshare.canonical.com/~tjyrinki/krillin/

Likewise I could extract pieces of / or other parts to try to pinpoint the difference.

Timo Jyrinki (timo-jyrinki) wrote :

Updated to the fact that it's possible to get staging working by not using using wipe/bootstrap.

summary: - Unity 8 fails to start on staging (xenial) on the phone
+ Unity 8 fails to start on staging (xenial) on the phone when --wipe is
+ used for flashing
description: updated

I confirm Timo's findings. vivid + upgrade (with udf) works, fresh xenial doesn't boot. Here is the diff between writable parts of the 2 installation. Left is vivid+upgrade, right is xenial.

Michał Sawicz (saviq) wrote :

The unity8 crash seems to be a glib fatal error:

"No GSettings schemas are installed on the system"

Also supporting this:

$ gsettings list-schemas
No schemas installed

This could also explain why it works on upgrade, if vivid finds and prepares the schemas fine.

Changed in unity8 (Ubuntu):
status: Confirmed → Incomplete
Michał Sawicz (saviq) wrote :

Confirmed, something on vivid makes the schemas work, if you wipe or bootstrap, glib can't find them.

glib-compile-schemas doesn't help, something else must be different between a vivid upgrade and fresh xenial.

Łukasz Zemczak (sil2100) wrote :

Commitlog for 41 if needed.

Łukasz Zemczak (sil2100) wrote :

Looking at the commitlog, from the packages that could have touched anything related to schemas, I see two package uploads happening between image #40 (xenial working) and image #41 (xenial broken):

- gsettings-ubuntu-touch-schemas (0.0.7+16.04.20160615.1-0ubuntu1) > gsettings-ubuntu-schemas:
http://launchpadlibrarian.net/267243088/gsettings-ubuntu-touch-schemas_0.0.6+16.04.20160414-0ubuntu1_0.0.7+16.10.20160615.1-0ubuntu1.diff.gz
- unity (7.4.0+16.04.20160705-0ubuntu1) > unity-schemas:
http://launchpadlibrarian.net/272410280/unity_7.4.0+16.04.20160526.1-0ubuntu1_7.4.0+16.04.20160705-0ubuntu1.diff.gz

no longer affects: glib2.0 (Ubuntu)
Michał Sawicz (saviq) wrote :

The problem seems to be:

$ initctl get-env -g XDG_DATA_DIRS
/usr/share/ubuntu-touch:/usr/share/ubuntu-touch:/usr/share/ubuntu-touch:/usr/share/ubuntu-touch::/custom/usr/share/:/custom/usr/share/

Compared to a working, upgraded device:
$ initctl get-env -g XDG_DATA_DIRS
/custom/xdg/data:/usr/share/ubuntu-touch:/usr/share/ubuntu-touch:/usr/share/ubuntu-touch:/usr/local/share:/usr/share:/custom/usr/share/

Changed in unity8 (Ubuntu):
status: Incomplete → Invalid
Łukasz Zemczak (sil2100) wrote :

It seems as if XDG_DATA_DIRS gets cleared out at some time (you can see the "::" where the previous XDG_DATA_DIRS content should have been), wonder what component could have caused this. There was no change in ubuntu-touch-session that actually sets the .profile file.

As Pat pointed out on IRC a valid lead would be to check what's being run in /etc/profile.d/ on the broken machine. The only script that touches XDG_DATA_DIRS is my old hack for enabling customized notification sounds, /etc/profile.d/add_custom_to_xdg_data.sh. Maybe it would be a good experiment to try and remove it in the broken system. Doesn't look like much could have gotten broken there. Could anyone list the contents of the /etc/profile.d directory on the xenial non-booting device?

Andrea Azzarone (azzar1) on 2016-08-16
no longer affects: unity (Ubuntu)
Changed in ubuntu-touch-session (Ubuntu):
assignee: nobody → Łukasz Zemczak (sil2100)
Łukasz Zemczak (sil2100) wrote :

Ok, it seems that the culprit is my quick-fix add_custom_to_xdg_data.sh in profile.d! I tracked it down and it seems to be causing the invalid .profile. The reason is yet unknown, but I suppose something changed and now it was applied in a different order than on vivid. I checked the values and it seems that /etc/profile.d/add_custom_to_xdg_data.sh was run *before* /usr/bin/ubuntu-touch-session, not the other way around as it was intended. This cause corruption as ubuntu-touch-session was written in a way assuming that XDG_DATA_DIRS is empty (or at least sane), where in this case it only had this one additional entry we wanted for customized notification sounds.

Anyway, need to figure out what happened exactly, but removing add_custom_to_xdg_data.sh from /etc/profile.d is a quick workaround for now.

Changed in ubuntu-touch-session (Ubuntu):
importance: Undecided → Critical
status: New → In Progress
Changed in gsettings-ubuntu-touch-schemas (Ubuntu):
status: New → Invalid
Changed in canonical-devices-system-image:
status: Confirmed → In Progress
assignee: Michał Sawicz (saviq) → Łukasz Zemczak (sil2100)
Changed in canonical-devices-system-image:
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-touch-session - 0.108+16.10.20160817.1-0ubuntu1

---------------
ubuntu-touch-session (0.108+16.10.20160817.1-0ubuntu1) yakkety; urgency=medium

  [ Łukasz 'sil2100' Zemczak ]
  * Modify the add_custom_to_xdg_data.sh profile.d hook to not modify
    the XDG_DATA_DIRS if it's not set yet. (LP: #1604421)
  * Do not save the ubuntu-touch-session environment to the local
    .profile files as this is not required (and bad practice).

 -- Łukasz Zemczak <email address hidden> Wed, 17 Aug 2016 13:53:09 +0000

Changed in ubuntu-touch-session (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers