/var/log owned by wrong group (android_input) (again)

Bug #1451565 reported by Selene ToyKeeper
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Confirmed
High
John McAleely
livecd-rootfs (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Bug 1425529 (or bug 1425869) is back. No syslog on arale image 31 and krillin vivid 188 (which are about two weeks apart), and possibly on other builds too. I'm not sure how widespread it is, but I've run into this several times lately.

Sorry for the dupe, I was asked to create a new bug for it instead of re-opening one of the previous two.

Test Case 1:
1. make the system writable
2. sudo vim.tiny /etc/system-image/channel.ini
3. Change the image number in the lower lines to be back 10 images
4. remove writable and rm .local/share/ubuntu-push-client/level.db
5. reboot
6. upgrade post upgrade check /var/log

Test Case 2:
1. Flash the latest rc-proposed to the device under test
2. run some tests
3. make the system writeable
4. add a silo
5. reflash the same image to the same device (we use bootstrap from fastboot for flashing purposes)

Revision history for this message
Oliver Grawert (ogra) wrote :

hmm, i can not confirm that on either of my devices here ...

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

I'm seeing the issue on arale image 32 too, flashed just a moment ago. Krillin rtm 274 seems fine though, so today's milestone candidate isn't affected.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

Image 38 seems fine. I'm not sure if it's fixed or if it's intermittent.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

Scratch that. It's definitely intermittent. I reflashed 38 and now /var/log is broken again.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

As far as I can tell, the first time I flash an image the permissions are correct. The second and later times I flash the same image, the permissions are incorrect. I'm not why it matters that the image has been flashed before.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

FWIW, removing the UDF cache has no effect on the results for this bug. However, changing to a different image number seems to fix the permission issues.

So. To make the flash work correctly, make sure the new image isn't the same number as the old image. To make the permissions break, flash the same image the device is already running.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

Additionally, the permissions of /var/log aren't the only thing which change when re-flashing the same image.

I collected the output of 'ls -alR /' during the first and second flash, filtered it a bit to remove most of the noise, and diff'd the two to look for differences. Some of the results are unusual.

One oddity is a change in mmcblk0 partition numbers, though this might make sense if the system swaps between two as a failsafe mechanism.
 /dev/disk/by-path:
-lrwxrwxrwx root root platform-mtk-msdc.0 -> ../../mmcblk0boot1
+lrwxrwxrwx root root platform-mtk-msdc.0 -> ../../mmcblk0boot0
...
-lrwxrwxrwx root root 57f8f4bc-abf4-655f-bf67-946fc0f9f25b -> ../../mmcblk0p14
+lrwxrwxrwx root root 57f8f4bc-abf4-655f-bf67-946fc0f9f25b -> ../../mmcblk0p15

More unusual is an apparent difference in the version numbers of installed click apps. For example:
 /sys/kernel/security/apparmor/policy/profiles:
-drwxr-xr-x root root com.canonical.cincodias_CINCODIAS_0.2.19
+drwxr-xr-x root root com.canonical.cincodias_CINCODIAS_0.2.20
-drwxr-xr-x root root com.canonical.elpais_webapp-ELPAIS_0.6.28
+drwxr-xr-x root root com.canonical.elpais_webapp-ELPAIS_0.6.29
-drwxr-xr-x root root com.canonical.scopes.bbc-sport_bbc-sport_1.3.1.31
+drwxr-xr-x root root com.canonical.scopes.bbc-sport_bbc-sport_1.3.1.32
-drwxr-xr-x root root com.ubuntu.calculator_ubuntu-calculator-app_2.0.155.32
+drwxr-xr-x root root com.ubuntu.calculator_ubuntu-calculator-app_2.0.155.31
-drwxr-xr-x root root com.ubuntu.music_music_2.1.857.13
+drwxr-xr-x root root com.ubuntu.music_music_2.1.857.12
-drwxr-xr-x root root com.ubuntu.scopes.youtube_youtube_1.0.18-134.30
+drwxr-xr-x root root com.ubuntu.scopes.youtube_youtube_1.0.18-134.28
-drwxr-xr-x root root com.ubuntu.weather_weather_1.1.403.20
+drwxr-xr-x root root com.ubuntu.weather_weather_1.1.403.19

And of course the known symptom:
 /userdata/system-data/var/log:
-drwxrwxr-x root syslog .
+drwxrwxr-x root android_input .
--rw-r----- syslog adm auth.log
--rw-r----- syslog adm syslog

The filtered 'ls' output and diff are attached.

Revision history for this message
Allan LeSage (allanlesage) wrote :

While reviewing a silo I was investigating lp:1471338 with boiko and we were both stymied trying to find the syslog--this fix would be really really helpful for basic debugging. IIRC I was on arale shortly before OTA-5.

Revision history for this message
Oliver Grawert (ogra) wrote :

hm, that looks very much like udev misbehaving or as if androids ueventd tinkers with the UUIDs when the container comes up (note that we should mount by label everywhere nowadays though)

or perhaps there is a difference in the disk related udev rules inside the initrd vs the ones on the rootfs

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

I forgot to work around this bug today and ended up not able to get syslog data for ~18 hours of tests. D'oh. My own fault, but I hope we'll find a fix for this soon. :)

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

@john is this yours?

Changed in canonical-devices-system-image:
assignee: nobody → John McAleely (john.mcaleely)
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

It happened again on arale 113

Changed in canonical-devices-system-image:
importance: Undecided → High
status: New → Confirmed
milestone: none → ww40-2015
Revision history for this message
John McAleely (john.mcaleely) wrote :

@toykeeper: you say you 'reflash the same image'. How?

Changed in canonical-devices-system-image:
status: Confirmed → Incomplete
Changed in livecd-rootfs (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Dave Morley (davmor2) wrote :

I have seen this most frequently when doing the following steps:
1. make the system writable
2. sudo vim.tiny /etc/system-image/channel.ini
3. Change the image number in the lower lines to be back 10 images
4. remove writable and rm .local/share/ubuntu-push-client/level.db
5. reboot
6. upgrade post upgrade check /var/log

1. Flash the latest rc-proposed to the device under test
2. run some tests
3. make the system writeable
4. add a silo
5. reflash the same image to the same device (we use bootstrap from fastboot for flashing purposes)

description: updated
Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

I find it trivial to reproduce this issue:
1. adb reboot bootloader
2. ubuntu-device-flash foo
3. Note that syslog works normally.
4. adb reboot bootloader
5. ubuntu-device-flash foo
6. Note that syslog no longer works.

Steps 4/5/6 can be repeated if desired. The only way I've found to get a good flash again is to flash a different build number.

Changed in canonical-devices-system-image:
status: Incomplete → Confirmed
Revision history for this message
Tony Espy (awe) wrote :

Just to add to the fire, we have a report from a user in India who purchased a MX4 which has no syslog. It's running OTA6, and no other software has been side-loaded. The phone also is having interoperability problems with certain operators. Let's hope these issues aren't caused by other permissions being wrong per comment #7.

Revision history for this message
Tony Espy (awe) wrote :

It seems possible that devices sold in India may have been re-flashed with a new custom tarball at the factory, so the scenario posited in my previous comment isn't as exceptional as first thought... However it's still makes it more important that we get to the root of this, as we really don't want this to continue to occur on out-of-the-box customer devices.

Revision history for this message
Tony Espy (awe) wrote :

So the double-flash at the factory is still just a theory...

What was confirmed was that the user restored the device to factory default settings using System Settings.

tags: added: hotfix
Changed in canonical-devices-system-image:
status: Confirmed → Won't Fix
status: Won't Fix → Incomplete
milestone: ww40-2015 → backlog
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

@John, could someone have a look again? It is still happening on OTA7 candidate even after flashing with --bootstrap. It makes bugs investigation very difficult because most of the time you realize there is no syslog when you need it.

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

On my krillin running ota7 stable via update, syslog was assigned group syslog not adm

Revision history for this message
Oliver Grawert (ogra) wrote :

the /var/log directory needs to be "root:syslog" ... the syslog file (and others) in the directory needs to be "syslog:adm"

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Is this still the case? I didn't reflash my phone but the group-assignment looks fine on my device. Can anyone try to reproduce?

tags: removed: hotfix
Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

Yes, this is still happening as of tonight's latest image (arale rc-proposed 183). The first flash works, while every subsequent flash of the same image has the wrong permissions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for livecd-rootfs (Ubuntu) because there has been no activity for 60 days.]

Changed in livecd-rootfs (Ubuntu):
status: Incomplete → Expired
Changed in livecd-rootfs (Ubuntu):
status: Expired → Confirmed
Changed in canonical-devices-system-image:
status: Incomplete → Confirmed
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

This is happening again very frequently when reflashing a phone (krillin rc-proposed 286 in this case). I insist but it is really an annoyance to debug any issue on the phone. when you get an issue and realize there is no syslog, it is already too late to fix is manually.

Revision history for this message
John McAleely (john.mcaleely) wrote :

@ondra - can you take a look at this during the next bring up you do (without mentioning the details here).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.