Duplicate files in winning path should prevent updates

Bug #1250181 reported by Alan Pope 🍺🐧🐱 🦄 on 2013-11-11
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ubuntu system image
Critical
Barry Warsaw
ubuntu-download-manager
Fix Released
Critical
Manuel de la Peña
system-image (Ubuntu)
Critical
Barry Warsaw
ubuntu-download-manager (Ubuntu)
Undecided
Unassigned

Bug Description

Trying to OTA update from 16 to 20 on mako. See screenshot.

root@ubuntu-phablet:/var/log/system-image# system-image-cli -i
current build number: 16
device name: mako
channel: trusty-proposed
last update: 2013-11-09 13:52:54
version version: 16
version ubuntu: 20131109
version device: 20131109

From /var/log/system-image/client.log

[systemimage] Nov 11 18:24:00 2013 (8816) Running group download reactor
[systemimage] Nov 11 18:25:20 2013 (8816) Group download reactor done
[systemimage] Nov 11 18:25:23 2013 (8816) uncaught exception in state machine
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/systemimage/state.py", line 138, in run_until
    step()
  File "/usr/lib/python3/dist-packages/systemimage/state.py", line 453, in _download_files
    raise SignatureError(dst)
systemimage.gpg.SignatureError: /android/cache/recovery/device-bc45abbc8b271608f690cc12f7e3dd6e3b6527fdd818a22ea1d2d204350dc56e.delta-device-f67595b49ad4c8233c0ad4645695dd2154242dc3e29b4a60a3afc39058b126a6.tar.xz

Related branches

Barry Warsaw (barry) on 2013-11-11
tags: added: client

root@ubuntu-phablet:/var/log/system-image# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 1.2G 720M 62% /
udev 935M 4.0K 935M 1% /dev
tmpfs 188M 676K 187M 1% /run
/dev/mmcblk0p23 5.7G 3.3G 2.4G 58% /home
/dev/loop1 95M 93M 1.5M 99% /lib/modules
none 4.0K 0 4.0K 0% /android
tmpfs 188M 676K 187M 1% /run
tmpfs 936M 4.0K 936M 1% /etc/fstab
/dev/disk/by-partlabel/cache 552M 10M 542M 2% /android/cache
/dev/disk/by-partlabel/persist 16M 4.2M 12M 27% /android/persist
/dev/disk/by-partlabel/modem 64M 54M 11M 83% /android/firmware
none 4.0K 0 4.0K 0% /sys/fs/cgroup
tmpfs 936M 28K 936M 1% /tmp
none 5.0M 0 5.0M 0% /run/lock
none 936M 72K 936M 1% /run/shm
none 100M 20K 100M 1% /run/user
tmpfs 936M 0 936M 0% /var/lib/sudo
tmpfs 936M 548K 936M 1% /var/lib/lxc/android/rootfs
udev 935M 4.0K 935M 1% /var/lib/lxc/android/rootfs/socket

root@ubuntu-phablet:/var/log/system-image# system-image-cli -n
Upgrade path is 17:18:19:20

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in system-image (Ubuntu):
status: New → Confirmed
Oliver Grawert (ogra) wrote :

i have the same issue on maguro ... noticable is also that the download progress bar went up to 115% before i got the error message in the UI

Watching it when i click "retry" I notice the percentage bar leaps around quite a bit.
Made a video showing it.

https://www.youtube.com/watch?v=Q4wicglMkes

Barry Warsaw (barry) on 2013-11-12
Changed in ubuntu-system-image:
status: New → Confirmed
importance: Undecided → Critical
assignee: nobody → Barry Warsaw (barry)
milestone: none → 2.0
Changed in ubuntu-download-manager:
assignee: nobody → Manuel de la Peña (mandel)
importance: Undecided → Critical
status: New → Confirmed
Barry Warsaw (barry) wrote :

I think I figured this out, but I'll need confirmation from Manuel. Here's my theory.

I flashed my manta to r15 on trusty-proposed, then initiated an update via -cli. The SignatureError is completely reproducible. I recreated the environment locally (on amd64 desktop) by copying over the client.ini and channel.ini and staged up the expected directories. I put a breakpoint right before the SignatureError and initiated an update locally, reproduced the SignatureError and hitting the breakpoint.

From the pdb prompt, I got a listing of the tar.xz and tar.xz.asc files that were failing the signature test and md5 checksummed them in their destination location. Then I got the source path on the server and wget'd them to a different location. Indeed the md5 checksums do *not* match. So clearly udm is somehow corrupting the files when it downloaded them.

There's more, and here's where it gets interesting. At the breakpoint I did a len(downloads) and got a count of 30. For some reason I then did len(set(downloads)) and got a count of 28. Well *that's* strange, but what it tells me is that there were 2 duplicate downloads in the udm group download request. Although I haven't verified (the paths in the index.json are quite difficult to read), I bet there is some image in the 15->20 upgrade path that names the same data file.

So my hypothesis is this: when udm gets a group download request which names a destination (i.e. local path) twice, it will dutifully download the second request right on top of the first. udm *doesn't* raise the duplicate-file exception because the file did not exist before the group download request.

If my theory is correct, then it should be fairly simple to filter out any duplicate downloads before the udm group download request. There may be some corner cases I need to worry about (e.g. what order do files in a duplicate download get applied in, but I think that will naturally play itself out).

I've added a udm bugtask so that Manuel can verify and add additional safeguards in udm.

Testing now...

Barry Warsaw (barry) on 2013-11-12
Changed in system-image (Ubuntu):
importance: Undecided → High
importance: High → Critical
assignee: nobody → Barry Warsaw (barry)
Barry Warsaw (barry) wrote :

IRC discussion:

It's definitely unexpected that you'd see duplicate data files in a delta upgrade path. This is an error condition on the server. Thus the client has three options:

1) fail, probably by raising a DuplicateDataFile error and providing some reasonable error string in the D-Bus error signal.

2) Ignore duplicates and carry on as if nothing's wrong.

3) Throw away the delta upgrade path and re-calculate as if --filter=full

#2 is problematic because duplicate data files *could* lead to unbootable devices. It's an unexpected condition with unknown effects on the upgrade results.

#3 is just masking over the problem.

So we'll do #1. It means that if this problem reoccurs (stgraber is fixing the server side) the user will be stuck because upgrades via the u/i will be impossible until the server is fixed. It should still be possible to perform an upgrade via cli with either --filter=full or -b 0. Still, this gives us the best opportunity to notice the problem and fix the server.

Manuel de la Peña (mandel) wrote :

I have proposed a branch in u-d-m that stops it from allowing two local files to be the same. It will not fix the issue perse because the upgrade won't be possible but we will know the exact reason.

Barry Warsaw (barry) on 2013-11-12
summary: - download update from 16 to 20 failed on mako
+ Duplicate files in index.json should prevent updates
Changed in ubuntu-system-image:
status: Confirmed → In Progress
summary: - Duplicate files in index.json should prevent updates
+ Duplicate files in winning path should prevent updates
Barry Warsaw (barry) on 2013-11-12
Changed in ubuntu-system-image:
status: In Progress → Fix Committed
Vincent Ladeuil (vila) wrote :

I bricked my phone while trying to update it with system upgrade on the phone (via the UI, not via 'adb shell system-image-cli -c trusty-proposed -b 0 -v).

I was running image #15 at the time.

I failed to upgrade to image #20 (~260M wow) with an undecipherable error message (mainly because the file name displayed was too long).

Since the progress bar behavior was a bit... surprising (went over 100, ~110, oops, let's get back to ~90) I tried again, same error.

After several attempts, trying to keep the phone from suspending and watching the progress bar going backwards even before reaching > 100, I resigned (I didn't have IRC access at that point to get some feedback).

Later in the day, image #17 was proposed. Huh ? Still missing IRC access I thought some bug was detected server side and the images were re-shuffled and happily tried that.

The download went well, I tapped the install & restart... rebooted to the google logo and.... it's still there :-/

Vincent Ladeuil (vila) wrote :

Following popey instructions I got the device un-bricked:

$ adb reboot-bootloader
$ fastboot flash boot trusty-preinstalled-boot-armhf+mako.img

image downloaded from http://cdimage.ubuntu.com/ubuntu-touch/daily-preinstalled/20131107.1/trusty-preinstalled-boot-armhf+mako.img

rebooting the device

That gave me a functional #17 image.

Vincent Ladeuil (vila) wrote :
Download full text (3.8 KiB)

From imgae #17, the UI stayed at the spinning 'checking for updates' for as long as I could wait (at least 10 minutes), so I went with:

# adb shell system-image-cli -c trusty-proposed -b 0 -v
[systemimage] Nov 13 10:47:19 2013 (5513) running state machine [trusty-proposed/mako]
[systemimage] Nov 13 10:47:19 2013 (5513) Looking for blacklist: https://system-image.ubuntu.com/gpg/blacklist.tar.xz
[systemimage] Nov 13 10:47:20 2013 (5513) Requesting group download:
 https://system-image.ubuntu.com/gpg/blacklist.tar.xz -> /var/lib/system-image/keyring.tar.xz
 https://system-image.ubuntu.com/gpg/blacklist.tar.xz.asc -> /var/lib/system-image/keyring.tar.xz.asc

[systemimage] Nov 13 10:47:20 2013 (5513) Running group download reactor
[systemimage] Nov 13 10:47:31 2013 (5513) Group download reactor done
[systemimage] Nov 13 10:47:31 2013 (5513) Local blacklist file: /var/lib/system-image/blacklist.tar.xz
[systemimage] Nov 13 10:47:31 2013 (5513) Looking for: https://system-image.ubuntu.com/channels.json
[systemimage] Nov 13 10:47:31 2013 (5513) Requesting group download:
 https://system-image.ubuntu.com/channels.json -> /tmp/system-image-z9hrkv/channels.json
 https://system-image.ubuntu.com/channels.json.asc -> /tmp/system-image-z9hrkv/channels.json.asc

[systemimage] Nov 13 10:47:31 2013 (5513) Running group download reactor
[systemimage] Nov 13 10:47:32 2013 (5513) Group download reactor done
[systemimage] Nov 13 10:47:32 2013 (5513) Local channels file: /tmp/system-image-z9hrkv/channels.json
[systemimage] Nov 13 10:47:32 2013 (5513) got channel: trusty-proposed
[systemimage] Nov 13 10:47:32 2013 (5513) found channel/device entry: trusty-proposed/mako
[systemimage] Nov 13 10:47:32 2013 (5513) Requesting group download:
 https://system-image.ubuntu.com/trusty-proposed/mako/index.json -> /tmp/system-image-z9hrkv/index.json
 https://system-image.ubuntu.com/trusty-proposed/mako/index.json.asc -> /tmp/system-image-z9hrkv/index.json.asc

[systemimage] Nov 13 10:47:32 2013 (5513) Running group download reactor
[systemimage] Nov 13 10:47:32 2013 (5513) Group download reactor done
[systemimage] Nov 13 10:47:33 2013 (5513) Upgrade path is 20
[systemimage] Nov 13 10:47:33 2013 (5513) Requesting group download:
 http://system-image.ubuntu.com/pool/ubuntu-918c4fd296961d2c273658540f8072cfe03d57d5fdc97b750073f7daa0149987.tar.xz -> /android/cache/recovery/ubuntu-918c4fd296961d2c273658540f8072cfe03d57d5fdc97b750073f7daa0149987.tar.xz
 http://system-image.ubuntu.com/pool/ubuntu-918c4fd296961d2c273658540f8072cfe03d57d5fdc97b750073f7daa0149987.tar.xz.asc -> /android/cache/recovery/ubuntu-918c4fd296961d2c273658540f8072cfe03d57d5fdc97b750073f7daa0149987.tar.xz.asc
 http://system-image.ubuntu.com/pool/device-0e42c8ad6c2174c9f3fa4a580a620e57b9df4b3c0aed54b425f4e2990ad4f6f7.tar.xz -> /android/cache/recovery/device-0e42c8ad6c2174c9f3fa4a580a620e57b9df4b3c0aed54b425f4e2990ad4f6f7.tar.xz
 http://system-image.ubuntu.com/pool/device-0e42c8ad6c2174c9f3fa4a580a620e57b9df4b3c0aed54b425f4e2990ad4f6f7.tar.xz.asc -> /android/cache/recovery/device-0e42c8ad6c2174c9f3fa4a580a620e57b9df4b3c0aed54b425f4e2990ad4f6f7.tar.xz.asc
 http://system-image.ubuntu.com/trusty-proposed/mako/version-20...

Read more...

On Nov 13, 2013, at 09:57 AM, Vincent Ladeuil wrote:

>Note that the 7 minutes without feedback almost triggered a "bah, broken
>somehow, let's reboot" which get delayed because I was typing this
>comment ;)

Yeah, -vv would have spewed much more logging output to your console :).

The problem here is that s-i really isn't doing much at this point. It's
asked udm to download a bunch of files and all it's doing is passing through
the progress signals it gets from udm's D-Bus API. -vv shows you all those
progress signals. -v does not, but then it's also not showing much progress
on the command line.

I've thought about trying to do some occasional progress output when a single
-v is used, but haven't come up with a good idea yet. If you have thoughts,
or even a wish list, please open a new bug just for this.

P.S. The "ui wait for 10 minutes" will be solved with s-i 2.0.

Vincent Ladeuil (vila) wrote :

> Yeah, -vv would have spewed much more logging output to your console :).

Works for me, I only use the cmdline when I feel the UI is lying in some wa, so -vv it is.

PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:ubuntu-download-manager at revision 173, scheduled for release in ubuntu-download-manager, milestone 0.3

Changed in ubuntu-download-manager:
status: Confirmed → Fix Committed
Barry Warsaw (barry) on 2013-11-27
Changed in ubuntu-system-image:
status: Fix Committed → Fix Released
Barry Warsaw (barry) on 2013-11-27
Changed in system-image (Ubuntu):
status: Confirmed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (5.8 KiB)

This bug was fixed in the package ubuntu-download-manager - 0.3+14.04.20131219-0ubuntu1

---------------
ubuntu-download-manager (0.3+14.04.20131219-0ubuntu1) trusty; urgency=low

  [ Loïc Minier ]
  * Always honor UBUNTU_DOWNLOADER_DEBUG; we should consider switching
    release builds to -UDEBUG, but this is too intrusive right now; LP:
    #1240656. (LP: #1240656)

  [ Michael McCracken ]
  * change command for 'make check' to work when not in tree (LP:
    #1249470). (LP: #1249470)

  [ Manuel de la Pena ]
  * Fix a number of logging problems found when we went to production.
    (LP: #1240656, #1241009, #1241005, #1240967)
  * Move download ids to be strings, that way applications have more
    predictable ways to connect to the current downloads. At the moment
    only not confined apps can use predictable ids. (LP: #1234965)
  * Group all system related files under the same dir.
  * Moved all downloads related to the same directory.
  * Renamed the DownloadDaemon to Daemon to remove the redundancy in the
    name.
  * Rename DownloadFactory to Factory to remove the redundancy in the
    name.
  * Rename the DownloaderQueue to Queue to remove the redundancy in the
    name.
  * Rename DownloadManager to Manager to remove the name redundancy.
  * Rename SIngleDownlaod to FileDownload so that it makes more sense.
  * Start using namespaces planning ahead when we have a client library.
  * Add the use of a new namespace for system realted classes.
  * Sort the files under the test project so that working on it is
    simpler.
  * Add documentation about the state machine to be used for the
    downloads and its initial skeleton.
  * Ensure that finished is raised when the group download is empty.
    (LP: #1245597)
  * Allow to pass the path of the service to be started so that it is
    easier to test. (LP: #1195657)
  * Create the download state machine and add the states defined in the
    docs diagram.
  * Provide a processing signal for when the downloading is done and a
    process is being executed. (LP: #1248770)
  * Provide setters and getters for the daemon command line args. Added
    self signed ssl certs for testing purposes. (LP: #1249336)
  * Add the header transitions. There is not yet nothing done because
    parsing the header info for the attachment name is harder than
    expected.
  * Ensure that group downloads do not allow two same local paths in the
    same group. (LP: #1250181)
  * Remove all the checks against NULL before calling delete. (LP:
    #1250409)
  * Remove the not needed pimpl pattern from the DBusConnection class.
    (LP: #1251008)
  * Remove the pimpl patter from the Application class. (LP: #1251003)
  * Add the transitions for the init state.
  * Added the downloading state transitions.
  * Remove the pimpl patter from the system network info. Because pimpl
    is out we can use the new signal connection. (LP: #1250946)
  * Remove the use of the pimpl pattern in the factory object because is
    not needed. Fix some issues with the tests after the change. (LP:
    #1250927)
  * Added downloading not connected state transitions.
  * Remove the pimpl pattern from the ProcessFactory. (LP: #1251269)
  ...

Read more...

Changed in ubuntu-download-manager (Ubuntu):
status: New → Fix Released
Changed in ubuntu-download-manager:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers