apt won't redownload Release.gpg after inconsistent cache updates made while UCA is being updated

Bug #1657440 reported by Andreas Hasenack on 2017-01-18
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
APT
Fix Released
Unknown
apt (Ubuntu)
Medium
Unassigned
Xenial
Medium
Unassigned
Yakkety
Medium
Unassigned

Bug Description

# apt --version
apt 1.2.18 (amd64)

xenial

I got myself into a situation where a repository has a Release and a Release.gpg file, but apt is just ignoring the gpg one and won't download it via apt update for some reason:

The repository in question is http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/xenial-updates/newton/. See how locally I have just the Release file:

root@juju-cb14ed-0-lxd-3:/var/lib/apt/lists# l *Release*
-rw-r--r-- 1 root root 100K Jan 15 18:03 archive.ubuntu.com_ubuntu_dists_xenial-backports_InRelease
-rw-r--r-- 1 root root 242K Apr 21 2016 archive.ubuntu.com_ubuntu_dists_xenial_InRelease
-rw-r--r-- 1 root root 100K Jan 18 11:42 archive.ubuntu.com_ubuntu_dists_xenial-updates_InRelease
-rw-r--r-- 1 root root 100K Jan 18 11:42 security.ubuntu.com_ubuntu_dists_xenial-security_InRelease
-rw-r--r-- 1 root root 7.7K Jan 18 11:45 ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_Release

Now I try an update. See how the Release.gpg file gets a "Hit:" instead of a "Get:":
root@juju-cb14ed-0-lxd-3:/var/lib/apt/lists# apt update
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
Hit:2 http://archive.ubuntu.com/ubuntu xenial InRelease
Ign:3 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-updates/newton InRelease
Get:4 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]
Hit:5 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-updates/newton Release
Get:6 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-updates/newton Release.gpg [543 B]
Hit:7 http://archive.ubuntu.com/ubuntu xenial-backports InRelease
Fetched 205 kB in 0s (395 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
8 packages can be upgraded. Run 'apt list --upgradable' to see them.

And I can't install packages:
root@juju-cb14ed-0-lxd-3:/var/lib/apt/lists# apt dist-upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following NEW packages will be installed:
  python3-setuptools
The following packages will be upgraded:
  dh-python dnsmasq-base python-pkg-resources python-setuptools python3-cryptography python3-pkg-resources python3-requests python3-urllib3
8 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 1,193 kB of archives.
After this operation, 808 kB of additional disk space will be used.
Do you want to continue? [Y/n]
WARNING: The following packages cannot be authenticated!
  dh-python dnsmasq-base python-setuptools python-pkg-resources python3-pkg-resources python3-setuptools python3-cryptography python3-requests python3-urllib3
Install these packages without verification? [y/N] n
E: Some packages could not be authenticated
root@juju-cb14ed-0-lxd-3:/var/lib/apt/lists#

Somehow apt is thinking it has the Release.gpg file, but it doesn't?

This server is behind a squid proxy.

[Impact]
An apt update of an apt repository that does not use InRelease during the time it is being updated can cause the gpg file to not be downloaded and updated. This makes the packages from the repository be unable to be authenticated.

The Ubuntu Cloud Archive is one of the archives that meets this criteria.

The impact to downstream automation deployment code is that if they are adding the UCA repo to a system and calling apt update during the time the UCA is being updated by Canonical, the repo can get into a state where the Release.gpg file is not there and all package installs will fail due to "unauthenticated packages" error.

[Test Case]
A detailed python script was attached.

To reproduce this outside that script you would want to:
1. Add the UCA repo
2. Do the following in a loop starting at 43 minutes after the hour and run it until 55 minutes after the hour:
2.1 Remove these files to simulate the UCA repo being added the first time.
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_Release
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_Release.gpg
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_main_binary*Packages

2.2 apt-get update
3. Check the state of the 3 files you deleted. If you have the _Release file but not the _Release.gpg you have recreated the issue.
4. If you have not recreated the issue, continue GOTO 2 and continue to loop.

[Regression Potential]
Unknown

Andreas Hasenack (ahasenack) wrote :

The proxy shows no attempts to download Release.gpg, just InRelease:

18/Jan/2017:12:29:02 +0000 69 y.y.y.y TCP_MISS/404 631 GET http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/xenial-updates/newton/InRelease - FIRSTUP_PARENT/x.x.x.x text/html
18/Jan/2017:12:29:02 +0000 67 y.y.y.y TCP_REFRESH_UNMODIFIED/304 335 GET http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/xenial-updates/newton/Release - FIRSTUP_PARENT/x.x.x.x -
18/Jan/2017:12:29:02 +0000 133 y.y.y.y TCP_REFRESH_UNMODIFIED/304 420 GET http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease - FIRSTUP_PARENT/x.x.x.x -
18/Jan/2017:12:29:02 +0000 134 y.y.y.y TCP_REFRESH_UNMODIFIED/304 438 GET http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease - FIRSTUP_PARENT/x.x.x.x -
18/Jan/2017:12:29:02 +0000 67 y.y.y.y TCP_REFRESH_UNMODIFIED/304 423 GET http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease - FIRSTUP_PARENT/x.x.x.x -
18/Jan/2017:12:29:02 +0000 66 y.y.y.y TCP_REFRESH_UNMODIFIED/304 420 GET http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease - FIRSTUP_PARENT/x.x.x.x -

Andreas Hasenack (ahasenack) wrote :

This was reproduced by paelzer in #ubuntu-devel earlier:

http://paste.ubuntu.com/23821784/

David Kalnischkies (donkult) wrote :

That sounds like what this commit describes: https://anonscm.debian.org/cgit/apt/apt.git/commit/?id=84eec207be35b8c117c430296d4c212b079c00c1
Hence tagged as such as its available in the 1.4 series. Not sure if this should be backported to 1.2 or not.

Changed in apt (Ubuntu):
status: New → Fix Committed
Samuel Matzek (smatzek) wrote :
Download full text (7.6 KiB)

This upstream patch needs to be backported to the 1.2 series for Xenial. If left unfixed in Xenial it opens a timing window every hour with Ubuntu Cloud Archive where users can fall into the "Some packages could not be authenticated" state without Release.gpg that is described in the description. This state will not self correct until an hour has passed an another apt-get update is run. This really impacts automated deployment technologies such as Juju and Ansible because if they hit this hourly window with UCA their fallback retries on apt-get update will not work and the automated deployments fail. The noted upstream Debian bug is specifically about trying to do an apt-get update without adding the keys first. The timing window that users can hit will occur even if you add the keys first.

Now for background information to explain the assertions above. Ubuntu Cloud Archive updates its files, and more importantly the timestamps on its files including the Release and Release.gpg file every hour. The timestamps are updated to be 45 minutes past the hour. The UCA servers start to reflect these changes around 50 minutes after the hour with a rolling update of the Packages and then the Release.* files. They are not updated as an atomic unit as seen from an HTTP client.

So the order of events is:
1. User or automation adds keys by installing the 'ubuntu-cloud-keyring' apt pacakge.
2. User adds the UCA repo using the Ansible apt_repository or other technique, possilbly just adding the repo to a sources list file under /etc/apt/sources/sources.list.d.
3. Either the tooling (apt_repository module) or the user triggers an apt-get update or other apt cache update trigger. If this cache update hits the timing window when UCA is being updated you can get into the state where you have the Release file but not the Release.gpg file without triggering a cache or apt-get update failure. A recreation Python main which uses straight python-apt can show this. I will attach my recreation program and output showing the error case.
4. At this point, as shown in the original description no further apt-get updates will fix the situation and any package installs from UCA will fail with "Some packages could not be authenticated".

While the timing window may seem small, probably a minute each hour, with complex multi-node OpenStack deployments using Ansible we are seeing this occur fairly frequently. Given the 'juju' in the host name in the original description I suspect that multi-node orchestrated Juju charm deployments using UCA are also hitting this often.

The bug is particularly harmful to automated deploy tooling because while the deploy tooling normally has apt-get update retries or periodic updates throughout the process, once this error state is entered the apt-get updates do not work to correct it until after an hour has passed and UCA has updated itself. The deployment tooling normally times and fails much sooner than an hour of retries.

Here is the annotated log output of the recreation script:
#####
# In this snippet we see apt update not pulling down the Release.gpg that was deleted right before the update to test
# its ability to pull down a ne...

Read more...

Samuel Matzek (smatzek) wrote :
Changed in apt (Ubuntu):
status: Fix Committed → Confirmed
Vej (vej) on 2017-02-09
summary: - apt won't redownload Release.gpg
+ apt won't redownload Release.gpg after inconsistent cache updates made
+ while UCA is being updated
Julian Andres Klode (juliank) wrote :

Fixed in 1.4~beta1. This will be cherry picked into the 1.3 and 1.2 (yakkety and xenial) branches in the next weeks.

Changed in apt (Ubuntu):
status: Confirmed → Fix Released
Changed in apt (Ubuntu Xenial):
status: New → Triaged
Changed in apt (Ubuntu Yakkety):
status: New → Triaged
Julian Andres Klode (juliank) wrote :

@Samuel Matzek (smatzek) Your comment is far too long, I did not read that. I only read the first sentence, and thus conclude that you believe the patch fixes this issue.

If you want to help, provide a *concise* instructions to test this by editing the bug report using the guidelines specified in https://wiki.ubuntu.com/StableReleaseUpdates

Changed in apt:
status: Unknown → Fix Released
Changed in apt (Ubuntu):
importance: Undecided → Medium
Changed in apt (Ubuntu Xenial):
importance: Undecided → Medium
Changed in apt (Ubuntu Yakkety):
importance: Undecided → Medium
Vej (vej) on 2017-02-13
tags: added: xenial yakkety
Samuel Matzek (smatzek) wrote :

[Impact]
An apt update of an apt repository that does not use InRelease during the time it is being updated can cause the gpg file to not be downloaded and updated. This makes the packages from the repository be unable to be authenticated.

The Ubuntu Cloud Archive is one of the archives that meets this criteria.

The impact to downstream automation deployment code is that if they are adding the UCA repo to a system and calling apt update during the time the UCA is being updated by Canonical, the repo can get into a state where the Release.gpg file is not there and all package installs will fail due to "unauthenticated packages" error.

[Test Case]
A detailed python script was attached.

To reproduce this outside that script you would want to:
1. Add the UCA repo
2. Do the following in a loop starting at 43 minutes after the hour and run it until 55 minutes after the hour:
2.1 Remove these files to simulate the UCA repo being added the first time.
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_Release
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_Release.gpg
/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_xenial-updates_newton_main_binary*Packages

2.2 apt-get update
3. Check the state of the 3 files you deleted. If you have the _Release file but not the _Release.gpg you have recreated the issue.
4. If you have not recreated the issue, continue GOTO 2 and continue to loop.

[Regression Potential]
Unknown

Changed in apt (Ubuntu Xenial):
status: Triaged → In Progress
Changed in apt (Ubuntu Yakkety):
status: Triaged → In Progress
Łukasz Zemczak (sil2100) wrote :

Since this (and a few other) bug is mentioned in the SRU changelog, please update the description to include the SRU template. There seems to be a master bug for the SRU, but each bug should *at least* have a clearly written test-case. Thanks!

Vej (vej) wrote :

@Lukasz This one has a SRU template in comment #8.

I copied it into the main description, *without* checking if this works to reproduce the bug.

description: updated

Hello Andreas, or anyone else affected,

Accepted apt into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/apt/1.2.20 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in apt (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Samuel Matzek (smatzek) wrote :

I setup a test environment to re-run the recreation script I attached above while using the fix from xenial-proposed.

What I found was that the fix helps but is not perfect.

While the UCA repo is in the hourly update time window, the apt-get update can still leave the user in the error case where you have the Release file but not the Release.gpg file. However, WITH this fix a subsequent apt-get update resolves the issue and will pull down the Release.gpg file. This is in contrast to WITHOUT the fix no amount of apt-get update calls would fix the issue until after the next hourly UCA update.

So my verdict is that this fix should go through as it allows automated tooling to simply do apt-get update retries and self-resolve the missing gpg issue.

Any further changes are probably required in the Ubuntu Cloud Archive itself to close the "partially updated" window that is part of the error case trigger.

Vej (vej) wrote :

> So my verdict is that this fix should go through as it allows automated tooling to simply do apt-get update retries and self-resolve the missing gpg issue.
I agree and will update the tags accordingly.

tags: added: verification-done-xenial verification-needed-yakkety
removed: verification-needed
Chris J Arges (arges) wrote :

Hello Andreas, or anyone else affected,

Accepted apt into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/apt/1.3.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in apt (Ubuntu Yakkety):
status: In Progress → Fix Committed
tags: added: verification-needed
Jon Grimm (jgrimm) on 2017-03-27
tags: removed: verification-needed
Julian Andres Klode (juliank) wrote :

Verified in yakkety, running the script once with 1.3.4 where it fails, and once with 1.3.5 where it succeeds.

tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apt - 1.3.5

---------------
apt (1.3.5) yakkety; urgency=medium

  * Microrelease covering important fixes of 1.4~rc2 (LP: #1668280)

  [ David Kalnischkies ]
  * don't install new deps of candidates for kept back pkgs
  * keep Release.gpg on untrusted to trusted IMS-Hit (Closes: 838779)
    (LP: #1657440)
  * reset HOME, USER(NAME), TMPDIR & SHELL in DropPrivileges (Closes: 842877)
  * add TMP/TEMP/TEMPDIR to the TMPDIR DropPrivileges dance
  * react to trig-pend only if we have nothing else to do
  * correct cross & disappear progress detection
  * improve arch-unqualified dpkg-progress parsing
  * don't perform implicit crossgrades involving M-A:same
  * do not configure unconfigured to be removed packages
  * skip unconfigure for unconfigured to-be removed pkgs
  * get pdiff files from the same mirror as the index
  * let {dsc,tar,diff}-only implicitly enable download-only
  * ensure generation of valid EDSP error stanzas
  * fix minimum pkgs option for dpkg --recursive usage
  * don't show update stats if cache generation is disabled
  * don't lock dpkg in 'apt-get clean'
  * don't lock dpkg in update commands
  * avoid validate/delete/load race in cache generation
  * fix 'install --no-download' mode
  * remove 'old' FAILED files in the next acquire call (Closes: 846476)
  * stop rred from leaking debug messages on recovered errors (Closes: #850759)

  [ Edgar Fuß ]
  * http: clear content before reporting the failure (Closes: #465572)

  [ Paul Wise ]
  * show output as documented for APT::Periodic::Verbose 2 (Closes: 845599)

  [ John R. Lenton ]
  * bash-completion: Only complete understood file paths for install
    (LP: #1645815)

  [ Lukasz Kawczynski ]
  * Honour Acquire::ForceIPv4/6 in the https transport

  [ Julian Andres Klode ]
  * basehttp: Only read Content-Range on 416 and 206 responses (LP: #1657567)
  * Only merge acquire items with the same meta key (Closes: #838441)
  * Do not package names representing .dsc/.deb/... files (Closes: #854794)
  * Don't use -1 fd and AT_SYMLINK_NOFOLLOW for faccessat()
    Thanks to James Clarke for debugging these issues
  * CMake: Install statvfs.h to include/sys, not just include/

 -- Julian Andres Klode <email address hidden> Mon, 27 Feb 2017 15:02:40 +0100

Changed in apt (Ubuntu Yakkety):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for apt has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apt - 1.2.20

---------------
apt (1.2.20) xenial; urgency=medium

  * Microrelease covering fixes of 1.4~rc2 (LP: #1668285)

  [ David Kalnischkies ]
  * don't install new deps of candidates for kept back pkgs
  * keep Release.gpg on untrusted to trusted IMS-Hit (Closes: 838779)
    (LP: #1657440)
  * reset HOME, USER(NAME), TMPDIR & SHELL in DropPrivileges (Closes: 842877)
  * add TMP/TEMP/TEMPDIR to the TMPDIR DropPrivileges dance
  * let {dsc,tar,diff}-only implicitly enable download-only
  * don't show update stats if cache generation is disabled
  * don't lock dpkg in 'apt-get clean'
  * don't lock dpkg in update commands
  * avoid validate/delete/load race in cache generation
  * remove 'old' FAILED files in the next acquire call (Closes: 846476)
  * stop rred from leaking debug messages on recovered errors (Closes: #850759)

  [ Paul Wise ]
  * show output as documented for APT::Periodic::Verbose 2 (Closes: 845599)

  [ John R. Lenton ]
  * bash-completion: Only complete understood file paths for install
    (LP: #1645815)

  [ Lukasz Kawczynski ]
  * Honour Acquire::ForceIPv4/6 in the https transport

  [ Julian Andres Klode ]
  * basehttp: Only read Content-Range on 416 and 206 responses (LP: #1657567)
  * Only merge acquire items with the same meta key (Closes: #838441)
  * Do not package names representing .dsc/.deb/... files (Closes: #854794)
  * Don't use -1 fd and AT_SYMLINK_NOFOLLOW for faccessat()
    Thanks to James Clarke for debugging these issues

 -- Julian Andres Klode <email address hidden> Mon, 27 Feb 2017 15:29:18 +0100

Changed in apt (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.