curtin should retry fetching from archives after transient failure
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
curtin |
Fix Released
|
High
|
Blake Rouse | ||
curtin (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Trusty |
Fix Released
|
Medium
|
Unassigned | ||
Vivid |
Fix Released
|
Medium
|
Unassigned |
Bug Description
=== Begin SRU Template ===
[Description]
During installation, curtin will run 'apt-get update' on in the target root. That is done as a requirement to installing new packages in the target.
'apt-get update' is widely known to fail as a result of transient network failures. This is commonly worked around by simply sleeping and re-trying the operation.
The solution implemented is to improve the 'subp' (subprocess) helper in curtin/util to take a 'retries' argument.
If provided that is a iterator that contains a time to sleep before trying again. If no retries is provided, then only one try is done.
Then, the curtin/util.py helper apt_update invokes subp with retries=(1, 2, 3).
[Impact]
Installation fails when a simple retry of 'apt-get update' would have succeeded.
[Test Case]
As this is a transient failure, it is hard to catch and hard to test for.
Installation should be more reliable now, with any 'apt-get update' operation that returned non-zero being retried 3 times.
[Regression Potential]
The only really likely regression path here would be retrying 'apt update' on its successful return. That seems fairly unlikely as the code in subp to check exit status has not changed.
[Other]
Related bugs:
* bug 972077: apt repository disk format has race conditions
=== End SRU Template ===
We run into transient network issues where index files fail to download. The deployment ends up being marked as failed. Then subsequent deployment succeeds but test has already failed. Curtin should be able to retry when such error happens.
Here's console output:
=======
Get:28 http://
Get:29 http://
Get:30 http://
Fetched 13.8 MB in 5s (2426 kB/s)
W: Failed to fetch http://
E: Some index files failed to download. They have been ignored, or old ones used instead.
Unexpected error while running command.
Command: ['chroot', '/tmp/tmp8mxme7
Exit code: 100
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'curthooks']
Exit code: 3
Reason: -
Stdout: "Ign http://
Stderr: ''
Success
ci-info: +++++++Authorized keys
=======
Related branches
- curtin developers: Pending requested
-
Diff: 106 lines (+43/-9)2 files modifiedcurtin/deps/install.py (+17/-4)
curtin/util.py (+26/-5)
description: | updated |
tags: | added: oil |
Changed in curtin: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in curtin (Ubuntu): | |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in curtin (Ubuntu Trusty): | |
status: | New → Confirmed |
Changed in curtin (Ubuntu Vivid): | |
status: | New → Confirmed |
Changed in curtin (Ubuntu Trusty): | |
importance: | Undecided → Medium |
Changed in curtin (Ubuntu Vivid): | |
importance: | Undecided → Medium |
description: | updated |
tags: |
added: verification-done removed: verification-needed |
Juju charm helpers does this to make it handle transient errors better.
http:// bazaar. launchpad. net/~charm- helpers/ charm-helpers/ devel/view/ head:/charmhelp ers/fetch/ __init_ _.py#L405