local-provider precise failed to upgrade
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | juju-core |
Critical
|
Dimiter Naydenov | ||
| | 1.22 |
Critical
|
Dimiter Naydenov | ||
Bug Description
As of commit 2ba3166, the local-provider upgrade tests fail on precise. We see the agents are downloaded and started, but they never call the state-server. The env says they are still 1.21.2, but the log does show 1.23 is started.
The last passing commit was 570cda4. The suspect commits are:
commit c555742 Merge pull request #1634 from waigani/
commit 9141c96 Merge pull request #1598 from waigani/
commit f223dd1 Merge pull request #1625 from waigani/
commit cb8a990 Merge pull request #1648 from axw/tag-
commit 9d172de Merge pull request #1640 from axw/state-
commit 2ba3166 Merge pull request #1626 from wallyworld/
As we can see from the logs of various runs, the upgrades are in progress when the 10 minute time out arrived. A hack was made to extend the upgrade time to 20 minutes. The test passed in 18 minutes. Recent changes have increased the needed time for local precise upgrades.
The question is what to do? Does juju need to be faster? Should the time out be extended? MAAS has a timeout set to 30 minutes because is is very slow.
| Changed in juju-ci-tools: | |
| status: | New → Triaged |
| importance: | Undecided → Critical |
| Dimiter Naydenov (dimitern) wrote : | #1 |
| Changed in juju-core: | |
| status: | Triaged → In Progress |
| assignee: | nobody → Dimiter Naydenov (dimitern) |
| Dimiter Naydenov (dimitern) wrote : | #2 |
Found a working solution. The root of the problem is cloud-init on precise (0.6.3) interpreting badly apt-get install commands for packages that need --target-release precise-
1. Always set apt-get-update to true in cloud-init userdata when the series is precise (otherwise neither the cloud-tools archive will be added nor any packages will be installed)
2. Add "--target-release", "precise-
After doing the above I managed to successfully bootstrap a trusty local environment and then manually add machines with series: precise, quantal, raring, saucy, trusty, utopic, and vivid. All of them succeeded, despite giving warnings for not finding some archive index files for quantal and raring (now past EOL I guess).
I'm continuing to test the same fix with a local environment bootstrapped on precise.
| no longer affects: | juju-ci-tools |
| Dimiter Naydenov (dimitern) wrote : | #3 |
I've successfully tested a precise local environment and found a few more things needed for the fix: even though precise cloud-init needs "--target-release" and "precise-
packages:
- --target-release
- precise-
- cloud-utils
- curl
- bridge-utils
becomes:
Installing package: --target-release precise-
Installing package: curl
Installing package: bridge-utils
when passed via sshinit. Confirmed the above works on newer cloud-init versions.
I'm about to propose the fix for 1.22.
| Dimiter Naydenov (dimitern) wrote : | #4 |
Fix proposed: https:/
Just to be on the safe side, I did a final test with MAAS - bootstrapping a trusty node, then adding a precise one. It worked without issues.
| Dimiter Naydenov (dimitern) wrote : | #5 |
The fix for 1.22 has landed, working on testing the port of the same fix for 1.23.
| Dimiter Naydenov (dimitern) wrote : | #6 |
Filed new bug 1425245 to improve the tests for the fix, as suggested on the review.
| Dimiter Naydenov (dimitern) wrote : | #7 |
Fix for 1.23 proposed: https:/
Exactly the same set of tests as before - all successful.
| Changed in juju-core: | |
| status: | In Progress → Fix Committed |
| Changed in juju-core: | |
| status: | Fix Committed → Fix Released |
| Changed in juju-core: | |
| milestone: | 1.23 → 1.23-beta1 |


I can confirm even local deploy from a trusty machine to a precise lxc container in 1.22 is broken:
E: Command line option --target-release precise- updates/ cloud-tools cloud-utils cloud-image-utils is not understood update_ upgrade. py[WARNING] : Failed to install packages: ['--target-release precise-upda _.py[WARNING] : Traceback (most recent call last): python2. 7/dist- packages/ cloudinit/ CloudConfig/ __init_ _.py", line 117, in run_cc_modules python2. 7/dist- packages/ cloudinit/ CloudConfig/ __init_ _.py", line 78, in handle python2. 7/dist- packages/ cloudinit/ __init_ _.py", line 327, in sem_and_run python2. 7/dist- packages/ cloudinit/ CloudConfig/ cc_apt_ update_ upgrade. py", line 126, in handle Options: :=--force- confold' , '--assume-yes', 'install', '--ta updates/ cloud-tools cloud-utils cloud-image-utils', 'curl cpu-checker bridge-utils rsyslog-gnutls']
2015-02-24 09:50:32,914 - cc_apt_
tes/cloud-tools cloud-utils cloud-image-utils', 'curl cpu-checker bridge-utils rsyslog-gnutls']
2015-02-24 09:50:32,916 - __init_
File "/usr/lib/
cc.handle(name, run_args, freq=freq)
File "/usr/lib/
[name, self.cfg, self.cloud, cloudinit.log, args])
File "/usr/lib/
func(*args)
File "/usr/lib/
raise errors[0]
CalledProcessError: Command '['apt-get', '--option', 'Dpkg::
rget-release precise-
' returned non-zero exit status 100
2015-02-24 09:50:32,916 - __init__.py[ERROR]: config handling of apt-update-upgrade, None, [] failed
2015-02-24 09:50:32,923 - cloud-init- cfg[ERROR] : errors running cloud_config [config]: ['apt-update- upgrade' ] upgrade' ]
errors running cloud_config [config]: ['apt-update-
I'm working on a fix for 1.22 and 1.23 - all deploys to precise (lxc or not) are affected, bootstrap to precise is fine though.