Upgrades from 1.20.11 to 1.25.2 fail because of status

Bug #1519995 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Invalid
Undecided
Unassigned
1.25
Fix Released
Critical
Tim Penhey

Bug Description

As seen in
    http://reports.vapour.ws/releases/3357/job/aws-upgrade-20-trusty-amd64/attempt/333
The aws-upgrade-20-trusty-amd64/ job is failing because status hangs for 300 seconds.

The results are consistent for 1.25.2. in 5 retests. In the example, after the call to upgrade, the script starts polling the progress using status. we see two dots indicating the first two calls to status occurred quickly
    1.20.11: 1, 0, 2, dummy-sink/0, dummy-source/0 ..
but the 3rd call to status hung for 300 seconds. At that point a timeout is raised because there was no visibility into to the progress of the upgrade. By the time the script reports this, we often see the upgrade has succeeded or is very near completion.

So we know upgrades work. anyone who interrupted the hung status and tried again would probably see that status succeed and that the upgrade was progressing.

1.20.11: 1, 0, 2, dummy-sink/0, dummy-source/0 ..
2015-11-25 07:59:28 ERROR Timed out waiting for juju status to succeed: Command 'juju' returned non-zero exit status 1
Traceback (most recent call last):
  File "/mnt/jenkinshome/juju-ci-tools/deploy_stack.py", line 477, in boot_context
    yield
  File "/mnt/jenkinshome/juju-ci-tools/deploy_stack.py", line 559, in _deploy_job
    assess_upgrade(client, juju_path, skip_juju_run)
  File "/mnt/jenkinshome/juju-ci-tools/deploy_stack.py", line 310, in assess_upgrade
    client.wait_for_version(client.get_matching_agent_version(), timeout)
  File "/mnt/jenkinshome/juju-ci-tools/jujupy.py", line 494, in wait_for_version
    versions = self.get_status(300).get_agent_versions()
  File "/mnt/jenkinshome/juju-ci-tools/jujupy.py", line 319, in get_status
    'Timed out waiting for juju status to succeed: %s' % e)
Exception: Timed out waiting for juju status to succeed: Command 'juju' returned non-zero exit status 1

The python script is calling juju status. it parses the yaml and returns a helper to get data out of status.

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Incomplete
tags: added: blockr
tags: added: blockrer
removed: blockr
tags: added: blocker
removed: blockrer
Tim Penhey (thumper)
Changed in juju-core:
status: Incomplete → Invalid
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote : Fix Released in juju-core 1.25

Juju-CI verified that this issue is Fix Released in juju-core 1.25:
    http://reports.vapour.ws/releases/3374

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.