restore failed: cannot detect whether old instance is still running: only some instances were found

Bug #1425807 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
Horacio Durán

Bug Description

functional-ha-backup-restore failed

ERROR:root:Restore failed:
2015-02-26 04:27:37 INFO juju.cmd supercommand.go:37 running juju [1.23-alpha1-precise-amd64 gc]
ERROR cannot detect whether old instance is still running: only some instances were found
error: exit status 1
2015-02-26 04:27:43 ERROR juju.cmd supercommand.go:430 subprocess encountered error code 1
Traceback (most recent call last):
  File "/var/lib/jenkins/juju-ci-tools/assess_recovery.py", line 208, in main
    restore_missing_state_server(client, backup_file)
  File "/var/lib/jenkins/juju-ci-tools/assess_recovery.py", line 129, in restore_missing_state_server
    raise Exception(message)
Exception: Restore failed:
2015-02-26 04:27:37 INFO juju.cmd supercommand.go:37 running juju [1.23-alpha1-precise-amd64 gc]
ERROR cannot detect whether old instance is still running: only some instances were found
error: exit status 1
2015-02-26 04:27:43 ERROR juju.cmd supercommand.go:430 subprocess encountered error code 1

The last pass was for commit 80daee2 . Then 9 branches mered This may relate to
Commit 80daee2 Merge pull request #1667 from perrito666/deprecate_restore_plugin …

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: none → 1.23
Changed in juju-core:
assignee: nobody → Horacio Durán (hduran-8)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Excerpt from the IRC discussion of my analysis of the likely cause for the blocker - http://paste.ubuntu.com/10432441/
and the proposed fix - https://github.com/juju/juju/pull/1693

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Nope, that won't fix the issue, but at least will improve the case a bit - by not having bash syntax errors in the juju-backup script, we guarantee that when the job reports "juju-restore correctly refused to restore because the state-server was still up." it won't be due to exit error 1 from bash, but from the juju-backup refusing to do it (e.g. ERROR old bootstrap instance ["i-2cd64cc3" "i-265eacd6" "i-5f6b08a5"] still seems to exist; will not replace error: exit status 1).

See this:

/mnt/jenkinshome/jobs/functional-ha-backup-restore/workspace/extracted-bin/usr/lib/juju-1.23-alpha1/bin/juju-backup: line 5: [: ==: unary operator expected
/mnt/jenkinshome/jobs/functional-ha-backup-restore/workspace/extracted-bin/usr/lib/juju-1.23-alpha1/bin/juju-backup: line 14: [: ==: unary operator expected
DEPRECATED: Use "juju backups create" instead of "juju backup".
backup ID: "20150226-032251.67412a10-d498-4057-84bc-7d6d85e21e03"
checksum: "z0U4g6RnIaEhqLOpDEjYiZYkKdk="
checksum format: "SHA-1, base64 encoded"
size (B): 20609472
stored: 2015-02-26 03:23:11 +0000 UTC
started: 2015-02-26 03:22:51.741872779 +0000 UTC
finished: 2015-02-26 03:23:04.317915569 +0000 UTC
notes: ""
environment ID: "67412a10-d498-4057-84bc-7d6d85e21e03"
machine ID: "3"
created on host: "ip-172-31-18-95"
juju version: 1.23-alpha1
20150226-032251.67412a10-d498-4057-84bc-7d6d85e21e03
downloading to juju-backup-20150226-032251.tar.gz

State-Server backup at /mnt/jenkinshome/jobs/functional-ha-backup-restore/workspace/juju-backup-20150226-032251.tar.gz
juju-restore correctly refused to restore because the state-server was still up.
WARNING: Could not find the instance_id in output:
2015-02-26 03:23:13 INFO juju.cmd supercommand.go:37 running juju [1.23-alpha1-precise-amd64 gc]
ERROR old bootstrap instance ["i-2cd64cc3" "i-265eacd6" "i-5f6b08a5"] still seems to exist; will not replace
error: exit status 1
2015-02-26 03:23:14 ERROR juju.cmd supercommand.go:430 subprocess encountered error code 1

The "juju-restore correctly ... " appears *before* the actual error returned by the juju-restore command, because in EnvJujuClient.backup() threw an exception earlier due to the bash syntax error.

Anyway, I've suggested to perritto666 to revert #1667 as juju backups create is apparently not yet stable enough to replace the juju-restore plugin.

So, once this lands it should hopefully unblock CI https://github.com/juju/juju/pull/1694

Changed in juju-core:
status: Triaged → Fix Committed
Revision history for this message
Ian Booth (wallyworld) wrote :

Marked as fix released as the functional ha restore tests passed in the latest CI run that is executing.

Changed in juju-core:
status: Fix Committed → Fix Released
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.23 → 1.23-beta1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.