Restore doesn't

Bug #1336967 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Unassigned
1.20
Fix Released
Critical
Unassigned

Bug Description

Restore from backup frequently fails, success is rare. There is no single cause for failure when restoring a single or HA state server. Aws is not better than Hp. The restore process does not generate a log of meaningful data to examine. What is certain is that the process is too brittle, there are many race conditions that are vaguely reported by restore as:
    A. (During set up of the new bootstrap) error: cannot open state: no reachable servers
    B. (During set up of the new bootstrap) Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable) E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it?")
    C. (Starting Juju machine agent (jujud-machine-0)) error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: cannot get machine 0: EOF
    D. (After restore completes): juju status exits with an error, this happens for 10 minutes then the test gives up.

I am marking this as a critical regression because for several weeks, the restore tests always passed. In truth, Juju stable has never passed these tests.

Curtis Hovey (sinzui)
description: updated
Ian Booth (wallyworld)
summary: - Restore doen't
+ Restore doesn't
Curtis Hovey (sinzui)
no longer affects: juju-core/1.20
Curtis Hovey (sinzui)
description: updated
Curtis Hovey (sinzui)
Changed in juju-core:
status: Triaged → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
importance: Critical → High
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.