Activity log for bug #1336967

Date Who What changed Old value New value Message
2014-07-02 21:48:35 Curtis Hovey bug added bug
2014-07-02 21:48:59 Curtis Hovey nominated for series juju-core/1.20
2014-07-02 21:48:59 Curtis Hovey bug task added juju-core/1.20
2014-07-02 21:49:06 Curtis Hovey juju-core/1.20: status New Triaged
2014-07-02 21:49:09 Curtis Hovey juju-core/1.20: importance Undecided Critical
2014-07-02 21:49:12 Curtis Hovey juju-core/1.20: milestone 1.20.0
2014-07-02 21:49:43 Curtis Hovey description Restore from backup frequently fails, success is rare. There is no single cause for failure when restoring a single or HA state server. Aws is not better than Hp. The restore process does not generate a log of meaningful data to examine. What is certain is that the process is too brittle, there are many race conditions that are vaguely reported by restore as: A .(During seting of the new bootstrap) error: cannot open state: no reachable servers B. (Starting Juju machine agent (jujud-machine-0)) error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: cannot get machine 0: EOF C. (After restore completes): juju status exits with an error, this happens for 10 minutes then the test gives up. I am marking this as a critical regression because for several weeks, the restore tests always passed. In truth, Juju stable has never passed these tests. Restore from backup frequently fails, success is rare. There is no single cause for failure when restoring a single or HA state server. Aws is not better than Hp. The restore process does not generate a log of meaningful data to examine. What is certain is that the process is too brittle, there are many race conditions that are vaguely reported by restore as:     A. (During set up of the new bootstrap) error: cannot open state: no reachable servers     B. (Starting Juju machine agent (jujud-machine-0)) error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: cannot get machine 0: EOF     C. (After restore completes): juju status exits with an error, this happens for 10 minutes then the test gives up. I am marking this as a critical regression because for several weeks, the restore tests always passed. In truth, Juju stable has never passed these tests.
2014-07-03 09:18:08 Ian Booth summary Restore doen't Restore doesn't
2014-07-03 15:15:51 Curtis Hovey bug task deleted juju-core/1.20
2014-07-04 14:12:04 Curtis Hovey nominated for series juju-core/1.20
2014-07-04 14:12:04 Curtis Hovey bug task added juju-core/1.20
2014-07-04 14:12:13 Curtis Hovey juju-core/1.20: milestone 1.20.1
2014-07-04 14:12:17 Curtis Hovey juju-core/1.20: importance Undecided Critical
2014-07-04 14:12:21 Curtis Hovey juju-core/1.20: status New Triaged
2014-07-04 14:15:59 Curtis Hovey description Restore from backup frequently fails, success is rare. There is no single cause for failure when restoring a single or HA state server. Aws is not better than Hp. The restore process does not generate a log of meaningful data to examine. What is certain is that the process is too brittle, there are many race conditions that are vaguely reported by restore as:     A. (During set up of the new bootstrap) error: cannot open state: no reachable servers     B. (Starting Juju machine agent (jujud-machine-0)) error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: cannot get machine 0: EOF     C. (After restore completes): juju status exits with an error, this happens for 10 minutes then the test gives up. I am marking this as a critical regression because for several weeks, the restore tests always passed. In truth, Juju stable has never passed these tests. Restore from backup frequently fails, success is rare. There is no single cause for failure when restoring a single or HA state server. Aws is not better than Hp. The restore process does not generate a log of meaningful data to examine. What is certain is that the process is too brittle, there are many race conditions that are vaguely reported by restore as:     A. (During set up of the new bootstrap) error: cannot open state: no reachable servers B. (During set up of the new bootstrap) Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable) E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it?")     C. (Starting Juju machine agent (jujud-machine-0)) error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: cannot get machine 0: EOF     D. (After restore completes): juju status exits with an error, this happens for 10 minutes then the test gives up. I am marking this as a critical regression because for several weeks, the restore tests always passed. In truth, Juju stable has never passed these tests.
2014-07-09 23:51:00 Curtis Hovey juju-core/1.20: status Triaged Fix Committed
2014-07-10 02:59:36 Curtis Hovey juju-core: status Triaged Fix Committed
2014-07-10 12:45:27 Curtis Hovey juju-core/1.20: status Fix Committed Fix Released
2014-08-21 15:37:19 Curtis Hovey juju-core: importance Critical High
2014-09-08 14:19:26 Curtis Hovey juju-core: status Fix Committed Fix Released