Juju documentation claims that it's possible to restore controller from backup after full controller failure

Bug #1803943 reported by Bence
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Won't Fix
Medium
Unassigned

Bug Description

symptoms:
2018-11-16 10:04:34 ERROR juju.worker.dependency engine.go:632 "state" manifold worker returned unexpected error: no reachable servers
2018-11-16 10:04:34 WARNING juju.mongo open.go:166 TLS handshake failed: x509: certificate signed by unknown authority
2018-11-16 10:04:34 WARNING juju.mongo open.go:166 TLS handshake failed: x509: certificate signed by unknown authority

test plan to reproduce
install a fresh bionic vm with lxd+juju
bootstrap a controller
backup a controller
destroy controller
bootstrap another controller
restore it from backup

Init system
create new vm in virtualbox, bionic desktop minimal install with NAT.
make portforward,
install sshd:
juju@jujutest:~$ sudo apt install openssh-server
install lxd and juju:
juju@jujutest:~$ sudo apt install lxd
juju@jujutest:~$ sudo snap install juju --classic
juju 2.4.6 from 'canonical' installed

Bootstrap a controller

juju@jujutest:~$ juju bootstrap localhost lxd-controller
ERROR Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: permission denied
juju@jujutest:~$ newgrp lxd
juju@jujutest:~$ juju bootstrap localhost lxd-controller
Creating Juju controller "lxd-controller" on localhost/localhost
Looking for packaged Juju agent version 2.4.6 for amd64
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on localhost/localhost...
 - juju-5ec3ad-0 (arch=amd64)
Installing Juju agent on bootstrap instance
Fetching Juju GUI 2.14.0
Waiting for address
Attempting to connect to 10.206.131.45:22
Connected to 10.206.131.45
Running machine configuration script...
Bootstrap agent now started
Contacting Juju controller at 10.206.131.45 to verify accessibility...
Bootstrap complete, "lxd-controller" controller now available
Controller machines are in the "controller" model
Initial model "default" added

juju@jujutest:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
default lxd-controller localhost/localhost 2.4.6 unsupported 10:45:55+01:00

Model "admin/default" is empty.
juju@jujutest:~$ juju switch controller
lxd-controller:admin/default -> lxd-controller:admin/controller
juju@jujutest:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
controller lxd-controller localhost/localhost 2.4.6 unsupported 10:47:28+01:00

Machine State DNS Inst id Series AZ Message
0 started 10.206.131.45 juju-5ec3ad-0 bionic Running

Backup a controller
juju@jujutest:~$ juju create-backup -m lxd-controller:controller --filename juju-backup.tar.gz
Remote backup was not created.
Downloaded to juju-backup.tar.gz.

Destroy controller
juju@jujutest:~$ juju list-controllers
Use --refresh flag with this command to see the latest information.

Controller Model User Access Cloud/Region Models Machines HA Version
lxd-controller* controller admin superuser localhost/localhost 2 1 none 2.4.6

juju@jujutest:~$ juju destroy-controller lxd-controller
WARNING! This command will destroy the "lxd-controller" controller.
This includes all machines, applications, data and other resources.

Continue? (y/N):y
Destroying controller
Waiting for hosted model resources to be reclaimed
All hosted models reclaimed, cleaning up controller machines

juju@jujutest:~$ lxc list
To start your first container, try: lxc launch ubuntu:18.04

+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

Bootstrap another controller

juju@jujutest:~$ juju bootstrap localhost lxd-controller
Creating Juju controller "lxd-controller" on localhost/localhost
Looking for packaged Juju agent version 2.4.6 for amd64
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on localhost/localhost...
 - juju-d04293-0 (arch=amd64)
Installing Juju agent on bootstrap instance
Fetching Juju GUI 2.14.0
Waiting for address
Attempting to connect to 10.206.131.149:22
Connected to 10.206.131.149
Running machine configuration script...
Bootstrap agent now started
Contacting Juju controller at 10.206.131.149 to verify accessibility...
Bootstrap complete, "lxd-controller" controller now available
Controller machines are in the "controller" model
Initial model "default" added

juju@jujutest:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
default lxd-controller localhost/localhost 2.4.6 unsupported 11:00:10+01:00

Model "admin/default" is empty.

juju@jujutest:~$ juju switch controller
lxd-controller:admin/default -> lxd-controller:admin/controller
juju@jujutest:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
controller lxd-controller localhost/localhost 2.4.6 unsupported 11:01:41+01:00

Machine State DNS Inst id Series AZ Message
0 started 10.206.131.149 juju-d04293-0 bionic Running

Restore it from backup
term1:
juju@jujutest:~$ juju restore-backup --file juju-backup.tar.gz

term2:
juju@jujutest:~$ lxc list
+---------------+---------+-----------------------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+---------------+---------+-----------------------+------+------------+-----------+
| juju-d04293-0 | RUNNING | 10.206.131.149 (eth0) | | PERSISTENT | 0 |
+---------------+---------+-----------------------+------+------------+-----------+
juju@jujutest:~$ lxc exec juju-d04293-0 bash
root@juju-d04293-0:~# tail -f /var/log/juju/machine-0.log

2018-11-16 10:04:34 ERROR juju.worker.dependency engine.go:632 "state" manifold worker returned unexpected error: no reachable servers
2018-11-16 10:04:34 WARNING juju.mongo open.go:166 TLS handshake failed: x509: certificate signed by unknown authority
2018-11-16 10:04:34 WARNING juju.mongo open.go:166 TLS handshake failed: x509: certificate signed by unknown authority
2018-11-16 10:04:34 WARNING juju.mongo open.go:166 TLS handshake failed: x509: certificate signed by unknown authority
2018-11-16 10:04:34 ERROR juju.worker.dependency engine.go:632 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly

Tytus Kurek (tkurek)
tags: added: cpe-onsite
Bence (szabo-bence)
tags: added: 4010
Revision history for this message
Richard Harding (rharding) wrote :

Thanks, yes we have on our roadmap work to enable restoring to a new controller. At this time that's not a supported restore mechanism and so we suggest using HA with backups to help recover. You need to have one of the HA available to restore to in order to successfully use restore.

Changed in juju:
status: New → Triaged
importance: Undecided → Medium
summary: - Restoring from backup fails (certificate issue)
+ Restoring backup to new controller fails
Revision history for this message
Tytus Kurek (tkurek) wrote : Re: Restoring backup to new controller fails

Hi Rick,

Thank you for the update. This means, however, that the following part of the documentation should be updated:

https://docs.jujucharms.com/2.4/en/controllers-backup

Take a look on the section "Restoring due to complete cluster failure" on the bottom.

Michał Ajduk (majduk)
summary: - Restoring backup to new controller fails
+ Juju documentation claims that it's possible to restore controller from
+ backup after full controller failure
Felipe Reyes (freyes)
tags: added: sts
Revision history for this message
Ian Booth (wallyworld) wrote :

The Juju doc this bug was raised against is now obsolete, replaced by

https://juju.is/docs/olm/controller-backups

with a link to a new stand-alone restore tool.

https://discourse.charmhub.io/t/restoring-from-a-backup/3665

The restore tool does not yet support restoring to a brand new controller.

Marking as Won't Fix since we're not updating the old doc any more.

Changed in juju:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.