2 machines with same ip address after juju restore-backup ; deploy not possible

Bug #1819214 reported by Heather Lanigan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

juju 2.5.2 from 2.5/candidate snap
juju bootstrap localhost one --bootstrap-series xenial
juju enable-ha
juju switch controller
wait for juju show-controller | grep ha to show 3 ha-enabled
juju create backup

juju status

Model Controller Cloud/Region Version SLA Timestamp
controller one localhost/localhost 2.5.2 unsupported 12:52:09-05:00

Machine State DNS Inst id Series AZ Message
0 started 10.121.191.143 juju-adc312-0 xenial Running
1 started 10.121.191.161 juju-adc312-1 xenial Running
2 started 10.121.191.178 juju-adc312-2 xenial Running

juju remove-machine 1 2
wait for machines to disappear
juju restore-backup --file <from-above>

$ juju status
Model Controller Cloud/Region Version SLA Timestamp
controller one localhost/localhost 2.5.2 unsupported 12:57:38-05:00

Machine State DNS Inst id Series AZ Message
0 down 10.121.191.143 juju-adc312-0 xenial Running
1 started 10.121.191.143 juju-adc312-1 xenial Running
2 down 10.121.191.178 juju-adc312-2 xenial Running

from debug-log:
juju.worker.peergrouper cannot set replicaset: Found two member configurations with same _id field, members.0._id == members.1._id == 1

from syslog:
Mar 8 18:07:40 juju-adc312-0 mongod.37017[9415]: [conn4] replSetReconfig got BadValue: Found two member configurations with same _id field, members.0._id == members.1._id == 1 while validating { _id: "juju", version: 8, members: [ { _id: 1, host: "10.121.191.143:37017", priority: 1.0, tags: { juju-machine-id: "0" }, votes: 1 }, { _id: 1, host: "10.121.191.143:37017", priority: 0.0, tags: { juju-machine-id: "1" }, votes: 0 }, { _id: 3, host: "10.121.191.178:37017", priority: 0.0, tags: { juju-machine-id: "2" }, votes: 0 } ] }

$ juju-db.bash
MongoDB shell version: 3.2.15
connecting to: 127.0.0.1:37017/juju
2019-03-08T18:15:12.217+0000 W NETWORK [thread1] SSL peer certificate validation failed: unable to get local issuer certificate
2019-03-08T18:15:12.217+0000 W NETWORK [thread1] The server certificate does not match the host name 127.0.0.1
2019-03-08T18:15:12.248+0000 E QUERY [thread1] Error: Authentication failed. :
DB.prototype._authOrThrow@src/mongo/shell/db.js:1441:20
@(auth):6:1
@(auth):1:2

exception: login failed
Connection to 10.121.191.143 closed.

$ lxc list
+---------------+---------+-----------------------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+---------------+---------+-----------------------+------+------------+-----------+
| juju-adc312-0 | RUNNING | 10.121.191.143 (eth0) | | PERSISTENT | |
+---------------+---------+-----------------------+------+------------+-----------+

http://people.canonical.com/~heather/juju-backup-20190308-175227.tar.gz

Unfortunately I haven't been able to reproduce this.

tags: added: backup-restore
Ian Booth (wallyworld)
tags: removed: restore-backup
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Undecided → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.