Restore-backup cannot initiate replica set

Bug #1626573 reported by Aaron Bentley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
High
Unassigned

Bug Description

As seen here:
http://reports.vapour.ws/releases/issue/57dff624749a565f5f2982f0

23:32:08 ERROR cmd supercommand.go:458 cannot perform restore: <nil>: restore failed: cannot reset replicaSet: cannot initiate replica set: cannot dial mongo to initiate replicaset: auth fails
23:32:08 DEBUG cmd supercommand.go:459 (error details: [{github.com/juju/juju/cmd/juju/backups/restore.go:417: } {github.com/juju/juju/api/backups/restore.go:137: cannot perform restore: <nil>} {github.com/juju/juju/api/apiclient.go:618: } {github.com/juju/retry/retry.go:187: } {github.com/juju/juju/rpc/client.go:149: } {restore failed: cannot reset replicaSet: cannot initiate replica set: cannot dial mongo to initiate replicaset: auth fails}])
2016-09-18 23:32:08 INFO Call of juju restore exited with an error

Aaron Bentley (abentley)
description: updated
Revision history for this message
Nicholas Skaggs (nskaggs) wrote :

Same issue as bug 1606308?

Revision history for this message
Aaron Bentley (abentley) wrote :

That bug is marked fix-released, and I tend to agree; between July 24 and August 31, we were seeing failures with this symptom as many as 7 times a day, and then it dropped dramatically. I think this is a new bug (or maybe the same old one that caused the failure on June 16).

Changed in juju:
milestone: 2.0-rc2 → 2.0.1
Tim Penhey (thumper)
Changed in juju:
milestone: 2.0.1 → 2.1.0
Revision history for this message
Tim Penhey (thumper) wrote :

OK, I think I have worked out what is going on here.

In an HA environment, any one of the three apiservers may do the backup. One of the things that is saved is the agent configuration file, which includes the machine tag and password.

When we are restoring, we are always creating 'machine-0'. This will work one time in three (on average) when machine-0 is the machine that did the backup.

Probably the easiest thing to do here is not always say that the new machine is machine-0, but instead, it should be the machine that did the backup.

Changed in juju-core:
status: New → Won't Fix
no longer affects: juju-core
Curtis Hovey (sinzui)
tags: added: mongodb
Changed in juju:
milestone: 2.1.0 → 2.2.0
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.2-beta1 → 2.2-beta2
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.2-beta2 → 2.2-beta3
Changed in juju:
milestone: 2.2-beta3 → 2.2-beta4
Changed in juju:
milestone: 2.2-beta4 → 2.2-rc1
Revision history for this message
Tim Penhey (thumper) wrote :

This hasn't happened for some time, perhaps time has been on our side and it has been fixed by another piece of work.

Changed in juju:
status: Triaged → Incomplete
milestone: 2.2-rc1 → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.