Comment 16 for bug 1720251

Revision history for this message
Jason Hobbs (jason-hobbs) wrote : Re: [Bug 1720251] Re: new controller machines stay in "adding-vote" forever

This is failing on 2.3, so I don't think that fix worked.

On Tue, Dec 12, 2017 at 2:20 AM, John A Meinel <email address hidden>
wrote:

> Getting information about what rs.config says inside the Mongo controller,
> and what log output is happening for the peergrouper are both relevant for
> this.
>
> Most likely this is a bug we've seen where the peer grouper decides that
> the best addresses to assign for all of the Mongo replicas is "127.0.0.1"
> which is critically the wrong one.
> A fix has already gone into 2.3 so that localhost is never selected. It is
> possible that we would backport that fix.
>
>
> On Tue, Dec 12, 2017 at 12:54 AM, Chris Gregan <<email address hidden>
> >
> wrote:
>
> > Bumped to field critical
> >
> > --
> > You received this bug notification because you are a member of Canonical
> > Field Critical, which is subscribed to the bug report.
> > Matching subscriptions: juju bugs
> > https://bugs.launchpad.net/bugs/1720251
> >
> > Title:
> > new controller machines stay in "adding-vote" forever
> >
> > To manage notifications about this bug go to:
> > https://bugs.launchpad.net/juju/+bug/1720251/+subscriptions
> >
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1720251
>
> Title:
> new controller machines stay in "adding-vote" forever
>
> Status in juju:
> Triaged
>
> Bug description:
> juju version: 2.2.4
> maas version: 2.2.3-6114-g672ff26-0ubuntu1~16.04.1
>
> After running enable-ha, my two new controller machines have been in
> 'adding-vote' state for over an hour. Their machine logs have this:
>
> 2017-09-28 21:59:47 ERROR juju.worker.dependency engine.go:546 "state"
> manifold worker returned unexpected error: cannot connect to mongodb:
> no reachable servers
>
> Here is the status:
> http://paste.ubuntu.com/25635827/
>
> I've attached a crashdump from the deployment.
>
> This doesn't happen everytime - we've had quite a few runs where it
> didn't happen and this is the first where it did.
>
> The longest I've let it stay in this state is 8 hours - it seems like
> it will be stuck forever.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1720251/+subscriptions
>