Comment 17 for bug 1720251

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1720251] Re: new controller machines stay in "adding-vote" forever

If you can ssh into the controller machine, and connect to Mongo, you
should be able to run:
> rs.config()
> rs.status()

on the Mongo console.
If you can at least ssh into the machine that has the primary mongo, this
should get you access to the Mongo shell:
agent=$(cd /var/lib/juju/agents; echo machine-*)
pw=$(sudo grep statepassword /var/lib/juju/agents/${agent}/agent.conf | cut
'-d ' -sf2)
/usr/lib/juju/mongo3.2/bin/mongo --ssl -u ${agent} -p $pw
--authenticationDatabase admin --sslAllowInvalidHostnames
--sslAllowInvalidCertificates localhost:37017/juju

At this point, I'd probably be more interested in 'juju debug-log
--include-model juju.peergrouper --replay'

On Tue, Dec 12, 2017 at 4:44 PM, Jason Hobbs <email address hidden>
wrote:

> This is failing on 2.3, so I don't think that fix worked.
>
> On Tue, Dec 12, 2017 at 2:20 AM, John A Meinel <email address hidden>
> wrote:
>
> > Getting information about what rs.config says inside the Mongo
> controller,
> > and what log output is happening for the peergrouper are both relevant
> for
> > this.
> >
> > Most likely this is a bug we've seen where the peer grouper decides that
> > the best addresses to assign for all of the Mongo replicas is "127.0.0.1"
> > which is critically the wrong one.
> > A fix has already gone into 2.3 so that localhost is never selected. It
> is
> > possible that we would backport that fix.
> >
> >
> > On Tue, Dec 12, 2017 at 12:54 AM, Chris Gregan <
> <email address hidden>
> > >
> > wrote:
> >
> > > Bumped to field critical
> > >
> > > --
> > > You received this bug notification because you are a member of
> Canonical
> > > Field Critical, which is subscribed to the bug report.
> > > Matching subscriptions: juju bugs
> > > https://bugs.launchpad.net/bugs/1720251
> > >
> > > Title:
> > > new controller machines stay in "adding-vote" forever
> > >
> > > To manage notifications about this bug go to:
> > > https://bugs.launchpad.net/juju/+bug/1720251/+subscriptions
> > >
> >
> > --
> > You received this bug notification because you are subscribed to the bug
> > report.
> > https://bugs.launchpad.net/bugs/1720251
> >
> > Title:
> > new controller machines stay in "adding-vote" forever
> >
> > Status in juju:
> > Triaged
> >
> > Bug description:
> > juju version: 2.2.4
> > maas version: 2.2.3-6114-g672ff26-0ubuntu1~16.04.1
> >
> > After running enable-ha, my two new controller machines have been in
> > 'adding-vote' state for over an hour. Their machine logs have this:
> >
> > 2017-09-28 21:59:47 ERROR juju.worker.dependency engine.go:546 "state"
> > manifold worker returned unexpected error: cannot connect to mongodb:
> > no reachable servers
> >
> > Here is the status:
> > http://paste.ubuntu.com/25635827/
> >
> > I've attached a crashdump from the deployment.
> >
> > This doesn't happen everytime - we've had quite a few runs where it
> > didn't happen and this is the first where it did.
> >
> > The longest I've let it stay in this state is 8 hours - it seems like
> > it will be stuck forever.
> >
> > To manage notifications about this bug go to:
> > https://bugs.launchpad.net/juju/+bug/1720251/+subscriptions
> >
>
> --
> You received this bug notification because you are a member of Canonical
> Field Critical, which is subscribed to the bug report.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1720251
>
> Title:
> new controller machines stay in "adding-vote" forever
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1720251/+subscriptions
>