Juju bootstrap fails because mongodb is unreachable

Bug #1337340 reported by Chris Glass
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Andrew Wilkins
1.20
Fix Released
High
Andrew Wilkins

Bug Description

Juju version: 1.19.4-trusty-amd64

I hit an issue this morning where bootstrapping would fail, when juju seemingly fails to start/connect to the mongodb server:

http://pastebin.ubuntu.com/7742147/

Destroying an re-bootstrapping the environment did not display the same problem, despite the versions use being the exact same (ubuntu, maas, juju and charm versions are pinned).

"can't dial mongo to initiate replicaset: no reachable servers"

Chris Glass (tribaal)
description: updated
Revision history for this message
Curtis Hovey (sinzui) wrote :

This may relate to slow servers. I am seeing this in KVM testing testing too.

description: updated
tags: added: bootstrap mongodb
Changed in juju-core:
status: New → Triaged
milestone: none → 1.21-alpha1
importance: Undecided → High
Revision history for this message
Dean Henrichsmeyer (dean) wrote :

FWIW, we're seeing this with servers that are not slow.

Revision history for this message
Ian Booth (wallyworld) wrote :

Hi, I notice the issue raised occurred when running Juju 1.19.4
Before 1.20 was released, some work was done to improve how Juju agents talk to Mongo. I'd be very interested to know if the issue is still observed when running Juju 1.20. I am hoping the issue was already been fixed.

Revision history for this message
James Troup (elmo) wrote :

I'm seeing this with juju 1.20 on Softlayer servers which are not slow.

https://pastebin.canonical.com/113156/

tags: added: canonical-is
Revision history for this message
Ian Booth (wallyworld) wrote :

We expect this issue has been fixed with the changes made to address bug 1339240 but would love confirmation that is the case

Changed in juju-core:
assignee: nobody → Andrew Wilkins (axwalk)
status: Triaged → In Progress
Revision history for this message
Andrew Wilkins (axwalk) wrote :

The original issue is a little different to the one linked by elmo (note the different error message, "Closed explicitly" vs. "no reachable servers"). I believe the issue that elmo reports has been fixed (lp:1339240).

The original error is because Mongo has taken >30s to start listening on its socket, probably because it's setting up replica sets which can take a while. The timeout has been increased on trunk and in 1.20.1.

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Marking fix committed as per above comments, please reopen if the issue persists.

Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.