zookeeper: WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /127.0.0.1 - max is 10

Bug #1070519 reported by dann frazier
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
Fix Released
High
Unassigned
Precise
Fix Released
Undecided
Unassigned
Raring
Fix Released
High
Unassigned
Saucy
Fix Released
Undecided
Unassigned
juju (Ubuntu)
Fix Committed
High
Kapil Thangavelu

Bug Description

I'm running juju 0.6-1ubuntu1 on a quantal/maas cloud of highbank/maas nodes. I have a demo loop that I run to deploy hadoop, wait, add a few nodes, wait add a few more nodes, then tear things down to just the bootstrap node & restart. I'm finding that this will very quickly cause a hang where any juju command fails to make progress.

ubuntu@laptop:~$ juju status
2012-10-23 13:34:17,456 INFO Connecting to environment...
<hang>

Manually inspecting the bootstrap node shows that the ssh connection does occur - and until I ^c the juju command, /var/log/zookeeper/zookeeper.log fills with these messages - about 2/second:

2012-10-23 15:43:16,754 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /127.0.0.1 - max is 10

netstat output doesn't suggest a ton of active connections - I'll attach a copy of this output in case it helps.

I've also found that running "sudo restart zookeeper" on the node frees it up, and I can again run juju commands.

See demo.sh in lp:~dannf/+junk/arm-maas-demo to see what this sample code looks like. Dropping the "sleep" timeouts to a low value - e.g. 10s - seems to make the hang occur faster.

Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
dann frazier (dannf) wrote :

Note that the zookeeper process appears to be spinning, not just hung.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

I think I've seen this too dann. Will investigate soon, it may be a simpler matter of raising some defaults.

Changed in juju (Ubuntu):
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in juju (Ubuntu):
status: New → Confirmed
Ara Pulido (ara)
Changed in oem-priority:
importance: Undecided → High
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

The solution requires modifying the zookeeper config, its resolved in trunk, but not in the 0.6 ppa. It can manually resolved via
echo "maxClientCnxns=500" >> /etc/zookeeper/conf/zoo.cfg on the bootstrap node.

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

the maxclientcnxns adjustment also needs a restart of zookeeper..

Revision history for this message
Chris Van Hoof (vanhoof) wrote : Re: [Bug 1070519] Re: zookeeper: WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /127.0.0.1 - max is 10

Thanks Kapil -- Is this destined for the ppa as well or are you after
verification first?

--chris

Changed in juju (Ubuntu):
assignee: nobody → Kapil Thangavelu (hazmat)
status: Confirmed → In Progress
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

its destined for the ppa (waiting on builders atm), its already tested and landed on the 0.6 branch (juju-origin: lp:juju/0.6)

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

landed in 0.6 ppa (precise default)

Changed in juju (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Ara Pulido (ara) wrote :

I have confirmed with Vanhoof that PPA is OK

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.