nova-api fails to query ServiceGroup status from Zookeeper
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
Sean Dague |
Bug Description
I am running with the ZooKeeper servicegroup driver on CentOS 6.4 (Python 2.6) with the RDO distro of Grizzly.
All nova services are successfully connecting to ZooKeeper, which I've verified using zkCli.
However, when I run `nova service-list` I get an HTTP 500 error from nova-api. The nova-api log (/var/log/
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/
, line 93, in service_is_up
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack return self._driver.
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/
/zk.py", line 116, in is_up
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack all_members = self.get_
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/
/zk.py", line 141, in get_all
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack raise exception.
r")
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack ServiceGroupUna
eeperDriver is temporarily unavailable.
The problem seems to be around evzookeeper (using version 0.4.0).
To isolate the problem, I added some evzookeeper.
When I do the get() operation from within the ZooKeeperDriver get_all() method, the web request hangs indefinitely. However, if I recreate the evzookeeper.
diff --git a/nova/
index 2a3edae..7de2488 100644
--- a/nova/
+++ b/nova/
@@ -122,7 +122,14 @@ class ZooKeeperDriver
monitor = self._monitors.
if monitor is None:
path = "%s/%s" % (CONF.zookeeper
- monitor = membership.
+
+ null = open(os.devnull, "w")
+ local_session = evzookeeper.
+ recv_timeout=
+ CONF.zookeeper.
+ zklog_fd=null)
+
+ monitor = membership.
# Note(maoy): When initialized for the first time, it takes a
# while to retrieve all members from zookeeper. To prevent
tags: | added: compute |
tags: | added: low-hanging-fruit |
Changed in nova: | |
assignee: | nobody → Paul Green (paul-green-u) |
Changed in nova: | |
importance: | Undecided → Low |
Changed in nova: | |
assignee: | Paul Green (paul-green-u) → Sean Dague (sdague) |
Changed in nova: | |
milestone: | none → juno-rc1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | juno-rc1 → 2014.2 |
Update: I work for the same company that Jeff worked for when he entered this report. This problem still manifests itself in Havana and Icehouse. We have reproduced the issue using the following packages:
CentOS 6.5
Python 2.6
evzookeeper 0.4.0
zc-zookeeper-static 3.4.4
The stack trace of our current failures is similar to the original report.
The code change posted by Jeff and included with this report resolves the issue.