ServiceBinaryExists - binary for nova-conductor already exists
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Undecided
|
Corey Bryant | ||
Icehouse |
Fix Released
|
Undecided
|
Unassigned | ||
nova (Ubuntu) |
Fix Released
|
High
|
Corey Bryant | ||
Trusty |
Triaged
|
High
|
Unassigned | ||
Utopic |
Fix Released
|
High
|
Corey Bryant |
Bug Description
We're hitting an intermittent issue where ServiceBinaryExists is raised for nova-conductor on deployment.
From nova-conductor's upstart log ( /var/log/
2014-05-15 12:02:25.206 34494 INFO nova.openstack.
2014-05-15 12:02:25.241 34494 INFO nova.openstack.
2014-05-15 12:02:25.242 34494 INFO nova.openstack.
2014-05-15 12:02:25.244 34494 INFO nova.openstack.
2014-05-15 12:02:25.246 34494 INFO nova.openstack.
2014-05-15 12:02:25.246 34501 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.247 34502 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.247 34494 INFO nova.openstack.
2014-05-15 12:02:25.249 34503 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.251 34504 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.254 34505 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.250 34494 INFO nova.openstack.
2014-05-15 12:02:25.261 34494 INFO nova.openstack.
2014-05-15 12:02:25.263 34494 INFO nova.openstack.
2014-05-15 12:02:25.266 34494 INFO nova.openstack.
2014-05-15 12:02:25.267 34507 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.268 34506 AUDIT nova.service [-] Starting conductor node (version 2014.1)
2014-05-15 12:02:25.271 34508 AUDIT nova.service [-] Starting conductor node (version 2014.1)
/usr/lib/
match = pattern.
/usr/lib/
match = pattern.
Traceback (most recent call last):
File "/usr/lib/
timer()
File "/usr/lib/
cb(*args, **kw)
File "/usr/lib/
2014-05-15 12:02:25.862 34502 ERROR oslo.messaging.
result = function(*args, **kwargs)
File "/usr/lib/
service.start()
File "/usr/lib/
self.
File "/usr/lib/
service = self.conductor_
File "/usr/lib/
return self._manager.
File "/usr/lib/
return func(*args, **kwargs)
File "/usr/lib/
svc = self.db.
File "/usr/lib/
return IMPL.service_
File "/usr/lib/
return f(*args, **kwargs)
File "/usr/lib/
binary=
ServiceBinaryEx
2014-05-15 12:02:25.864 34503 ERROR nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.
After looking into the traceback, I came across this piece of code which starts a service. Of note is the 2nd except path. This exception was introduced in January to handle races when creating a record. It seems as if we are racing similarly in our situation and might need to handle ServiceBinaryExists the same way ServiceTopicExists is being handled. I'll submit a patch for this.
From nova/service.py:
158 def start(self):
159 verstr = version.
160 LOG.audit(
161 {'topic': self.topic, 'version': verstr})
162 self.basic_
163 self.manager.
164 self.model_
165 ctxt = context.
166 try:
167 self.service_ref = self.conductor_
168 self.host, self.binary)
169 self.service_id = self.service_
170 except exception.NotFound:
171 try:
172 self.service_ref = self._create_
173 except exception.
174 # NOTE(danms): If we race to create a record with a sibling
175 # worker, don't fail here.
176 self.service_ref = self.conductor_
177 self.host, self.binary)
$ git blame -L 158,178 nova/service.py
b3f5aba0 (Andy Smith 2010-12-09 15:25:14 -0800 158) def start(self):
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 159) verstr = version.
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 160) LOG.audit(
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 161) {'topic': self.topic, 'version': verstr})
e8386a27 (Rafi Khardalian 2013-01-23 01:55:09 +0000 162) self.basic_
065257fb (Vishvananda Ishaya 2010-09-23 12:43:41 -0700 163) self.manager.
9c98cfb4 (Vishvananda Ishaya 2010-08-30 00:55:19 -0700 164) self.model_
5e3da586 (Vishvananda Ishaya 2010-10-01 05:57:17 -0700 165) ctxt = context.
57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 166) try:
e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 167) self.service_ref = self.conductor_
e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 168) self.host, self.binary)
e34bc343 (Wenhao Xu 2013-01-21 19:07:34 +0800 169) self.service_id = self.service_
57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 170) except exception.NotFound:
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 171) try:
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 172) self.service_ref = self._create_
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 173) except exception.
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 174) # NOTE(danms): If we race to create a record
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 175) # worker, don't fail here.
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 176) self.service_ref = self.conductor_
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 177) self.host, self.binary)
tags: | added: conductor |
affects: | ubuntu → nova (Ubuntu) |
tags: | added: icehouse-backport-potential |
Changed in nova: | |
milestone: | none → juno-2 |
status: | Fix Committed → Fix Released |
Changed in nova (Ubuntu Utopic): | |
status: | New → Fix Released |
importance: | Undecided → High |
Changed in nova (Ubuntu Trusty): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in nova: | |
milestone: | juno-2 → 2014.2 |
What release is this on, or trunk/master for Juno on 6/5?