Volume detach fails if there are multiple BDM entries
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Expired
|
Undecided
|
Unassigned |
Bug Description
Steps to reproduce:
1. Attaching volume to an instance fails because of an RPC timeout when nova-api calls nova-compute to create BDM
2. Attaching the same volume to the same instance succeeds the second time
3. There are two BDMs for this volume and one of them has empty connection_info. When we try to detach the volume, an error is thrown because of the stale BDM entry created on step 1:
[req-b14eb2a2-
Traceback (most recent call last):
File "/usr/lib/
incoming.
File "/usr/lib/
return self._do_
File "/usr/lib/
result = func(ctxt, **new_args)
File "/usr/lib/
return f(*args, **kwargs)
File "/usr/lib/
payload)
File "/usr/lib/
self.
File "/usr/lib/
six.
File "/usr/lib/
return f(self, context, *args, **kw)
File "/usr/lib/
kwargs[
File "/usr/lib/
self.
File "/usr/lib/
six.
File "/usr/lib/
return function(self, context, *args, **kwargs)
File "/usr/lib/
instance=
File "/usr/lib/
self.
File "/usr/lib/
six.
File "/usr/lib/
*args, **kwargs)
File "/usr/lib/
attachment_
File "/usr/lib/
connection_info = jsonutils.
File "/usr/lib/
return json.loads(
File "/usr/lib/
raise TypeError("%s can't be decoded" % type(text))
TypeError: <type 'NoneType'> can't be decoded
It is not easy to catch the timeout and then delete the BDM entry because the entry may get created later after the timeout (we have seen this in our environment). Also, we may accidentally delete the entry created by a concurrent attach request.
Need more information about the environment - which version of nova? If master, what is the git hash? Which virt driver?