infinite recursion when deleting an instance with no network interfaces
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Peter Feiner | ||
Havana |
Fix Released
|
High
|
Ihar Hrachyshka |
Bug Description
In some situations when an instance has "no network information" (a phrase that I'm using loosely), deleting the instance results in infinite recursion. The stack looks like this:
2013-11-15 18:50:28.995 DEBUG nova.network.
func(*args, **kwargs)
File "/opt/stack/
**args)
File "/opt/stack/
result = getattr(proxyobj, method)(ctxt, **kwargs)
File "/opt/stack/
return function(self, context, *args, **kwargs)
File "/opt/stack/
return f(self, context, *args, **kw)
File "/opt/stack/
return function(self, context, *args, **kwargs)
File "/opt/stack/
function(self, context, *args, **kwargs)
File "/opt/stack/
return function(self, context, *args, **kwargs)
File "/opt/stack/
do_
File "/opt/stack/
return f(*args, **kwargs)
File "/opt/stack/
reservation
File "/opt/stack/
rv = f(*args, **kwargs)
File "/opt/stack/
self.
File "/opt/stack/
network_info = self._get_
File "/opt/stack/
instance)
File "/opt/stack/
result = self._get_
File "/opt/stack/
nw_info=res)
RECURSION STARTS HERE
File "/opt/stack/
nw_info = api._get_
File "/opt/stack/
nw_info=res)
... REPEATS AD NAUSEUM ...
File "/opt/stack/
nw_info = api._get_
File "/opt/stack/
nw_info=res)
File "/opt/stack/
nw_info = api._get_
File "/opt/stack/
res = f(self, context, *args, **kwargs)
File "/opt/stack/
LOG.debug('%s', ''.join(
Here's a step-by-step explanation of how the infinite recursion arises:
1. somebody calls nova.network.
2. in the above call, the network info is successfully retrieved as result = self._get_
3. however, since the instance has "no network information", result is the empty list (i.e., [])
4. the result is put in the cache by calling nova.network.
5. update_
if not nw_info:
nw_info = api._get_
which erroneously equates [] and None. Hence the check should be "if nw_info is None:"
I should clarify that the instance _did_ have network information at some point (i.e., I booted it normally with a NIC), however, some time after I issued a "nova delete" request, the network information was gone (i.e., in nova list, the networks column was empty for the instance while it was in the deleting task state).
I came across this problem when doing performance testing with the latest openstack code (i.e., the master branches as of this morning of all of the github.
There's an outstanding max recursion issue (https:/
Although The fix is simple enough, I'm not going to fire off a review immediately because I haven't put much thought in how to test it.
tags: | added: api network |
Changed in nova: | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in nova: | |
assignee: | Armando Migliaccio (armando-migliaccio) → nobody |
assignee: | nobody → Peter Feiner (pete5) |
status: | Triaged → In Progress |
tags: | added: havana-backport-potential |
tags: | removed: havana-backport-potential |
Changed in nova: | |
status: | In Progress → Fix Committed |
milestone: | none → icehouse-1 |
Changed in nova: | |
status: | Fix Committed → Fix Released |
tags: | removed: in-stable-havana |
Changed in nova: | |
milestone: | icehouse-1 → 2014.1 |
Peter, thanks for the analysis. This seems nasty enough that I don't know there would be a ton of nit picking over unit tests. Seems like the [] vs None could just be used in existing tests to verify the fix.