Network interface allocation corrupts instance info cache
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Sahid Orentino | ||
Liberty |
Fix Released
|
Medium
|
Mark Goddard |
Bug Description
Allocation of network interfaces for an instance can result in corruption of the instance info cache in Nova. The result is that the cache may contain duplicate entries for network interfaces. This can cause failure to boot nodes, as seen with the Libvirt driver.
Seen on Ubuntu / devstack / commit b0013d93ffeaed5
The issue can be reproduced using an instance with a large number of interfaces, for example using the heat stack in the attached YAML file heat-stack-
This issue was found by SecurityFun23 when testing the fix for bug #1467581.
The problem appears to be that in nova.network.
The perceived problem in a more visual form:
Request:
- Allocate interfaces for an instance (nova.network.
- n x Neutron API port create/updates
-------
Notification:
- External event notification from Neutron - network-changed (nova.compute.
- Refresh instance network cache (network_
- Query ports for device in Neutron
- Add new ports to instance info cache
-------
Request:
- Refresh instance network cache with new interfaces (get_instance_
- Unconditionally add duplicate interfaces to cache.
Changed in nova: | |
importance: | Undecided → Medium |
status: | New → Confirmed |
tags: | added: network |
Changed in nova: | |
assignee: | Mark Goddard (mgoddard) → Roman Podoliaka (rpodolyaka) |
Possibly caused by the fix for bug #1407664, which allows an empty instance info cache to be updated based on Neutron ports. The unit test 'test_get_ instance_ nw_info_ ignores_ neutron_ ports' seems to suggest that during a cache refresh, ports seen in Neutron with a matching instance/device ID should be ignored. However, the test only covers the case where there is already a network in the cache. Adding a similar test case for when there is no network in the cache currently fails. Without having looked into it too much, I think that the fix for bug #1467581 may also fix #1407664, in which case that fix could potentially be reverted.