Nova creates duplicate Neutron ports on instance reschedule

Bug #1609526 reported by Major Hayden on 2016-08-03
84
This bug affects 19 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Unassigned

Bug Description

Consider this environment:

* Running stable/mitaka (latest available)
* Four hypervisors
* Two glance nodes (A and B)
* The glance nodes are storing images locally but the image files aren't in sync between both hosts

When I request a new instance, the following happens:

* Instance is scheduled to hypervisor A
* Hypervisor A checks to see if the image is available for use -- SUCCESS
* Hypervisor A calls neutron for a network port -- SUCCESS
* Hypervisor A tries to download image from glance server A -- FAILURE (glance server A doesn't have the image cached on its filesystem)
* Instance is rescheduled to hypervisor B
* Hypervisor B checks to see if the image is available for use -- SUCCESS
* Hypervisor B calls neutron for a network port -- SUCCESS
* Hypervisor B downloads an image from glance server B -- SUCCESS (glance server B has the image on its filesystem)

The instance will come up on hypervisor B with two ports attached to the instance. The second one (requested by hypervisor B) will be up and fully functional. The first port (requested by hypervisor A) will be marked as 'down' and won't be usable.

It seems like nova-compute should call neutron to say "I don't need that network port any longer since I can't get what I need to build the rest of the instance" and clean up that port. Without the cleanup, an instance can end up with a lot of ports attached and potentially waste a lot of IPv4 address space.

I wrote more details on this issue here: https://major.io/2016/08/03/openstack-instances-come-online-with-multiple-network-ports-attached/

summary: - nova doesn't clean up network ports when an image fails to download from
+ nova should clean up network ports when an image fails to download from
glance

Today I face the same issue in my devstack.

stack@szxbzci0004 ~/nova (master *) $ git log -1
commit e9d503a1202fadd5163e343424cf15285f5dc016
Merge: 5426d95 a6ad102
Author: Jenkins <email address hidden>
Date: Thu Sep 1 03:15:49 2016 +0000

    Merge "Update placement config reno"

I have two compute nodes, but one of them(A) exist RBD configure issue, so when libvirt try to launch the instance, a LibvirtError is raised, the instance is rescheduled to another compute node(B), but the linux bridge isn't cleaned up on compute node A, and the instance launch on compute node B successfully, but it allocate port again, so the instance run with two ports.

See my operation details:
http://paste.openstack.org/show/565674/

Changed in nova:
status: New → Confirmed
Changed in nova:
assignee: nobody → Zhenyu Zheng (zhengzhenyu)
cloudbuilders (operations-8) wrote :

We've came across this problem as well.
We have 4 Glance nodes, with the images mounted on an NFS volume. One of the Glance instances went down, and it failed mounting the NFS when it rebooted. We started having VMs with more than one port assigned (showing more than one IP per VM in Horizon.)

Seems to us that Nova should tell Neutron, either to delete the unused port, or update it instead of creating a new one.

Maciej Szankin (mszankin) wrote :

Zhenyu Zheng, how is the work going? It has been some time since your last activity. If you are actively working on this item can you confirm, otherwise unassign yourself?

summary: - nova should clean up network ports when an image fails to download from
- glance
+ Nova creates duplicate Neutron ports on instance reschedule
Changed in nova:
importance: Undecided → Medium
Changed in nova:
assignee: Zhenyu Zheng (zhengzhenyu) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers