commit ad20a87028523f0a1fdf2e9319fac4537c9fbbf3
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 15 19:15:44 2017 -0500
Always deallocate networking before reschedule if using Neutron
When a server build fails on a selected compute host, the compute
service will cast to conductor which calls the scheduler to select
another host to attempt the build if retries are not exhausted.
With commit 08d24b733ee9f4da44bfbb8d6d3914924a41ccdc, if retries
are exhausted or the scheduler raises NoValidHost, conductor will
deallocate networking for the instance. In the case of neutron, this
means unbinding any ports that the user provided with the server
create request and deleting any ports that nova-compute created during
the allocate_for_instance() operation during server build.
When an instance is deleted, it's networking is deallocated in the same
way - unbind pre-existing ports, delete ports that nova created.
The problem is when rescheduling from a failed host, if we successfully
reschedule and build on a secondary host, any ports created from the
original host are not cleaned up until the instance is deleted. For
Ironic or SR-IOV ports, those are always deallocated.
The ComputeDriver.deallocate_networks_on_reschedule() method defaults
to False just so that the Ironic driver could override it, but really
we should always cleanup neutron ports before rescheduling.
Looking over bug report history, there are some mentions of different
networking backends handling reschedules with multiple ports differently,
in that sometimes it works and sometimes it fails. Regardless of the
networking backend, however, we are at worst taking up port quota for
the tenant for ports that will not be bound to whatever host the instance
ends up on.
There could also be legacy reasons for this behavior with nova-network,
so that is side-stepped here by just restricting this check to whether
or not neutron is being used. When we eventually remove nova-network we
can then also remove the deallocate_networks_on_reschedule() method and
SR-IOV check.
NOTE(mriedem): There are a couple of changes to the unit test for code
that didn't exist in Pike, due to the change for alternate hosts
Iae904afb6cb4fcea8bb27741d774ffbe986a5fb4 and the change to pass the
request spec to conductor Ie5233bd481013413f12e55201588d37a9688ae78.
Change-Id: Ib2abf73166598ff14fce4e935efe15eeea0d4f7d
Closes-Bug: #1597596
(cherry picked from commit 3a503a8f2b934f19049531c5c92130ca7cdd6a7f)
(cherry picked from commit 9203326f84cd35243e9e6a73cd5fac62af27aaf5)
Reviewed: https:/ /review. openstack. org/555907 /git.openstack. org/cgit/ openstack/ nova/commit/ ?id=ad20a870285 23f0a1fdf2e9319 fac4537c9fbbf3
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit ad20a87028523f0 a1fdf2e9319fac4 537c9fbbf3
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 15 19:15:44 2017 -0500
Always deallocate networking before reschedule if using Neutron
When a server build fails on a selected compute host, the compute
service will cast to conductor which calls the scheduler to select
another host to attempt the build if retries are not exhausted.
With commit 08d24b733ee9f4d a44bfbb8d6d3914 924a41ccdc, if retries for_instance( ) operation during server build.
are exhausted or the scheduler raises NoValidHost, conductor will
deallocate networking for the instance. In the case of neutron, this
means unbinding any ports that the user provided with the server
create request and deleting any ports that nova-compute created during
the allocate_
When an instance is deleted, it's networking is deallocated in the same
way - unbind pre-existing ports, delete ports that nova created.
The problem is when rescheduling from a failed host, if we successfully
reschedule and build on a secondary host, any ports created from the
original host are not cleaned up until the instance is deleted. For
Ironic or SR-IOV ports, those are always deallocated.
The ComputeDriver. deallocate_ networks_ on_reschedule( ) method defaults
to False just so that the Ironic driver could override it, but really
we should always cleanup neutron ports before rescheduling.
Looking over bug report history, there are some mentions of different
networking backends handling reschedules with multiple ports differently,
in that sometimes it works and sometimes it fails. Regardless of the
networking backend, however, we are at worst taking up port quota for
the tenant for ports that will not be bound to whatever host the instance
ends up on.
There could also be legacy reasons for this behavior with nova-network, networks_ on_reschedule( ) method and
so that is side-stepped here by just restricting this check to whether
or not neutron is being used. When we eventually remove nova-network we
can then also remove the deallocate_
SR-IOV check.
NOTE(mriedem): There are a couple of changes to the unit test for code b4fcea8bb27741d 774ffbe986a5fb4 and the change to pass the 13f12e55201588d 37a9688ae78.
that didn't exist in Pike, due to the change for alternate hosts
Iae904afb6c
request spec to conductor Ie5233bd4810134
Change-Id: Ib2abf73166598f f14fce4e935efe1 5eeea0d4f7d 9049531c5c92130 ca7cdd6a7f) 43e9e6a73cd5fac 62af27aaf5)
Closes-Bug: #1597596
(cherry picked from commit 3a503a8f2b934f1
(cherry picked from commit 9203326f84cd352