When performing a live migration (kvm) when multi_host is set to True the following does not happen:
1) The networks(bridge and vlan) on the destination node are not setup by nova-network.
*) If this is not configured before the migration the instance will fail to start on the destination node and will rollback to the source node.
2) dnsmasq is not updated on the destination node.
*) dnsmasq hosts file is not updated on the migration destination and it will not reply to DHCP requests from the migrated instance.
*) Additionally, DHCP requests will still be answered by the source migration node until a new instance is created on that compute node. When that happens dnsmasq host files are re-written and dnsmasq is sent SIGHUP and it will no longer respond to DHCP requests from the migrated instance.
If both of the above occur the migrated instance will lose IP access upon the expiration of its lease.
I have included a patch that will fix this in the short-term but a more elegant resolution is required.
Tested and fixed on diablo/stable. This bug is also present in essex.
I discussed this a bit offline with the networking team. It seems a little challenging to do the correct implementation for essex, but here is the basic plan:
Notes below from Trey Morris:
we'll need to pull the network_setup functionality out of ip allocation/ deallocation and add a callable trigger to that functionality to the network api. It looks like for allocate_ip the functionality is already split out into the _setup_network() function. We need to do something similar for deallocate_ip, like _teardown_ network( ), and create a setup_networks() function with a corresponding network_api call.
setup/unsetup_all could be optimized into one function with a default parameter, something like:
def setup_networks( self, context, teardown=False, **kwargs): network
if teardown:
call_func = self._teardown_
else:
call_func = self._setup_network
*pull instance variables from kwargs* instance_ nw_info( instance_ stuff.. .)
self. call_func( context, vif['network'])
nw_info = self.get_
for vif in nw_info:
The flow as I see it (from compute) would be network_ api.setup_ networks( instance, teardown=True) network_ api.setup_ networks( instance)
def live_migrate():
self.
perform migrate as it was before
self.
allocate_fixed_ip would still call self._setup_ network( context, network) and the single network would be configured just as it was before, and deallocate could do the same, only it would call self._teardown_ network( context, network) instead of performing the teradown in-function.
My only addition might be that you would want to teardown the network on the old host after the migrate, which means you might have to pass the host in the call somewhere.
In the meantime the above patch will at least make things work.