OpenStack Compute (nova)

Bug #1535918
Comment #1

Comment 1 for bug 1535918

Revision history for this message

Kyle L. Henderson (kyleh) wrote on 2016-01-22:

To point out the issue a little more.

The compute manager's virtapi allows the compute driver to wait for external events via wait_for_instance_event() method. The common use case is for a compute driver to wait for the vifs to be plugged by neutron before proceeding through the spawn. The pattern is also present in the libvirt driver. See libvirt driver.py -> _create_domain_and_network(). In there you'll see the use of the wait_for_instance_event context manager.

The flow for the events to come into Nova is through nova/api/openstack/compute/server_external_events.py. which eventually calls compute_api.external_instance_event() to dispatch the events. In external_instance_event() you'll see it's using instance.host to call compute_rpcapi.external_instance_event(). So the RPC message will go to whatever host is currently set. In the case of evacuate, at that point in time (while the new host is spawning the recreated VM) it's set to the original host. Which is down. So the compute driver that initiated the action and is waiting for the event will never get it.

The question was raised why libvirt doesn't suffer the same fate. I can't answer that authoritatively, but libvirt has a lot of conditions that have to be met before it'll wait for the event. Here's what it's currently checking before waiting for a plug vif event:

        timeout = CONF.vif_plugging_timeout
        if (self._conn_supports_start_paused and
            utils.is_neutron() and not
            vifs_already_plugged and power_on and timeout):
            events = self._get_neutron_events(network_info)
        else:
            events = []

But it does seem (from reading the code) that if all those conditions are met and the operation is an evacuate, it too would fail. Though I have not tried it.