nova-compute doesn't shutdown gracefully on SIGTERM, e.g. booting a VM fails with:
09:29:18 AUDIT nova.compute.manager [req-9cdbba9c-af3b-4845-9deb-c68bffe63d75 None] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286
df8] Starting instance...
09:29:18 INFO nova.openstack.common.service [-] Caught SIGTERM, exiting
...
09:29:37 INFO nova.compute.manager [-] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] VM Started (Lifecycle Event)
09:29:37 INFO nova.compute.manager [-] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] VM Paused (Lifecycle Event)
...
09:34:37 WARNING nova.virt.libvirt.driver [req-9cdbba9c-af3b-4845-9deb-c68bffe63d75 None] Timeout waiting for vif plugging callback for instance 7ea3e761-6b85-49db-8dcc-79f6f2286df8
09:34:37 INFO nova.compute.manager [-] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] VM Stopped (Lifecycle Event)
09:34:38 INFO nova.virt.libvirt.driver [req-9cdbba9c-af3b-4845-9deb-c68bffe63d75 None] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] Deleting instance files /var/lib/nova/instances/7ea3e761-6b85-49db-8dcc-79f6f2286df8
09:34:38 ERROR nova.compute.manager [req-9cdbba9c-af3b-4845-9deb-c68bffe63d75 None] [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] Instance failed to spawn
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] Traceback (most recent call last):
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1773, in _spawn
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] block_device_info)
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2299, in spawn
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] block_device_info)
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3745, in _create_domain_and_network
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] raise exception.VirtualInterfaceCreateException()
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8] VirtualInterfaceCreateException: Virtual Interface creation failed
09:34:38 TRACE nova.compute.manager [instance: 7ea3e761-6b85-49db-8dcc-79f6f2286df8]
Ok, so my current understanding of the problem is described on this sequence diagram - http:// goo.gl/ QAfcKU
Basically, the way graceful shutdown is implemented for RPC servers like nova-compute is that upon receiving of SIGTERM, RPC threads pool is resized to 0, no new RPC requests are accepted, nova-compute waits until all RPC threads end. At the same time, nova-compute relies on receiving of a notification from nova-api, that neutron has finished plugging in VIFs, so nova-compute is stuck waiting for a message it cannot handle and exits after graceful shutdown timeout (300s) leaving a VM in paused state. After restarting nova-compute all VMs in half-provisioned state are put into ERROR state.