Comment 14 for bug 1477475

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

In other words, I think we are fighting with symptoms here (just like we fixed the original problem in 7.0), instead of fixing of the root cause.

I suggest we fix the docs and tests instead.

The procedure for compute nodes must be the following (this is from Nova perspective - we'll need to repeat this exercise for other components as well):

1. Disable scheduling of new VMs on this nova-compute instance:

   nova service-disable <host> nova-compute

2. Shut off all the VMs running on the node to be re-installed:

    for VM in $(nova list --host <host>); do
       nova stop $VM;
    done

or alternatively, (live) migrate VMs first (please see https://docs.google.com/document/d/1nZ4QHfOOqyioOUDgnrO3i7NQkUeiD7qyq0jlrIcuuNU/edit for details)

3. Preserve partitions.

4. Start redeployment.

5. Enable nova-compute service:

   nova service-enable <host> nova-compute

If we didn't (live) migrated VMs:

6. Start VMs:

    for VM in $(nova list --host <host> --status SHUTOFF); do
       nova start $VM
    done

Please see the google doc for detailed description of the maintenance mode and VMs migration.