OpenStack Compute (nova)

Bug #1803331
Comment #5

Comment 5 for bug 1803331

Revision history for this message

Logan V (loganv) wrote on 2018-11-14:

On the source node, yes I found the disk is still present in the _resize directory.

# ls /var/lib/nova/instances/a1b29f17-e910-4216-9165-e148e62d1ba1_resize/ -lha
total 20M
drwxr-xr-x 2 nova nova 4.0K Nov 14 12:48 .
drwxr-xr-x 29 nova nova 4.0K Nov 14 12:50 ..
-rw------- 1 root root 54K Nov 14 12:50 console.log
-rw-r--r-- 1 root root 20M Nov 14 12:50 disk
-rw-r--r-- 1 nova nova 79 Nov 14 12:48 disk.info

In the logs I see that it did try to copy the disk to the destination node. If the disk is small enough, it will copy, but usually this will fail because the destination host (our rbd hypervisors) are booted from a very small ramdisk which cannot hold the instance root disk.

Even though it copied my test instance disk, the end result is the instance goes to error state due to the error in my initial bug report. The resize is not revertable:

ubuntu@b0ca2dda2a32:~$ openstack server resize --flavor s1.small --wait a1b29f17-e910-4216-9165-e148e62d1ba1
Error resizing server: a1b29f17-e910-4216-9165-e148e62d1ba1
Error resizing server
ubuntu@b0ca2dda2a32:~$ openstack server resize --revert a1b29f17-e910-4216-9165-e148e62d1ba1
Cannot 'revertResize' instance a1b29f17-e910-4216-9165-e148e62d1ba1 while it is in vm_state error (HTTP 409) (Request-ID: req-7ec082bf-7eb3-4b7f-be1f-fe2221ad0f39)
ubuntu@b0ca2dda2a32:~$ openstack server set --state active a1b29f17-e910-4216-9165-e148e62d1ba1
ubuntu@b0ca2dda2a32:~$ openstack server resize --revert a1b29f17-e910-4216-9165-e148e62d1ba1
Cannot 'revertResize' instance a1b29f17-e910-4216-9165-e148e62d1ba1 while it is in vm_state active (HTTP 409) (Request-ID: req-fd828140-405b-4747-b6c5-7f846a46e2dc)

So even though I have the disk present on the source and (maybe) the destination node, I don't see a way to easily recover the instance back to its previous state.

Last I tested what will happen if I delete the instance now that it is stuck in this state. The disk was deleted from the destination node, however it is not cleaned up from the source node.

On the source node, yes I found the disk is still present in the _resize directory.

# ls /var/lib/nova/instances/a1b29f17-e910-4216-9165-e148e62d1ba1_resize/ -lha
total 20M
drwxr-xr-x  2 nova nova 4.0K Nov 14 12:48 .
drwxr-xr-x 29 nova nova 4.0K Nov 14 12:50 ..
-rw-------  1 root root  54K Nov 14 12:50 console.log
-rw-r--r--  1 root root  20M Nov 14 12:50 disk
-rw-r--r--  1 nova nova   79 Nov 14 12:48 disk.info

Even though it copied my test instance disk, the end result is the instance goes to error state due to the error in my initial bug report. The resize is not revertable:

So even though I have the disk present on the source and (maybe) the destination node, I don't see a way to easily recover the instance back to its previous state.

Last I tested what will happen if I delete the instance now that it is stuck in this state. The disk was deleted from the destination node, however it is not cleaned up from the source node.