On the source node, yes I found the disk is still present in the _resize directory.
# ls /var/lib/nova/instances/a1b29f17-e910-4216-9165-e148e62d1ba1_resize/ -lha
total 20M
drwxr-xr-x 2 nova nova 4.0K Nov 14 12:48 .
drwxr-xr-x 29 nova nova 4.0K Nov 14 12:50 ..
-rw------- 1 root root 54K Nov 14 12:50 console.log
-rw-r--r-- 1 root root 20M Nov 14 12:50 disk
-rw-r--r-- 1 nova nova 79 Nov 14 12:48 disk.info
In the logs I see that it did try to copy the disk to the destination node. If the disk is small enough, it will copy, but usually this will fail because the destination host (our rbd hypervisors) are booted from a very small ramdisk which cannot hold the instance root disk.
Even though it copied my test instance disk, the end result is the instance goes to error state due to the error in my initial bug report. The resize is not revertable:
ubuntu@b0ca2dda2a32:~$ openstack server resize --flavor s1.small --wait a1b29f17-e910-4216-9165-e148e62d1ba1
Error resizing server: a1b29f17-e910-4216-9165-e148e62d1ba1
Error resizing server
ubuntu@b0ca2dda2a32:~$ openstack server resize --revert a1b29f17-e910-4216-9165-e148e62d1ba1
Cannot 'revertResize' instance a1b29f17-e910-4216-9165-e148e62d1ba1 while it is in vm_state error (HTTP 409) (Request-ID: req-7ec082bf-7eb3-4b7f-be1f-fe2221ad0f39)
ubuntu@b0ca2dda2a32:~$ openstack server set --state active a1b29f17-e910-4216-9165-e148e62d1ba1
ubuntu@b0ca2dda2a32:~$ openstack server resize --revert a1b29f17-e910-4216-9165-e148e62d1ba1
Cannot 'revertResize' instance a1b29f17-e910-4216-9165-e148e62d1ba1 while it is in vm_state active (HTTP 409) (Request-ID: req-fd828140-405b-4747-b6c5-7f846a46e2dc)
So even though I have the disk present on the source and (maybe) the destination node, I don't see a way to easily recover the instance back to its previous state.
Last I tested what will happen if I delete the instance now that it is stuck in this state. The disk was deleted from the destination node, however it is not cleaned up from the source node.
On the source node, yes I found the disk is still present in the _resize directory.
# ls /var/lib/ nova/instances/ a1b29f17- e910-4216- 9165-e148e62d1b a1_resize/ -lha
total 20M
drwxr-xr-x 2 nova nova 4.0K Nov 14 12:48 .
drwxr-xr-x 29 nova nova 4.0K Nov 14 12:50 ..
-rw------- 1 root root 54K Nov 14 12:50 console.log
-rw-r--r-- 1 root root 20M Nov 14 12:50 disk
-rw-r--r-- 1 nova nova 79 Nov 14 12:48 disk.info
In the logs I see that it did try to copy the disk to the destination node. If the disk is small enough, it will copy, but usually this will fail because the destination host (our rbd hypervisors) are booted from a very small ramdisk which cannot hold the instance root disk.
Even though it copied my test instance disk, the end result is the instance goes to error state due to the error in my initial bug report. The resize is not revertable:
ubuntu@ b0ca2dda2a32: ~$ openstack server resize --flavor s1.small --wait a1b29f17- e910-4216- 9165-e148e62d1b a1 e910-4216- 9165-e148e62d1b a1 b0ca2dda2a32: ~$ openstack server resize --revert a1b29f17- e910-4216- 9165-e148e62d1b a1 e910-4216- 9165-e148e62d1b a1 while it is in vm_state error (HTTP 409) (Request-ID: req-7ec082bf- 7eb3-4b7f- be1f-fe2221ad0f 39) b0ca2dda2a32: ~$ openstack server set --state active a1b29f17- e910-4216- 9165-e148e62d1b a1 b0ca2dda2a32: ~$ openstack server resize --revert a1b29f17- e910-4216- 9165-e148e62d1b a1 e910-4216- 9165-e148e62d1b a1 while it is in vm_state active (HTTP 409) (Request-ID: req-fd828140- 405b-4747- b6c5-7f846a46e2 dc)
Error resizing server: a1b29f17-
Error resizing server
ubuntu@
Cannot 'revertResize' instance a1b29f17-
ubuntu@
ubuntu@
Cannot 'revertResize' instance a1b29f17-
So even though I have the disk present on the source and (maybe) the destination node, I don't see a way to easily recover the instance back to its previous state.
Last I tested what will happen if I delete the instance now that it is stuck in this state. The disk was deleted from the destination node, however it is not cleaned up from the source node.