Should rollback the instance if finish_resize fails

Bug #1396003 reported by Charlotte Han on 2014-11-25
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Unassigned

Bug Description

Resize failed in finish_resize function, the instance disappeared and can not rollback.

log is :
http://paste.openstack.org/show/137968/

Charlotte Han (hanrong) on 2014-11-25
tags: added: in-stable-icehouse
Madhurya (madhurya-jesu) on 2015-01-07
Changed in nova:
assignee: nobody → Madhurya (madhurya-jesu)
Madhurya (madhurya-jesu) on 2015-01-09
Changed in nova:
status: New → Incomplete
Madhurya (madhurya-jesu) wrote :

Hi Rong Han, can you please elaborate your bug. When I changed the '-allow_resize_to_same_host=true' and '-allow_migrate_to_same_host=true' in nova.conf and restarted nova-compute and nova-scheduler, I was able to resize the instance. So can please post your nova.conf file.

Charlotte Han (hanrong) wrote :

# Allow destination machine to match source for resize. Useful
# when testing in single-host environments. (boolean value)
#allow_resize_to_same_host=false

Charlotte Han (hanrong) wrote :

I use the default configuration of 'allow_resize_to_same_host=false' in nova.conf.

Robert Collins (lifeless) wrote :

So it sounds like the default configuration will break vms on resize?

Changed in nova:
status: Incomplete → New
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Jian Wen (wenjianhn) on 2015-11-19
tags: removed: in-stable-icehouse
Jian Wen (wenjianhn) wrote :

Can anyone else reproduce this bug?

Jian Wen (wenjianhn) on 2015-11-19
Changed in nova:
assignee: Madhurya (madhurya-jesu) → nobody
Charlotte Han (hanrong) wrote :

Before resize instance, stop neutron-openvswitch-agent, then resize/migrate instance, the instance is error.

Jian Wen (wenjianhn) wrote :

I reproduced the bug by adding an exception to finish_migration().
" vif_type=binding_failed" means neutron failed to bind the port to the compute host.
You cannot rollback the instance since Neutron doesn't support so.
I don't think we can fix the bug in Nova.

p.s.
The error log is not friendly to users.

Changed in nova:
status: Confirmed → Invalid
Charlotte Han (hanrong) wrote :

if destination host's neutron server is unavailable, we should not resize to this destination host, or we can let it revert to it's source host.

You are right.

Looks like we are able to revert it manually now.
See bug 1296519.

On Thu, Nov 19, 2015 at 3:39 PM, Rong Han ZTE <email address hidden> wrote:

> if destination host's neutron server is unavailable, we should not
> resize to this destination host, or we can let it revert to it's source
> host.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1396003
>
> Title:
> Resize failed in finish_resize function, the instance disappeared and
> can not rollback.
>
> Status in OpenStack Compute (nova):
> Invalid
>
> Bug description:
> Resize failed in finish_resize function, the instance disappeared and
> can not rollback.
>
> log is :
> http://paste.openstack.org/show/137968/
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1396003/+subscriptions
>

--
Best,

Jian

Changed in nova:
status: Invalid → Confirmed
summary: - Resize failed in finish_resize function, the instance disappeared and
- can not rollback.
+ Should rollback the instance if finish_resize fails
Charlotte Han (hanrong) wrote :

I try the same operation with kilo version, the same error is occurring.

Charlotte Han (hanrong) wrote :

and I reset-state --active, then hard reboot this instance, the instance is error.
log is as follow:
http://paste.openstack.org/show/479397/

Charlotte Han (hanrong) wrote :

I try the same operation with Mikata version, the same error is occurring.

see https://bugs.launchpad.net/nova/+bug/1586309, delete operation could not clear source compute node resource.

The openvswitch agent service on Destination compute node is active, but public net is not available for some reason I don't know on destination node.

Devstack openstack environment, two networks were created, I had two compute nodes.
When I created instance using public net, resize failed with Unexpected vif_type.
When I created instance using private net, resize successfully.
So I think when I chose some network which can not resize to another compute node successfully, please let this instance rollback to source compute node.
[stack@SBCJSlot5Rack2Centos7 instances]$ nova net-list
+--------------------------------------+---------+------+
| ID | Label | CIDR |
+--------------------------------------+---------+------+
| 44652d4f-95ee-4785-8c16-de74bbb722c0 | public | None |
| d40836c3-f036-4151-bbaa-cf92125ce552 | private | None |

Changed in nova:
assignee: nobody → Charlotte Han (hanrong)
status: Confirmed → In Progress
Charlotte Han (hanrong) wrote :

Can we use "revert_resize" to recover the error state of the instances after they had been migrated/resized failed?

Charlotte Han (hanrong) on 2016-06-28
Changed in nova:
importance: Low → High
Sean Dague (sdague) wrote :

Patch in merge conflict

Changed in nova:
importance: High → Low
status: In Progress → Confirmed
assignee: Charlotte Han (hanrong) → nobody

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/334747
Reason: This review is > 6 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers