Do not set vm error state when raise MigrationError

Bug #1328367 reported by Xiang BZ Zhou
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Low
Unassigned

Bug Description

Control Node: 101.0.0.20(also has compute service , but do not use it)
Compute Node: 101.0.0.30

nova version:
2014.1.b2-847-ga891e04

in control node nova.conf
allow_resize_to_same_host = True
and
in compute node nova.conf
allow_resize_to_same_host = False

detail:
1. boot an instance in compute node
nova boot --image 51c4a908-c028-4ce2-bbd1-8b0e15d8d829 --flavor 84 --nic net-id=308840da-6440-4599-923a-2edd290971d3 --availability-zone nova:compute.localdomain migrate_test

2. resize it to flavor type 1
nova resize migrate_test 1

3.the instance has set to error state when resize failed.

#nova list
+--------------------------------------+----------+--------+-------------+-------------+-------------------+
| a1424990-182a-4bc2-8c17-aa4808a49472 | migrate_test | ERROR | resize_prep | Running | private=20.0.0.15 |
+--------------------------------------+----------+--------+-------------+-------------+-------------------+

#nova show
....
| config_drive | |
| created | 2014-06-09T09:31:35Z |
| fault | {"message": "<class 'nova.exception.MigrationError'>", "code": 500, "details": " File \"/opt/stack/nova/nova/compute/manager.py\", line 3104, in prep_resize |
| | node) |
| | File \"/opt/stack/nova/nova/compute/manager.py\", line 3058, in _prep_resize |
| | raise exception.MigrationError(msg) |
| | ", "created": "2014-06-10T03:54:39Z"}
                                                                     |
| flavor | m1.micro (84) |
| hostId | f73013b029032929598a4a54586e4469c2c7cd676c147f6601f73c58
....

error log in compute node:

2014-06-10 11:54:48.372 ERROR nova.compute.manager [req-6a4ac25a-7d24-40c6-9f8d-435b4adb6fff admin admin] [instance: a1424990-182a
-4bc2-8c17-aa4808a49472] Setting instance vm_state to ERROR
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] Traceback (most recent call la
st):
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] File "/opt/stack/nova/nova/c
ompute/manager.py", line 5231, in _error_out_instance_on_exception
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] yield
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] File "/opt/stack/nova/nova/c
ompute/manager.py", line 3111, in prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] filter_properties)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] File "/opt/stack/nova/nova/compute/manager.py", line 3104, in prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] node)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] File "/opt/stack/nova/nova/compute/manager.py", line 3058, in _prep_resize
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] raise exception.MigrationError(msg)
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472] MigrationError: destination same as source!
2014-06-10 11:54:48.372 TRACE nova.compute.manager [instance: a1424990-182a-4bc2-8c17-aa4808a49472]

bug reason:
1. nova-scheduler is allowed to scheduler to compute node (due to controller nova.conf)

2. but nova-compute is not allowed to resize in same host (due to compute node nova.conf)

3.
a)compute side _prep_resize() function set instance into error state:
....
self._set_instance_error_state(context, instance['uuid'])
...
then raise exception

b)
compute node reschedule the instance again, failed again
....
self._reschedule_resize_or_reraise(context, image, instance,
     exc_info, instance_type, reservations, request_spec,
     filter_properties)
...
c)compute store instance fault info
....
compute_utils.add_instance_fault_from_exc(context, self.conductor_api,
    instance, exc_info[0], exc_info=exc_info)

additional:
no matter what the scheduler filter is using, instance should not be set to ERROR status just because scheduler doesn't find a appropriate host to do resize.
and we can not deal with vm in ERROR state unless we change it's state in db

Xiang BZ Zhou (zhouxbj)
description: updated
Tracy Jones (tjones-i)
tags: added: compute
tags: added: migrate resize
Revision history for this message
Matt Riedemann (mriedem) wrote :

You should be able to use the reset-state API to get it out of ERROR state, unless it's deleting in which case you've got bug 1299139.

Changed in nova:
importance: Undecided → Low
Matt Riedemann (mriedem)
tags: added: migration
removed: migrate
melanie witt (melwitt)
Changed in nova:
status: New → Confirmed
lizheming (lizheming-li)
Changed in nova:
assignee: nobody → lizheming (lizheming-li)
lizheming (lizheming-li)
Changed in nova:
assignee: lizheming (lizheming-li) → nobody
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

fyi, there's an effort to eliminate the flag - https://review.openstack.org/#/c/118604/

Revision history for this message
jichenjc (jichenjc) wrote :

The patch mentioned above are merged , I believe we can close the bug ?

Changed in nova:
status: Confirmed → Won't Fix
status: Won't Fix → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.