2018-01-08 04:39:30 |
yuanyue |
bug |
|
|
added bug |
2018-01-08 04:40:42 |
yuanyue |
description |
Description
===========
If nova-compute crahes on the host and rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:41:28 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:41:55 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if restart nova-compute on this host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:42:34 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if restart nova-compute on this host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if restart nova-compute on this host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:43:27 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if restart nova-compute on this host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:45:56 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:46:28 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:47:17 |
yuanyue |
description |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the instance on this host, it will stuck in rebulding which is reasonable because no one consumes the message from MQ. However, if nova-compute on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later it stucks in following state:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to ERROR state _init_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} |
|
2018-01-08 04:50:13 |
yuanyue |
summary |
vm rebuild fail but confusing state |
vm rebuild fails but confusing state |
|
2018-01-17 14:23:16 |
xulei |
nova: assignee |
|
xulei (605423512-j) |
|
2018-01-19 08:22:17 |
OpenStack Infra |
nova: status |
New |
In Progress |
|
2019-10-07 10:26:51 |
OpenStack Infra |
nova: assignee |
xulei (605423512-j) |
Stephen Finucane (stephenfinucane) |
|