Resource tracker: unable to start nova compute

Bug #1444439 reported by Gary Kotton
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Gary Kotton
Kilo
Fix Released
Critical
Unassigned

Bug Description

After a failure of the resize and a deletion of the instance. I am unable to restart the nova compute due to the exception below. The instance was deleted via nova api.

The DB is as follows:
mysql> select * from migrations;
+---------------------+---------------------+------------+----+------------------+------------------+---------------+----------------+--------------------------------------+----------------------+----------------------+------------------+------------------+---------+
| created_at | updated_at | deleted_at | id | source_compute | dest_compute | dest_host | status | instance_uuid | old_instance_type_id | new_instance_type_id | source_node | dest_node | deleted |
+---------------------+---------------------+------------+----+------------------+------------------+---------------+----------------+--------------------------------------+----------------------+----------------------+------------------+------------------+---------+
| 2015-04-15 09:44:02 | 2015-04-15 09:44:08 | NULL | 1 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | post-migrating | 42264e24-1385-41f1-8dfc-120a1891ab05 | 10 | 11 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 09:48:13 | 2015-04-15 10:19:48 | NULL | 2 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | reverted | fcab4bde-d93e-4d79-ae35-9d1306da10a4 | 10 | 11 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:23:56 | 2015-04-15 10:24:03 | NULL | 3 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | post-migrating | d074bbc0-b912-4c85-a02b-aabf56d45f0b | 10 | 11 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:27:45 | 2015-04-15 10:28:16 | NULL | 4 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | reverted | 21e59c96-fa2f-45e3-9070-e982a2dafea6 | 10 | 11 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:28:43 | 2015-04-15 10:29:16 | NULL | 5 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | confirming | 21e59c96-fa2f-45e3-9070-e982a2dafea6 | 10 | 11 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:35:15 | 2015-04-15 10:53:16 | NULL | 6 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | confirmed | 4abd75b5-bb91-4ce7-a928-2a96941ea9cb | 10 | 14 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:35:39 | 2015-04-15 10:53:17 | NULL | 7 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | confirmed | 5e01bddb-3978-4f6f-a4d3-6d24ed31afa4 | 14 | 10 | domain-c167(DVS) | domain-c167(DVS) | 0 |
| 2015-04-15 10:55:01 | 2015-04-15 10:55:02 | NULL | 8 | Ubuntu1404Server | Ubuntu1404Server | 10.160.94.173 | migrating | 20017567-5c83-4918-b269-525169009026 | 10 | 15 | domain-c167(DVS) | domain-c167(DVS) | 0 |
+---------------------+---------------------+------------+----+------------------+------------------+---------------+----------------+--------------------------------------+----------------------+----------------------+------------------+------------------+---------+
8 rows in set (0.00 sec)

2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 145, in wait
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup x.wait()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 47, in wait
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 175, in wait
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/service.py", line 497, in run_service
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup service.start()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/service.py", line 183, in start
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 1287, in pre_start_hook
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_context())
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 6236, in update_available_resource
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup rt.update_available_resource(context)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 402, in update_available_resource
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup self._update_available_resource(context, resources)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 445, in inner
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return f(*args, **kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 445, in _update_available_resource
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup self._update_usage_from_migrations(context, resources, migrations)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 709, in _update_usage_from_migrations
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup instance = migration.instance
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/migration.py", line 80, in instance
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return objects.Instance.get_by_uuid(self._context, self.instance_uuid)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/base.py", line 161, in wrapper
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup args, kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/conductor/rpcapi.py", line 325, in object_class_action
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup objver=objver, args=args, kwargs=kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 156, in call
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup retry=self.retry)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup timeout=timeout, retry=retry)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup retry=retry)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 341, in _send
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup raise result
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup InstanceNotFound_Remote: Instance 42264e24-1385-41f1-8dfc-120a1891ab05 could not be found.
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/conductor/manager.py", line 423, in _object_dispatch
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return getattr(target, method)(*args, **kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/base.py", line 163, in wrapper
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup result = fn(cls, context, *args, **kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/instance.py", line 564, in get_by_uuid
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup use_slave=use_slave)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/api.py", line 651, in instance_get_by_uuid
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup columns_to_join, use_slave=use_slave)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup return f(*args, **kwargs)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1744, in instance_get_by_uuid
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup columns_to_join=columns_to_join, use_slave=use_slave)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1756, in _instance_get_by_uuid
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup raise exception.InstanceNotFound(instance_id=uuid)
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup InstanceNotFound: Instance 42264e24-1385-41f1-8dfc-120a1891ab05 could not be found.
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.821 TRACE nova.openstack.common.threadgroup
2015-04-15 04:47:04.824 INFO oslo_vmware.api [req-2d7c5d01-438c-494c-a382-b7c55b71d8be None None] Logging out and terminating the current session with ID = f4fcb.
nicira@Ubuntu1404Server:~/devstack$

Tags: scheduler
Gary Kotton (garyk)
Changed in nova:
importance: Undecided → Critical
assignee: nobody → Gary Kotton (garyk)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/173794

Changed in nova:
status: New → In Progress
tags: added: kilo-rc-potential
Gary Kotton (garyk)
tags: added: scheduler
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/173794
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ee7a7446cc6947a6bacacb6cb514934cc22e5782
Submitter: Jenkins
Branch: master

commit ee7a7446cc6947a6bacacb6cb514934cc22e5782
Author: Gary Kotton <email address hidden>
Date: Wed Apr 15 05:14:42 2015 -0700

    Resource tracker: unable to restart nova compute

    The resource tracker calculates its used resources. In certain cases
    of failed migrations and an instance being deleted the resource tracker
    causes an exception in nova compute. If this situation arises then nova
    compute may not even be able to restart.

    Change-Id: I4a154e0cae3b8e22bd59ed05ba708e07eed8dea7
    Closes-bug: #1444439

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/175360

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/175360
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=389368bcfe498323b369f68682babb92a5b0ca54
Submitter: Jenkins
Branch: stable/kilo

commit 389368bcfe498323b369f68682babb92a5b0ca54
Author: Gary Kotton <email address hidden>
Date: Wed Apr 15 05:14:42 2015 -0700

    Resource tracker: unable to restart nova compute

    The resource tracker calculates its used resources. In certain cases
    of failed migrations and an instance being deleted the resource tracker
    causes an exception in nova compute. If this situation arises then nova
    compute may not even be able to restart.

    Change-Id: I4a154e0cae3b8e22bd59ed05ba708e07eed8dea7
    Closes-bug: #1444439
    (cherry picked from commit ee7a7446cc6947a6bacacb6cb514934cc22e5782)

Thierry Carrez (ttx)
tags: removed: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/179284

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)
Download full text (18.1 KiB)

Reviewed: https://review.openstack.org/179284
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5228d4e418734164ffa5ccd91d2865d9cc659c00
Submitter: Jenkins
Branch: master

commit 906ab9d6522b3559b4ad36d40dec3af20397f223
Author: He Jie Xu <email address hidden>
Date: Thu Apr 16 07:09:34 2015 +0800

    Update rpc version aliases for kilo

    Update all of the rpc client API classes to include a version alias
    for the latest version implemented in Kilo. This alias is needed when
    doing rolling upgrades from Kilo to Liberty. With this in place, you can
    ensure all services only send messages that both Kilo and Liberty will
    understand.

    Closes-Bug: #1444745

    Conflicts:
     nova/conductor/rpcapi.py

    NOTE(alex_xu): The conflict is due to there are some logs already added
    into the master.

    Change-Id: I2952aec9aae747639aa519af55fb5fa25b8f3ab4
    (cherry picked from commit 78a8b5802ca148dcf37c5651f75f2126d261266e)

commit f191a2147a21c7e50926b288768a96900cf4c629
Author: Hans Lindgren <email address hidden>
Date: Fri Apr 24 13:10:39 2015 +0200

    Add security group calls missing from latest compute rpc api version bump

    The recent compute rpc api version bump missed out on the security group
    related calls that are part of the api.

    One possible reason is that both compute and security group client side
    rpc api:s share a single target, which is of little value and only cause
    mistakes like this.

    This change eliminates future problems like this by combining them into
    one to get a 1:1 relationship between client and server api:s.

    Change-Id: I9207592a87fab862c04d210450cbac47af6a3fd7
    Closes-Bug: #1448075
    (cherry picked from commit bebd00b117c68097203adc2e56e972d74254fc59)

commit a2872a9262985bd0ee2c6df4f7593947e0516406
Author: Dan Smith <email address hidden>
Date: Wed Apr 22 09:02:03 2015 -0700

    Fix migrate_flavor_data() to catch instances with no instance_extra rows

    The way the query was being performed previously, we would not see any
    instances that didn't have a row in instance_extra. This could happen if
    an instance hasn't been touched for several releases, or if the data
    set is old.

    The fix is a simple change to use outerjoin instead of join. This patch
    includes a test that ensures that instances with no instance_extra rows
    are included in the migration. If we query an instance without such a
    row, we create it before doing a save on the instance.

    Closes-Bug: #1447132
    Change-Id: I2620a8a4338f5c493350f26cdba3e41f3cb28de7
    (cherry picked from commit 92714accc49e85579f406de10ef8b3b510277037)

commit e3a7b83834d1ae2064094e9613df75e3b07d77cd
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 23 02:18:41 2015 +0000

    Updated from global requirements

    Change-Id: I5d4acd36329fe2dccb5772fed3ec55b442597150

commit 8c9b5e620eef3233677b64cd234ed2551e6aa182
Author: Divya <email address hidden>
Date: Tue Apr 21 08:26:29 2015 +0200

    Control create/delete flavor api permissions using policy.json

    The permissions of ...

Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.