periodically seeing KeyError: 'old_instance_type_memory_mb' in compute.log

Bug #1160489 reported by James Belden
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith

Bug Description

When running multiple migrations concurrently, I see an error the repeats in the compute.log file. The trace can be seen here:

http://pastebin.com/XRr70jpU

Here is the steps I followed to recreate this issue:

(1). Boot the instances

nova boot --image be8b6475-26d8-410f-aaa5-8b278d98c8f9 --flavor 1 MIGRATE1
nova boot --image be8b6475-26d8-410f-aaa5-8b278d98c8f9 --flavor 1 MIGRATE2
nova boot --image be8b6475-26d8-410f-aaa5-8b278d98c8f9 --flavor 1 MIGRATE3

[root@vs347 ~]# nova list
+--------------------------------------+----------+--------+-------------------+
| ID | Name | Status | Networks |
+--------------------------------------+----------+--------+-------------------+
| ccf6212b-bb8d-4826-97db-e18b4d4cb2b0 | MIGRATE1 | ACTIVE | demonet=172.0.0.2 |
| 739a6d71-cc63-4a34-b14f-7aef6f635a31 | MIGRATE2 | ACTIVE | demonet=172.0.0.4 |
| 11dabeb8-2799-40e1-a355-4d58f935a9c6 | MIGRATE3 | ACTIVE | demonet=172.0.0.3 |
+--------------------------------------+----------+--------+-------------------+

bvt-instance-00000029 - olyblade02
bvt-instance-0000002a - olyblade02
bvt-instance-0000002b - olyblade02

(2). migrate the instances simultaneously

[root@vs347 ~]# nova migrate MIGRATE1
[root@vs347 ~]# nova migrate MIGRATE2
[root@vs347 ~]# nova migrate MIGRATE3
[root@vs347 ~]#

* State as of 4:24 p.m.

+--------------------------------------+----------+--------+-------------------+
| ID | Name | Status | Networks |
+--------------------------------------+----------+--------+-------------------+
| ccf6212b-bb8d-4826-97db-e18b4d4cb2b0 | MIGRATE1 | RESIZE | demonet=172.0.0.2 |
| 739a6d71-cc63-4a34-b14f-7aef6f635a31 | MIGRATE2 | RESIZE | demonet=172.0.0.4 |
| 11dabeb8-2799-40e1-a355-4d58f935a9c6 | MIGRATE3 | RESIZE | demonet=172.0.0.3 |
+--------------------------------------+----------+--------+-------------------+

* Now they are in resize-verify state so do confirm

[root@vs347 ~]# nova resize-confirm MIGRATE1
[root@vs347 ~]# nova resize-confirm MIGRATE2
[root@vs347 ~]# nova resize-confirm MIGRATE3
[root@vs347 ~]#

[root@vs347 ~]# nova list
+--------------------------------------+----------+--------+-------------------+
| ID | Name | Status | Networks |
+--------------------------------------+----------+--------+-------------------+
| ccf6212b-bb8d-4826-97db-e18b4d4cb2b0 | MIGRATE1 | ERROR | demonet=172.0.0.2 |
| 739a6d71-cc63-4a34-b14f-7aef6f635a31 | MIGRATE2 | ACTIVE | demonet=172.0.0.4 |
| 11dabeb8-2799-40e1-a355-4d58f935a9c6 | MIGRATE3 | ACTIVE | demonet=172.0.0.3 |
+--------------------------------------+----------+--------+-------------------+

[root@vs347 ~]# nova show MIGRATE1
+-------------------------------------+------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+-------------------------------------+------------------------------------------------------------------------------------------------------------------------+
| status | ERROR |
| updated | 2013-03-22T21:52:21Z |
| OS-EXT-STS:task_state | None |
| OS-EXT-SRV-ATTR:host | vs347.rch.kstart.ibm.com |
| key_name | None |
| image | Rhel6MasterFile (be8b6475-26d8-410f-aaa5-8b278d98c8f9) |
| hostId | a3883219ffb68903fc2ecc1e0a490f3f250fdf3773f446bcb398441c |
| OS-EXT-STS:vm_state | error |
| OS-EXT-SRV-ATTR:instance_name | bvt-instance-00000029 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | olyblade01.rch.stglabs.ibm.com |
| flavor | m1.tiny (1) |
| demonet network | 172.0.0.2 |
| id | ccf6212b-bb8d-4826-97db-e18b4d4cb2b0 |
| security_groups | [{u'name': u'default'}] |
| user_id | 3ccf55fd609b45319f24fe681338886d |
| name | MIGRATE1 |
| created | 2013-03-22T21:19:00Z |
| tenant_id | 67b1c37f4ca64283908c7077e9e59997 |
| OS-DCF:diskConfig | MANUAL |
| metadata | {} |
| accessIPv4 | |
| accessIPv6 | |
| fault | {u'message': u'KeyError', u'code': 500, u'details': u'\'old_instance_type_memory_mb\' |
| | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 229, in decorated_function |
| | return function(self, context, *args, **kwargs) |
| | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1956, in confirm_resize |
| | rt.confirm_resize(context, migration) |
| | File "/usr/lib/python2.6/site-packages/nova/compute/resource_tracker.py", line 373, in confirm_resize |
| | self.update_available_resource(elevated) |
| | File "/usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 242, in inner |
| | retval = f(*args, **kwargs) |
| | File "/usr/lib/python2.6/site-packages/nova/compute/resource_tracker.py", line 269, in update_available_resource |
| | self._update_usage_from_migrations(resources, migrations) |
| | File "/usr/lib/python2.6/site-packages/nova/compute/resource_tracker.py", line 468, in _update_usage_from_migrations |
| | migration) |
| | File "/usr/lib/python2.6/site-packages/nova/compute/resource_tracker.py", line 428, in _update_usage_from_migration |
| | itype = instance_types.extract_instance_type(instance, \'old_\') |
| | File "/usr/lib/python2.6/site-packages/nova/compute/instance_types.py", line 250, in extract_instance_type |
| | instance_type[key] = type_fn(sys_meta[type_key]) |
| | ', u'created': u'2013-03-22T21:52:21Z'} |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-AZ:availability_zone | nova |
| config_drive | |
+-------------------------------------+------------------------------------------------------------------------------------------------------------------------+

Revision history for this message
James Belden (jbelden) wrote :

I am seeing this issue with periodic tasks as well - not just running multiple migrations

Revision history for this message
Dan Smith (danms) wrote :

Can you post an un-wrapped log trace?

Also, can you please confirm the exact version of grizzly you're working on? Ideally, a git hash...

Changed in nova:
assignee: nobody → Dan Smith (danms)
importance: Undecided → High
status: New → Incomplete
Revision history for this message
James Belden (jbelden) wrote :
Download full text (5.1 KiB)

Here is the trace:

2013-03-22 14:13:45.950 ERROR nova.compute.manager [req-9d363741-c3b1-4a1d-b87c-8084ceed2993 3ccf55fd609b45319f24fe681338886d 67b1c37f4ca64283908c7077e9e59997] [instance: 2fe0499d-7707-4447-879b-15093e7d1ee9] 'old_instance_type_memory_mb'. Setting instance vm_state to ERROR

2013-03-22 14:13:46.411 ERROR nova.openstack.common.rpc.amqp [req-9d363741-c3b1-4a1d-b87c-8084ceed2993 3ccf55fd609b45319f24fe681338886d 67b1c37f4ca64283908c7077e9e59997] Exception during message handling
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 393, in _process_data
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp rval = self.proxy.dispatch(ctxt, version, method, **args)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 133, in dispatch
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp return getattr(proxyobj, method)(ctxt, **kwargs)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 120, in wrapped
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp temp_level, payload)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp self.gen.next()
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 97, in wrapped
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 265, in decorated_function
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp function(self, context, *args, **kwargs)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 242, in decorated_function
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp e, sys.exc_info())
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp self.gen.next()
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 229, in decorated_function
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs)
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1956, in confirm_resize
2013-03-22 14:13:46.411 11393 TRACE nova.openstack.common.rpc.amqp rt.confirm_resiz...

Read more...

Revision history for this message
Dan Smith (danms) wrote :

Thanks!

That git hash is not in my tree:

dan@guaranine:/opt/stack/nova$ git show 35e2906530e8ded6643479f53df505e09d23f31b
fatal: bad object 35e2906530e8ded6643479f53df505e09d23f31b

Perhaps you're on a modified copy? Can you figure out what hash in in the upstream tree you're branched from?

Revision history for this message
Dan Smith (danms) wrote :

I'm unable to reproduce this on current master with devstack and a bunch of poking. Are you able to reproduce it that way?

Revision history for this message
James Belden (jbelden) wrote :

commit b0998792222bd3aede65feeb32d1c2d6e6f8080b

Revision history for this message
Dan Smith (danms) wrote :

Okay, I think I've repro'd it.

Changed in nova:
milestone: none → grizzly-rc2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/25433

Changed in nova:
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/25433
Committed: http://github.com/openstack/nova/commit/985d703d2731fe9ec968ef6698bbf39743ede32a
Submitter: Jenkins
Branch: master

commit 985d703d2731fe9ec968ef6698bbf39743ede32a
Author: Dan Smith <email address hidden>
Date: Tue Mar 26 13:14:56 2013 -0700

    Always store old instance_type during a migration

    Previously, we were trying to optimize for the migration (as opposed
    to resize) case by not storing the old_ instance_type. However,
    this is confusing to the vast amount of shared code that works on
    migrations and resizes because the "is old == new" logic must be
    replicated everywhere. To cut our losses and hopefully avoid other
    hidden dragons, this patch removes the optimization and makes both
    paths the same.

    Fixes bug #1160489

    Change-Id: I2ad7b02239baada5fd2174f5b61cdc90ec91e6f5

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (milestone-proposed)

Fix proposed to branch: milestone-proposed
Review: https://review.openstack.org/25521

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (milestone-proposed)

Reviewed: https://review.openstack.org/25521
Committed: http://github.com/openstack/nova/commit/1fa7879071310fda7d8393716441dd02be38fc30
Submitter: Jenkins
Branch: milestone-proposed

commit 1fa7879071310fda7d8393716441dd02be38fc30
Author: Dan Smith <email address hidden>
Date: Tue Mar 26 13:14:56 2013 -0700

    Always store old instance_type during a migration

    Previously, we were trying to optimize for the migration (as opposed
    to resize) case by not storing the old_ instance_type. However,
    this is confusing to the vast amount of shared code that works on
    migrations and resizes because the "is old == new" logic must be
    replicated everywhere. To cut our losses and hopefully avoid other
    hidden dragons, this patch removes the optimization and makes both
    paths the same.

    Fixes bug #1160489

    Change-Id: I2ad7b02239baada5fd2174f5b61cdc90ec91e6f5
    (cherry picked from commit 985d703d2731fe9ec968ef6698bbf39743ede32a)

Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-rc2 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.