Aggregate metadata is not correctly handled by compute

Bug #1232179 reported by Christopher Lefelhocz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Bob Ball

Bug Description

Hopefully this is not covered under a different bug. What I'm seeing is that compute does not properly handle metadata in the "aggregate" class of API commands. Specifically, after adding metadata and performing an add-host you get the following stacktrace exception:
2013-09-27 17:48:44.396 DEBUG nova.openstack.common.rpc.amqp [-] received {u'_context_roles': [u'admin'], u'_context_request_id': u'req-643da000-e5ae-482c-ba01-027ee61f5085', u'_context_quota_class': None, u'_context_user_name': u'admin', u'_context_project_name': u'admin', u'_context_service_catalog': [{u'endpoints_links': [], u'endpoints': [{u'adminURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537', u'region': u'RegionOne', u'publicURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537', u'internalURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537', u'id': u'1733df26b558409ca93bb1ffe8851929'}], u'type': u'volume', u'name': u'cinder'}], u'_context_tenant': u'0fa0013cceb4430ba09e88a18d344537', u'_context_auth_token': '<SANITIZED>', u'args': {u'aggregate': {u'name': u'foo', u'availability_zone': u'nova', u'deleted': False, u'created_at': u'2013-09-27T17:47:24.000000', u'updated_at': None, u'hosts': [u'devstack2'], u'deleted_at': None, u'id': 2, u'metadata': {u'hypervisor': u'true', u'availability_zone': u'nova'}}, u'host': u'devstack2', u'slave_info': None}, u'namespace': None, u'_context_instance_lock_checked': False, u'_context_timestamp': u'2013-09-27T17:48:44.352804', u'_context_is_admin': True, u'version': u'2.14', u'_context_project_id': u'0fa0013cceb4430ba09e88a18d344537', u'_context_user': u'bd2e35f5cdfa4272b7f03d601c95a61d', u'_unique_id': u'13c5f2d81c484bbebdd8214574060ab3', u'_context_read_deleted': u'no', u'_context_user_id': u'bd2e35f5cdfa4272b7f03d601c95a61d', u'method': u'add_aggregate_host', u'_context_remote_address': u'172.24.4.10'} from (pid=10210) _safe_log /opt/stack/nova/nova/openstack/common/rpc/common.py:277
013-09-27 17:48:44.397 DEBUG nova.openstack.common.rpc.amqp [-] unpacked context: {'read_deleted': u'no', 'project_name': u'admin', 'user_id': u'bd2e35f5cdfa4272b7f03d601c95a61d', 'roles': [u'admin'], 'timestamp': u'2013-09-27T17:48:44.352804', 'auth_token': '<SANITIZED>', 'remote_address': u'172.24.4.10', 'quota_class': None, 'is_admin': True, 'user': u'bd2e35f5cdfa4272b7f03d601c95a61d', 'service_catalog': [{u'endpoints': [{u'adminURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537', u'region': u'RegionOne', u'id': u'1733df26b558409ca93bb1ffe8851929', u'internalURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537', u'publicURL': u'http://172.24.4.10:8776/v1/0fa0013cceb4430ba09e88a18d344537'}], u'endpoints_links': [], u'type': u'volume', u'name': u'cinder'}], 'request_id': u'req-643da000-e5ae-482c-ba01-027ee61f5085', 'instance_lock_checked': False, 'project_id': u'0fa0013cceb4430ba09e88a18d344537', 'user_name': u'admin', 'tenant': u'0fa0013cceb4430ba09e88a18d344537'} from (pid=10210) _safe_log /opt/stack/nova/nova/openstack/common/rpc/common.py:277
2013-09-27 17:48:44.400 ERROR nova.openstack.common.rpc.amqp [req-643da000-e5ae-482c-ba01-027ee61f5085 admin admin] Exception during message handling
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 461, in _process_data
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp **args)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/exception.py", line 90, in wrapped
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp payload)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/exception.py", line 73, in wrapped
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/compute/manager.py", line 4993, in add_aggregate_host
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp slave_info=slave_info)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 628, in add_to_aggregate
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp return self._pool.add_to_aggregate(context, aggregate, host, **kwargs)
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp File "/opt/stack/nova/nova/virt/xenapi/pool.py", line 77, in add_to_aggregate
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp if not pool_states.is_hv_pool(aggregate['metadetails']):
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp KeyError: 'metadetails'
2013-09-27 17:48:44.400 TRACE nova.openstack.common.rpc.amqp

Based on evidence it looks like caller passing information with key metadata (change back in August?) while we are looking for it in the Xen code as metadetails. The key name needs to change to correspond.

This was produced on a devstack with a commit of ab5a99bbca4d68002c887b5dd7b3741b57f650ee for nova.

tags: added: compute
Revision history for this message
Joe Gordon (jogo) wrote :

I assume you are using xenapi? Adding xenapi tag

tags: added: xenserver
Revision history for this message
John Garbutt (johngarbutt) wrote :

Yeah, the impact here is that XenAPI pool is broken (again) in havana.

I am proposing we pull this in Icehouse (or look to deprecate it, at least).
No one is stepping up to maintain and test this, it seems.

Changed in nova:
status: New → Incomplete
status: Incomplete → Triaged
importance: Undecided → Medium
Revision history for this message
Bob Ball (bob-ball) wrote :

Add this to the IceHouse discussions - I agree that we either need to test it or deprecate it, but we need a discussion about which we do.

Revision history for this message
Bob Ball (bob-ball) wrote :

Very interestingly I get this same stack trace from the latest Havana on a host that has not been joined to any aggregate - and I only see it on first run.

Is the above error fatal, or do things continue to work?
My devstack VM continues to work as a single host even without this:

2013-10-02 03:09:23.950 ERROR nova.openstack.common.rpc.amqp [req-23bdcc58-49c5-4dbd-918c-b97dd1ab16aa admin admin] Exception during message handling
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 461, in _process_data
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  **args)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  result = getattr(proxyobj, method)(ctxt, **kwargs)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/exception.py", line 90, in wrapped
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  payload)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/exception.py", line 73, in wrapped
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  return f(self, context, *args, **kw)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/compute/manager.py", line 5002, in add_aggregate_host
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  slave_info=slave_info)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 627, in add_to_aggregate
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  return self._pool.add_to_aggregate(context, aggregate, host, **kwargs)
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  File "/opt/stack/nova/nova/virt/xenapi/pool.py", line 77, in add_to_aggregate
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp  if not pool_states.is_hv_pool(aggregate['metadetails']):
2013-10-02 03:09:23.950 TRACE nova.openstack.common.rpc.amqp KeyError: 'metadetails'

Revision history for this message
Bob Ball (bob-ball) wrote :

Broken by https://review.openstack.org/#/c/43157/ which added the translation from metadetails to metadata for the aggregate instance object.

Revision history for this message
Christopher Lefelhocz (christopher-lefelhoc) wrote :

We'll (Christopher to Add, John to discuss at summit) add a proposal blueprint for discussion in the next few days.

Bob Ball (bob-ball)
Changed in nova:
assignee: nobody → Bob Ball (bob-ball)
status: Triaged → In Progress
Revision history for this message
Bob Ball (bob-ball) wrote :

Fixed by https://review.openstack.org/#/c/49400/ - not sure why the automatic link wasn't added

tags: added: havana-rc-potential
Thierry Carrez (ttx)
tags: added: havana-backport-potential
removed: havana-rc-potential
Revision history for this message
Bob Ball (bob-ball) wrote :

A better fix is at https://review.openstack.org/#/c/50466 - although the original (https://review.openstack.org/#/c/49400/) may be a simpler backport.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/50466
Committed: http://github.com/openstack/nova/commit/f2f58eef93aec45e46a9a2ab06fc8b00a9420350
Submitter: Jenkins
Branch: master

commit f2f58eef93aec45e46a9a2ab06fc8b00a9420350
Author: Dan Smith <email address hidden>
Date: Tue Oct 8 12:54:27 2013 -0700

    Make XenAPI use Aggregate object

    This makes the XenAPI driver use the Aggregate object for its work,
    and avoids the need to call back through virtapi to conductor
    directly. It also allows us to convert the two aggregate-related
    compute manager methods fully to new-world objects.

    Related to blueprint compute-manager-objects
    Related to blueprint virt-objects

    Closes-bug: #1232179
    Change-Id: Ib38ab0e4d6feefebda37888f150752167474b693

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-1
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.