Leases fail to start because freepool not found

Bug #1847821 reported by Jason Anderson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Blazar
Fix Released
Medium
Jason Anderson

Bug Description

Sometimes a lease will fail to start, and Blazar reports the reason is that
the freepool cannot be found. After verifying with the system however, we
confirmed that the freepool existed and was working just fine for other leases.

This was discovered in the Stein branch.

Example trace output:
---------------------
2019-10-11 14:15:11.727 8 WARNING blazar.utils.openstack.nova [-] Removing hosts added to aggregate 34838: [u'eddfa86d-f879-4114-9876-bc716411a2ce']: HostNotInFreePool: Host f9995c9d-fc60-4f20-a206-76b9b8
d9eda2 not in freepool 'freepool'
2019-10-11 14:15:12.055 8 WARNING blazar.utils.openstack.nova [-] Adding hosts back to freepool: [u'eddfa86d-f879-4114-9876-bc716411a2ce']: HostNotInFreePool: Host f9995c9d-fc60-4f20-a206-76b9b8d9eda2 not
 in freepool 'freepool'
2019-10-11 14:15:12.461 8 ERROR blazar.status [-] Lease 36836677-dfaf-48b4-9e1e-cf62ca6c2754 went into ERROR status: Aggregate freepool could not be found. (HTTP 404) (Request-ID: req-2ce0dd30-032a-494e-9
216-52df65ec0f2f): NotFound: Aggregate freepool could not be found. (HTTP 404) (Request-ID: req-2ce0dd30-032a-494e-9216-52df65ec0f2f)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service [-] Error occurred while handling start_lease event for lease 36836677-dfaf-48b4-9e1e-cf62ca6c2754.: NotFound: Aggregate freepool could not be found.
 (HTTP 404) (Request-ID: req-2ce0dd30-032a-494e-9216-52df65ec0f2f)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service Traceback (most recent call last):
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/manager/service.py", line 248, in _exec_event
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service event_fn(lease_id=event['lease_id'], event_id=event['id'])
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/status.py", line 235, in wrapper
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service {'status': cls.ERROR})
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service self.force_reraise()
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service six.reraise(self.type_, self.value, self.tb)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/status.py", line 219, in wrapper
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service result = func(*args, **kwargs)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/manager/service.py", line 638, in start_lease
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service status.reservation.ACTIVE)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/manager/service.py", line 668, in _basic_action
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service reservation['resource_id']
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/plugins/oshosts/host_plugin.py", line 215, in on_start
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service pool.add_computehost(host_reservation['aggregate_id'], hosts)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/blazar/utils/openstack/nova.py", line 392, in add_computehost
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service self.nova.aggregates.add_host(freepool_agg.name, host)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/novaclient/v2/aggregates.py", line 84, in add_host
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service body, "aggregate")
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/novaclient/base.py", line 366, in _create
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service resp, body = self.api.client.post(url, body=body)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 334, in post
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service return self.request(url, 'POST', **kwargs)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service File "/var/lib/kolla/venv/lib/python2.7/site-packages/novaclient/client.py", line 83, in request
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service raise exceptions.from_response(resp, body, url, method)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service NotFound: Aggregate freepool could not be found. (HTTP 404) (Request-ID: req-2ce0dd30-032a-494e-9216-52df65ec0f2f)
2019-10-11 14:15:12.559 8 ERROR blazar.manager.service

description: updated
Revision history for this message
Tetsuro Nakamura (tetsuro0907) wrote :
Changed in blazar:
assignee: nobody → Jason Anderson (jasonandersonatuchicago)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to blazar (master)

Reviewed: https://review.opendev.org/688214
Committed: https://git.openstack.org/cgit/openstack/blazar/commit/?id=0346b02ccea75b992d0b72a6dbaa1f05864450e0
Submitter: Zuul
Branch: master

commit 0346b02ccea75b992d0b72a6dbaa1f05864450e0
Author: Jason Anderson <email address hidden>
Date: Fri Oct 11 16:51:28 2019 -0500

    Fix issue moving hosts back to freepool

    If an aggregate fails to be created for some reason, for example because
    the requested node is not in the freepool, some cleanup occurs. Part of
    the cleanup is ensuring the target node is moved _back_ to the freepool.
    This should theoretically "heal" cases where the node wasn't in the
    freepool but also wasn't in some lease aggregate.

    However, this call to Nova's API refers to the freepool aggregate by its
    name, which is not supported here: the ID must be used[1]. This caused
    this operation to fail and raise a NotFound, confusingly (because Nova
    couldn't find an aggregate with ID='freepool' for example.)

    [1]:
    https://docs.openstack.org/api-ref/compute/?expanded=add-host-detail#add-host

    Closes-Bug: #1847821
    Change-Id: I7af4d407b183578617131f0de42becb3dc2bc415

Changed in blazar:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to blazar (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/689541

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to blazar (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/689542

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to blazar (stable/train)

Reviewed: https://review.opendev.org/689541
Committed: https://git.openstack.org/cgit/openstack/blazar/commit/?id=b30cece36ea143a97544799fb807644a51fd1c60
Submitter: Zuul
Branch: stable/train

commit b30cece36ea143a97544799fb807644a51fd1c60
Author: Jason Anderson <email address hidden>
Date: Fri Oct 11 16:51:28 2019 -0500

    Fix issue moving hosts back to freepool

    If an aggregate fails to be created for some reason, for example because
    the requested node is not in the freepool, some cleanup occurs. Part of
    the cleanup is ensuring the target node is moved _back_ to the freepool.
    This should theoretically "heal" cases where the node wasn't in the
    freepool but also wasn't in some lease aggregate.

    However, this call to Nova's API refers to the freepool aggregate by its
    name, which is not supported here: the ID must be used[1]. This caused
    this operation to fail and raise a NotFound, confusingly (because Nova
    couldn't find an aggregate with ID='freepool' for example.)

    [1]:
    https://docs.openstack.org/api-ref/compute/?expanded=add-host-detail#add-host

    Closes-Bug: #1847821
    Change-Id: I7af4d407b183578617131f0de42becb3dc2bc415
    (cherry picked from commit 0346b02ccea75b992d0b72a6dbaa1f05864450e0)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to blazar (stable/stein)

Reviewed: https://review.opendev.org/689542
Committed: https://git.openstack.org/cgit/openstack/blazar/commit/?id=5d302258741954292b2dd82163af32f8379f7707
Submitter: Zuul
Branch: stable/stein

commit 5d302258741954292b2dd82163af32f8379f7707
Author: Jason Anderson <email address hidden>
Date: Fri Oct 11 16:51:28 2019 -0500

    Fix issue moving hosts back to freepool

    If an aggregate fails to be created for some reason, for example because
    the requested node is not in the freepool, some cleanup occurs. Part of
    the cleanup is ensuring the target node is moved _back_ to the freepool.
    This should theoretically "heal" cases where the node wasn't in the
    freepool but also wasn't in some lease aggregate.

    However, this call to Nova's API refers to the freepool aggregate by its
    name, which is not supported here: the ID must be used[1]. This caused
    this operation to fail and raise a NotFound, confusingly (because Nova
    couldn't find an aggregate with ID='freepool' for example.)

    [1]:
    https://docs.openstack.org/api-ref/compute/?expanded=add-host-detail#add-host

    Closes-Bug: #1847821
    Change-Id: I7af4d407b183578617131f0de42becb3dc2bc415
    (cherry picked from commit 0346b02ccea75b992d0b72a6dbaa1f05864450e0)
    (cherry picked from commit b30cece36ea143a97544799fb807644a51fd1c60)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/blazar 4.0.1

This issue was fixed in the openstack/blazar 4.0.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.