health-policy cannot work

Bug #1655511 reported by XueFeng Liu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
senlin
Fix Released
Medium
XueFeng Liu

Bug Description

In newest senlin version, health-policy cannot do cluster_check, the reason is the project of health-manage is None.

More info:
2017-01-10 17:31:01.948 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'senlin.engine.health_manager.HealthManager._poll_cluster' failed
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall Traceback (most recent call last):
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 136, in _run_loop
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall result = func(*self.args, **self.kw)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/health_manager.py", line 135, in _poll_cluster
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall self.rpc_client.call(self.ctx, 'cluster_check', req)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/rpc/client.py", line 56, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return client.call(ctxt, method, req=req)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 465, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return self.prepare().call(ctxt, method, **kwargs)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall retry=self.retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 97, in _send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall timeout=timeout, retry=retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 467, in send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall retry=retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 458, in _send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall raise result
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall ResourceNotFound_Remote: The cluster (d2f0dc8d-b8d1-432d-a67a-36974746fdd0) could not be found.
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall Traceback (most recent call last):
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/service.py", line 71, in wrapped
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return func(self, ctx, req_obj)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/service.py", line 1370, in cluster_check
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall db_cluster = co.Cluster.find(ctx, req.identity)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/objects/cluster.py", line 73, in find
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall raise exc.ResourceNotFound(type='cluster', id=identity)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall ResourceNotFound: The cluster (d2f0dc8d-b8d1-432d-a67a-36974746fdd0) could not be found.
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall

Changed in senlin:
assignee: nobody → XueFeng Liu (jonnary-liu)
status: New → In Progress
Revision history for this message
Qiming Teng (tengqim) wrote :

cluster check request, when initiated from health manager, doesn't have a proper context. But when this request arrives at cluster_check RPC call in engine service, we are checking if user and project are set. They will be assigned if not set yet.

Don't understand what is broken, and how?

description: updated
Revision history for this message
XueFeng Liu (jonnary-liu) wrote :

Yes, I have fixed cluster_check context before.
I think project_safe was changed not long ago

Qiming Teng (tengqim)
Changed in senlin:
status: In Progress → Triaged
importance: Undecided → Medium
milestone: none → ocata-3
Changed in senlin:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to senlin (master)

Reviewed: https://review.openstack.org/417697
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=26fc50b3f8904bc85bb5a8c4699c5bdd74835a58
Submitter: Jenkins
Branch: master

commit 26fc50b3f8904bc85bb5a8c4699c5bdd74835a58
Author: jonnary <email address hidden>
Date: Thu Jan 12 13:19:18 2017 +0800

    Fix cluster-check cannot work problem

    This patch fixes cluster-check context in health manage.

    Change-Id: I444c468683c06dcba1b9928b8e5b82dee21b1e50
    Closes-Bug: #1655511

Changed in senlin:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/senlin 3.0.0.0b3

This issue was fixed in the openstack/senlin 3.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.