senlin

health-policy cannot work

Bug #1655511 reported by XueFeng Liu on 2017-01-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	senlin	Fix Released	Medium	XueFeng Liu	senlin ocata-3

Bug Description

In newest senlin version, health-policy cannot do cluster_check, the reason is the project of health-manage is None.

More info:
2017-01-10 17:31:01.948 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'senlin.engine.health_manager.HealthManager._poll_cluster' failed
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall Traceback (most recent call last):
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 136, in _run_loop
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall result = func(*self.args, **self.kw)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/health_manager.py", line 135, in _poll_cluster
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall self.rpc_client.call(self.ctx, 'cluster_check', req)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/rpc/client.py", line 56, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return client.call(ctxt, method, req=req)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 465, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return self.prepare().call(ctxt, method, **kwargs)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall retry=self.retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 97, in _send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall timeout=timeout, retry=retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 467, in send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall retry=retry)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 458, in _send
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall raise result
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall ResourceNotFound_Remote: The cluster (d2f0dc8d-b8d1-432d-a67a-36974746fdd0) could not be found.
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall Traceback (most recent call last):
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/service.py", line 71, in wrapped
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall return func(self, ctx, req_obj)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/engine/service.py", line 1370, in cluster_check
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall db_cluster = co.Cluster.find(ctx, req.identity)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall File "/opt/stack/senlin/senlin/objects/cluster.py", line 73, in find
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall raise exc.ResourceNotFound(type='cluster', id=identity)
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall ResourceNotFound: The cluster (d2f0dc8d-b8d1-432d-a67a-36974746fdd0) could not be found.
2017-01-10 17:31:01.948 TRACE oslo.service.loopingcall

See original description

XueFeng Liu (jonnary-liu) on 2017-01-11

Changed in senlin:
assignee:	nobody → XueFeng Liu (jonnary-liu)
status:	New → In Progress

Revision history for this message

Qiming Teng (tengqim) wrote on 2017-01-11:

cluster check request, when initiated from health manager, doesn't have a proper context. But when this request arrives at cluster_check RPC call in engine service, we are checking if user and project are set. They will be assigned if not set yet.

Don't understand what is broken, and how?

XueFeng Liu (jonnary-liu) on 2017-01-11

description:

updated

Revision history for this message

XueFeng Liu (jonnary-liu) wrote on 2017-01-11:

Yes, I have fixed cluster_check context before.
I think project_safe was changed not long ago

Qiming Teng (tengqim) on 2017-01-11

Changed in senlin:
status:	In Progress → Triaged
importance:	Undecided → Medium
milestone:	none → ocata-3

OpenStack Infra (hudson-openstack) on 2017-01-12

Changed in senlin:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-01-15: Fix merged to senlin (master)

Reviewed: https://review.openstack.org/417697
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=26fc50b3f8904bc85bb5a8c4699c5bdd74835a58
Submitter: Jenkins
Branch: master

commit 26fc50b3f8904bc85bb5a8c4699c5bdd74835a58
Author: jonnary <email address hidden>
Date: Thu Jan 12 13:19:18 2017 +0800

Fix cluster-check cannot work problem

This patch fixes cluster-check context in health manage.

Change-Id: I444c468683c06dcba1b9928b8e5b82dee21b1e50
Closes-Bug: #1655511

Changed in senlin:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-01-26: Fix included in openstack/senlin 3.0.0.0b3

This issue was fixed in the openstack/senlin 3.0.0.0b3 development milestone.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.