keystone unavailable may cause agent-central unavailable forever

Bug #1287613 reported by ZhiQiang Fan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Low
ZhiQiang Fan
Havana
Won't Fix
Low
Unassigned

Bug Description

there is a chance when ceilometer-agent-central try to trigger interval_task but keystone is not available, then an exception is raised but not caught, the outside oslo.loopingcall will stop calling the task function, finally leading to the agent-central service is not available forever, even the keystone is available again!

this problem is found in havana environment but still occur in master branch (not verified!)

my opinion is simple: skip this interval if keystone is not avaiable

ZhiQiang Fan (aji-zqfan)
Changed in ceilometer:
assignee: nobody → ZhiQiang Fan (aji-zqfan)
Changed in ceilometer:
status: New → In Progress
Revision history for this message
gordon chung (chungg) wrote :

is this a bug that should be raised against oslo or expected behaviour?

Revision history for this message
gordon chung (chungg) wrote :

marking as rc1 target... fix seems simple enough.

Changed in ceilometer:
milestone: none → icehouse-rc1
importance: Undecided → Low
Revision history for this message
Ildiko Vancsa (ildiko-vancsa) wrote :

Hi Gordon,

The AgentManager in central/managery.py has a parent class in ceilometer/agent.py:
https://github.com/openstack/ceilometer/blob/master/ceilometer/agent.py#L100 .
The interval_task is introduced here. This AgentManager class here has a parent class, which comes from oslo-incubator: https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L419 .
In Ceilometer we use the __init__ from here and overwrite the start method, with using the interval_task. So I assume this modification has nothing to do with oslo. I hope I did not miss anything.

Best Regards,
Ildiko

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/78079
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=ffff1cb13281894dcbbdfa4bc6f63fec4673f10c
Submitter: Jenkins
Branch: master

commit ffff1cb13281894dcbbdfa4bc6f63fec4673f10c
Author: ZhiQiang Fan <email address hidden>
Date: Wed Mar 5 10:58:59 2014 +0800

    Skip central agent interval_task when keystone fails

    There is a chance when ceilometer-agent-central try to trigger interval_task
    but keystone is not available, then an exception is raised but not caught, the
    outside oslo.loopingcall will stop calling the task function, finally leading
    to the agent-central service is no longer available forever, even the keystone
    is available again.

    We can skip the particular interval_task when keystone is not available.

    Change-Id: I0849cafadac8fb8a7670aaa4cc76dc708bdb25a1
    Closes-Bug: #1287613

Changed in ceilometer:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/82291

Thierry Carrez (ttx)
Changed in ceilometer:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: icehouse-rc1 → 2014.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ceilometer (stable/havana)

Change abandoned by ZhiQiang Fan (<email address hidden>) on branch: stable/havana
Review: https://review.openstack.org/82291

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.