check dispatcher when invoke in case of pre-load failed

Bug #1301777 reported by Jia Dong
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Low
Srinivas Sakhamuri

Bug Description

 In this test scenarios:
1. When collector service starting, if the db has not yet ready, it log an error info like 'Could not load 'database': could not connect to...' and the dispatcher_manager in it is null ([ ]) , but the service still goes on;
2. Because of no dispatcher, when recording the sample data from pollster or notification the error (RuntimeError: No ceilometer.dispatcher extensions found ) occurs from stevedore;
3. Later when the db is ready, but there are no mechanisms to check the db status and reconnect it. so the collector service keeps useless to record the data.

So I think we should add the checking function in DispatchedService , like:

       if list(self.dispatcher_manager):
            return
        self.dispatcher_manager = named.NamedExtensionManager(
                namespace=self.DISPATCHER_NAMESPACE,
                names=cfg.CONF.dispatcher,
                invoke_on_load=True,
                invoke_args=[cfg.CONF])
        if not list(self.dispatcher_manager):
                LOG.warning(_('Failed to load any dispatchers for %s when'
                              're-fetch'),
                            self.DISPATCHER_NAMESPACE)

before invoking the dispatcher in Collector service, we should check if dispatcher exists.

Revision history for this message
gordon chung (chungg) wrote :

this is probably not likely to occur but i guess if the db is down during collector startup, this could be useful.

Changed in ceilometer:
importance: Undecided → Low
Revision history for this message
ZhiQiang Fan (aji-zqfan) wrote :

@gordon chung (chungg)

it does occur in sles 11 sp3 with openstack less than or equal to 2013.2.2, while 2013.2.3 fixes the service start dependency

ZhiQiang Fan (aji-zqfan)
Changed in ceilometer:
assignee: nobody → ZhiQiang Fan (aji-zqfan)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/86155

Changed in ceilometer:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/87133

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ceilometer (master)

Change abandoned by ZhiQiang Fan (<email address hidden>) on branch: master
Review: https://review.openstack.org/86155

ZhiQiang Fan (aji-zqfan)
Changed in ceilometer:
assignee: ZhiQiang Fan (aji-zqfan) → nobody
status: In Progress → New
Revision history for this message
Srinivas Sakhamuri (srinivas-sakhamuri) wrote :

This bug can cause issues when the collector is started but the mysql connection is timed out, and collector service will not recover when mysql becomes available.

Changed in ceilometer:
assignee: nobody → Srinivas Sakhamuri (srinivas-sakhamuri)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/127128

Changed in ceilometer:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/127128
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=48f0c4b423ea54ba89a85cc49819308295eb45a5
Submitter: Jenkins
Branch: master

commit 48f0c4b423ea54ba89a85cc49819308295eb45a5
Author: Srinivas Sakhamuri <email address hidden>
Date: Thu Oct 9 03:39:30 2014 +0000

    Allow collector service database connection retry

    At the startup collector service loads the database dispatcher,
    if the DB is offline momentarily the collector service starts but
    continues to run with out having the ability to re-try the DB
    connection. This fix will allow the connection to be retried next
    time when collector service attempts to write to the database.

    Change-Id: I10fef61a9c525e1c84d8cadbca86d6bde84927ba
    Closes-Bug: 1301777

Changed in ceilometer:
status: In Progress → Fix Committed
Eoghan Glynn (eglynn)
Changed in ceilometer:
milestone: none → kilo-1
Thierry Carrez (ttx)
Changed in ceilometer:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.