Host polling monitor fails and stop

Bug #1785945 reported by Masahito Muroi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Blazar
Fix Released
High
Masahito Muroi

Bug Description

Compute host polling monitor fails to poll compute's status with following error message. After this error message the polling monitor terminates to poll compute host information.

There're two issues in this bug.

1. ComputeHostMonitor fails to authentication for Nova API
2. PollingMonitor terminates its polling loop when any error happens in monitor instances.

ERROR blazar.plugins.oshosts.host_plugin [-] Skipping health check. No valid authentication is available: AuthorizationFailure: No valid authentication is available
ERROR blazar.plugins.oshosts.host_plugin Traceback (most recent call last):
ERROR blazar.plugins.oshosts.host_plugin File "/host_shared/blazar/blazar/plugins/oshosts/host_plugin.py", line 776, in _poll_resource_failures
ERROR blazar.plugins.oshosts.host_plugin hvs = self.nova.hypervisors.list()
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/novaclient/api_versions.py", line 393, in substitution
ERROR blazar.plugins.oshosts.host_plugin return methods[-1].func(obj, *args, **kwargs)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/novaclient/v2/hypervisors.py", line 57, in list
ERROR blazar.plugins.oshosts.host_plugin return self._list_base(detailed=detailed)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/novaclient/v2/hypervisors.py", line 48, in _list_base
ERROR blazar.plugins.oshosts.host_plugin return self._list(path, 'hypervisors')
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/novaclient/base.py", line 257, in _list
ERROR blazar.plugins.oshosts.host_plugin resp, body = self.api.client.get(url)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 328, in get
ERROR blazar.plugins.oshosts.host_plugin return self.request(url, 'GET', **kwargs)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/novaclient/client.py", line 77, in request
ERROR blazar.plugins.oshosts.host_plugin **kwargs)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 487, in request
ERROR blazar.plugins.oshosts.host_plugin resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 213, in request
ERROR blazar.plugins.oshosts.host_plugin return self.session.request(url, method, **kwargs)
ERROR blazar.plugins.oshosts.host_plugin File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 688, in request
ERROR blazar.plugins.oshosts.host_plugin raise exceptions.AuthorizationFailure(msg)
ERROR blazar.plugins.oshosts.host_plugin AuthorizationFailure: No valid authentication is available
ERROR blazar.plugins.oshosts.host_plugin
ERROR blazar.monitor.base [-] Caught an exception while executing a callback. local variable 'failed_hosts' referenced before assignment: UnboundLocalError: local variable 'failed_hosts' referenced before assignment
ERROR blazar.monitor.base Traceback (most recent call last):
ERROR blazar.monitor.base File "/host_shared/blazar/blazar/monitor/base.py", line 60, in call_monitor_plugin
ERROR blazar.monitor.base reservation_flags = callback(*args, **kwargs)
ERROR blazar.monitor.base File "/host_shared/blazar/blazar/plugins/oshosts/host_plugin.py", line 754, in poll
ERROR blazar.monitor.base failed_hosts, recovered_hosts = self._poll_resource_failures()
ERROR blazar.monitor.base File "/host_shared/blazar/blazar/plugins/oshosts/host_plugin.py", line 790, in _poll_resource_failures
ERROR blazar.monitor.base return failed_hosts, recovered_hosts
ERROR blazar.monitor.base UnboundLocalError: local variable 'failed_hosts' referenced before assignment
ERROR blazar.monitor.base
ERROR oslo.service.loopingcall [-] Fixed interval looping call 'blazar.monitor.polling_monitor.PollingMonitor.call_monitor_plugin' failed: UnboundLocalError: local variable 'reservation_flags' referenced before assignment
ERROR oslo.service.loopingcall Traceback (most recent call last):
ERROR oslo.service.loopingcall File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 193, in _run_loop
ERROR oslo.service.loopingcall result = func(*self.args, **self.kw)
ERROR oslo.service.loopingcall File "/host_shared/blazar/blazar/monitor/base.py", line 65, in call_monitor_plugin
ERROR oslo.service.loopingcall if reservation_flags:
ERROR oslo.service.loopingcall UnboundLocalError: local variable 'reservation_flags' referenced before assignment
ERROR oslo.service.loopingcall

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to blazar (master)

Fix proposed to branch: master
Review: https://review.openstack.org/589705

Changed in blazar:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/589706

Changed in blazar:
milestone: none → rocky-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to blazar (master)

Reviewed: https://review.openstack.org/589705
Committed: https://git.openstack.org/cgit/openstack/blazar/commit/?id=b54d05b9587baf6a57979d8a6ba8def8a1b6017e
Submitter: Zuul
Branch: master

commit b54d05b9587baf6a57979d8a6ba8def8a1b6017e
Author: Masahito Muroi <email address hidden>
Date: Wed Aug 8 11:59:21 2018 +0900

    Set username when initializing PhysicalHostMonitorPlugin

    PhysicalHostMonitorPlugin fails to set the username parameter during its
    initialization, which results in authorization errors when the plugin
    calls the Nova API.

    Change-Id: I64b9f7c3a97e70acca996c52ff739c6c4397dd8f
    Partial-Bug: #1785945

Changed in blazar:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/589706
Committed: https://git.openstack.org/cgit/openstack/blazar/commit/?id=ed82418a01c7e3a2da7059fc0ee17b5e90878eb1
Submitter: Zuul
Branch: master

commit ed82418a01c7e3a2da7059fc0ee17b5e90878eb1
Author: Masahito Muroi <email address hidden>
Date: Wed Aug 8 13:26:04 2018 +0900

    Catch any exception raised in call_monitor_plugin()

    The call_monitor_plugin method is executed by polling monitor and
    periodic healing. The two use oslo_service.threadgroup for its periodic
    execution, but the threadgroup doesn't catch any exception. If
    call_monitor_plugin() raises any exception, the thread automatically
    terminates.

    To keep the threadgroup running, this patch makes call_monitor_plugin()
    catch any exception internally, preventing them from being raised to the
    threadgroup.

    Change-Id: Id26401c03e3e89d308a089097315996973cc1dfb
    Closes-Bug: #1785945

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.