aodh-api is restarted every 5 minutes

Bug #1689710 reported by Xav Paice on 2017-05-10
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack AODH Charm
Low
Unassigned

Bug Description

Xenial, Mitaka, juju 2.1.2, charm version stable/17.02

Every 5 mins in the aodh api logs, we see what looks very much like a restart of the process. Confirming that with ps shows that to be the case.

At the same minute past each hour, we see update-status hook running (http://pastebin.ubuntu.com/24547150/) in the juju unit log on the host running aodh.

Looking at the charm code, it appears that the write to configs triggers aodh.reload_and_restart(), which includes ch_host.service_restart('aodh-api').

If we could avoid re-writing the config every time the update-status hook that would help immensely.

Xav Paice (xavpaice) on 2017-05-10
description: updated
Xav Paice (xavpaice) on 2017-05-10
description: updated
Alex Kavanagh (ajkavanagh) wrote :

@afreiberger As you've already determined, the keystone issue seems to be caused by the keystone-ldap charm which is a reactive charm. aodh is ALSO a reactive charm. This is also happening with designate, which is ALSO a reactive charm.

The common theme is that they are all reactive charms and use the charms.openstack library. What's probably happening in each of the charms is that a reactive handler is triggering the same behaviour over and over again, and that behaviour has the unintended side-effect of triggering a restart. An example of this was in the designate charm, where the common haproxy code in charms.openstack was causing the haproxy.conf file to be re-written every update status, with the same information, but unfortunately in a non-deterministic order (this is almost fixed).

The options are either working through all of the charms and stopping the unintended behaviour (which is tricky, as charms.reactive _wants_ all handlers that have true conditions to run on every hook invocation, and it's hard to 'gate' behaviour to only run once - the states get a bit out of control), or fix layer-openstack so that it includes a custom reactive 'update-status' handler than DOESN'T run the charms.reactive handler system but does allow the introspection of interface (relation) states.

I'll raise a bug in charms.openstack to track this, and reference it back here.

Xav Paice (xavpaice) wrote :

Note that https://code.launchpad.net/~billy-olsen/charm-helpers/lp1698343/+merge/326495 includes a charmhelpers fix which maybe the solution to all this - but needs a fresh release of charmhelpers since it's not included in the current one on Pypi. Current version is 0.16.0.

Alex Kavanagh (ajkavanagh) wrote :

Related bug in charms.openstack (the base of the OpenStack reactive charms): https://bugs.launchpad.net/charms.openstack/+bug/1702316

Changed in charm-aodh:
status: New → Triaged
Alex Kavanagh (ajkavanagh) wrote :
Download full text (4.6 KiB)

This bug is now fixed in master:

root@juju-7dbb96-0:/var/log/aodh# date
Fri Aug 25 15:02:30 UTC 2017
root@juju-7dbb96-0:/var/log/aodh# ps ax | grep aodh
 1782 ? Ss 0:00 bash /var/lib/juju/init/jujud-unit-aodh-0/exec-start.sh
 1786 ? Sl 0:01 /var/lib/juju/tools/unit-aodh-0/jujud unit --data-dir /var/lib/juju --unit-name aodh/0 --debug
21191 ? Ss 0:00 /usr/bin/python /usr/bin/aodh-api --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-api.log
21200 ? Ss 0:00 /usr/bin/python /usr/bin/aodh-evaluator --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-evaluator.log
21209 ? Ssl 0:00 /usr/bin/python /usr/bin/aodh-notifier --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-notifier.log
21218 ? Ssl 0:00 /usr/bin/python /usr/bin/aodh-listener --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-listener.log
22666 pts/0 S+ 0:00 tail -f unit-aodh-0.log
23801 pts/1 S+ 0:00 grep --color=auto aodh
root@juju-7dbb96-0:/var/log/aodh# date
Fri Aug 25 15:09:47 UTC 2017
root@juju-7dbb96-0:/var/log/aodh# ps ax | grep aodh
 1782 ? Ss 0:00 bash /var/lib/juju/init/jujud-unit-aodh-0/exec-start.sh
 1786 ? Sl 0:01 /var/lib/juju/tools/unit-aodh-0/jujud unit --data-dir /var/lib/juju --unit-name aodh/0 --debug
21191 ? Ss 0:00 /usr/bin/python /usr/bin/aodh-api --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-api.log
21200 ? Ss 0:00 /usr/bin/python /usr/bin/aodh-evaluator --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-evaluator.log
21209 ? Ssl 0:00 /usr/bin/python /usr/bin/aodh-notifier --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-notifier.log
21218 ? Ssl 0:00 /usr/bin/python /usr/bin/aodh-listener --config-file=/etc/aodh/aodh.conf --log-file=/var/log/aodh/aodh-listener.log
22666 pts/0 S+ 0:00 tail -f unit-aodh-0.log
25851 pts/1 S+ 0:00 grep --color=auto aodh

Note that the pids are the same with 8 minutes difference + an update-status happening every 5 minutes:

2017-08-25 15:04:16 INFO juju-log Reactive main running for hook update-status
2017-08-25 15:04:16 INFO juju-log Invoking reactive handler: reactive/layer_openstack.py:54:default_update_status
2017-08-25 15:04:16 INFO juju-log Invoking reactive handler: reactive/aodh_handlers.py:70:render_unclustered
2017-08-25 15:04:17 WARNING juju-log DEPRECATION: should not use port_map parameter in APIConfigurationAdapter.__init__()
2017-08-25 15:04:17 WARNING juju-log DEPRECATION: should not use service_name parameter in APIConfigurationAdapter.__init__()
2017-08-25 15:04:17 INFO juju-log Creating choice loader with dirs: [['templates/'], ['/var/lib/juju/agents/unit-aodh-0/.venv/lib/python3.5/site-packages/charmhelpers/contrib/openstack/templates']]
2017-08-25 15:04:17 WARNING juju-log Not adding haproxy listen stanza for aodh-api_int port is already in use
2017-08-25 15:04:17 WARNING juju-log Not adding haproxy listen stanza for aodh-api_public port is already in use
2017-08-25 15:04:17 INFO juju-log Writing file /etc/haproxy/haproxy.cfg root:root 444
2017-08-25 15:04:17 INF...

Read more...

Alex Kavanagh (ajkavanagh) wrote :

Changing to incomplete and lowering the priority as I believe this is now resolved due to changes in charms.openstack around ordering of things, etc.

Changed in charm-aodh:
importance: Undecided → Low
status: Triaged → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers