Keystone: Apache restarts frequently (SIGTERM) possibly due to frequent change in the backend

Bug #1698343 reported by Nusrath Mohammed
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Keystone LDAP integration
Fix Released
Medium
Billy Olsen
keystone (Juju Charms Collection)
Invalid
Undecided
Unassigned

Bug Description

Hi,
We are seeing frequent SIGTERM errors

[Fri Jun 16 09:15:01.508555 2017] [mpm_event:notice] [pid 1426216:tid 140643397523328] AH00491: caught SIGTERM, shutting down
[Fri Jun 16 09:15:02.575654 2017] [mpm_event:notice] [pid 1429831:tid 140469605418880] AH00489: Apache/2.4.10 (Ubuntu) mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal operations
[Fri Jun 16 09:15:02.575726 2017] [core:notice] [pid 1429831:tid 140469605418880] AH00094: Command line: '/usr/sbin/apache2'
[Fri Jun 16 09:17:40.960667 2017] [mpm_event:notice] [pid 1429831:tid 140469605418880] AH00491: caught SIGTERM, shutting down
[Fri Jun 16 09:17:42.028744 2017] [mpm_event:notice] [pid 1431988:tid 139635145045888] AH00489: Apache/2.4.10 (Ubuntu) mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal operations
[Fri Jun 16 09:17:42.028826 2017] [core:notice] [pid 1431988:tid 139635145045888] AH00094: Command line: '/usr/sbin/apache2'
[Fri Jun 16 09:20:04.190324 2017] [mpm_event:notice] [pid 1431988:tid 139635145045888] AH00491: caught SIGTERM, shutting down
[Fri Jun 16 09:20:17.377800 2017] [mpm_event:notice] [pid 1435080:tid 139844850268032] AH00489: Apache/2.4.10 (Ubuntu) mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal operations
[Fri Jun 16 09:20:17.377871 2017] [core:notice] [pid 1435080:tid 139844850268032] AH00094: Command line: '/usr/sbin/apache2'

It appears that the domain-backend is frequently flapping and causing Apache to restart. Charms installed on trusty (4.2.0-57-generic):
trusty/keystone git+https://github.com/openstack/charm-keystone;revno=stable/17.02
trusty/keystone-ldap git+https://github.com/openstack/charm-keystone-ldap;revno=stable/17.02

Keystone.config (partial)

[identity]
driver = sql
default_domain_id = <masked>
domain_specific_drivers_enabled = True
domain_config_dir = /etc/keystone/domains

[credential]
driver = sql

[trust]
driver = sql

[os_inherit]

[catalog]
driver = sql

[endpoint_filter]

[token]
driver = sql
provider = keystone.token.providers.uuid.Provider
expiration = 3600

[ldap]
[resource]
admin_project_domain_name = admin_domain
admin_project_name = admin

Needs to check why the backends is creating events that are causing needless relation-changed charm events.

summary: - Apache on Keystone frequently restart possibly due to frequent change in
- the backend
+ Keystone: Apache restarts frequently (SIGTERM) possibly due to frequent
+ change in the backend
Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

We have thousands of these log messages on each keystone server:

2017-06-09 03:50:40 INFO domain-backend-relation-changed * Restarting web server apache2

While it's possible that the root cause of this bug lies further back, the issue here is that apache2 is restarted across all keystones every time this relation fires. This is causing needless alerts and potential service availability problems.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

It appears from the keystone-ldap reactive charm, if the relation is connected and the config is complete, every time the reactive sweeper comes along (every update-status run?) it's going to end up calling keystone_ldap.render_config(domain.trigger_restart)...Can we get something that tracks this restarted state so that this only happens once per boot, or once per relation-changed event in charm-keystone-ldap?

Revision history for this message
Drew Freiberger (afreiberger) wrote :
Download full text (5.8 KiB)

2017-06-27 22:51:00 INFO juju-log Unit is ready
2017-06-27 22:51:02 INFO juju-log Application Version: 9.3.0
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/keystone/keystone.conf
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/keystone/logging.conf
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/haproxy/haproxy.cfg
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/apache2/sites-available/openstack_https_frontend.conf
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/keystone/policy.json
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/cron.d/keystone-token-flush
2017-06-27 22:51:05 INFO juju-log domain-backend:180: Registered config file: /etc/apache2/sites-enabled/wsgi-openstack-api.conf
2017-06-27 22:51:11 INFO domain-backend-relation-changed * Restarting web server apache2
2017-06-27 22:51:13 INFO domain-backend-relation-changed AH00557: apache2: apr_sockaddr_info_get() failed for juju-machine-1-lxc-2
2017-06-27 22:51:13 INFO domain-backend-relation-changed AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
2017-06-27 22:51:14 INFO domain-backend-relation-changed ...done.
2017-06-27 22:51:14 INFO juju-log domain-backend:180: Configuring Keystone to use a pre-configured admin token.
2017-06-27 22:51:14 INFO juju-log domain-backend:180: Configuring Keystone to use a pre-configured admin token.
2017-06-27 22:51:15 INFO juju-log domain-backend:180: Configuring Keystone to use a pre-configured admin token.
2017-06-27 22:51:15 INFO domain-backend-relation-changed haproxy is running.
2017-06-27 22:51:15 INFO domain-backend-relation-changed * apache2 is running
2017-06-27 22:51:15 INFO juju-log domain-backend:180: Unit is ready
2017-06-27 22:51:18 INFO juju-log domain-backend:180: Application Version: 9.3.0
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/keystone/keystone.conf
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/keystone/logging.conf
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/haproxy/haproxy.cfg
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/apache2/sites-available/openstack_https_frontend.conf
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/keystone/policy.json
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/cron.d/keystone-token-flush
2017-06-27 22:51:20 INFO juju-log domain-backend:181: Registered config file: /etc/apache2/sites-enabled/wsgi-openstack-api.conf
2017-06-27 22:51:27 INFO domain-backend-relation-changed * Restarting web server apache2
2017-06-27 22:51:29 INFO domain-backend-relation-changed AH00557: apache2: apr_sockaddr_info_get() failed for juju-machine-1-lxc-2
2017-06-27 22:51:29 INFO domain-backend-relation-changed AH00558: apache2: Could not reliably determine the server's full...

Read more...

Revision history for this message
Drew Freiberger (afreiberger) wrote :

seeing above every 5-12 minutes in keystone unit logs. nothing reflected in keystone-ldap unit logs other than a single relation-changed when the units restarted this morning.

Changed in keystone (Juju Charms Collection):
status: New → Invalid
Changed in charm-keystone-ldap:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Billy Olsen (billy-olsen)
milestone: none → 17.08
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Worked this through a bit with brad-marshall last night and the ordering of the ldap-config options as written to the ldap section is changing when its written out. This is due to the unordered nature of traversing dicts.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Proposed a change to charmhelpers library which does the config-flags parsing [0]. That should solve the problem seen by the keystone-ldap charm, but I need to confirm which other charms might be subject to this particular problem.

[0] https://code.launchpad.net/~billy-olsen/charm-helpers/lp1698343/+merge/326495

Changed in charm-keystone-ldap:
status: Triaged → In Progress
Revision history for this message
Drew Freiberger (afreiberger) wrote :
Revision history for this message
Xav Paice (xavpaice) wrote :

Billy's change is merged, but needs a fresh release of charmhelpers since it's not included in the current one on Pypi. Current version is 0.16.0.

Once that is released, we should be clear to rebuild the charm.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Related bug in the charms.openstack project (the base of OpenStack reactive charms):
https://bugs.launchpad.net/charms.openstack/+bug/1702316

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-keystone-ldap (master)

Fix proposed to branch: master
Review: https://review.openstack.org/480709

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone-ldap (master)

Reviewed: https://review.openstack.org/480709
Committed: https://git.openstack.org/cgit/openstack/charm-keystone-ldap/commit/?id=92f0fb511d78fd0d127cd504b28dcab3d00b2c72
Submitter: Jenkins
Branch: master

commit 92f0fb511d78fd0d127cd504b28dcab3d00b2c72
Author: Billy Olsen <email address hidden>
Date: Wed Jul 5 12:16:42 2017 -0700

    Rebuild charm to sync charmhelpers

    Add a rebuild nonce to the charm-keystone-ldap charm the same
    as other reactive charms. This allows the charm to trigger a
    rebuild when upstream dependencies need to be synced into a
    reactive charm.

    This sync will pull in a change for parsing config flags that
    will ensure the ordering is consistent each time the config
    flag parsing is executed (even across hook invocations).

    Change-Id: I4a4b58a2176314b69c577810a9e15d0814e24ae5
    Closes-Bug: #1698343

Changed in charm-keystone-ldap:
status: In Progress → Fix Committed
tags: added: backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-keystone-ldap (stable/17.02)

Fix proposed to branch: stable/17.02
Review: https://review.openstack.org/483108

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-keystone-ldap (stable/17.02)

Change abandoned by Billy Olsen (<email address hidden>) on branch: stable/17.02
Review: https://review.openstack.org/483108
Reason: Abandoning this backport. After discussion, these 3 commits fixing the keystone-ldap charm's restarting of keystone will be squashed into one backport commit.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone-ldap (stable/17.02)

Reviewed: https://review.openstack.org/492367
Committed: https://git.openstack.org/cgit/openstack/charm-keystone-ldap/commit/?id=186069de6a40fdb4f50a85378e71f2a0eae8965b
Submitter: Jenkins
Branch: stable/17.02

commit 186069de6a40fdb4f50a85378e71f2a0eae8965b
Author: Billy Olsen <email address hidden>
Date: Fri Aug 11 16:10:55 2017 -0700

    Add guard state around configure_domain_name()

    This change adds a guard state around the configure_domain_name()
    function so that it only gets called ONCE when the relation connects.

    This is to prevent restarting keystone everytime the update-status
    hook runs on this charm.

    Note: this change also includes the rebuild change from commit 9427ccf
    and the subsequent patch to this patchset which fixes a hook error on
    removing a backend relation. All cherry-picks are referenced below.

    Change-Id: I85d14ddb97e78be966f0cc8dbbcea312599d7327
    Depends-On: I44d3d5ec0c930a9cde80200bb1e5eb4dfae58d02
    Closes-Bug: #1698343
    (cherry picked from commit 2ba38505215a0583e13c2f223290f8f88ab31ebc)
    (cherry picked from commit 317e55074d1a0f28b799102aba53573e2398e9b1)
    (cherry picked from commit 92f0fb511d78fd0d127cd504b28dcab3d00b2c72)

James Page (james-page)
Changed in charm-keystone-ldap:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.