glance re-spawns a child when terminating

Bug #1714240 reported by Bernhard M. Wiedemann on 2017-08-31
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Medium
Bernhard M. Wiedemann

Bug Description

When sending a SIGTERM to the main glance-api process,
api.log shows
2017-08-31 13:10:30.996 10618 INFO glance.common.wsgi [-] Removed dead child 10628
2017-08-31 13:10:31.004 10618 INFO glance.common.wsgi [-] Started child 10642
2017-08-31 13:10:31.006 10642 INFO eventlet.wsgi.server [-] (10642) wsgi starting up on https://10.162.184.83:5510
2017-08-31 13:10:31.008 10642 INFO eventlet.wsgi.server [-] (10642) wsgi exited, is_accepting=True
2017-08-31 13:10:31.009 10642 INFO glance.common.wsgi [-] Child 10642 exiting normally

This is because kill_children sends a SIGTERM to all children
and wait_on_children restarts one, when it notices a dead child

We noticed this, because this triggered a fencing in our cloud's pacemaker setup because systemd seems to have a race condition in the cgroup code that should detect that all related services have terminated.

# systemctl status openstack-glance-api
● openstack-glance-api.service - OpenStack Image Service API server
   Loaded: loaded (/usr/lib/systemd/system/openstack-glance-api.service; disabled; vendor preset: disabled)
   Active: deactivating (final-sigterm) since Thu 2017-08-31 10:13:48 UTC; 1min 14s ago
 Main PID: 25077 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 512)
   CGroup: /system.slice/openstack-glance-api.service
Aug 31 10:13:48 d08-9e-01-b4-9e-42 systemd[1]: Stopping OpenStack Image Service API server...
Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: State 'stop-final-sigterm' timed out. Killing.
Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: Stopped OpenStack Image Service API server.
Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Unit entered failed state.
Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Failed with result 'timeout'.

description: updated
Erno Kuvaja (jokke) on 2017-09-20
Changed in glance:
assignee: nobody → Bernhard M. Wiedemann (ubuntubmw)
status: New → In Progress
importance: Undecided → Medium
Changed in glance:
milestone: none → queens-1

Reviewed: https://review.openstack.org/499592
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=877cd166b56ec4b7f5530ea9bf1587077692275b
Submitter: Jenkins
Branch: master

commit 877cd166b56ec4b7f5530ea9bf1587077692275b
Author: Bernhard M. Wiedemann <email address hidden>
Date: Thu Aug 31 15:11:41 2017 +0200

    Avoid restarting a child when terminating

    When sending a SIGTERM to the main glance-api process,
    it was sending a SIGTERM to its children
    but then also re-spawning its first dead child.

    Closes-bug: #1714240

    Change-Id: Ibef426c198d287bbdac4e764fd654edba4b7c2d6

Changed in glance:
status: In Progress → Fix Released

This issue was fixed in the openstack/glance 16.0.0.0b1 development milestone.

Reviewed: https://review.openstack.org/513843
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=892af4718b97019dc25bee6fc28fa092f319c54d
Submitter: Zuul
Branch: stable/pike

commit 892af4718b97019dc25bee6fc28fa092f319c54d
Author: Bernhard M. Wiedemann <email address hidden>
Date: Thu Aug 31 15:11:41 2017 +0200

    Avoid restarting a child when terminating

    When sending a SIGTERM to the main glance-api process,
    it was sending a SIGTERM to its children
    but then also re-spawning its first dead child.

    Closes-bug: #1714240

    Change-Id: Ibef426c198d287bbdac4e764fd654edba4b7c2d6
    (cherry picked from commit 877cd166b56ec4b7f5530ea9bf1587077692275b)

tags: added: in-stable-pike

This issue was fixed in the openstack/glance 15.0.2 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers