Make sure rhel-plugin-push.service is stopped after pacemaker stops
When issuing a normal reboot command on an overcloud node the following
stop sequence can take place:
------------- -----------------------------
| Pacemaker | | paunch-container-shutdown |
------------- -----------------------------
| |
\ /
\ / ----------
| docker | ----------
If there are docker plugins that are allowed to stop before docker and
also before pacemaker, it might happen that stopping them down during
the pacemaker stop will cause a bunch of timeouts and a failure to stop
containers:
Sep 13 17:53:00.821030 controller-0.localdomain pacemakerd[6147]: notice: Shutting down Pacemaker
Sep 13 17:54:15.798026 controller-0.localdomain lrmd[6284]: warning: galera-bundle-docker-0_monitor_60000 process (PID 226329) timed out
Sep 13 17:54:15.799004 controller-0.localdomain lrmd[6284]: warning: galera-bundle-docker-0_monitor_60000:226329 - timed out after 20000ms
One of these plugins is 'rhel-push-plugin.service'. It seems that when
this plugin is free to stop before docker on shutdown, it is very
possible that docker commands can start timing out.
Before:
Before adding the symlink we would need 15mins to reboot a node and
we would get a bunch of timeouts on shutdown and some failed actions on
boot.
After:
A reboot will take a reasonable couple of minutes to complete with no
failed actions at boot and timeouts during shutdown.
NB: We add the symlink unconditionally as systemd will ignore it if the
service is not installed.
Closes-Bug: #1792701
Change-Id: I6f6d27f2457efcc49d9edd8a2f98484c5f7c0933
(cherry picked from commit e288dbd8252765020816639b9b53f8212292cfaf)
Reviewed: https:/ /review. openstack. org/606849 /git.openstack. org/cgit/ openstack/ puppet- tripleo/ commit/ ?id=5f43470da1f 80675ac6144136e c8e60f23f9356b
Committed: https:/
Submitter: Zuul
Branch: stable/queens
commit 5f43470da1f8067 5ac6144136ec8e6 0f23f9356b
Author: Michele Baldessari <email address hidden>
Date: Sat Sep 15 15:19:26 2018 +0200
Make sure rhel-plugin- push.service is stopped after pacemaker stops
When issuing a normal reboot command on an overcloud node the following ------- ------- ------- - container- shutdown | ------- ------- ------- -
-- ------- -
-- ------- -
stop sequence can take place:
------------- -------
| Pacemaker | | paunch-
------------- -------
| |
\ /
\ /
| docker |
If there are docker plugins that are allowed to stop before docker and 0.localdomain pacemakerd[6147]: notice: Shutting down Pacemaker 0.localdomain lrmd[6284]: warning: galera- bundle- docker- 0_monitor_ 60000 process (PID 226329) timed out 0.localdomain lrmd[6284]: warning: galera- bundle- docker- 0_monitor_ 60000:226329 - timed out after 20000ms
also before pacemaker, it might happen that stopping them down during
the pacemaker stop will cause a bunch of timeouts and a failure to stop
containers:
Sep 13 17:53:00.821030 controller-
Sep 13 17:54:15.798026 controller-
Sep 13 17:54:15.799004 controller-
One of these plugins is 'rhel-push- plugin. service' . It seems that when
this plugin is free to stop before docker on shutdown, it is very
possible that docker commands can start timing out.
Before:
Before adding the symlink we would need 15mins to reboot a node and
we would get a bunch of timeouts on shutdown and some failed actions on
boot.
After:
A reboot will take a reasonable couple of minutes to complete with no
failed actions at boot and timeouts during shutdown.
NB: We add the symlink unconditionally as systemd will ignore it if the
service is not installed.
Closes-Bug: #1792701
Change-Id: I6f6d27f2457efc c49d9edd8a2f984 84c5f7c0933 20816639b9b53f8 212292cfaf)
(cherry picked from commit e288dbd82527650