Minor updates in composable HA break due to haproxy rules being applied too late
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Michele Baldessari |
Bug Description
Any role that has haproxy needs custom iptables rules that open up the traffic for all the haproxy stanzas. This is normally not spectacularly interesting or important when the role containing haproxy also contains all other controller services (mysql/
In such a composable HA scenario minor updates can potentially break. Imagine the following scenario. Note that a minor update only runs the update tasks, host_prep_tasks and the docker_config tasks, aka the transient containers.
Now imagine the following scenario:
1) Minor update on controller-2, followed by controller-1
At this point the haproxy rules have disappeared from controller-2 and controller-1 because they run on the deployment steps which are not run during minor update.
2) Minor update of controller-0
At this point any transient container that tries to update or poke the DB will be stuck with:
2020-04-07 15:00:53.606 12 WARNING oslo_db.
Because those haproxy ports (3306 in this specific case) will not appear until we run the converge step.
Only stein and train and onwards are affected (queens created iptables rules inside a transient container which was run at the right time)
Changed in tripleo: | |
status: | Triaged → In Progress |
Reviewed: https:/ /review. opendev. org/718159 /git.openstack. org/cgit/ openstack/ tripleo- heat-templates/ commit/ ?id=6220fe1bd31 9b8fabcc46b4b3f c705ac2f5526ed
Committed: https:/
Submitter: Zuul
Branch: master
commit 6220fe1bd319b8f abcc46b4b3fc705 ac2f5526ed
Author: Michele Baldessari <email address hidden>
Date: Tue Apr 7 18:00:13 2020 +0200
Move the haproxy iptables rules creation to host_prep_tasks
The reason for this is that under deploy_tasks they won't be run during
an update (until the converge command is run). This is problematic
because in a composable HA being updated the haproxy firewall rules
might disappear due to other tasks cleaning the rules up and they won't
be recreated until converge. The problem is that that the temporary
containers will run during the minor update and try to access the db
which is now effectively firewalled off.
Historically this was at step 2, because haproxy was configured during
that step. Nothing should prevent us from creating the rules before and
that is what we do for the non-haproxy rules too anyway.
While moving it we need to take out the code from :profile: :base:: haproxy and use it directly because we do not
::tripleo:
have the required 'step' variable set in host_prep_tasks and silly
puppet has now way of passing a hiera value on the command line (or via
other simple means)
Tested as follows: controller- 0 ~]# iptables -nvL INPUT |grep _haproxy |wc -l
1) Deployed a fresh Train environment with this patch and correctly
observed the haproxy fw rules:
[root@
27
2) Ran a minor update of controller-2, controller-1 and controller-0
(in that order) and verified that afterwards all _haproxy rules
are in place *before* the converge.
3) Confirmed that in the minor update logs we do see the step where controller- 2.log ******* ******* *****
haproxy rules are enforced (previously this was not the case):
$ grep 'Run puppet on the host to apply IPtables rules' update-
TASK [Run puppet on the host to apply IPtables rules] *******
4) Run a full minor update + converge of a composable HA environment
Closes-Bug: #1871646
Change-Id: Icba8a8292d1e26 75c7da3513d00a4 a0f4191747e