Comment 2 for bug 1657108

Revision history for this message
Michele Baldessari (michele) wrote :

Ok so the bug is really somewhere in tripleo::firewall. Here is what happens:
1) Image starts with prepopulated /etc/sysconfig/iptables only allowing ssh and icmp
2) During boot (either because the puppet class "firewall" enforces it or because the image has it configured) the "iptables" systemd service starts and sets the above iptables rules
3) We have the following in puppet-pacemaker:
Service['pcsd'] -> exec { 'auth-successful-across-all-nodes':
  command => "${::pacemaker::pcs_bin} cluster auth ${cluster_members} -u hacluster -p ${::pacemaker::hacluster_pwd}",
4) tripleo::firewall guarantees the following:
Class['tripleo::firewall::pre'] -> Class['tripleo::firewall::post']
Service<||> -> Class['tripleo::firewall::post']

So we have this sequence overall:
A) Class['tripleo::firewall::pre'] -> B) Service['pcsd'] -> C) exec { 'auth-successful-across-all-nodes'} -> D) Class['tripleo::firewall::post']

The problem being that when C) runs there are no open ports yet so it will hang trying for many times.

Potential solutions:
A) Empty /etc/sysconfig/iptables in the image itself (it makes little sense to have it anyway)
B) Find another way to purge the rules in there. I tried the following:
diff --git a/manifests/firewall.pp b/manifests/firewall.pp
index 8c6a53b..d577ca1 100644
--- a/manifests/firewall.pp
+++ b/manifests/firewall.pp
@@ -56,7 +56,8 @@ class tripleo::firewall(
     # Only purges IPv4 rules
     if $purge_firewall_rules {
       resources { 'firewall':
- purge => true
+ purge => true,
+ before => Class['tripleo::firewall::pre'],
       }
     }

and with setting:
parameter_defaults:
  PurgeFirewallRules: true

but somehow it does clean the live rules on the system but it will start the iptables service again which will reprovision the previous rules.

C) Add some special rules in firewall::pre that open up the cluster ports (or any service that might be impacted)
D) Any other approaches here?

Note that with commit 2ca3cb03ad5f05469e5ae181981e559ccc77371f "firewall: stop using stdlib stages" we stated that:
- use ordering to make sure we start all Services in catalog before post
  rules. It ensure that we don't drop all traffic before starting the
  services, which could lead to services errors (e.g. trying to reach database
  or amqp)

The problem is that the above holds true only when the iptables starts as clean.