undercloud : firewall rules not persisted accross reboot

Bug #1854980 reported by Harald Jensås
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Kevin Carter

Bug Description

[stack@leafs ~]$ rpm -q openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-12.0.1-0.20191129004429.8baf366.el7.noarch

/**
  * Install undercloud
  */

[stack@leafs ~]$ sudo iptables -L INPUT -v
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target prot opt in out source destination
 203K 50M neutron-openvswi-INPUT all -- any any anywhere anywhere
 468K 104M ACCEPT all -- any any anywhere anywhere /* 000 accept related established rules ipv4 */ ctstate RELATED,ESTABLISHED
    0 0 ACCEPT icmp -- any any anywhere anywhere /* 001 accept all icmp ipv4 */ ctstate NEW
16267 976K ACCEPT all -- lo any anywhere anywhere /* 002 accept all to lo interface ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any 172.20.0.0/26 anywhere tcp dpt:ssh /* 003 accept ssh from ctlplane subnet 172.20.0.1/26 ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any anywhere anywhere udp dpt:ntp /* 105 ntp ipv4 */ ctstate NEW
    0 0 ACCEPT vrrp -- any any anywhere anywhere /* 106 keepalived vrrp ipv4 */ ctstate NEW
    0 0 ACCEPT vrrp -- any any anywhere anywhere /* 106 neutron_l3 vrrp ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:snmp-tcp-port /* 107 haproxy stats ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:25672 /* 109 rabbitmq ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:amqp /* 109 rabbitmq ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:epmd /* 109 rabbitmq ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:openstack-id /* 111 keystone ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13000 /* 111 keystone ipv4 */ ctstate NEW
   14 840 ACCEPT tcp -- any any anywhere anywhere tcp dpt:commplex-main /* 111 keystone ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13292 /* 112 glance_api ipv4 */ ctstate NEW
    8 480 ACCEPT tcp -- any any anywhere anywhere tcp dpt:armtechdaemon /* 112 glance_api ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13774 /* 113 nova_api ipv4 */ ctstate NEW
    9 540 ACCEPT tcp -- any any anywhere anywhere tcp dpt:8774 /* 113 nova_api ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13888 /* 113 zaqar_api ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:hbci /* 113 zaqar_api ipv4 */ ctstate NEW
   11 660 ACCEPT tcp -- any any anywhere anywhere tcp dpt:ddi-tcp-1 /* 113 zaqar_api ipv4 */ ctstate NEW
   10 600 ACCEPT tcp -- any any anywhere anywhere tcp dpt:cslistener /* 113 zaqar_api ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13696 /* 114 neutron api ipv4 */ ctstate NEW
   11 660 ACCEPT tcp -- any any anywhere anywhere tcp dpt:9696 /* 114 neutron api ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any anywhere anywhere udp dpt:bootps /* 115 neutron dhcp input ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any anywhere anywhere udp dpt:4789 /* 118 neutron vxlan networks ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:websm /* 119 novajoin ipv4 */ ctstate NEW
    1 60 ACCEPT tcp -- any any 172.20.0.0/26 anywhere tcp dpt:memcache /* 121 memcached 172.20.0.1/26 ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13808 /* 122 swift proxy ipv4 */ ctstate NEW
   19 1140 ACCEPT tcp -- any any anywhere anywhere tcp dpt:webcache /* 122 swift proxy ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:6002 /* 123 swift storage ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:6001 /* 123 swift storage ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:x11 /* 123 swift storage ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:rsync /* 123 swift storage ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any 172.20.0.0/26 anywhere udp dpt:snmp /* 124 snmp 172.20.0.1/26 ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13004 /* 125 heat_api ipv4 */ ctstate NEW
   22 1320 ACCEPT tcp -- any any anywhere anywhere tcp dpt:8004 /* 125 heat_api ipv4 */ ctstate NEW
    1 60 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13385 /* 133 ironic api ipv4 */ ctstate NEW
   29 1740 ACCEPT tcp -- any any anywhere anywhere tcp dpt:6385 /* 133 ironic api ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13989 /* 133 mistral ipv4 */ ctstate NEW
   22 1320 ACCEPT tcp -- any any anywhere anywhere tcp dpt:sunwebadmins /* 133 mistral ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any anywhere anywhere udp dpt:tftp /* 134 ironic conductor TFTP ipv4 */ ctstate NEW
    1 60 ACCEPT tcp -- any any anywhere anywhere tcp dpt:radan-http /* 135 ironic conductor HTTP ipv4 */ ctstate NEW
    0 0 ACCEPT gre -- any any anywhere anywhere /* 136 neutron gre networks ipv4 */
   26 1560 ACCEPT tcp -- any any anywhere anywhere tcp dpt:mmcc /* 137 ironic-inspector ipv4 */ ctstate NEW
    0 0 ACCEPT udp -- any any anywhere anywhere udp dpt:bootps /* 137 ironic-inspector dhcp input ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13778 /* 138 placement ipv4 */ ctstate NEW
   34 2040 ACCEPT tcp -- any any anywhere anywhere tcp dpt:8778 /* 138 placement ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:13787 /* 155 docker-registry ipv4 */ ctstate NEW
    0 0 ACCEPT tcp -- any any anywhere anywhere tcp dpt:msgsrvr /* 155 docker-registry ipv4 */ ctstate NEW
  360 23304 LOG all -- any any anywhere anywhere /* 998 log all ipv4 */ ctstate NEW limit: avg 20/min burst 15 LOG level warning
  412 26424 ACCEPT all -- any any anywhere anywhere /* 999 drop all ipv4 */ ctstate NEW
3179K 598M ACCEPT all -- any any anywhere anywhere ctstate RELATED,ESTABLISHED
 122K 7291K ACCEPT all -- lo any anywhere anywhere
  865 93594 INPUT_direct all -- any any anywhere anywhere
  865 93594 INPUT_ZONES_SOURCE all -- any any anywhere anywhere
  865 93594 INPUT_ZONES all -- any any anywhere anywhere
    0 0 DROP all -- any any anywhere anywhere ctstate INVALID
  864 93534 REJECT all -- any any anywhere anywhere reject-with icmp-host-prohibited

/**
  * Do a reboot
  */

[stack@leafs ~]$ sudo reboot
Connection to leafs.lab.example.com closed by remote host.
Connection to leafs.lab.example.com closed.

[hjensas@hjensas ~]$ ssh <email address hidden>
<email address hidden>'s password:
Last login: Tue Dec 3 17:28:29 2019 from 192.168.122.1

/**
  * Firewall rules are gone
  */

[stack@leafs ~]$ sudo iptables -L INPUT -v
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target prot opt in out source destination
 1854 434K ACCEPT all -- any any anywhere anywhere ctstate RELATED,ESTABLISHED
  211 12677 ACCEPT all -- lo any anywhere anywhere
    3 204 INPUT_direct all -- any any anywhere anywhere
    3 204 INPUT_ZONES_SOURCE all -- any any anywhere anywhere
    3 204 INPUT_ZONES all -- any any anywhere anywhere
    0 0 DROP all -- any any anywhere anywhere ctstate INVALID
    2 144 REJECT all -- any any anywhere anywhere reject-with icmp-host-prohibited

Revision history for this message
Harald Jensås (harald-jensas) wrote :

Reverting commit 50367fbe3563d34976deb377ed32b6f26aeca44f fixes the issue.

Revision history for this message
Kevin Carter (kevin-carter) wrote :

In the deployment output, or the deployment logs, can you check if the task "Save firewall rules .*" was executed?

We can see in the role that the rule save commands are being defined [0], and there is an appropriate notify [1], however, we'll need to confirm that those tasks are actually executed as expected.

[0] https://github.com/openstack/tripleo-ansible/blob/8ca1ca46a966d45502d055fff16b0416b49ee009/tripleo_ansible/roles/tripleo-firewall/handlers/main.yml
[1] https://github.com/openstack/tripleo-ansible/blob/cbdbfe9cfb03047d63a37103cc217d98da519d38/tripleo_ansible/roles/tripleo-firewall/tasks/tripleo_firewall_add.yml#L79

Revision history for this message
Harald Jensås (harald-jensas) wrote :

[stack@leafs ~]$ grep "Save firewall rules .*" install-undercloud.log

2019-12-04 01:28:17.989 15496 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] RUNNING HANDLER [tripleo-firewall : Save firewall rules ipv4] ************************************************************************************************************
2019-12-04 01:28:18.306 15496 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] RUNNING HANDLER [tripleo-firewall : Save firewall rules ipv6] ************************************************************************************************************

 ^^^ Looks like the "Save firewall rules" tasks run.

Revision history for this message
Harald Jensås (harald-jensas) wrote :

When reproducint this on a different node I got a slightly different result. The previous comments is an undercloud running on a libvirt VM on my laptop, where I consistently see all firewall rules lost on reboot. When I try this on an undercloud running in an rdocloud instance only two firewall rules are lost on reboot.) See attached file.

Revision history for this message
Kevin Carter (kevin-carter) wrote :

Do we see package differences between your local test environment and the rdocloud instance?
Maybe the two missing rules are being applied outside of the tripleo_firewall role?

We can modify the role to flush in tasks instead of using the handler, however given we're seeing the rule save task execute I'm not sure that will make a difference.

Revision history for this message
Harald Jensås (harald-jensas) wrote :

The rdoinstance have:
 openstack-tripleo-heat-templates-12.0.1-0.20191203161714.d85e4ad.el7.noarch

And it also deployed the undercloud using a git clone of the tripleo-heat-templates master git branch.

We can assume other packages also changed. (I had to rebuild the libvirt instance with queens so I lost the details there.)

These are the missing rules:
https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/ironic/ironic-inspector-container-puppet.yaml#L185-L193

I believe these are applied by the tripleo_firewall role?

Revision history for this message
Harald Jensås (harald-jensas) wrote :
Download full text (5.8 KiB)

I re-deployed the libvirt undercloud, it now uses the same delorean repo as my rdocloud instance:
baseurl=https://trunk.rdoproject.org/centos7/83/43/83439525641e9908d75ed8ba8bced96fcef640af_6db41c91

This node also loose the deployment/ironic/ironic-inspector-container-puppet.yaml#L185-L193 rules in the IPv4 table. I believe the reason these are removed is the non-persistent rules:
https://github.com/openstack/tripleo-ansible/blob/8ca1ca46a966d45502d055fff16b0416b49ee009/tripleo_ansible/roles/tripleo-firewall/handlers/main.yml#L46

 This is done wrong, it's not supposed to remove the rules which have ironic-inspector in the comments. The puppet code works like this:

a) Only run if any line NOT including "\-m comment \--comment" in it and includes ironic-inspector.
 https://github.com/openstack/puppet-tripleo/blob/master/manifests/firewall.pp#L185
b) The sed command then ensures to print ';p' the lines including /-m comment --comment.*ironic-inspector/ and delete lines with "ironic-inspector".
https://github.com/openstack/puppet-tripleo/blob/master/manifests/firewall.pp#L182

Conclusion, the tripleo_firewall implementation is not correctly implemented, and removes ironic-inspector firewall rules with a comment as well as the ironic-inspector rules without a comment. It should only remove rules without a comment.

The IPv6 table rules on the nodes are in worse shape ... so there is more stuff not right.

On one node I have:

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target prot opt in out source destination
67246 12M neutron-openvswi-INPUT all * * ::/0 ::/0
72871 14M ACCEPT all * * ::/0 ::/0 ctstate RELATED,ESTABLISHED
 3194 256K ACCEPT all lo * ::/0 ::/0
    6 520 INPUT_direct all * * ::/0 ::/0
    6 520 INPUT_ZONES_SOURCE all * * ::/0 ::/0
    6 520 INPUT_ZONES all * * ::/0 ::/0
    0 0 DROP all * * ::/0 ::/0 ctstate INVALID
    0 0 REJECT all * * ::/0 ::/0 reject-with icmp6-adm-prohibited

On the other node it sets only the neutron-openvswi-INPUT rule is present in the INPUT chain.

Chain INPUT (policy ACCEPT 89361 packets, 19M bytes)
 pkts bytes target prot opt in out source destination
89361 19M neutron-openvswi-INPUT all * * ::/0 ::/0

A couple of things I notice:

1. The firewalld service is running?

puppetlabs-firewall will disable that for RedHat distros:
https://github.com/puppetlabs/puppetlabs-firewall/blob/f4d3f95cdb36304e0fce2924aa2eb48ca6358851/manifests/linux/redhat.pp#L53

Are we supposed to run firewalld now?

● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (run...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/699477

Changed in tripleo:
assignee: nobody → Kevin Carter (kevin-carter)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Harald Jensås (<email address hidden>) on branch: master
Review: https://review.opendev.org/699477
Reason: https://review.opendev.org/699489 and https://review.opendev.org/699486 should fix this issue.

Changed in tripleo:
milestone: ussuri-1 → ussuri-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/699486
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=4b4d93ad041e64c7ad2da9d6072160764e5b7d41
Submitter: Zuul
Branch: master

commit 4b4d93ad041e64c7ad2da9d6072160764e5b7d41
Author: Kevin Carter <email address hidden>
Date: Tue Dec 17 13:38:27 2019 -0600

    Update firewall role handlers

    These changes will do the following

    > Ensure that the iptables services for both ipv4 and ipv6 are enabled
      and started
    > Ensure the firewalld service is stopped and disabled.
    > Ensure that the tripleo-iptables services for both ipv4 and ipv6 are
      enabled and started
    > Extend the non-persistent rule lookup to all iptables sysconfig files.

    Related-Bug: #1854980
    Change-Id: I23898d7536edbedc6abca34eede8c680fe2c39d3
    Signed-off-by: Kevin Carter <email address hidden>

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/699489
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=cc4919f95153207e764640c32b6e49a876fe8b64
Submitter: Zuul
Branch: master

commit cc4919f95153207e764640c32b6e49a876fe8b64
Author: Kevin Carter <email address hidden>
Date: Tue Dec 17 14:12:34 2019 -0600

    Update regex for presistent save roles

    The regex we were using was too general and was not removing the right
    rules. This change will make it so that the task will remove lines that
    do NOT have a comment, AND contain either "ironic-inspector" OR "neutron-".

    This change should provide parity with the old persistent save rule
    function we had in puppet.

    Closes-Bug: #1854980
    Change-Id: I6a7d82938bfb138aa3de875184af8b38babed84e
    Signed-off-by: Kevin Carter <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 1.1.0

This issue was fixed in the openstack/tripleo-ansible 1.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.