Established connection don't stops when rule is removed

Bug #1657260 reported by Slawek Kaplonski
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Kevin Benton

Bug Description

If iptables driver is used for Security groups (e.g. in Linuxbridge L2 agent) there is an issue with update rules. When You have rule which allows some kind of traffic (like ssh for example from some src IP address) and You have established, active connection which match this rule, connection will be still active even if rule will be removed/changed.
It is because in iptables in chain for each SG as first there is rule to accept packets with "state RELATED,ESTABLISHED".
I'm not sure if it is in fact bug or maybe it's just design decision to have better performance of iptables.

Revision history for this message
Brian Haley (brian-haley) wrote :

When a SG rule is removed, conntrack -D is run to drop all the connections that might be using the old rule. Did you see that happen in the logs?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Brian: I didn't found something like that in logs. I found this issue when I was working on fullstack test for security groups in linuxbridge agent (https://review.openstack.org/#/c/417532/).
If You say that it should works properly, I will check it once again and I will also check what can be then wrong with it.

Revision history for this message
Brian Haley (brian-haley) wrote :

I just tried on a devstack from yesterday and it worked. Check in the agent log that conntrack -D was called for the instance's IP address(es) - I tried both IPv4 and IPv6.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Brian: ok, I will check it today but later. Thx once again

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Brian: I made dummy "patch" to test it with fullstack tests. You can check it on https://review.openstack.org/#/c/425923/
I tested it locally and it failes for Linuxbridge agent on my host. I want to be sure that it's not only issue on my host so I pushed this patch to test on gate.
But when I was working on this fullstack patch some time ago, I remember that I also tested it on other host with devstack installed and I had same issue.
My test was like:
1. spawn vm (cirros or ubuntu, I don't remember exectly)
2. connect with netcat to this vm from qrouter (or qdhcp) namespace - connection was not possible without proper security group rule
3. add SG rule - connection with netcat possible
4. when connection with netcat was still active I removed SG rule and check that it was removed from iptables - active connection was still active
5. disconnect netcat - I couldn't connect again to vm in same way

So IMHO there is some issue with this.
Can You write me exactly how You made Your test?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

@Brian: On http://logs.openstack.org/23/425923/1/check/gate-neutron-dsvm-fullstack-ubuntu-xenial/a9b4fe6/testr_results.html.gz You can check result of this test also in gate. It fails as I expected. IMHO it is related to what I described in this bug report

Revision history for this message
Brian Haley (brian-haley) wrote :

Slawek, I tested this manually on devstack using OVS with the iptables_hybrid firewall driver.

1. Added ICMP and SSH rules to default SG
2. Booted VM and associated floating IP
3. Ran ping; deleted ICMP rule; ping stopped; added rule back; ping resumed
4. Ran ssh; deleted SSH rule; session "hung"; added rule back; session resumed

I also looked in the q-agt log and saw conntrack -D was run:

Running command (rootwrap daemon): ['conntrack', '-D', '-p', 'icmp', '-f', 'ipv4', '-d', '10.0.0.5', '-w', '2'] execute_rootwrap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:113

Running command (rootwrap daemon): ['conntrack', '-D', '-p', 'tcp', '-f', 'ipv4', '-d', '10.0.0.5', '-w', '2'] execute_rootwrap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:113

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Brian: so it looks that it's not working only for "no hybrid" driver (and maybe only with Linuxbridge agent).
You can see that ovs with hybrid driver is working also in same fullstack test: http://logs.openstack.org/23/425923/1/check/gate-neutron-dsvm-fullstack-ubuntu-xenial/a9b4fe6/console.html#_2017-01-26_21_32_18_925037

In Linuxbridge agent logs there is also this conntrack -D call visible but connection is not stopped. I will try to check it more and write here if I will find anything.

Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I think I found what is the issue. In OVSHybridIptablesFirewallDriver there are CT zones set. In IptablesFirewallDriver it's not set but conntrack manager tries to remove entry from such zone and it fails.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/426429

Changed in neutron:
status: New → In Progress
Changed in neutron:
milestone: none → ocata-rc1
Changed in neutron:
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/426429
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=10bfa690885f06316ccec1fee39e51ca64058443
Submitter: Jenkins
Branch: master

commit 10bfa690885f06316ccec1fee39e51ca64058443
Author: Sławek Kapłoński <email address hidden>
Date: Fri Jan 27 23:19:25 2017 +0000

    Clear conntrack entries without zones if CT zones are not used

    CT zones are used only in OVSHybridIptablesFirewallDriver.
    Such zones are not set in IptablesFirewallDriver class but
    even if iptables driver was is not using CT zones, it was
    used by conntrack manager class during delete of conntrack
    entry.
    This cause issue that for Linuxbridge agent established and
    active connection stayed active even after security group
    rule was deleted.
    This patch changes conntrack manager class that it will not
    use CT zone (-w option) if zone for port was not assigned
    earlier.

    Change-Id: Ib9c8d0a09d0858ff6f36db406c6b2a9191f304d1
    Closes-bug: 1657260

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0rc1

This issue was fixed in the openstack/neutron 10.0.0.0rc1 release candidate.

Revision history for this message
Jakub Libosvar (libosvar) wrote :

The issue still exists and currently is failing fullstack tests.

I was able to reproduce the issue locally. After SG rule was removed, I still see

tcp 6 431985 ESTABLISHED src=20.0.0.10 dst=20.0.0.9 sport=42308 dport=3355 src=20.0.0.9 dst=20.0.0.10 sport=3355 dport=42308 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1

in conntrack and increasing counters in iptables for following rule:

Chain neutron-linuxbri-i886980ff-0 (1 references)
 pkts bytes target prot opt in out source destination
   25 1460 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED /* Direct packets associated with a known session to the RETURN chain. */

linuxbridge-agent logs say:

2017-03-15 11:50:06.179 11755 DEBUG neutron.agent.linux.ip_conntrack [req-b776ebc8-72f4-4385-98d4efa38ecb63a9 - - - - -] No zone for device tap886980ff-0c. Will not try to clear conntrack state. Zone map: {} _get_conntrack_cmds /opt/stack/neutron/neutron/agent/linux/ip_conntrack.py:83

Changed in neutron:
status: Fix Released → Confirmed
Revision history for this message
Jakub Libosvar (libosvar) wrote :
Revision history for this message
Jakub Libosvar (libosvar) wrote :
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

So it looks that CT zones should be added also for "non hybrid" driver

Changed in neutron:
assignee: Slawek Kaplonski (slaweq) → Jakub Libosvar (libosvar)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/446099

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Jakub Libosvar (<email address hidden>) on branch: master
Review: https://review.openstack.org/446099
Reason: Abandoned in favor of https://review.openstack.org/#/c/441353/5

Changed in neutron:
assignee: Jakub Libosvar (libosvar) → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/441353
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c76164c058a0cfeee3eb46b523a9ad012f93dd51
Submitter: Jenkins
Branch: master

commit c76164c058a0cfeee3eb46b523a9ad012f93dd51
Author: Kevin Benton <email address hidden>
Date: Fri Mar 3 11:18:28 2017 -0800

    Move conntrack zones to IPTablesFirewall

    The regular IPTablesFirewall needs zones to support safely
    clearly conntrack entries.

    In order to support the single bridge use case, the conntrack
    manager had to be refactored slightly to allow zones to be
    either unique to ports or unique to networks.

    Since all ports in a network share a bridge in the IPTablesDriver
    use case, a zone per port cannot be used since there is no way
    to distinguish which zone traffic should be checked against when
    traffic enters the bridge from outside the system.

    A zone per network is adequate for the single bridge per network
    solution since it implicitly does not suffer from the double-bridge
    cross in a single network that led to per port usage in OVS.[1]

    This had to adjust the functional firewall tests to use the correct
    bridge name now that it's relevant in the non hybrid IPTables case.

    1. Ibe9e49653b2a280ea72cb95c2da64cd94c7739da

    Closes-Bug: #1668958
    Closes-Bug: #1657260
    Change-Id: Ie88237d3fe4807b712a7ec61eb932748c38952cc

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/455399

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b1

This issue was fixed in the openstack/neutron 11.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata)

Reviewed: https://review.openstack.org/455399
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9a920fe0a561d36db95e27ac5673a9dba4d845d3
Submitter: Jenkins
Branch: stable/ocata

commit 9a920fe0a561d36db95e27ac5673a9dba4d845d3
Author: Kevin Benton <email address hidden>
Date: Fri Mar 3 11:18:28 2017 -0800

    Move conntrack zones to IPTablesFirewall

    The regular IPTablesFirewall needs zones to support safely
    clearly conntrack entries.

    In order to support the single bridge use case, the conntrack
    manager had to be refactored slightly to allow zones to be
    either unique to ports or unique to networks.

    Since all ports in a network share a bridge in the IPTablesDriver
    use case, a zone per port cannot be used since there is no way
    to distinguish which zone traffic should be checked against when
    traffic enters the bridge from outside the system.

    A zone per network is adequate for the single bridge per network
    solution since it implicitly does not suffer from the double-bridge
    cross in a single network that led to per port usage in OVS.[1]

    This had to adjust the functional firewall tests to use the correct
    bridge name now that it's relevant in the non hybrid IPTables case.

    1. Ibe9e49653b2a280ea72cb95c2da64cd94c7739da

    Closes-Bug: #1668958
    Closes-Bug: #1657260
    Change-Id: Ie88237d3fe4807b712a7ec61eb932748c38952cc
    (cherry picked from commit c76164c058a0cfeee3eb46b523a9ad012f93dd51)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/460903

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/460906

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/462614

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/460903
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7cbb44836b75c202e4bed822d6ce212dafbea225
Submitter: Jenkins
Branch: stable/newton

commit 7cbb44836b75c202e4bed822d6ce212dafbea225
Author: Sławek Kapłoński <email address hidden>
Date: Fri Jan 27 23:19:25 2017 +0000

    Clear conntrack entries without zones if CT zones are not used

    CT zones are used only in OVSHybridIptablesFirewallDriver.
    Such zones are not set in IptablesFirewallDriver class but
    even if iptables driver was is not using CT zones, it was
    used by conntrack manager class during delete of conntrack
    entry.
    This cause issue that for Linuxbridge agent established and
    active connection stayed active even after security group
    rule was deleted.
    This patch changes conntrack manager class that it will not
    use CT zone (-w option) if zone for port was not assigned
    earlier.

    Closes-bug: 1657260

    Conflicts:
     neutron/tests/fullstack/test_securitygroup.py

    Change-Id: Ib9c8d0a09d0858ff6f36db406c6b2a9191f304d1
    (cherry-picked from 10bfa690885f06316ccec1fee39e51ca64058443)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/460906
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f142cde767c9ff1d9f787048bb4754b95aea8e84
Submitter: Jenkins
Branch: stable/newton

commit f142cde767c9ff1d9f787048bb4754b95aea8e84
Author: Kevin Benton <email address hidden>
Date: Fri Mar 3 11:18:28 2017 -0800

    Move conntrack zones to IPTablesFirewall

    The regular IPTablesFirewall needs zones to support safely
    clearly conntrack entries.

    In order to support the single bridge use case, the conntrack
    manager had to be refactored slightly to allow zones to be
    either unique to ports or unique to networks.

    Since all ports in a network share a bridge in the IPTablesDriver
    use case, a zone per port cannot be used since there is no way
    to distinguish which zone traffic should be checked against when
    traffic enters the bridge from outside the system.

    A zone per network is adequate for the single bridge per network
    solution since it implicitly does not suffer from the double-bridge
    cross in a single network that led to per port usage in OVS.[1]

    This had to adjust the functional firewall tests to use the correct
    bridge name now that it's relevant in the non hybrid IPTables case.

    1. Ibe9e49653b2a280ea72cb95c2da64cd94c7739da

    Closes-Bug: #1668958
    Closes-Bug: #1657260
    Change-Id: Ie88237d3fe4807b712a7ec61eb932748c38952cc
    (cherry picked from commit c76164c058a0cfeee3eb46b523a9ad012f93dd51)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.4.0

This issue was fixed in the openstack/neutron 9.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.2

This issue was fixed in the openstack/neutron 10.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/462614
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.