neutron-meter-agent - makes traffic between internal networks NATed

Bug #1539664 reported by Dmitry Sutyagin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Medium
MOS Neutron
6.1.x
Fix Released
High
MOS Maintenance
7.0.x
Won't Fix
Medium
MOS Maintenance
8.0.x
Won't Fix
Medium
MOS Neutron
9.x
Fix Released
High
Sergey Belous

Bug Description

If neutron-meter-agent is installed and enabled, and a meter-label is created, all traffic between internal networks becomes NATed, which is unexpected and potentially causes firewall/routing issues.

Verified on 6.1

Steps to reproduce:
1. create 2 internal non-colliding networks
2. create a router
3. add interfaces for each network in the router
4. add gateway to the router
5. spawn 2 VMs, one in each internal network
6. assign Floating IPs to VMs
7. allow ICMP from the two internal network CIDRs via security group
8. start pinging one VM from another by internal IP
8. install and configure neutron-meter-agent
9. create a meter-label with `neutron meter-label-create --shared testlabel`
10. do `crm resource restart p_neutron-l3-agent` to quickly refresh the namespaces (might be not necessary, not sure)

Expected result: ping will recover after a pause
Observed result: ping does not recover

The following traffic is observed in qrouter:

before creating meter-label:
root@node-5:~# tcpdump -i qr-ac3acd7b-10 -n -l
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on qr-ac3acd7b-10, link-type EN10MB (Ethernet), capture size 65535 bytes
15:32:43.973916 IP 192.168.1.2 > 192.168.0.2: ICMP echo request, id 35329, seq 30, length 64
15:32:43.974536 IP 192.168.0.2 > 192.168.1.2: ICMP echo reply, id 35329, seq 30, length 64
15:32:44.974347 IP 192.168.1.2 > 192.168.0.2: ICMP echo request, id 35329, seq 31, length 64
15:32:44.974940 IP 192.168.0.2 > 192.168.1.2: ICMP echo reply, id 35329, seq 31, length 64

after creating meter-label:
15:46:09.943503 IP 192.168.1.2 > 192.168.0.2: ICMP echo request, id 38657, seq 0, length 64
15:46:09.943529 IP 172.16.22.133 > 192.168.0.2: ICMP echo request, id 38657, seq 0, length 64
15:46:10.945857 IP 192.168.1.2 > 192.168.0.2: ICMP echo request, id 38657, seq 1, length 64
15:46:10.946377 IP 172.16.22.133 > 192.168.0.2: ICMP echo request, id 38657, seq 1, length 64

I will also attach iptables-save output before and after meter rules are injected.

Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :
Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :
Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

The issue is caused by that fact that the order of following rules is reverted:

before:
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom

after:
-A POSTROUTING -j neutron-postrouting-bottom
-A POSTROUTING -j neutron-l3-agent-POSTROUTING

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

As this does not affect the default use case and only potentially causes issues, I tentatively mark this as Medium until RCA is performed.

No longer fixing Medium bugs in 8.0, closing as Won't Fix

tags: added: area-neutron
Changed in mos:
status: New → Won't Fix
Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

Some intermediate results:

The issue is caused by the the way "agent/linux/iptables_manager.py" -> "IpTablesManager" -> "_modify_rules" function works.

Metering agent requests to add the following rules to iptables:
-A PREROUTING -j neutron-meter-PREROUTING
-A OUTPUT -j neutron-meter-OUTPUT
-A POSTROUTING -j neutron-meter-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-postrouting-bottom -j neutron-meter-snat
-A neutron-meter-snat -j neutron-meter-float-snat

One of these rules is already present in iptables, which is "-A POSTROUTING -j neutron-postrouting-bottom". This rule is necessary to be added to make this ruleset valid, otherwise the next two dependent lines will fail. This causes two issues further in the code:
1. Since it is already present in the original filter, it gets deleted from the original filter list to prevent duplicates. At this stage we lose the correct position of this rule in the filter.
2. "_modify_rules" function does not know how to insert rules in specific positions - it essentially puts new rules before all the already present. This makes it impossible to use rulesets such as the one pushed by metering agent without disturbing the order of the rules.

I see these possible solutions:
a) The proper ordering and injecting needs to be performed outside of "_modify_rules", in our case metering agent should get the full iptables and work with it to inject the rules correctly, then provide all the rules to the "_modify_rules" - not just the new ones.
b) "_modify_rules" can be more clever and insert rules not only at the beginning but trying to preserve the positions of "duplicates" and dependent elements in the original list, inserting the new rules where appropriate.

All in all it looks like the logic of iptables management needs some rework, because currently it looks like a set of hacks.

IMHO "rules" object should contain not only the rules, but also the position where to insert them, and probably a few new functions can be introduced which can fill that field depending on the input (first in chain, last in chain, after regexp element, before regexp element - something like that) and at the same time get rid of duplicates. This could make the code of "_modify_rules" way cleaner and provide more flexibility to iptables management.

Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

The bug is born in "agent/linux/iptables_manager.py" -> "IpTablesManager" -> "__init__"

Metering agent only creates a generic IpTablesManager and then calls apply(). During the __init__ of IpTablesManager "neutron-postrouting-bottom" rule is put into it's structure:
https://github.com/openstack/neutron/blob/master/neutron/agent/linux/iptables_manager.py#L365
Which then causes this rule to change it's position during "_modify_rules" in case it was already created by a different service, as described in the above comment. In our case l3-agent has previously created this rule.

So it looks like
# Add a neutron-postrouting-bottom chain. It's intended to be
# shared among the various neutron components. We set it as the
# last chain of POSTROUTING chain.
Does not work as expected by the initial code author - even though it is set as the last rule in the current IpTablesManager's table, the currently active table is not taken into account here, but later referenced in "_modify_rules" which results in this rule's original position being discarded. Probably since this is a rule with special logic then "_modify_rules" should get special section for it too.

As far as I can see, this bug affects not only neutron-metering-agent but any other service which needs to update iptables by adding a new set of rules via IpTablesManager.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (stable/juno)

Fix proposed to branch: stable/juno
Change author: Dmitry Sutyagin <email address hidden>
Review: https://review.fuel-infra.org/16628

Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

I have created a tentative fix for 6.1

Please review

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-6.1/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.1/2014.2
Change author: Dmitry Sutyagin <email address hidden>
Review: https://review.fuel-infra.org/16636

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/neutron (stable/juno)

Change abandoned by Dmitry Sutyagin <email address hidden> on branch: stable/juno
Review: https://review.fuel-infra.org/16628
Reason: wrong branch

Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

created an upstream bug https://bugs.launchpad.net/neutron/+bug/1544508, , attached a patch for master there

tags: added: wontfix-low
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-6.1/2014.2)

Reviewed: https://review.fuel-infra.org/16636
Submitter: Vitaly Sedelnik <email address hidden>
Branch: openstack-ci/fuel-6.1/2014.2

Commit: 1ba05e819f1bbdbf44ed79c1ace764a0bbfdc540
Author: Dmitry Sutyagin <email address hidden>
Date: Thu Feb 11 10:16:05 2016

Make metering agent iptables stateless

Prevents metering agent from misplacing the
'-A POSTROUTING -j neutron-postrouting-bottom' rule
which causes premature NATing. Metering agent operates
only within "FORWARD" chain so this change should have
no impact on current functionality.

Change-Id: I1a4a7f9d2f986cb257d1f1e8fd378c4909645770
Closes-bug: #1539664

tags: added: on-verification
Revision history for this message
Alexander Gromov (agromov) wrote :

Verified on 6.1+mu5.

tags: removed: on-verification
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Hi dev team, could you please include the fix in MOS 9.0 (as far it is fixed in 6.1 updates already) ?

Thank you!

tags: added: keep-in-9.0
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (9.0/mitaka)

Fix proposed to branch: 9.0/mitaka
Change author: Dmitry Sutyagin <email address hidden>
Review: https://review.fuel-infra.org/18878

tags: added: wait-for-stable
Revision history for this message
Sergey Belous (sbelous) wrote :

Patch to stable/mitaka on review https://review.openstack.org/#/c/300288/

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Priority changed to High (we already have inline fix on review for 9.0 release).
Please make sure that the fix will be included to MOS 9.0 release.

Thank you!

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/neutron (9.0/mitaka)

Change abandoned by Sergey Belous <email address hidden> on branch: 9.0/mitaka
Review: https://review.fuel-infra.org/18878
Reason: Waiting for merge this patch https://review.openstack.org/#/c/300288/ in stable/mitaka

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/19544
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 9e90051f07e35d29848e7f679a02d0cd992d541a
Author: Jenkins <email address hidden>
Date: Wed Apr 13 13:22:17 2016

Merge the tip of origin/stable/mitaka into origin/9.0/mitaka

05a4a34 Notify resource_versions from agents only when needed
fff909e Values for [ml2]/physical_network_mtus should not be unique
6814411 Imported Translations from Zanata
a2d1c46 firewall: don't warn about a driver that does not accept bridge
fa5eb53 Add uselist=True to subnet rbac_entries relationship
c178bd9 Fix race conditions in IP availability API tests
ee32ea5 Switched from fixtures to mock to mock out starting RPC consumers
77696d8 Imported Translations from Zanata
3190494 Fix zuul_cloner errors during tox job setup
04fb147 Refactor and fix dummy process fixture
844cae4 Switches metering agent to stateless iptables
19ea6ba Remove obsolete keepalived PID files before start
aafa702 Add IPAllocation object to session info to stop GC
005d49d Ensure metadata agent doesn't use SSL for UNIX socket
905fd05 DVR: Increase the link-local address pair range
93d719a SG protocol validation to allow numbers or names
33d3b8c L3 agent: match format used by iptables
7b2fcaa Use right class method in IP availability tests
93cdf8e Make L3 HA interface creation concurrency safe
d934669 ovsfw: Remove vlan tag before injecting packets to port
33c01f4 Imported Translations from Zanata
05ac012 test_network_ip_availability: Skip IPv6 tests when configured so
38894cc Retry updating agents table in case of deadlock
aac460b Allow to use several nics for physnet with SR-IOV
90b9cd3 port security: gracefully handle resources with no bindings
7174bc4 Ignore exception when deleting linux bridge if doesn't exist
93d29d1 Don't delete br-int to br-tun patch on startup
211e0a6 functional: Update ref used from ovs branch-2.5.
c6ef57a ovs-fw: Mark conntrack entries invalid if no rule is matched
ef6ea62 l3: Send notify on router_create when ext gw is specified
eb8ddb9 Move db query to fetch down bindings under try/except
da1eee3 Close XenAPI sessions in neutron-rootwrap-xen-dom0
1d51172 Watch for 'new' events in ovsdb monitor for ofport
bd3e9c3 Removes host file contents from DHCP agent logs

Closes-Bug: #1569735
Closes-Bug: #1569738
Closes-Bug: #1539664
Closes-Bug: #1560727
Closes-Bug: #1558613
Closes-Bug: #1545756

Change-Id: Ia30076744f13666f950fee78a86a8c81f7207206

Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Verify on 9.0:
[root@fuel ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 389
cat /etc/fuel_build_number:
 389
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6346.noarch
 fuel-bootstrap-cli-9.0.0-1.mos282.noarch
 fuel-migrate-9.0.0-1.mos8378.noarch
 rubygem-astute-9.0.0-1.mos745.noarch
 fuel-misc-9.0.0-1.mos8378.noarch
 network-checker-9.0.0-1.mos72.x86_64
 fuel-mirror-9.0.0-1.mos136.noarch
 fuel-openstack-metadata-9.0.0-1.mos8693.noarch
 fuel-notify-9.0.0-1.mos8378.noarch
 nailgun-mcagents-9.0.0-1.mos745.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8693.noarch
 python-fuelclient-9.0.0-1.mos315.noarch
 fuelmenu-9.0.0-1.mos270.noarch
 fuel-9.0.0-1.mos6346.noarch
 fuel-utils-9.0.0-1.mos8378.noarch
 fuel-setup-9.0.0-1.mos6346.noarch
 fuel-library9.0-9.0.0-1.mos8378.noarch
 shotgun-9.0.0-1.mos88.noarch
 fuel-agent-9.0.0-1.mos282.noarch
 fuel-ui-9.0.0-1.mos2696.noarch
 fuel-ostf-9.0.0-1.mos934.noarch
 python-packetary-9.0.0-1.mos136.noarch
 fuel-nailgun-9.0.0-1.mos8693.noarch
(neutron + vxlan)

Reproduced on 9.0 iso 134 (2016-03-30)

Steps:
1) download and install metering agent
2) check rules
3) create meter-label
4) check rules:
sudo ip net e qrouter-<id_router> iptables-save

Now the rules are in the correct order:
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.