With DVR Pings to floating IPs replied with fixed-ips if VMs are on the same network

Bug #1462154 reported by Stephen Ma on 2015-06-05
56
This bug affects 10 people
Affects Status Importance Assigned to Milestone
neutron
High
Brian Haley

Bug Description

On my single node devstack setup, there are 2 VMs hosted. VM1 has no floating IP assigned. VM2 has a floating IP assigned. From VM1, ping to VM2 using the floating IP. Ping output reports the replies comes from VM2's fixed ip address.
The reply should be from VM2's floating ip address.

This is a DVR problem as it doesn't happen when the L3 agent's mode is 'legacy'.

This may be a problem with the NAT rules defined by the DVR L3-agent.

I used the latest neutron code on the master branch to reproduce, The agent_mode is set to 'dvr_snat'.

Here is how the problem is reproduced:

VM1 and VM2 runs on the same host.

VM1 has fixed IP of 10.11.12.4, no floating-ip associated.
VM2 has fixed IP of 10.11.12.5 floating-ip=10.127.10.226

Logged into VM1 from the qrouter namespace.

From VM1, ping to 10.127.10.226, ping output at VM1 reports
ping replies are from the VM2's fixed IP address

# ssh cirros@10.11.12.4
cirros@10.11.12.4's password:
$ ping 10.127.10.226
PING 10.127.10.226 (10.127.10.226): 56 data bytes
64 bytes from 10.11.12.5: seq=0 ttl=64 time=4.189 ms
64 bytes from 10.11.12.5: seq=1 ttl=64 time=1.254 ms
64 bytes from 10.11.12.5: seq=2 ttl=64 time=2.386 ms
64 bytes from 10.11.12.5: seq=3 ttl=64 time=2.064 ms
^C
--- 10.127.10.226 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 1.254/2.473/4.189 ms
$

If I associate a floating IP on VM1 then repeat the same test, ping reports the replies comes from VM2's floating IP:

# ssh cirros@10.11.12.4
cirros@10.11.12.4's password:
$ ping 10.127.10.226
PING 10.127.10.226 (10.127.10.226): 56 data bytes
64 bytes from 10.127.10.226: seq=0 ttl=63 time=16.750 ms
64 bytes from 10.127.10.226: seq=1 ttl=63 time=2.417 ms
64 bytes from 10.127.10.226: seq=2 ttl=63 time=1.558 ms
64 bytes from 10.127.10.226: seq=3 ttl=63 time=1.042 ms
64 bytes from 10.127.10.226: seq=4 ttl=63 time=2.770 ms
^C
--- 10.127.10.226 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 1.042/4.907/16.750 ms
$

Numan Siddique (numansiddique) wrote :

@Stephen - If you are planning to work on this bug or have already started, then please feel free to assign back to yourself.

Changed in neutron:
assignee: nobody → Numan Siddique (numansiddique)
status: New → In Progress
Numan Siddique (numansiddique) wrote :

I tested it and i was able to reproduce.
In my setup VM1 is 10.0.0.3 and VM2 is 10.0.0.5 and with fip 172.168.1.9 - both hosted in the same compute node.

In the q-router namespace, there is a DNAT rule (shown below)

Chain neutron-l3-agent-PREROUTING (1 references)
 pkts bytes target prot opt in out source destination
    0 0 REDIRECT tcp -- qr-+ * 0.0.0.0/0 169.254.169.254 tcp dpt:80 redir ports 9697
   12 1008 DNAT all -- * * 0.0.0.0/0 172.168.1.9 to:10.0.0.5

Because of which, the ping packet destined to the floating ip (172.168.1.9) is not received by the snat namespace of the controller node.

Below is the tcpdump of the q-router interface

15:48:51.418852 fa:16:3e:48:fa:e5 > fa:16:3e:01:b5:31, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 20248, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.0.3 > 172.168.1.9: ICMP echo request, id 29185, seq 0, length 64
15:48:51.418920 fa:16:3e:01:b5:31 > Broadcast, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.5 tell 10.0.0.1, length 28
15:48:51.419430 fa:16:3e:ef:ce:6b > fa:16:3e:01:b5:31, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.5 is-at fa:16:3e:ef:ce:6b, length 28
15:48:51.419446 fa:16:3e:01:b5:31 > fa:16:3e:ef:ce:6b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 20248, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.0.3 > 10.0.0.5: ICMP echo request, id 29185, seq 0, length 64
15:48:52.418927 fa:16:3e:48:fa:e5 > fa:16:3e:01:b5:31, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 20480, offset 0, flags [DF], proto ICMP (1), length 84)
    10.0.0.3 > 172.168.1.9: ICMP echo request, id 29185, seq 1, length 64
15:48:52.418996 fa:16:3e:01:b5:31 > fa:16:3e:ef:ce:6b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 20480, offset 0, flags [DF], proto ICMP (1), length 84)

I manually deleted the DNAT rule from iptables and it seems to work fine initially. But it had side effects.

I am not sure if its worth fixing it.

Thanks
Numan

Changed in neutron:
status: In Progress → Opinion
assignee: Numan Siddique (numansiddique) → nobody
Changed in neutron:
status: Opinion → New
Numan Siddique (numansiddique) wrote :

May be this can be addressed by apply the DNAT rule only for packets coming from external ports.
I will explore more on it.

Changed in neutron:
assignee: nobody → Numan Siddique (numansiddique)
Jian Wen (wenjianhn) on 2015-06-26
Changed in neutron:
status: New → Confirmed
Changed in neutron:
status: Confirmed → In Progress
Stephen Ma (stephen-ma) on 2015-06-27
tags: added: l3-dvr-backlog
tags: added: juno-backport-potential kilo-backport-potential
Numan Siddique (numansiddique) wrote :

Adding the below rule is fixing the problem.

 sudo ip netns exec qrouter-5d722d78-8aee-44de-a655-5a485e1d595b iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 10.0.0.0/24 -m conntrack --ctstate DNAT -j SNAT --to 10.0.0.1

Working on the patch.

Changed in neutron:
importance: Undecided → Medium

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/200855
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Fix proposed to branch: master
Review: https://review.openstack.org/233334

Changed in neutron:
assignee: Numan Siddique (numansiddique) → Stephen Ma (stephen-ma)
Changed in neutron:
importance: Medium → High
ZongKai LI (zongkai) on 2015-10-29
Changed in neutron:
assignee: Stephen Ma (stephen-ma) → ZongKai LI (lzklibj)

Well, I found maybe we can use following workaround to handle this issue:
1) modify existing iptables rule for only do DNAT for ingress traffic, like:
 -A neutron-l3-agent-PREROUTING -d 192.168.0.100/32 -m mac --mac-source 1A:14:03:D1:85:7B -j DNAT --to-destination 20.0.1.6
while 20.0.1.6 is VM private fixed IP, 192.168.0.100 is floating IP, and mac 1A:14:03:D1:85:7B is mac of related fpr-* device.
2) delete floating IP address from rfp-* device.

tested 3 cases:
1) external node ping floating IP, no beside affect found yet;
2) VM with floating IP ping external, no beside affect found yet;
3) non-floating-attached VM on the same node, same subnet ping floating IP;

to test case 3), verify with tcpdump packets on sg-*, fg-*, found ICMP packets will go snat way, no longer the local way.

ZongKai LI (zongkai) wrote :

I think it's OK to modify current DNAT iptables rules for floating IP, by adding source mac match.

But I'm not sure if any explicit/potential issues are behind removing floating IP address from rfp-* device.

Need to advice/discuss before code writing.

Stephen Ma (stephen-ma) wrote :

@ZongKai Li

I also have tried with an Iptables rule change and deleting the FIP from the rfp device:

1. Changed the IP prerouting dnat rule to be:
  -A neutron-l3-agent-PREROUTING ! -i qr-+ -d <fip> -j DNAT --to-destination <fixed-ip>

2. Have the FIP added to the RFP device just as it does now. After the L3-agent did the arping to the FIP from the fip namespace, deleted the FIP from the rfp device. If the arping is not done, the status of the FIP is ERROR.

After this pings and ssh to VM2 using the FIP worked. However, I found that after an L3 agent restart, the FIP count is set back to 0. The reason is that L3-agent, after restarting, is counting the number of FIPs configured on the rfp device. Since there are no FIPs configured on the rfp device now, it thinks the FIP count is 0.

ZongKai LI (zongkai) wrote :

@Stephen Ma

Thanks, that's a great help!

To FIP count, I think you mean https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_fip_ns.py#L248-L258 , in the scan_fip_ports method.

And it will have affect on https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L129-L130 , if floating IP disassociate event happened while l3-agent just restarted, the removal processing will not be completed.

It could also have affect on https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L454-L459, it will make fip_ns try to re-create rtr_2_fip link.

That's all I can see now.

About arping, I'm not sure, I will test it later.

ZongKai LI (zongkai) wrote :

I did some investigation on this issue again. Things I get for now:
1) arping Stephen mentioned in comment #10 seems not to necessary to worry.
2) FIP count, dist_fip_count, we can use another way to get its value, like from "ip route" result in fip netns, not only from rfp device.

I did some investigation on this issue again. Things I get for now:
1) arping Stephen mentioned in comment #10 seems not to necessary to worry.
2) FIP count, dist_fip_count, we can use another way to get its value, like from "ip route" result in fip netns, not only from rfp device.

ZongKai LI (zongkai) wrote :

Sorry for duplicated comment #14.

Change abandoned by Carl Baldwin (<email address hidden>) on branch: master
Review: https://review.openstack.org/233334
Reason: Looks like https://review.openstack.org/#/c/240677 supersedes this.

[@ZonKai Li] Do you need assistance in writing the unit tests for https://review.openstack.org/#/c/240677?

ZongKai LI (zongkai) wrote :

@Stephen Ma, Sure, that will be a great help!

And FYI, based on Carl's comments, I tried to override some related code make Legacy/HA/DVR router have same behavior, no FIP ip hosted by devices like qg/rfp.

To FipNamespace method "scan_fip_ports"(changed its name now), it's changed to:
1) get (fixed_ip, priority) pairs from fip ns by ip rule,
2) get (fixed_ip, floatingip) pairs from qrouter ns by iptables.
This can help ip rules for floatingip can keep the same priority after l3-agent restart.

And method get_router_cidrs, both its name and the way it processing are changed.

Changed in neutron:
assignee: ZongKai LI (lzklibj) → Stephen Ma (stephen-ma)

Fix proposed to branch: master
Review: https://review.openstack.org/246855

Changed in neutron:
assignee: Stephen Ma (stephen-ma) → ZongKai LI (lzklibj)

Fix proposed to branch: master
Review: https://review.openstack.org/246894

Alan Pevec (apevec) on 2015-11-24
tags: removed: juno-backport-potential

Change abandoned by ZongKai LI (<email address hidden>) on branch: master
Review: https://review.openstack.org/240677

Changed in neutron:
assignee: ZongKai LI (lzklibj) → nobody
status: In Progress → Incomplete

I'm not sure why this was marked incomplete.

Changed in neutron:
status: Incomplete → In Progress
assignee: nobody → ZongKai LI (lzklibj)
Carl Baldwin (carl-baldwin) wrote :

The fix is under active discussion: https://review.openstack.org/#/c/246855/

Change abandoned by Doug Wiegley (<email address hidden>) on branch: master
Review: https://review.openstack.org/246894
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Change [1] is active but a partial fix. Can someone provide a summary?

[1] https://review.openstack.org/#/c/246855/

@Carl: regarding comment #22, bug reports that fail to reach to 'Fix Released' for any reason will get garbage collected only if in Incomplete state with no assignee and without targeted milestone. When I see that a fix (or fixes) fail(s) to merge and the associated patches get abandoned, so should be the bug report (unless there's enough interested to pull it together, like in this case). Garbage collection gives us 60 days grace period.

Fix proposed to branch: master
Review: https://review.openstack.org/266731

Changed in neutron:
assignee: nobody → ZongKai LI (lzklibj)
status: Incomplete → In Progress
Changed in neutron:
assignee: ZongKai LI (lzklibj) → Carl Baldwin (carl-baldwin)
Changed in neutron:
assignee: Carl Baldwin (carl-baldwin) → ZongKai LI (lzklibj)

Change abandoned by Carl Baldwin (<email address hidden>) on branch: master
Review: https://review.openstack.org/266731
Reason: This doesn't seem to be getting any attention.

The ball has been dropped here. The review to focus on is https://review.openstack.org/#/c/246855 but I still have my concerns. I'm going to try to get Stephen to work with me on this next week so that we can finally nail this down.

Hong Hui Xiao (xiaohhui) wrote :

I get into this bug when debug something else. My concern it not that fixed ip is returned.

If I ping from a vm in a host other than the host of floating ip, I can get the reply of floating ip.

$ ping 172.168.1.51 -c 1 -W 1
PING 172.168.1.51 (172.168.1.51): 56 data bytes
64 bytes from 172.168.1.51: seq=0 ttl=61 time=8.911 ms

The source vm doesn't have a floating ip associated. And the request goes this way.
Source VM ---> Shared NAT at centralized router -> fip namespace -> qrouter namespace-> dest VM

The behavior should be consistent no matter if it is a multi-host DVR use case.
I think the multi-host DVR use case mentioned above gets the right result from a wrong reason. The request from an internal IP to a floatingip should be DNAT to the associated fixed IP, and then routes to the router interface. That is the most common case for Neutron routers.

This only happens if the VMs are on the same network. If they are on different networks we don't see this issue. So I have refined the bug description to reflect it.

summary: - With DVR Pings to floating IPs replied with fixed-ips
+ With DVR Pings to floating IPs replied with fixed-ips if VMs are on the
+ same network

Fix proposed to branch: master
Review: https://review.openstack.org/289172

Changed in neutron:
assignee: ZongKai LI (lzklibj) → Swaminathan Vasudevan (swaminathan-vasudevan)
Changed in neutron:
milestone: none → mitaka-rc1

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/246894
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Any update on this one?

Yes I am working on fixing a functional test that is failing and also trying to add some more functional test to address it.

Changed in neutron:
milestone: mitaka-rc1 → newton-1
tags: added: mitaka-rc-potential
tags: removed: mitaka-rc-potential

Change abandoned by Swaminathan Vasudevan (<email address hidden>) on branch: master
Review: https://review.openstack.org/289172
Reason: Will adbandon this patch since we have an alternate one that works better.
https://review.openstack.org/#/c/285982/3

tags: added: liberty-backport-potential mitaka-backport-potential

Reviewed: https://review.openstack.org/285982
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1cea77b0aafbada6cad89a6fe0f5450004aef4e1
Submitter: Jenkins
Branch: master

commit 1cea77b0aafbada6cad89a6fe0f5450004aef4e1
Author: Hong Hui Xiao <email address hidden>
Date: Mon Feb 29 11:07:15 2016 +0000

    DVR: Fix issue of SNAT rule for DVR with floating ip

    With current code, there are 2 issues.

    1) The prevent snat rule that is added for floating ip will be
    cleaned, when restarting the l3 agent. Without this rule, the fixed
    ip will be SNATed to floating ip, even if the network request is to
    an internal IP.

    2) The prevent snat rule will not be cleaned, even if the external
    device(rfp device) is deleted. So, when the floating ips are removed
    from DVR router, there are still dump rules in iptables. Restarting
    the l3 agent can clean these dump rules.

    The fix in this patch will handle DVR floating ip nat rules at the
    same step to handle nat rules for other routers(legacy router, dvr
    edge router)

    After the change in [1], the fip nat rules for external port have
    been extracted together into a method. Add all rules in that method
    in the same step can fix the issue of ping floating ip, but reply
    with fixed ip.

    [1] https://review.openstack.org/#/c/286392/

    Change-Id: I018232c03f5df2237a11b48ac877793d1cb5c1bf
    Closes-Bug: #1549311
    Related-Bug: #1462154

Changed in neutron:
status: In Progress → Fix Committed
Carl Baldwin (carl-baldwin) wrote :

I suspect we've got the wrong fix for this bug. If I'm currently following correctly, we "fixed" this by SNAT to the router gateway address in the qrouter namespace. Is that a correct understanding?

If so, then this is wrong, IMO. We can't just SNAT to the router gateway address in a bunch of qrouter namespaces all over the network. Shared SNAT has to be centralized unless you can figure out some way to coordinate the use of source ports between all of these independent places where SNAT is applied. I've tried to say this a number of times but the message doesn't seem to be getting across. Am I missing something?

Changed in neutron:
status: Fix Committed → Confirmed
status: Confirmed → In Progress
Changed in neutron:
assignee: Swaminathan Vasudevan (swaminathan-vasudevan) → Hong Hui Xiao (xiaohhui)
Eugene Nikanorov (enikanorov) wrote :

Observing this issue on one of our clouds we've found that the root cause was VM getting incorrect default gw from DHCP server.
dnsmasq for some reason advertises default gw = dhcp port ip, despite what is in dnsmasq config.
Then ping reply goes through dhcp namespace to snat gateway and then it is not snatted.

One possible way to highlight the issue would be to disable ip forwarding in dhcp namespace.
In such case traffic just would not go back.

Changed in neutron:
assignee: Hong Hui Xiao (xiaohhui) → Brian Haley (brian-haley)

Reviewed: https://review.openstack.org/289172
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a388f78c8cb4b1c860bfc11029b5210955f1932d
Submitter: Jenkins
Branch: master

commit a388f78c8cb4b1c860bfc11029b5210955f1932d
Author: Hong Hui Xiao <email address hidden>
Date: Thu May 12 05:48:15 2016 +0000

    DVR: Pings to floatingip returns with fixed-ip on same network

    Pinging a floatingip of VM1 from a second VM(VM2) which has SNAT
    enabled connected to a DVR router on the same network returns
    with fixed-ip address rather than the floatingip address.

    The NAT forwarding rules for floatingip in the router namespace
    does not check for the in coming port and tries to add the rule
    for all in coming ports.

    This causes the packets that are originating from the router
    namespace to be modified and forwarded directly to the VM2 fixed
    ip instead of forwarding the traffic to the SNAT namespace.

    The fix in here will make sure that for all routers, the floatingip
    forwarding rules will be applied only to the 'rfp-' internal ports
    and not to all ports.

    Change-Id: I9453beffd94bf685afd74b0820506fb6b7c996c4
    Closes-Bug: #1462154
    Co-Authored-By: Hong Hui Xiao <email address hidden>

Changed in neutron:
status: In Progress → Fix Released

This issue was fixed in the openstack/neutron 9.0.0.0b1 development milestone.

tags: added: neutron-proactive-backport-potential

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: /stable/mitaka
Review: https://review.openstack.org/296850
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/246855
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Reviewed: https://review.openstack.org/349549
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=10532404b5f28ac9489bc300e1c9a94aa5ad86c0
Submitter: Jenkins
Branch: stable/mitaka

commit 10532404b5f28ac9489bc300e1c9a94aa5ad86c0
Author: Hong Hui Xiao <email address hidden>
Date: Mon Feb 29 11:07:15 2016 +0000

    DVR: Fix issue of SNAT rule for DVR with floating ip

    With current code, there are 2 issues.

    1) The prevent snat rule that is added for floating ip will be
    cleaned, when restarting the l3 agent. Without this rule, the fixed
    ip will be SNATed to floating ip, even if the network request is to
    an internal IP.

    2) The prevent snat rule will not be cleaned, even if the external
    device(rfp device) is deleted. So, when the floating ips are removed
    from DVR router, there are still dump rules in iptables. Restarting
    the l3 agent can clean these dump rules.

    The fix in this patch will handle DVR floating ip nat rules at the
    same step to handle nat rules for other routers(legacy router, dvr
    edge router)

    After the change in [1], the fip nat rules for external port have
    been extracted together into a method. Add all rules in that method
    in the same step can fix the issue of ping floating ip, but reply
    with fixed ip.

    [1] https://review.openstack.org/#/c/286392/

    Conflicts:
        neutron/agent/l3/dvr_fip_ns.py
        neutron/tests/functional/agent/l3/test_dvr_router.py

    Change-Id: I018232c03f5df2237a11b48ac877793d1cb5c1bf
    Closes-Bug: #1549311
    Related-Bug: #1462154
    (cherry picked from commit 1cea77b0aafbada6cad89a6fe0f5450004aef4e1)

tags: added: in-stable-mitaka

Reviewed: https://review.openstack.org/378374
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fcb12b5c2744a08ce99286141a632c8b50e4e385
Submitter: Jenkins
Branch: stable/mitaka

commit fcb12b5c2744a08ce99286141a632c8b50e4e385
Author: Hong Hui Xiao <email address hidden>
Date: Thu May 12 05:48:15 2016 +0000

    DVR: Pings to floatingip returns with fixed-ip on same network

    Pinging a floatingip of VM1 from a second VM(VM2) which has SNAT
    enabled connected to a DVR router on the same network returns
    with fixed-ip address rather than the floatingip address.

    The NAT forwarding rules for floatingip in the router namespace
    does not check for the in coming port and tries to add the rule
    for all in coming ports.

    This causes the packets that are originating from the router
    namespace to be modified and forwarded directly to the VM2 fixed
    ip instead of forwarding the traffic to the SNAT namespace.

    The fix in here will make sure that for all routers, the floatingip
    forwarding rules will be applied only to the 'rfp-' internal ports
    and not to all ports.

    Change-Id: I9453beffd94bf685afd74b0820506fb6b7c996c4
    Closes-Bug: #1462154
    Co-Authored-By: Hong Hui Xiao <email address hidden>
    (cherry picked from commit a388f78c8cb4b1c860bfc11029b5210955f1932d)

tags: removed: neutron-proactive-backport-potential
tags: removed: kilo-backport-potential
tags: removed: liberty-backport-potential mitaka-backport-potential

This issue was fixed in the openstack/neutron 8.3.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers