snat is used instead of dnat_and_snat when L3GW resides on chassis

Bug #1960405 reported by frigo
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Medium
frigo

Bug Description

I run RHOSP 16.2 (Train) with OVN and enable_distributed_floating_ip.
On the same chassis I have a VM running and the L3GW port scheduled there, and a FIP is associated to the VM.
I would expect the "dnat_and_snat" NAT to be used and traffic to egress with the FIP. However, as the L3GW is scheduled there too, I see the "snat" NAT is used instead.

I think this is a bug (unless I'm wrong)
- a VM having a FIP should use this FIP for egress traffic. External firewall expect it
- the L3GW port is expected to move. If the port moves to a chassis where the traffic is already flowing using the FIP, the presence of the L3GW port should not disrupt the traffic.

* Reproduction steps
We assume 2 chassis, cpu34d and cpu35d.
# Create a router, we make sure its port is scheduled on cpu35d:

openstack router create router1 --availability-zone-hint a35
openstack router set --external-gateway external1 --fixed-ip subnet=tenant_35 router1
openstack port show 34ef841f-545f-4fab-9447-11bf18ae0e1a \
  -c binding_host_id -c device_owner -c fixed_ips
+-----------------+------------------------------------------------------------------------------+
| Field | Value |
+-----------------+------------------------------------------------------------------------------+
| binding_host_id | cpu35d
| device_owner | network:router_gateway |
| fixed_ips | ip_address='10.64.245.126', subnet_id='e955b866-324d-491f-888c-2760b713d3b0' |
+-----------------+------------------------------------------------------------------------------+
openstack router add subnet router1 mysub

# We run a VM on cpu35d with floating IP 10.64.254.128 associated
openstack server create myserver --key-name stack --security-group prodlike --network private \
 --image cirros --flavor m1.small --availability-zone nova:cpu35d
openstack server add floating ip myserver 10.64.254.128
ssh cirros@10.64.254.128
ping external

# We run another VM, on the other chassis:
openstack server create myserver2 --key-name stack --security-group prodlike --network private \
 --image cirros --flavor m1.small --availability-zone nova:cpu34d
openstack server add floating ip myserver2 10.64.254.135
ssh cirros@10.64.254.135
ping external

We observe the egress traffic:

* Expected output: what did you hope to see?
from myserver: traffic coming from the fip:
IP 10.64.254.128 > external

from myserver2:
IP 10.64.254.135 > external

* Actual output:
from myserver: traffic coming from the L3GW port:
IP 10.64.245.126 > external

from myserver2:
IP 10.64.254.135 > external

* Version:
  ** OpenStack version RHOSP 16.2 (Train)
  ** RHEL 8.4
  ** deployed with tripleo (ovn-2021-central-21.09.1-20.el8fdp.x86_64, python3-neutron-15.3.5-2.20210608154816.el8ost.4.noarch

From OVN northdb:
router 6a1c6c1d-e365-4684-96d6-9b06e4ad5862 (neutron-abf4070d-6134-4bf8-b398-a9e201b66b08) (aka router1)
    port lrp-34ef841f-545f-4fab-9447-11bf18ae0e1a
        mac: "fa:16:3e:22:3f:c0"
        networks: ["10.64.245.126/27"]
        gateway chassis: [1126ea9a-2860-4e5c-9ab5-ca1e8959edee]
    port lrp-e56e108f-d731-45e1-ba45-444219572859
        mac: "fa:16:3e:90:a7:3b"
        networks: ["192.168.200.1/27"]
    nat 8e72663f-2c9d-49fa-9749-223df298c646
        external ip: "10.64.254.128"
        logical ip: "192.168.200.21"
        type: "dnat_and_snat"
    nat b0d9f69a-c8d4-4413-8f00-1c6f0b9e643f
        external ip: "10.64.245.126"
        logical ip: "192.168.200.0/27"
        type: "snat"
    nat f5808a3a-f40c-4ccf-a9fe-b83209011555
        external ip: "10.64.254.135"
        logical ip: "192.168.200.27"
        type: "dnat_and_snat"

From ovn-trace we read:

ingress(dp="router1", inport="lrp-e56e10")
------------------------------------------
 0. lr_in_admission (northd.c:10285): eth.dst == fa:16:3e:90:a7:3b && inport == "lrp-e56e10", priority 50, uuid 3711664a
    xreg0[0..47] = fa:16:3e:90:a7:3b;
    next;
 1. lr_in_lookup_neighbor (northd.c:10365): 1, priority 0, uuid a7f32214
    reg9[2] = 1;
    next;
 2. lr_in_learn_neighbor (northd.c:10374): reg9[2] == 1, priority 100, uuid bb51e95e
    next;
10. lr_in_ip_routing (northd.c:9179): ip4.dst == 0.0.0.0/0, priority 1, uuid 57ef0971
    ip.ttl--;
    reg8[0..15] = 0;
    reg0 = 10.64.245.97;
    reg1 = 10.64.245.126;
    eth.src = fa:16:3e:22:3f:c0;
    outport = "lrp-34ef84";
    flags.loopback = 1;
    next;
11. lr_in_ip_routing_ecmp (northd.c:10670): reg8[0..15] == 0, priority 150, uuid 9b75d8f6
    next;
12. lr_in_policy (northd.c:10795): 1, priority 0, uuid a68ffb22
    reg8[0..15] = 0;
    next;
13. lr_in_policy_ecmp (northd.c:10797): reg8[0..15] == 0, priority 150, uuid c2a799d9
    next;
14. lr_in_arp_resolve (northd.c:10831): ip4, priority 0, uuid 3dff7cbd
    get_arp(outport, reg0);
    /* MAC binding to 00:1c:73:00:00:11. */
    next;
17. lr_in_gw_redirect (northd.c:12774): ip4.src == 192.168.200.21 && outport == "lrp-34ef84" && is_chassis_resident("464051"), priority 100, uuid f142bc57
    eth.src = fa:16:3e:c3:29:10;
    reg1 = 10.64.254.128;
    next;
18. lr_in_arp_request (northd.c:11488): 1, priority 0, uuid 39a08290
    output;

egress(dp="router1", inport="lrp-e56e10", outport="lrp-34ef84")
---------------------------------------------------------------
 0. lr_out_undnat (northd.c:12318): ip && ip4.src == 192.168.200.21 && outport == "lrp-34ef84", priority 100, uuid 8d276994
    eth.src = fa:16:3e:c3:29:10;
    ct_dnat;

ct_dnat /* assuming no un-dnat entry, so no change */
-----------------------------------------------------
 2. lr_out_snat (northd.c:12410): ip && ip4.src == 192.168.200.0/27 && outport == "lrp-34ef84" && is_chassis_resident("cr-lrp-34ef84"), priority 156, uuid 7a29e2cf
    ct_snat(10.64.245.126);

ct_snat(ip4.src=10.64.245.126)
------------------------------
 4. lr_out_delivery (northd.c:11536): outport == "lrp-34ef84", priority 100, uuid d00dd976
    output;
    /* output to "lrp-34ef84", type "patch" */

I think what is happening is is_chassis_resident("cr-lrp-34ef84") is running in the end and override the source IP with its own.

Tags: ovn
tags: added: ovn
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
Elvira García Ruiz (elviragr) wrote :

I'm currently trying to reproduce this. I tried first with no AZs and I wasn't able to reproduce the problem, my results were just like the expected results. I will try it using availability zones later today.

Revision history for this message
frigo (rigault-francois) wrote :

Thanks for the help! I need to work a bit to provide more reliable reproduction steps.

I can reproduce when the AZ where the L3GW is scheduled contains a single chassis, deploying 1 controller, cpu34d, cpu35d, each in its own AZ, and making sure the L3GW port is bound to cpu35d. It's unrealistic that anyone would actually run such a setup.

Revision history for this message
Elvira García Ruiz (elviragr) wrote :

Hi! I tried again with different AZ for each compute and still couldn't reproduce.

Revision history for this message
frigo (rigault-francois) wrote :

really! well i could reproduce, let me try to come up with some steps to reproduce (if that's fine, I m assigning this ticket to me for the moment)

Changed in neutron:
assignee: nobody → frigo (rigault-francois)
Revision history for this message
frigo (rigault-francois) wrote :

I can't reproduce. I think the problem was that the ovn controller was not claiming ports as needed, due to my own development environment. I can't know for sure, but to investigate such issue I would first look into any

    Claiming virtual lport
    Releasing lport

from the ovn-controller logs on the chassis, and try to correlate with actions running at that time in the neutron server.
closing

Changed in neutron:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.