OVN Router sending ARP instead of sending traffic to the gateway
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
New
|
Undecided
|
Unassigned |
Bug Description
Summary:
When a VM has a Floating IP, any attempt to reach a routed network results in an ARP being sent instead of the traffic being sent to the Gateway.
Description:
I have two VM's:
$ openstack server list -f yaml
- Flavor: ''
ID: f875fc7c-
Image: Fedora_32
Name: fedora_no_fip
Networks: infra_external=
Status: ACTIVE
- Flavor: ''
ID: 4dd45015-
Image: Fedora_32
Name: fedora_test
Networks: infra_internal=
Status: ACTIVE
The one without the FIP can reach anything fine. For example, ping 1.1.1.1:
[root@overcloud
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
22:52:13.970470 P fa:16:3e:47:ee:dd ethertype IPv4 (0x0800), length 100: (tos 0x0, ttl 64, id 59289, offset 0, flags [DF], proto ICMP (1), length 84)
172.20.10.201 > 1.1.1.1: ICMP echo request, id 1, seq 36, length 64
22:52:13.978619 P 00:e0:67:15:cc:2f ethertype 802.1Q (0x8100), length 104: vlan 4, p 0, ethertype IPv4, (tos 0x0, ttl 56, id 38296, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.1 > 172.20.10.201: ICMP echo reply, id 1, seq 36, length 64
But, when I try the same from the VM with the Floating IP, I can see that an ARP is being sent for 1.1.1.1:
[root@overcloud
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
22:55:42.779383 B fa:16:3e:d7:80:3a ethertype 802.1Q (0x8100), length 48: vlan 4, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 1.1.1.1 tell 172.20.10.107, length 28
22:55:42.779476 Out fa:16:3e:d7:80:3a ethertype ARP (0x0806), length 44: Ethernet (len 6), IPv4 (len 4), Request who-has 1.1.1.1 tell 172.20.10.107, length 28
22:55:42.779510 Out fa:16:3e:d7:80:3a ethertype ARP (0x0806), length 44: Ethernet (len 6), IPv4 (len 4), Request who-has 1.1.1.1 tell 172.20.10.107, length 28
The router has the gateway network set:
$ openstack router show infra_r1 -f yaml
admin_state_up: true
availability_
availability_zones: null
created_at: '2020-05-
description: ''
external_
enable_snat: true
external_
- ip_address: 172.20.10.118
subnet_id: bf21b56a-
network_id: 2561f8db-
flavor_id: null
id: 15c1b81d-
interfaces_info:
- ip_address: 192.168.10.1
port_id: 65a28088-
subnet_id: 27382151-
location:
cloud: ''
project:
domain_id: null
domain_name: Default
id: 0e446e02e899455
name: admin
region_name: regionOne
zone: null
name: infra_r1
project_id: 0e446e02e899455
revision_number: 3
routes: []
status: ACTIVE
tags: []
updated_at: '2020-05-
Reproducer for me has been:
1. Deploy OpenStack with OVN DVR (Using TripleO, so the settings by default here: https:/
2. Create an external network that is a VLAN:
$ openstack network show infra_external -f yaml
admin_state_up: true
availability_
availability_zones: []
created_at: '2020-05-
description: ''
dns_domain: ''
id: 2561f8db-
ipv4_address_scope: null
ipv6_address_scope: null
is_default: false
is_vlan_
location:
cloud: ''
project:
domain_id: null
domain_name: Default
id: 0e446e02e899455
name: admin
region_name: regionOne
zone: null
mtu: 9000
name: infra_external
port_security_
project_id: 0e446e02e899455
provider:
provider:
provider:
qos_policy_id: null
revision_number: 2
router:external: true
segments: null
shared: false
status: ACTIVE
subnets:
- bf21b56a-
tags: []
updated_at: '2020-05-
3. Subnet with the corresponding details:
$ openstack subnet show infra_external_
allocation_pools:
- end: 172.20.10.250
start: 172.20.10.70
cidr: 172.20.0.0/16
created_at: '2020-05-
description: ''
dns_nameservers:
- 8.8.8.8
dns_publish_
enable_dhcp: true
gateway_ip: 172.20.0.254
host_routes: []
id: bf21b56a-
ip_version: 4
ipv6_address_mode: null
ipv6_ra_mode: null
location:
cloud: ''
project:
domain_id: null
domain_name: Default
id: 0e446e02e899455
name: admin
region_name: regionOne
zone: null
name: infra_external_
network_id: 2561f8db-
prefix_length: null
project_id: 0e446e02e899455
revision_number: 0
segment_id: null
service_types: []
subnetpool_id: null
tags: []
updated_at: '2020-05-
4. Internal network and a router, with the infra_external network set as the gateway (output provided earlier)
5. Create two VM's, one with a FIP and one directly attached to infra_external
6. Try to ping anything that would need to be routed by the gateway for infra_external_
gateway_ip: 172.20.0.254
I can ping that gateway fine, it's just when the traffic would need to be routed by 172.20.0.254 that we have an issue.
Versions:
$ cat /etc/redhat-release
CentOS Linux release 8.1.1911 (Core)
# rpm -qa | grep ovn
ovn-20.
puppet-
ovn-host-
$ rpm -qa | grep tripleo-
openstack-
For the containers, I'm just using current-tripleo, but let me know if there is something else specific that I can get for you:
# podman image list | egrep 'ovn|neutron'
docker.
docker.
docker.
docker.
docker.
I'll share some ovn-trace outputs in the comments. This is getting a bit lengthy.
Expected Results:
OVN shouldn't send an ARP for a routed network.
Severity for me is not very high. It's just a home lab, but if there is a wider issue it could be a problem.
Two logic switches, one for each network:
()[root@ overcloud- controller- 0 /]# ovn-nbctl ls-list 9bec-42b7- bedf-12ce8e9611 de (neutron- 2561f8db- e1c8-4185- 9056-0883686a8a 53) f512-43bc- 949e-4d45f75408 2c (neutron- 9d4c5e96- bba6-4716- adb2-3d6c2ddd39 03)
e5bcc681-
0304d31c-
()[root@ overcloud- controller- 0 /]# ovn-nbctl show e5bcc681- 9bec-42b7- bedf-12ce8e9611 de 9bec-42b7- bedf-12ce8e9611 de (neutron- 2561f8db- e1c8-4185- 9056-0883686a8a 53) (aka infra_external) d5e4-4e60- 84f8-5dd38ff728 33 13c4-4781- 8bd5-f6a7db16da ee
router- port: lrp-e696d78b- 13c4-4781- 8bd5-f6a7db16da ee 2561f8db- e1c8-4185- 9056-0883686a8a 53 0c32-4a86- 8896-72b9cbfb69 95 overcloud- controller- 0 /]# ovn-nbctl show 0304d31c- f512-43bc- 949e-4d45f75408 2c f512-43bc- 949e-4d45f75408 2c (neutron- 9d4c5e96- bba6-4716- adb2-3d6c2ddd39 03) (aka infra_internal) 3b33-4177- bbcb-d07439f163 8e 761c-461c- 912c-7d0a3781ab 6b
router- port: lrp-65a28088- 761c-461c- 912c-7d0a3781ab 6b a937-4b50- a64c-aef54a3284 d8
switch e5bcc681-
port 9075cf11-
type: localport
addresses: ["fa:16:3e:91:da:cc 172.20.10.70"]
port e696d78b-
type: router
port provnet-
type: localnet
tag: 4
addresses: ["unknown"]
port 75a72825-
addresses: ["fa:16:3e:47:ee:dd 172.20.10.201"]
()[root@
switch 0304d31c-
port b975d1ca-
type: localport
addresses: ["fa:16:3e:be:97:a0 192.168.10.10"]
port 65a28088-
type: router
port 12427559-
addresses: ["fa:16:3e:7c:36:ff 192.168.10.102"]
()[root@ overcloud- controller- 0 /]# ovn-trace infra_internal 'inport == "12427559- a937-4b50- a64c-aef54a3284 d8" && eth.src == fa:16:3e:7c:36:ff && ip4.src == 192.168.10.102 && eth.dst == fa:16:3e:be:97:a0 && ip4.dst == 1.1.1.1' 0x3,vlan_ tci=0x0000, dl_src= fa:16:3e: 7c:36:ff, dl_dst= fa:16:3e: be:97:a0, nw_src= 192.168. 10.102, nw_dst= 1.1.1.1, nw_proto= 0,nw_tos= 0,nw_ecn= 0,nw_ttl= 0
# ip,reg14=
ingress( dp="infra_ internal" , inport="124275") ------- ------- ------- ------- ------- --- c:4516) : inport == "124275" && eth.src == {fa:16: 3e:7c:36: ff}, priority 50, uuid f869e22a c:4188) : inport == "124275" && eth.src == fa:16:3e:7c:36:ff && ip4.src == {192.168.10.102}, priority 90, uuid ec3f6e49 c:4706) : ip, priority 100, uuid 8ca99cd5 c:4895) : reg0[0] == 1, priority 100, uuid dd15ba61
-------
0. ls_in_port_sec_l2 (ovn-northd.
next;
1. ls_in_port_sec_ip (ovn-northd.
next;
3. ls_in_pre_acl (ovn-northd.
reg0[0] = 1;
next;
5. ls_in_pre_stateful (ovn-northd.
ct_next;
ct_next( ct_state= est|trk /* default (use --ct to customize) */) ------- ------- ------- ------- ------- ------- ------- ------- c:5086) : (!ct.trk || (!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0)) && (inport == @pg_63bc7fdf_ 3061_410f_ 9e82_80278b9879 28 && ip4), priority 2002, uuid 655e4046 c:6757) : eth.dst == fa:16:3e:be:97:a0, priority 50, uuid e74c5d8a
-------
6. ls_in_acl (ovn-northd.
next;
19. ls_in_l2_lkup (ovn-northd.
outport = "b975d1";
output;
egress( dp="infra_ internal" , inport="124275", outport="b975d1") ------- ------- ------- ------- ------- ------- ------- ------ c:4708) : ip, priority 100, uuid 79c0a63a
-------
1. ls_out_pre_acl (ovn-northd.
reg0[0] = 1;
next;
2. ls_out_pre_stateful (ovn-northd.c...