2021-06-21 11:14:19 |
Hemanth Nakkina |
bug |
|
|
added bug |
2021-06-21 11:14:52 |
Hemanth Nakkina |
attachment added |
|
test_snat_arp_entry.sh https://bugs.launchpad.net/neutron/+bug/1933092/+attachment/5506010/+files/test_snat_arp_entry.sh |
|
2021-06-21 11:15:25 |
Hemanth Nakkina |
description |
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer IP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not updated with snat ip in non-working cases.
[1] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L570
[2] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L317 |
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer VIP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not updated with snat ip in non-working cases.
[1] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L570
[2] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L317 |
|
2021-06-21 12:04:20 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2021-06-23 11:18:56 |
Akihiro Motoki |
tags |
|
l3-dvr-backlog |
|
2021-06-30 08:49:43 |
David Negreira |
bug |
|
|
added subscriber David Negreira |
2021-06-30 15:07:41 |
Slawek Kaplonski |
neutron: assignee |
|
Slawek Kaplonski (slaweq) |
|
2021-07-02 08:14:09 |
Vecdi Burak Bengi |
bug |
|
|
added subscriber Vecdi Burak Bengi |
2021-07-02 11:35:03 |
OpenStack Infra |
neutron: status |
New |
In Progress |
|
2021-07-02 11:54:39 |
Hemanth Nakkina |
neutron: assignee |
Slawek Kaplonski (slaweq) |
Hemanth Nakkina (hemanth-n) |
|
2021-07-05 11:22:14 |
OpenStack Infra |
neutron: status |
In Progress |
Fix Released |
|
2021-07-06 17:10:46 |
OpenStack Infra |
tags |
l3-dvr-backlog |
in-stable-wallaby l3-dvr-backlog |
|
2021-07-06 17:11:19 |
OpenStack Infra |
tags |
in-stable-wallaby l3-dvr-backlog |
in-stable-victoria in-stable-wallaby l3-dvr-backlog |
|
2021-07-06 17:11:49 |
OpenStack Infra |
tags |
in-stable-victoria in-stable-wallaby l3-dvr-backlog |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog |
|
2021-07-07 03:22:19 |
Hemanth Nakkina |
bug task added |
|
neutron (Ubuntu) |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
nominated for series |
|
Ubuntu Focal |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
bug task added |
|
neutron (Ubuntu Focal) |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
nominated for series |
|
Ubuntu Hirsute |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
bug task added |
|
neutron (Ubuntu Hirsute) |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
nominated for series |
|
Ubuntu Groovy |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
bug task added |
|
neutron (Ubuntu Groovy) |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
nominated for series |
|
Ubuntu Impish |
|
2021-07-07 03:23:00 |
Hemanth Nakkina |
bug task added |
|
neutron (Ubuntu Impish) |
|
2021-07-07 03:23:56 |
Hemanth Nakkina |
bug task added |
|
cloud-archive |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
nominated for series |
|
cloud-archive/victoria |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
bug task added |
|
cloud-archive/victoria |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
nominated for series |
|
cloud-archive/ussuri |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
bug task added |
|
cloud-archive/ussuri |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
nominated for series |
|
cloud-archive/xena |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
bug task added |
|
cloud-archive/xena |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
nominated for series |
|
cloud-archive/wallaby |
|
2021-07-07 03:24:14 |
Hemanth Nakkina |
bug task added |
|
cloud-archive/wallaby |
|
2021-07-07 03:59:03 |
Hemanth Nakkina |
description |
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer VIP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not updated with snat ip in non-working cases.
[1] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L570
[2] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L317 |
[Impact]
Load Balancers deployed on the cloud are unreachable
[Test Case]
1. Deploy openstack with atleast 4 compute nodes with networking features DVR SNAT+L3HA
2. Execute the script test_snat_arp_entry.sh
3. The script loops for 20 times creating network, router and connecting router to external, internal network and checking if ARP entries are populated properly on qrouter namespaces
4. The script stops if arp entries are missing.
5. If the script runs for 20 loops, then there are no issues.
[Regression Potential]
The issue only happens a few times when a router is created, external gateway set and internal subnet attached to router in quick succession. In other cases, the arp entry of snat is already added.
The fix just adds extra logic to add arp entry retrieving snat information from the router. In working cases, this extra logic will execute commands to add arp entry twice which should not cause further issues.
[Original Bug Report]
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer VIP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not updated with snat ip in non-working cases.
[1] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L570
[2] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L317 |
|
2021-07-07 10:20:49 |
Hemanth Nakkina |
tags |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog sts sts-sru-needed |
|
2021-07-07 11:39:04 |
Hemanth Nakkina |
attachment added |
|
Debdiff for impish https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509595/+files/lp1933092_impish.debdiff |
|
2021-07-07 11:39:35 |
Hemanth Nakkina |
attachment added |
|
Debdiff for hirsute https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509596/+files/lp1933092_hirsute.debdiff |
|
2021-07-07 11:40:04 |
Hemanth Nakkina |
attachment added |
|
Debdiff for groovy https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509597/+files/lp1933092_groovy.debdiff |
|
2021-07-07 11:40:23 |
Hemanth Nakkina |
attachment added |
|
Debdiff for focal https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509598/+files/lp1933092_focal.debdiff |
|
2021-07-07 11:40:49 |
Hemanth Nakkina |
attachment added |
|
Debdiff for UCA wallaby https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509599/+files/lp1933092_wallaby.debdiff |
|
2021-07-07 11:41:12 |
Hemanth Nakkina |
attachment added |
|
Debdiff for UCA victoria https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509600/+files/lp1933092_victoria.debdiff |
|
2021-07-07 11:41:37 |
Hemanth Nakkina |
attachment added |
|
Debdiff for UCA ussuri https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509601/+files/lp1933092_ussuri.debdiff |
|
2021-07-07 12:16:00 |
Hemanth Nakkina |
attachment added |
|
Debdiff for UCA xena https://bugs.launchpad.net/cloud-archive/wallaby/+bug/1933092/+attachment/5509624/+files/lp1933092_xena.debdiff |
|
2021-07-07 12:32:02 |
Ubuntu Foundations Team Bug Bot |
tags |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog sts sts-sru-needed |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog patch sts sts-sru-needed |
|
2021-07-07 12:32:10 |
Ubuntu Foundations Team Bug Bot |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2021-07-23 00:30:47 |
Mathew Hodson |
neutron (Ubuntu Focal): importance |
Undecided |
Medium |
|
2021-07-23 00:32:02 |
Mathew Hodson |
neutron (Ubuntu Groovy): importance |
Undecided |
Medium |
|
2021-07-23 00:33:09 |
Mathew Hodson |
neutron (Ubuntu Hirsute): importance |
Undecided |
Medium |
|
2021-07-23 00:33:24 |
Mathew Hodson |
neutron (Ubuntu Impish): importance |
Undecided |
Medium |
|
2021-07-27 18:19:18 |
Launchpad Janitor |
neutron (Ubuntu Impish): status |
New |
Fix Released |
|
2021-07-30 06:45:31 |
Hemanth Nakkina |
neutron (Ubuntu Focal): status |
New |
Fix Released |
|
2021-07-30 06:45:49 |
Hemanth Nakkina |
cloud-archive/ussuri: status |
New |
Fix Released |
|
2021-08-06 12:38:05 |
Bernard Cafarelli |
tags |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog patch sts sts-sru-needed |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog neutron-proactive-backport-potential patch sts sts-sru-needed |
|
2021-09-23 19:54:08 |
OpenStack Infra |
tags |
in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog neutron-proactive-backport-potential patch sts sts-sru-needed |
in-stable-train in-stable-ussuri in-stable-victoria in-stable-wallaby l3-dvr-backlog neutron-proactive-backport-potential patch sts sts-sru-needed |
|
2021-10-13 11:53:36 |
Hemanth Nakkina |
cloud-archive/victoria: status |
New |
Fix Released |
|
2021-10-13 11:53:49 |
Hemanth Nakkina |
cloud-archive/wallaby: status |
New |
Fix Released |
|
2021-10-13 11:54:01 |
Hemanth Nakkina |
cloud-archive/xena: status |
New |
Fix Released |
|
2021-10-13 11:54:14 |
Hemanth Nakkina |
neutron (Ubuntu Groovy): status |
New |
Fix Released |
|
2021-10-13 11:54:27 |
Hemanth Nakkina |
neutron (Ubuntu Hirsute): status |
New |
Fix Released |
|