VPNaaS connection not started after linuxbridge router switchover or other changes - duplicate IPtables reported regarding --pol ipsec

Bug #1943449 reported by Christian Rohmann
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

On OpenStack Ussuri (on Ubuntu bionic) running Neutron using the linux bridge driver we observed an issue with VPNaaS ...

After an existing VPN setup was reconfigured by a user via terraform the site connections remained in "DOWN" state:

```
openstack vpn ipsec site connection list
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
| ID | Name | Peer Address | Authentication Algorithm | Status |
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
| b7634f92-bac8-4fed-aa28-e9181482176e | site-connection-1-REDACTED | xxx.xxx.xxx.99 | psk | DOWN |
| 543fa57d-e15e-444b-b5d2-20752196f57a | site-connection-2-REDACTED | xxx.xxx.xxx.156 | psk | DOWN |
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
```

First only the endpoint group and the connection was reconfigured, but after this causes the connection to remain DOWN the user also tried tearing down the connection, endpoints, vpn service, policies, ... only leaving the network and the router in place (which are actively used and hosting other resources such as instances).

2) Looking at the neutron logs on the active network node (HA router) we saw tons of messages about duplicate IPtables, all of them for this very setup and with pol "ipsec":

[...]
neutron-l3-agent.log:2021-09-13 10:46:24.382 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.382 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.382 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.382 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.383 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.383 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.383 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.383 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.384 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.384 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.384 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.384 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.30.122.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.385 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 192.168.0.0/24 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.385 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.385 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.385 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.10.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.386 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.16.0/24 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.386 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.8.0/21 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
neutron-l3-agent.log:2021-09-13 10:46:24.386 4087474 WARNING neutron.agent.linux.iptables_manager [req-a169b585-589e-4d62-95d0-2eb6c2156ec2 df1d997e986b41d0b273945def7df72d 08cf58d22b314283b77bfa68a8611001 - - -] Duplicate iptables rule detected. This may indicate a bug in the ipt
ables rule generation code. Line: -A neutron-l3-agent-POSTROUTING -s 10.0.0.0/21 -d 10.96.0.0/16 -m policy --dir out --pol ipsec -j ACCEPT
[...]

We then simply restarted the neutron-l3-agent.service which caused the active router instance to switch to another node and things got back in working order quite quickly:

```
openstack vpn ipsec site connection list
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
| ID | Name | Peer Address | Authentication Algorithm | Status |
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
| b7634f92-bac8-4fed-aa28-e9181482176e | site-connection-1-REDACTED | xxx.xxx.xxx.99 | psk | ACTIVE |
| 543fa57d-e15e-444b-b5d2-20752196f57a | site-connection-2-REDACTED | xxx.xxx.xxx.156 | psk | ACTIVE |
+--------------------------------------+------------------------------------------+-----------------+--------------------------+--------+
```

But the messages about duplicate iptables rules were thrown again on the restarted, now inactive network / router node, so there must be some clean up / rules generation issue and I believe this issue will return when the active router instance is switched back to the previous master node.

I tried to find an existing bug and found
 * https://bugs.launchpad.net/neutron/+bug/1447651
 * https://bugs.launchpad.net/neutron/+bug/1845145

to be somewhat related (duplicate iptables rules).

Tags: vpnaas
summary: - VPNaaS reconfiguration causes duplicate IPtable rules causes the VPN
+ VPNaaS reconfiguration creates duplicate IPtables rules causes the VPN
connection to remain DOWN
Akihiro Motoki (amotoki)
tags: added: vpnaas
Revision history for this message
Christian Rohmann (christian-rohmann) wrote (last edit ): Re: VPNaaS reconfiguration creates duplicate IPtables rules causes the VPN connection to remain DOWN

@amotoki can I deliver any more data to help narrowing down the issue here?

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Hi, Do you have a chance to provide reproduction steps with devstack, perhaps on current master, that can make easier to check the issue and find solution.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

Hey Lajos, sorry for the late response, but thanks for your kind reply.

Yes, I shall be setting up a devstack with VPNaaS and get back to you with more details.

Revision history for this message
Niklas Schwarz (schwarzn) wrote :

Hi Lajos,

I have created the steps to reproduce the issue with the current release.

In the attachment you will find configuration files for a multinode devstack setup and a script to create the required resources for the vpn. I have tested this setup with 1 controll-node and 2 additional compute nodes.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

@Lajos I know you are in active search of new maintainers for the VPNaaS feature:

http://lists.openstack.org/pipermail/openstack-discuss/2022-April/028123.html

Am I reading the post correctly as there currently is nobody actively working on bugs?
Or is there any chance that someone will pick up on this very issue with duplicate IPtables rules before someone is found? We gladly test more or provide further details if you let me know what you need.

Revision history for this message
Mohammed Naser (mnaser) wrote :
Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

Hello Mahammed. Sorry sorry for the extreme delay in responding to you.
I have now setup a cloud with 3 ctrl nodes and 4 compute nodes on Ubuntu Focal 20.04 running OpenStack Xena and can quite quickly reproduce the issue(s) with Neutron L3 agent complaining

a) about duplicate iptable rules (in relation to IPSEC) and
b) a non-working IPSEC connection on keepalived after switching the master of a router to a new node

Attached please find the terraform setting up a router+network in two projects and then connecting them both using Neutron the VPNaaS.

I setup an instance on each side for easy connectivity testing and debugging.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

I did reproduce the issue as written just before ... and attached all the DEBUG logs of the neutron daemons (control plane and compute/network nodes)...

My steps where:

1) Create two projects "vpndebug1" and "vpndebug2" ...
2) Run the terraform code to set everything up, resulting in two routers:

vpndebug1-gateway ef8664da-5a53-467a-ad40-1a99ccc2d817
vpndebug2-gateway d969ffea-9d4c-4a3a-9ead-0d66624142db

3) Validate connectivity between VMs / instances
4) Trigger backup/master switches of the keepalived for the Neutron routers.
5) Monitor connectivity via the IPSEC VPN failing - one side reports "DOWN" via the API, the other doesn't.

In the logs there were not errors or warnings (yet), but I observed that only "ipsec status" was called, no other ipsec related commands were run.

Without doing any more switch-overs of the router I then toggled the "ipsec connection" and also the "ipsec service" itself via "set --disable", "set --enable" and at some point "ipsec start" and e.g. "ipsec stroke up-nb 2c446a92-0113-4418-b066-931a2afb58fd" was then called on the active router and the connection came backup / was healthy again.

Together with this I also observed the reporting of duplicate iptables.

summary: - VPNaaS reconfiguration creates duplicate IPtables rules causes the VPN
- connection to remain DOWN
+ VPNaaS connection not started after linuxbridge router switchover or
+ other changes - duplicate IPtables reported regarding --pol ipsec
Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

I narrowed it down further while having the issue again in PROD:

1) Moving the active router (replicated interface) does not seems to fix the issue once the service is DOWN and not working.

2) Doing a "openstack vpn service set --disable $SERVICE_ID" followed by a ""openstack vpn service set --enable $SERVICE_ID" does seem to fix things though.

So I strongly believe this is somewhat of a race condition or deadlock.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

After adding a few log lines we sadly found the issue to just be the merged patches to be missing in the UCA package neutron-vpnaas. I opened bug / SRU: https://bugs.launchpad.net/cloud-archive/+bug/1995861

Looking deeper into the code we also found the reason for the duplicate IPtables. I shall be pushing a patchset for those soon.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.