VPNaaS Ipsec does not correctly determine master L3 HA Router

Bug #1471940 reported by Tuomas Juntunen
This bug report is a duplicate of:  Bug #1478012: VPNaaS: Support VPNaaS with L3 HA. Edit Remove
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
In Progress
Undecided
venkata anil

Bug Description

I have just upgraded Openstack from Juno to Kilo and I am testing all the features.

We run 14.04 Ubuntu, all neutron packages are 1:2015.1.0-0ubuntu1~cloud0

It seems when I am trying to create a VPN IPSec Site Connection, the master L3 router is not chosen, but instead it seems to always default to the wrong node and the ip route get <ip> fails in the router namespace. IPSec SIte connection is left in PENDING_CREATE state.

2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Traceback (most recent call last):
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 255, in enable
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec self.start()
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 430, in start
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec ipsec_site_conn['id'])
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 387, in _get_nexthop
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec routes = self._execute(['ip', 'route', 'get', ip_addr])
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 335, in _execute
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec extra_ok_codes=extra_ok_codes)
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 580, in execute
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec extra_ok_codes=extra_ok_codes, **kwargs)
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 137, in execute
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec raise RuntimeError(m)
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec RuntimeError:
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-12891752-0afb-4d5f-8a8e-b46a9716accc', 'ip', 'route', 'get', 'myip']
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Exit code: 2
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Stdin:
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Stdout:
2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Stderr: RTNETLINK answers: Network is unreachable

I don't remember experiencing this in Juno.

Tags: l3-ha rfe vpnaas
Assaf Muller (amuller)
tags: added: l3-ha
removed: ha l3
Revision history for this message
Paul Michali (pcm) wrote :

Can you try to reproduce it in Juno. I don't know if anyone has test VPN with L3 HA ever. It would be good to know if it never worked (e.g. isn't supported yet), or is a regression.

Please indicate the commands used to setup the VPN connection (especially with the router selection), just for reference.

 I assume you are using OpenStack and not in a DevStack environment?

Revision history for this message
Tuomas Juntunen (tuomas-juntunen) wrote : RE: [Bug 1471940] Re: VPNaaS Ipsec does not correctly determine master L3 HA Router
Download full text (4.4 KiB)

Hi

Yes it's openstack, not devstack, and as mentioned we just upgraded from Juno to Kilo.

I don't have a Juno environment to test it in anymore. The setup was done from horizon.

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Paul Michali
Sent: 6. heinäkuuta 2015 22:48
To: <email address hidden>
Subject: [Bug 1471940] Re: VPNaaS Ipsec does not correctly determine master L3 HA Router

Can you try to reproduce it in Juno. I don't know if anyone has test VPN with L3 HA ever. It would be good to know if it never worked (e.g. isn't supported yet), or is a regression.

Please indicate the commands used to setup the VPN connection (especially with the router selection), just for reference.

 I assume you are using OpenStack and not in a DevStack environment?

--
You received this bug notification because you are subscribed to the bug report.
https://bugs.launchpad.net/bugs/1471940

Title:
  VPNaaS Ipsec does not correctly determine master L3 HA Router

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  I have just upgraded Openstack from Juno to Kilo and I am testing all
  the features.

  We run 14.04 Ubuntu, all neutron packages are
  1:2015.1.0-0ubuntu1~cloud0

  It seems when I am trying to create a VPN IPSec Site Connection, the
  master L3 router is not chosen, but instead it seems to always default
  to the wrong node and the ip route get <ip> fails in the router
  namespace. IPSec SIte connection is left in PENDING_CREATE state.

  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec Traceback (most recent call last):
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 255, in enable
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec self.start()
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 430, in start
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec ipsec_site_conn['id'])
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 387, in _get_nexthop
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec routes = self._execute(['ip', 'route', 'get', ip_addr])
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron_vpnaas/services/vpn/device_drivers/ipsec.py", line 335, in _execute
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec extra_ok_codes=extra_ok_codes)
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.services.vpn.device_drivers.ipsec File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 580, in execute
  2015-07-06 20:54:56.064 6859 TRACE neutron_vpnaas.servic...

Read more...

Changed in neutron:
assignee: nobody → venkata anil (anil-venkata)
Revision history for this message
Paul Michali (pcm) wrote :

There aren't any unit for functional tests for doing VPNaaS with an HA router, and AFAIK, has never been tried. I would say that, at this time, it is not a supported capability. I'd be surprised if it worked in Juno (and if so, probably by coincidence).

If this is a desired capability, then it would be useful to create a Request for Feature Enhancement (RFE) and see if the community wants to prioritize and engage someone to work on providing that functionality.

Revision history for this message
Tuomas Juntunen (tuomas-juntunen) wrote :

I went through my documentation and when we had vpnaas working, we didn't have l3-ha implemented in our Juno. So to conclude this, I bet it wasn't working then either.

I don't see point using VPNaaS if you can't have redundant routers at the same time. We disable it for now and use vyos / pfsense or similar to implement VPN.

Could someone add the rfe tag to this bug.

Thanks

Revision history for this message
venkata anil (anil-venkata) wrote :

I can take the activity of writing RFE for this feature.

tags: added: rfe
Revision history for this message
Tuomas Juntunen (tuomas-juntunen) wrote :

Thank you

Revision history for this message
venkata anil (anil-venkata) wrote :

Analysis of why vpnaas is not working with l3 ha -

Binding between neutron router and vpn service:
Present implementation of vpnaas has a 1:1 binding for vpn service with router-id i.e ipsec(vpn) process will be run inside router namespace ( unlinke neutron lbaas which has seperate namespace).
ipsec connection consists of source and dest cidr, source and dest gateway addresses, encryption, authentication. This information is stored in kernel network stack as ipsec policies and states. As local cidr and gateway addresses are present in router namespace, ipsec has to be run in router namespace.

what happens in vpn agent during vpn connection create:
When a vpn service is updated(ipsec connection create), each vpn agent hosting that router will be notified. Then the vpn agent spawns/kills the ipsec process in its router namespace.
 vpn also registers for router_create/delete/update notifications, and spawns/kills the ipsec process in its router namespace based on router_create/delete/update events.

Impact of l3-ha on vpn:
l3-ha router will be hosted on more than one l3 agent.
So when a vpn service created for l3-ha router, all the vpn agents hosting this router will be notified because of [1]

Then all the vpn agents hosting this router, in router's namespace
1) try reaching for the external gateway peer address
2) try to spawn haproxy process
As no gateway address present on the backup router, vpn agent hosting this backup router fails in step 1
If we skip this also, step 2 will fail as it needs ip addresses configured for router's internal ports.

Some of the solutions we can think of to fix this issue:
1) Follow the approach of radvd. l3-ha agent will spawn ipsec process on master node, like radvd. When the master router is up, state change monitor informs the l3-agent, which spawns ipsec process.

2) Otherwise, when ha_state of router(l3-agent-list-hosting-router) is modified, send router_updated notification to all agents. Agents will check ha_state of router, decide on spawning/killing of haprocess.

But in both cases existing vpn connections will be lost(though we have connection tracking in l3 ha), as spawning haproxy process creates the connections again.

[1]
https://github.com/openstack/neutron-vpnaas/blob/master/neutron_vpnaas/services/vpn/service_drivers/__init__.py#L84

Revision history for this message
venkata anil (anil-venkata) wrote :

typo in my prvious comment #7

Instead of using ipsec process(openswan, libreswan, strongswan) name, I used haproxy.

 vpnaas spawns ipsec process(openswan, libreswan, strongswan) and not haproxy process.

Sorry for the typo.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-vpnaas (master)

Fix proposed to branch: master
Review: https://review.openstack.org/200636

Changed in neutron:
status: New → In Progress
Revision history for this message
venkata anil (anil-venkata) wrote :

I had conversation with Paul Michali and Henry Gessau in IRC -
1) if we can updated this bug as RFE or
2) report a new bug as RFE and mark this as duplicate.
They suggested for second option. So marking this bug as duplicate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.