l3_router: no support for multiple router gateways to floating_net

Bug #1810536 reported by Alexandru Avadanii
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
networking-vpp
Fix Released
Undecided
Naveen Joy

Bug Description

When trying to route traffic from 2 separate subnets to the same floating network,
vpp-agent (L3 router) crashes since VPP can't handle overlapping addresses on the same interface.
I will follow up with a short investigation in a different comment, for now here are the steps to reproduce this.

############################################################################
# Steps to reproduce the issue
############################################################################
neutron net-create opnfv_fuel_test_net_1
neutron subnet-create --name 10_10_10 opnfv_fuel_test_net_1 10.10.10.0/24
neutron router-create opnfv_fuel_test_router_1 # get opnfv_fuel_test_router_1_ID
neutron router-interface-add ${opnfv_fuel_test_router_1_ID} 10_10_10
neutron router-gateway-set opnfv_fuel_test_router_1 floating_net

neutron net-create opnfv_fuel_test_net_2
neutron subnet-create --name 11_11_11 opnfv_fuel_test_net_2 11.11.11.0/24
neutron router-create opnfv_fuel_test_router_2 # get opnfv_fuel_test_router_2_ID
neutron router-interface-add ${opnfv_fuel_test_router_2_ID} 11_11_11
neutron router-gateway-set opnfv_fuel_test_router_2 floating_net

############################################################################
# tail -1 /var/log/neutron/vpp-agent.log
############################################################################
2019-01-04 14:12:51.530 16704 CRITICAL networking_vpp.agent.server [-] Failed VPP call to sw_interface_add_del_address(, {'address_length': 24, 'del_all': False, 'is_add': True, 'sw_if_index': 11, 'address': '\n\x00\x02r', 'is_ipv6': False}): retval is -127

############################################################################
# service vpp status (relevant line corresponding to above err)
############################################################################
Jan 04 14:12:51 gtw01 vnet[16287]: ip4_add_del_interface_address_internal: failed to add 10.0.2.114/24 which conflicts with 10.0.2.123/24 for interface loop0

############################################################################
# Openstack floating network
############################################################################
openstack network create --external --default --provider-network-type flat \
  --provider-physical-network physnet1 floating_net"

# 10.0.2.254 is a valid gw on our TOR switch
openstack subnet create --gateway 10.0.2.254 --no-dhcp \
  --allocation-pool start=10.0.2.113,end=10.0.2.253 \
  --network floating_net --subnet-range 10.0.2.0/24 floating_subnet

############################################################################
# cat /etc/neutron/neutron.conf # only relevant parts included
############################################################################
[DEFAULT]
core_plugin = neutron.plugins.ml2.plugin.Ml2Plugin
service_plugins = vpp-router,metering

############################################################################
# cat /etc/neutron/plugins/ml2/ml2_conf.ini # only relevant parts included
############################################################################
[ml2]
type_drivers = flat,vlan
tenant_network_types = vlan
mechanism_drivers = vpp
extension_drivers=port_security

[ml2_type_flat]
flat_networks = *

[ml2_type_vlan]
network_vlan_ranges = physnet2:1000:1999,physnet1

[ml2_vpp]
jwt_signing = False
etcd_insecure_explicit_disable_https = True
l3_hosts = gtw01
enable_l3_ha = False
gpe_locators =
gpe_src_cidr =
enable_vpp_restart = False
etcd_pass =
etcd_user =
etcd_port = 4001
etcd_host = 172.16.10.36
physnets = physnet2:GigabitEthernet0/5/0,physnet1:tap0

############################################################################
# cat /etc/vpp/startup.conf
############################################################################
unix {
  cli-listen /run/vpp/cli.sock
  log /var/log/vpp.log
  full-coredump
  nodaemon
  startup-config /etc/vpp/commands.txt
  gid neutron
}
api-trace {
  on
}
api-segment {
  gid neutron
}
cpu {
  main-core 1
}
dpdk {
  socket-mem 1024
  dev 0000:00:05.0
}

############################################################################
# cat /etc/vpp/commands.txt
############################################################################
create tap host-if-name vpp_ext_tap host-bridge br-floating rx-ring-size 1024 tx-ring-size 1024
set interface state tap0 up

Revision history for this message
Alexandru Avadanii (alexandru-avadanii) wrote :

It is possible I am missing something obvious, in which case feel free to disregard my observations.

The current l3_router design implies creating (or reusing) a BVI loop device for each router [1], then trying to assign an IP address to it [2], corresponding to the Neutron-assigned gateway IP address of said router.

Although this works fine for one router, the second router will try to reuse the same BVI and assign an overlapping IP address in the same CIDR, hitting the VPP limitation via [3, 4].

This is clearly not going to change in VPP anytime soon, so l3_router should be adjusted accordingly.
I'm not exactly familiar with the codebase of either projects, but I did play around a bit with some hacks in this direction, trying to assign the gateway IP in the floating network outside VPP (i.e. on the Linux kernel side) with no idea about the implications on security et al. However that would require tracking routes for said IPs inside l3_router, which is just another rabbit hole ...

[1] https://github.com/openstack/networking-vpp/blob/master/networking_vpp/agent/server.py#L1834-L1838
[2] https://github.com/openstack/networking-vpp/blob/master/networking_vpp/agent/server.py#L1892-L1894
[3] https://github.com/FDio/vpp/blob/master/src/vnet/interface_api.c#L335
[4] https://github.com/FDio/vpp/blob/master/src/vnet/ip/ip4_forward.c#L577-L581

Naveen Joy (najoy)
Changed in networking-vpp:
assignee: nobody → Naveen Joy (najoy)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-vpp (master)

Fix proposed to branch: master
Review: https://review.openstack.org/634075

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-vpp (master)

Reviewed: https://review.openstack.org/634075
Committed: https://git.openstack.org/cgit/openstack/networking-vpp/commit/?id=9598c5963d0638cb5ea55f9b5f24363ee6433c23
Submitter: Zuul
Branch: master

commit 9598c5963d0638cb5ea55f9b5f24363ee6433c23
Author: Naveen Joy <email address hidden>
Date: Mon Jan 28 14:33:03 2019 -0800

    Support for multiple router gateways on the floating network.

    Since the VPP does not support overlapping IP addresses on a
    subnet, the solution adds an independent route to the VPP's
    local interface each time an additional router is added on the
    floating network. The first router's gateway IP address is set
    on the BVI loopback and this address becomes the primary gateway
    IP address. Each subsequent router's gateway IP address is added
    as a local route. When the primary gateway is deleted, if a
    valid local IP address exists, it is migrated to the BVI
    loopback interface. Upon deleting an external gateway, the BVI
    loopback is only deleted if no valid local IPs exist on the
    floating network.

    Change-Id: I945056fad31e599833e706ee627fe444e32ed606
    Closes-Bug: #1810536

Changed in networking-vpp:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.