2017-02-22 03:16:00 |
venkata anil |
description |
Branch: stable/newton
Setup:
l3_ha = True
l2_population = True
arp_responder = True
enable_distributed_routing = False
Node1: controller/network
Node2: controller/network
Node3: compute
step 1: created external network
step 2: created a router
step 3: set router gateway to external network
step 4: created internal network with subnet, for example: 2.2.2.0/24
step 5: added internal network subnet to router using command: neutron router-interface-add <router> <subnet-id>
step 6: created an instance (VM1) on the internal network
Note: VM1 scheduled to boot on compute host or Node1
Symptoms:
Missing arp entry on Node1 for the internal interface on router
VM1 does not know where to forward out going traffic
Debugging:
# neutron port-list | grep 2.2.2.
| 1a593597-72de-4309-ada0-8e2dacba36ce | | fa:16:3e:c3:f8:fb | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.2"} |
| 2007ca70-8cde-43bf-9254-66fd2e7fb327 | | fa:16:3e:ca:2a:c0 | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.1"} |
| ba642200-e389-436d-aee8-a2198414f221 | | fa:16:3e:0b:5d:3d | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.3"} |
# neutron port-show 2007ca70-8cde-43bf-9254-66fd2e7fb327
+-----------------------+--------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | controller2 |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| created_at | 2017-02-01T22:10:39Z |
| description | |
| device_id | c0f504ff-575e-4e7e-b25e-1f5ddc29a390 |
| device_owner | network:ha_router_replicated_interface |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.1"} |
| id | 2007ca70-8cde-43bf-9254-66fd2e7fb327 |
| mac_address | fa:16:3e:ca:2a:c0 |
| name | |
| network_id | 048938b7-b108-4f43-9222-3560f6d91fef |
| port_security_enabled | False |
| project_id | 58a54da3f0404bc4ad7a266e73c9e7cc |
| revision_number | 68 |
| security_groups | |
| status | ACTIVE |
| tenant_id | 58a54da3f0404bc4ad7a266e73c9e7cc |
| updated_at | 2017-02-03T05:09:49Z |
+-----------------------+--------------------------------------------------------------------------------+
In the log, the following debug message was seen and only 2.2.2.2 and 2.2.2.3 IPs are sent out by the l2pop notification agent.
/var/log/neutron/neutron-server.log
2017-02-01 22:20:03.925 35239 DEBUG neutron.plugins.ml2.drivers.l2pop.rpc [req-f3f31e84-b43e-4393-9ac1-8da9c5c5952e - - - - -] Notify l2population agent compute1 at q-agent-notifier the message add_fdb_entries with {u'048938b7-b108-4f43-9222-3560f6d91fef': {'ports': {u'10.153.36.74': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address=u'fa:16:3e:c3:f8:fb', ip_address=u'2.2.2.2')], u'10.153.36.75': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address=u'fa:16:3e:0b:5d:3d', ip_address=u'2.2.2.3')]}, 'network_type': u'vxlan', 'segment_id': 18}} _notification_host /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/l2pop/rpc.py:57
[root@compute1 ] # arp -an | grep 2.2.2
? (2.2.2.2) at fa:16:3e:c3:f8:fb [ether] PERM on vxlan-18
? (2.2.2.3) at fa:16:3e:0b:5d:3d [ether] PERM on vxlan-18
Looking at the code:
L2pop notification for these ports is sent here:
https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/l2pop/rpc.py#L51
Port info is gathered here for non-distributed HA router:
https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/l2pop/db.py#L99
Trace down from this line, it filters out ports that are in HA_ROUTER_PORTS, which HA_ROUTER_PORTS = (const.DEVICE_OWNER_HA_REPLICATED_INT, const.DEVICE_OWNER_ROUTER_SNAT)
From the above we see that the port's device_owner is network:ha_router_replicated_interface, which would get filtered out by this. |
When both l2pop and arp_responder enabled for linuxbridge agent, vxlan device is created in "proxy" mode. In this mode, ARP entry must be statically added by linuxbridge agent. Because of [1], l2pop driver won't notify HA router port, so linuxbridge agent can't add ARP entry for router port. As there is no router ARP entry, vxlan device is dropping ARP request packets from vm(destined to router), making vm unable to communicate with router.
This issue is only on linuxbridge agent and not on ovs agent.
Temporary solution for vm to communicate with HA router is to disable arp_responder when l2pop is enabled.
If the users need both arp_responder and l2pop features for linuxbridge agent, we need an implementation which decouples them i.e https://bugs.launchpad.net/neutron/+bug/1518392
[1] https://review.openstack.org/#/c/255237/ |
|