Activity log for bug #1661717

Date Who What changed Old value New value Message
2017-02-03 19:13:22 Xiang Wang bug added bug
2017-02-09 22:47:20 Assaf Muller tags l2-pop l3-ha linuxbridge
2017-02-10 06:23:30 venkata anil neutron: assignee venkata anil (anil-venkata)
2017-02-22 03:16:00 venkata anil description Branch: stable/newton Setup: l3_ha = True l2_population = True arp_responder = True enable_distributed_routing = False Node1: controller/network Node2: controller/network Node3: compute step 1: created external network step 2: created a router step 3: set router gateway to external network step 4: created internal network with subnet, for example: 2.2.2.0/24 step 5: added internal network subnet to router using command: neutron router-interface-add <router> <subnet-id> step 6: created an instance (VM1) on the internal network Note: VM1 scheduled to boot on compute host or Node1 Symptoms: Missing arp entry on Node1 for the internal interface on router VM1 does not know where to forward out going traffic Debugging: # neutron port-list | grep 2.2.2. | 1a593597-72de-4309-ada0-8e2dacba36ce | | fa:16:3e:c3:f8:fb | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.2"} | | 2007ca70-8cde-43bf-9254-66fd2e7fb327 | | fa:16:3e:ca:2a:c0 | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.1"} | | ba642200-e389-436d-aee8-a2198414f221 | | fa:16:3e:0b:5d:3d | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.3"} | # neutron port-show 2007ca70-8cde-43bf-9254-66fd2e7fb327 +-----------------------+--------------------------------------------------------------------------------+ | Field | Value | +-----------------------+--------------------------------------------------------------------------------+ | admin_state_up | True | | allowed_address_pairs | | | binding:host_id | controller2 | | binding:profile | {} | | binding:vif_details | {"port_filter": true} | | binding:vif_type | bridge | | binding:vnic_type | normal | | created_at | 2017-02-01T22:10:39Z | | description | | | device_id | c0f504ff-575e-4e7e-b25e-1f5ddc29a390 | | device_owner | network:ha_router_replicated_interface | | extra_dhcp_opts | | | fixed_ips | {"subnet_id": "5f60e3c1-b176-4cc7-ba44-b236b3001b35", "ip_address": "2.2.2.1"} | | id | 2007ca70-8cde-43bf-9254-66fd2e7fb327 | | mac_address | fa:16:3e:ca:2a:c0 | | name | | | network_id | 048938b7-b108-4f43-9222-3560f6d91fef | | port_security_enabled | False | | project_id | 58a54da3f0404bc4ad7a266e73c9e7cc | | revision_number | 68 | | security_groups | | | status | ACTIVE | | tenant_id | 58a54da3f0404bc4ad7a266e73c9e7cc | | updated_at | 2017-02-03T05:09:49Z | +-----------------------+--------------------------------------------------------------------------------+ In the log, the following debug message was seen and only 2.2.2.2 and 2.2.2.3 IPs are sent out by the l2pop notification agent. /var/log/neutron/neutron-server.log 2017-02-01 22:20:03.925 35239 DEBUG neutron.plugins.ml2.drivers.l2pop.rpc [req-f3f31e84-b43e-4393-9ac1-8da9c5c5952e - - - - -] Notify l2population agent compute1 at q-agent-notifier the message add_fdb_entries with {u'048938b7-b108-4f43-9222-3560f6d91fef': {'ports': {u'10.153.36.74': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address=u'fa:16:3e:c3:f8:fb', ip_address=u'2.2.2.2')], u'10.153.36.75': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address=u'fa:16:3e:0b:5d:3d', ip_address=u'2.2.2.3')]}, 'network_type': u'vxlan', 'segment_id': 18}} _notification_host /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/l2pop/rpc.py:57 [root@compute1 ] # arp -an | grep 2.2.2 ? (2.2.2.2) at fa:16:3e:c3:f8:fb [ether] PERM on vxlan-18 ? (2.2.2.3) at fa:16:3e:0b:5d:3d [ether] PERM on vxlan-18 Looking at the code: L2pop notification for these ports is sent here: https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/l2pop/rpc.py#L51 Port info is gathered here for non-distributed HA router: https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/l2pop/db.py#L99 Trace down from this line, it filters out ports that are in HA_ROUTER_PORTS, which HA_ROUTER_PORTS = (const.DEVICE_OWNER_HA_REPLICATED_INT, const.DEVICE_OWNER_ROUTER_SNAT) From the above we see that the port's device_owner is network:ha_router_replicated_interface, which would get filtered out by this. When both l2pop and arp_responder enabled for linuxbridge agent, vxlan device is created in "proxy" mode. In this mode, ARP entry must be statically added by linuxbridge agent. Because of [1], l2pop driver won't notify HA router port, so linuxbridge agent can't add ARP entry for router port. As there is no router ARP entry, vxlan device is dropping ARP request packets from vm(destined to router), making vm unable to communicate with router. This issue is only on linuxbridge agent and not on ovs agent. Temporary solution for vm to communicate with HA router is to disable arp_responder when l2pop is enabled. If the users need both arp_responder and l2pop features for linuxbridge agent, we need an implementation which decouples them i.e https://bugs.launchpad.net/neutron/+bug/1518392 [1] https://review.openstack.org/#/c/255237/
2017-02-22 03:17:16 venkata anil summary L2pop filters out port info for HA router internal interface when sending out notification [linuxbridge agent] vm can't communicate with router with l2pop
2017-02-22 19:06:48 Jakub Libosvar neutron: milestone pike-1
2017-02-22 20:00:15 venkata anil bug added subscriber Thomas Morin
2017-02-22 20:34:48 Miguel Angel Ajo neutron: importance Undecided Medium
2017-02-22 20:35:11 Miguel Angel Ajo neutron: importance Medium High
2017-03-07 11:48:43 venkata anil neutron: status New In Progress
2017-05-18 01:19:15 Armando Migliaccio neutron: milestone pike-1 pike-2
2017-09-02 19:43:20 venkata anil neutron: assignee venkata anil (anil-venkata)
2017-09-02 19:43:36 venkata anil neutron: status In Progress Confirmed
2020-02-07 19:03:22 Brian Haley neutron: status Confirmed Won't Fix