'Unable to retrieve the agent ip' when using l2population mechnism driver

Bug #1407959 reported by Itzik Brown
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Hong Hui Xiao

Bug Description

If there is more than one agent per host even when there is just one alive l2_population doesn't work i.e. the tunnel isn't brought up.

There is an error in Neutron's log:
Unable to retrieve the agent ip, check the agent configuration.

When deleting the dead agent using neutron agent-delete <agent-id> and restarting neutron-openvswitch-agent the tunnel is brought up.

The 'dead' agent is LinuxBridge agent.

Version
=======
RHEL 7.0
openstack-neutron-2014.2.1-5.el7ost

Changed in neutron:
assignee: nobody → venkata anil (anil-venkata)
Itzik Brown (itzikb1)
description: updated
Revision history for this message
Romil Gupta (romilg) wrote :

Could you please let me know in which scenario people may run two L2 agents on a single node?

Revision history for this message
Itzik Brown (itzikb1) wrote :

It can be ,for example, when switching from Linux Bridge to Openvswitch.

Changed in neutron:
assignee: venkata anil (anil-venkata) → nobody
Romil Gupta (romilg)
Changed in neutron:
assignee: nobody → Romil Gupta (romilg)
Revision history for this message
Assaf Muller (amuller) wrote :

Itzik, can you explain exactly how are you switching from LB to OVS? I'd imagine you'd migrate all VMs and other resources (Routers, DHCP if network node) off the node, shut down the LB agent and start the OVS agent. You shouldn't have the LB and OVS agents up at the same time.

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Waiting for more input from the submitter

Changed in neutron:
importance: Undecided → Low
status: New → Confirmed
tags: added: l2-pop ovs
tags: added: linuxbridge
Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Itzik Brown (itzikb1) wrote :

As mentioned it's the case of migrating from one agent to the other.
Not running both agents at the same time.

Changed in neutron:
status: Incomplete → New
Revision history for this message
Assaf Muller (amuller) wrote :

Then the old agent should be deleted via neutron agent-delete.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This happens because the query to look up agent is busted. See [1] and [2]. Even though we changed the logic slightly this can still happen. The only way to make this bullet proof it is by changing what's passed in the filter function, e.g. hostname, agent type, etc.

[1] https://review.openstack.org/#/c/236970/
[2] https://review.openstack.org/#/c/237785/

tags: removed: linuxbridge ovs
Changed in neutron:
assignee: Romil Gupta (romilg) → nobody
status: New → Confirmed
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

@Romil: are you still interested in fixing this?

tags: added: low-hanging-fruit
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

The fix for this bug will let someone understand quite a bit of Neutron internals, so to this aim it's a nice 'starter bug'

Hong Hui Xiao (xiaohhui)
Changed in neutron:
assignee: nobody → Hong Hui Xiao (xiaohhui)
Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

With [1], I don't think the error in the description will report anymore. When I debug the code, I found that all agents(metadata, dhcp, l3, ovs) are returned for get_dvr_active_network_ports and get_nondvr_active_network_ports. So, the query still need to refine.

[1] https://review.openstack.org/#/c/237785

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/239862

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

As [1] and [2] are in-stable-liberty. Set the tag liberty-backport-potential.

[1] https://bugs.launchpad.net/neutron/+bug/1507684
[2] https://bugs.launchpad.net/neutron/+bug/1508205

tags: added: liberty-backport-potential
Changed in neutron:
assignee: Hong Hui Xiao (xiaohhui) → Vivekanandan Narasimhan (vivekanandan-narasimhan)
Changed in neutron:
assignee: Vivekanandan Narasimhan (vivekanandan-narasimhan) → Hong Hui Xiao (xiaohhui)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/239862
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=311c9d9d929b4a53d2f54824e599c12f653b77f6
Submitter: Jenkins
Branch: master

commit 311c9d9d929b4a53d2f54824e599c12f653b77f6
Author: Hong Hui Xiao <email address hidden>
Date: Wed Oct 28 03:28:27 2015 -0400

    Datapath on L2pop only for agents with tunneling-ip

    This patch is a regression issue for patch[1].

    Only those agents which expose tunneling-ip will be considered in
    determining data-path tunnels in deployments with l2pop ON.

    [1] https://review.openstack.org/#/c/236970/

    Change-Id: I7c3b911d5e7448b4e8dee15bb50df33a6e9d5cfe
    Closes-Bug: #1407959

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/247848

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/247848
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2a27d7d25eb6718c8c97ece42d9fc53d57ebb817
Submitter: Jenkins
Branch: stable/liberty

commit 2a27d7d25eb6718c8c97ece42d9fc53d57ebb817
Author: Hong Hui Xiao <email address hidden>
Date: Wed Oct 28 03:28:27 2015 -0400

    Datapath on L2pop only for agents with tunneling-ip

    This patch is a regression issue for patch[1].

    Only those agents which expose tunneling-ip will be considered in
    determining data-path tunnels in deployments with l2pop ON.

    [1] https://review.openstack.org/#/c/236970/

    Change-Id: I7c3b911d5e7448b4e8dee15bb50df33a6e9d5cfe
    Closes-Bug: #1407959
    (cherry picked from commit 311c9d9d929b4a53d2f54824e599c12f653b77f6)

tags: added: in-stable-liberty
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b1

This issue was fixed in the openstack/neutron 8.0.0.0b1 development milestone.

Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.0.1

This issue was fixed in the openstack/neutron 7.0.1 release.

Revision history for this message
ngenibre (leosum) wrote :
Download full text (5.6 KiB)

Hi,

I'm still having the same issue, following the Liberty documentation online http://docs.openstack.org/liberty/install-guide-rdo/launch-instance.html#create-virtual-networks

And when I setup the private subnet I have the following Warning followed by an error:

============>
2016-03-23 19:46:50.656 12893 WARNING neutron.plugins.ml2.drivers.l2pop.mech_driver [req-65351eeb-2e1c-4483-81c2-524bcf64c366 bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] Unable to retrieve active L2 agent on host chavloli001
2016-03-23 19:46:50.867 12893 INFO neutron.wsgi [req-65351eeb-2e1c-4483-81c2-524bcf64c366 bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] 10.123.35.173 - - [23/Mar/2016 19:46:50] "DELETE /v2.0/networks/d081a988-004a-4c72-bd54-7bb87cba2095.json HTTP/1.1" 204 173 0.335705
2016-03-23 19:46:56.942 12893 INFO neutron.wsgi [-] (12893) accepted ('10.123.35.173', 56377)
2016-03-23 19:46:57.000 12893 INFO neutron.plugins.ml2.db [req-601d0c61-6394-46b8-8459-11fe0dcb3976 bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] Added segment e6c14ae5-e05d-49e8-902e-00d57736fe6e of type vxlan for network eac8b214-952b-4670-a53d-9f111327eb3e
2016-03-23 19:46:57.012 12893 INFO neutron.wsgi [req-601d0c61-6394-46b8-8459-11fe0dcb3976 bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] 10.123.35.173 - - [23/Mar/2016 19:46:57] "POST /v2.0/networks.json HTTP/1.1" 201 590 0.068782
2016-03-23 19:47:08.783 12892 INFO neutron.wsgi [-] (12892) accepted ('10.123.35.173', 56380)
2016-03-23 19:47:08.825 12892 INFO neutron.wsgi [req-b0952284-a5dd-488e-997b-598f8b04b42c bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] 10.123.35.173 - - [23/Mar/2016 19:47:08] "GET /v2.0/networks.json?fields=id&name=private HTTP/1.1" 200 275 0.041099
2016-03-23 19:47:08.927 12892 INFO neutron.wsgi [req-ec830fb4-1f0c-41d0-ae9d-6f3a8f54d140 bf168041f9c5446b81bca4d737f547e7 a86464f29aac4ec5848fb481e6362fcb - - -] 10.123.35.173 - - [23/Mar/2016 19:47:08] "POST /v2.0/subnets.json HTTP/1.1" 201 738 0.097313
2016-03-23 19:47:09.087 12894 ERROR neutron.plugins.ml2.managers [req-27a5c63e-3d3a-4383-992b-317b18a53faa - - - - -] Failed to bind port 3c2a5757-0765-4ac3-8552-c687709d9277 on host chavloli001
2016-03-23 19:47:09.087 12894 ERROR neutron.plugins.ml2.managers [req-27a5c63e-3d3a-4383-992b-317b18a53faa - - - - -] Failed to bind port 3c2a5757-0765-4ac3-8552-c687709d9277 on host chavloli001
2016-03-23 19:47:09.099 12894 INFO neutron.plugins.ml2.plugin [req-27a5c63e-3d3a-4383-992b-317b18a53faa - - - - -] Attempt 2 to bind port 3c2a5757-0765-4ac3-8552-c687709d9277
2016-03-23 19:47:11.229 12894 WARNING neutron.plugins.ml2.rpc [req-cf691abb-d90e-4b88-8c3c-5e8f5869db1f - - - - -] Device tap3c2a5757-07 requested by agent lb005056b23058 on network eac8b214-952b-4670-a53d-9f111327eb3e not bound, vif_type: binding_failed

======>
2016-03-23 20:14:55.009 16201 INFO neutron.plugins.ml2.drivers.type_vlan [-] VlanTypeDriver initialization complete
2016-03-23 20:14:55.009 16201 INFO neutron.plugins.ml2.managers [-] Initializing driver for type 'vxlan'
2016-03-23 20:14:55.010 16201 INFO neutron.plugins...

Read more...

Revision history for this message
ngenibre (leosum) wrote :

Any update

tags: removed: liberty-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.