l3agent can't create router if there are multiple external networks

Bug #1824571 reported by Szabolcs Tenczer
56
This bug affects 11 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Miguel Lavalle

Bug Description

In case there are more than one external network the l3 agent unable to create routers with the following error:

2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router)
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent target_ex_net_id = self._fetch_external_net_id()
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent raise Exception(msg)
2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network.

It happens in DVR scenario on both dvr and dvr_snat agents and it started after upgraded from Rocky to Stein, before the upgrade it worked fine. The gateway_external_network_id is not set in my config, because I want the l3 agent to be able to use multiple external networks.

Changed in neutron:
importance: Undecided → High
status: New → Confirmed
tags: added: l3-dvr-backlog
Revision history for this message
Brian Haley (brian-haley) wrote :

This could be related to https://review.openstack.org/#/c/567369/ or at least is something to look at.

Miguel Lavalle (minsel)
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I set it as critical because at least 3 different people hits this issue already and were asking about it in neutron irc channel and OpenStack mailing list.

Changed in neutron:
importance: High → Critical
Revision history for this message
Miguel Lavalle (minsel) wrote :

I can replicate this issue in an environment running master branch. When setting the external gateway:

openstack router set --external-gateway {external-net-id} {router or router-id}

I get the following traceback in the l3 agent:

May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent [-] Failed to process compatible router: e6d235a5-46f8-4a27-b7d5-adeb8ed8b619: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network.
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent Traceback (most recent call last):
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent File "/opt/stack/neutron/neutron/agent/l3/agent.py", line 759, in _process_routers_if_compatible
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent File "/opt/stack/neutron/neutron/agent/l3/agent.py", line 598, in _process_router_if_compatible
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent # by forcing a check by RPC.
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent File "/opt/stack/neutron/neutron/agent/l3/agent.py", line 426, in _fetch_external_net_id
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent raise Exception(msg)
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network.
May 25 23:30:09 network neutron-l3-agent[15565]: ERROR neutron.agent.l3.agent

Revision history for this message
Miguel Lavalle (minsel) wrote :

This regression was introduced when we merged https://review.opendev.org/#/c/567369/, which removed the external_network_bridge option. Per commit message in change https://review.opendev.org/#/c/59359, "l3 agent can handle any networks by setting the neutron parameter external_network_bridge and gateway_external_network_id to empty". After removing the external_network_bridge option, we no longer perform this check https://github.com/openstack/neutron/blob/946faaf361003a0a67081be28565d3815c354510/neutron/agent/l3/agent.py#L311-L315, so we cannot no longer have several external networks

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Brian Haley (brian-haley) wrote :

Proposed fix at https://review.opendev.org/#/c/661509/ - somehow bug didn't get auto-updated.

LIU Yulong (dragon889)
Changed in neutron:
importance: Critical → High
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/661835

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/661509
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0b3f5f429d2e495eb78d78d46186092ac735e0d5
Submitter: Zuul
Branch: master

commit 0b3f5f429d2e495eb78d78d46186092ac735e0d5
Author: Miguel Lavalle <email address hidden>
Date: Sun May 26 19:15:25 2019 -0500

    Support multiple external networks in L3 agent

    Change [1] removed the deprecated option external_network_bridge. Per
    commit message in change [2], "l3 agent can handle any networks by
    setting the neutron parameter external_network_bridge and
    gateway_external_network_id to empty". So the consequence of [1] was to
    introduce a regression whereby multiple external networks are not
    supported by the L3 agent anymore.

    This change proposes a new simplified rule. If
    gateway_external_network_id is defined, that is the network that the L3
    agent will use. If not and multiple external networks exist, the L3
    agent will handle any of them.

    [1] https://review.opendev.org/#/c/567369/
    [2] https://review.opendev.org/#/c/59359

    Change-Id: Idd766bd069eda85ab6876a78b8b050ee5ab66cf6
    Closes-Bug: #1824571

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/661835
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=62e0bee8206491690bd6b6b72fc38d9694d5083f
Submitter: Zuul
Branch: stable/stein

commit 62e0bee8206491690bd6b6b72fc38d9694d5083f
Author: Miguel Lavalle <email address hidden>
Date: Sun May 26 19:15:25 2019 -0500

    Support multiple external networks in L3 agent

    Change [1] removed the deprecated option external_network_bridge. Per
    commit message in change [2], "l3 agent can handle any networks by
    setting the neutron parameter external_network_bridge and
    gateway_external_network_id to empty". So the consequence of [1] was to
    introduce a regression whereby multiple external networks are not
    supported by the L3 agent anymore.

    This change proposes a new simplified rule. If
    gateway_external_network_id is defined, that is the network that the L3
    agent will use. If not and multiple external networks exist, the L3
    agent will handle any of them.

    [1] https://review.opendev.org/#/c/567369/
    [2] https://review.opendev.org/#/c/59359

    Change-Id: Idd766bd069eda85ab6876a78b8b050ee5ab66cf6
    Closes-Bug: #1824571
    (cherry picked from commit 0b3f5f429d2e495eb78d78d46186092ac735e0d5)

tags: added: in-stable-stein
Revision history for this message
Florian Guitton (f-guitton) wrote :

Hello everybody,

Anyone knows when we could hope to see a new 14.x version of the packages published with this patch ?
We have a critical update planned that will require this fix and it would be tremendous to be able to rely on upstream packages.

Best wishes,

Revision history for this message
Brian Haley (brian-haley) wrote :

Florian - we try and tag all the stable branches on every release milestone, the next one being T-1 on June 6th [0]. Looking at stable/stein reviews we have only two outstanding, so once they merge I will send out a review to tag it.

[0] https://releases.openstack.org/train/schedule.html#t-1

Revision history for this message
Florian Guitton (f-guitton) wrote :

Thank you very much Brian, very informative !
And Great News ... looking forward to the review !

no longer affects: ubuntu
Miguel Lavalle (minsel)
Changed in neutron:
status: Fix Released → Confirmed
Miguel Lavalle (minsel)
Changed in neutron:
importance: Critical → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.2

This issue was fixed in the openstack/neutron 14.0.2 release.

Changed in neutron:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 15.0.0.0b1

This issue was fixed in the openstack/neutron 15.0.0.0b1 development milestone.

Revision history for this message
Marek Grudzinski (ivve) wrote :
Download full text (3.3 KiB)

So I have kept rocky neutron in a overall stein installation but I'm still getting this issue in neutron 14.0.2. Here are the logs:

Exception during message handling: TooManyExternalNetworks: More than one external network exists.
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/api/rpc/handlers/l3_rpc.py", line 254, in get_external_network_id
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server net_id = self.plugin.get_external_network_id(context)
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/db/external_net_db.py", line 149, in get_external_network_id
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server raise n_exc.TooManyExternalNetworks()
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server TooManyExternalNetworks: More than one external network exists.
2019-10-14 12:15:40.682 25 ERROR oslo_messaging.rpc.server

and

Failed to process compatible router: a1d8edbd-2156-4cae-9d3c-803bb32acd94: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network.
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router)
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent target_ex_net_id = self._fetch_external_net_id()
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent raise Exception(msg)
2019-10-14 12:15:39.950 58 ERROR neutron.agent.l3.agent Exception: The...

Read more...

Revision history for this message
Marek Grudzinski (ivve) wrote :

Just to clarify, i upgraded from 13.0.4 -> 14.0.2 and getting this message. Using HA and DVR in this deployment.

Revision history for this message
Marek Grudzinski (ivve) wrote :

My bad, kolla-ansible didn't properly respond to its upgrade playbooks and decided only to upgrade to 14.0.0 where the problem actually occurs.

Revision history for this message
Laurent Dumont (baconpackets) wrote :

If anyone has a second, I'm not sure if I'm hitting the same issue but I would just like to confirm. I am seeing two similar logs when I create two external networks within the same project.

I'm using Terraform to spawn the Networks and if I create both my PROC and EXT network with the "external" flag, the router will not spawn and will show both the "RouterNotCompatibleWithAgent" and "TooManyExternalNetworks" errors.

I'm using kolla 8 with a stein install. Is there a way to check I'm seeing the same issue?

Revision history for this message
Marek Grudzinski (ivve) wrote :

I'm taking back my earlier statement. The log message still occurs in 14.0.2 in a HA DVR environment.

Revision history for this message
chalansonnet (schalans) wrote :

Hello,

In our environment we can also reproduced the problem (Kolla Ansible / Centos7 / Stein)
We trying two workarounds :

- Remove DVR => Worked and reconfigure Neutron / the problem was removed with this configuration
- Keep DVR, but we found (i think ) a bug on python library
Last kolla docker package of L3_agent was created with the version 0.5.3-4.el7 python2-pyroute2-0.5.3-4.el7.noarch
There is a bug on RPC https://bugs.launchpad.net/neutron/+bug/1856572

The following package (need to update l3_agent and openvswitch container) python2-pyroute2-0.5.6-1.el7.noarch did the job and the problem was resolved

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers