[RFE]L3 Router should support ECMP

Bug #1880532 reported by XiaoYu Zhu on 2020-05-25
This bug affects 1 person
Affects Status Importance Assigned to Milestone
XiaoYu Zhu

Bug Description

ECMP is a kind of routing technology which allows multiple different links reach the same destination. Thanks to ECMP has been supported by linux kernel, neutron can simply support ECMP by using linux command line and adding route entry into qrouter namespace.

An ECMP command looks like:
ip route replace to <destination_ip> nexthop via <nexthop_ip1> nexthop via <nexthop_ip2>

then there will be a entry as follow:
           nexthop via <nexthop_ip1> dev qr-xxxxxxxx-nn weight 1
           nexthop iva <nexthop_ip2> dev qr-xxxxxxxx-nn weight 1

Then router will randomlly pick a <nexthop_ip> and fill its mac address into the package's dst_mac address when this package wants to get to the <destination_ip>.
Since Octavia has proposed an active-active load balancing design on https://review.opendev.org/#/c/723864/ and this design need ecmp support in neutron, I hope ECMP can be supported as soon as possible.

XiaoYu Zhu (honglan0914) on 2020-05-25
summary: - L3 Router should support ECMP
+ [RFE]L3 Router should support ECMP
description: updated
XiaoYu Zhu (honglan0914) on 2020-05-25
description: updated
XiaoYu Zhu (honglan0914) on 2020-05-25
description: updated
Changed in neutron:
assignee: nobody → XiaoYu Zhu (honglan0914)

Hello XiaoYu:

Can you bring this RFE to the next Neutron L3 meeting or Neutron drivers meeting?


Thanks and regards.

Slawek Kaplonski (slaweq) wrote :

Do You plan to extend existing extraroutes API or to introduce something new?

LIU Yulong (dragon889) wrote :

This should be the spec for this RFE:

XiaoYu Zhu (honglan0914) wrote :

To Slawek Kaplonski:
Yes,I do. And yes spec file is at #3
And I'm sorry for missing the meeting on wednesday...

Slawek Kaplonski (slaweq) wrote :


I see that Liu Yulong has got couple valid question related to this, see logs from the last L3 subteam meeting: http://eavesdrop.openstack.org/meetings/neutron_l3/2020/neutron_l3.2020-05-27-14.02.log.html#l-16

I will point it here just for the record:

14:16:26 <liuyulong> From the spec I can only see the "background", "the problem from that background", "code changes".
14:17:25 <liuyulong> It missed the important section "the detail of the new approach"
14:17:35 <liuyulong> why it works? and how it works?
14:20:16 <liuyulong> And, in other word, this is something like a load balancer upon the Octavia load balancer.
14:24:41 <liuyulong> And why not add the LB "real server" to the octavia load balancer?
14:27:03 <liuyulong> One more thing is the route is cross the subnet, so the traffic in the same subnet should not go to the router, so the ECMP is actually not work.
14:27:23 <liuyulong> So why not add the route directly to the guest client VM?

Can You reply here to those questions from Liu? Thx in advance.

tags: added: l3-dvr-backlog
XiaoYu Zhu (honglan0914) wrote :

To Slawek Kaplonski:

Thanks for your attention, I have given my replies in my spec file, and I'll stick them here as follows:

<Q0:the detail of the new approach>
I think I have explained the implementation method clearly, but it seems that it is not enough. I will fill in this part if necessary
<Q1:why it works? and how it works>
First we create a port with VIP and never bind it to any vm, then Ocativa create some load balancers having their own real IP and sharing the one VIP.
Ocativa send request to Neutron to create a ECMP instance, then Neutron l3-agent excute command in qrouter namespaces to create a ECMP entry just like I wrote above, fill <destination_ip> field with VIP, fill <nexthop_ip> field with real ip.
Now, when packets come to router with VIP as their destination ip, they will match the
ECMP entry, so qrouter will change their destination mac address to the real ip's mac address which belongs to a load balancer.
<Q2:load balancer upon the Octavia load balancer>
Yes you are right. The ECMP is another layer of loadbalancer, which provide more flexity.
<Q3:add the LB "real server" to the octavia load balancer>
I'm not sure about what you mean....but the "back ends" refered here are exactly Octavia backends pool.
<Q4:same subnet will not go to the router>
In fact I also added the ARP surrogate setting to the router,but forgot to write it in my spec file, I'll make up that part later.
<Q5:why not add the route directly to the guest client VM>
It is because the simple ECMP approach can not proivde features about L7 loadbalaning. The complex loadbalancing should be done by haproxy, lvs or other loadbalancer. The two level of loadbalancing(ECMP+Loadbalancer) can be easily managed by octavia...

In Addition, Mr Liu YuLong has also ask server other questions in my spec file, as follows:

<Q6:How about DVR scenario>
It can works in DVR scenario,just put ecmp entries to every each route related to load balancing node.
<Q7:How about bug 1774459>
This bug will not trigger, because router will not need the mac of VIP.

Changed in neutron:
importance: Undecided → Wishlist
XiaoYu Zhu (honglan0914) on 2020-06-01
description: updated
XiaoYu Zhu (honglan0914) wrote :

To YangJianFeng:

As Jay Liu has replied you in his spec,the command you wrote:"openstack router set <routerid> --route destination=<des_ip1>,gateway=<nexthop_ip1> --route destination=<des_ip1>,gateway=<nexthop_ip2>" will have only one route entry with <nexthop_2> as result, due to the second one replaced the frist one, I think such design of this command should has its resaon and original purpose, and I don't think we should change it.

Slawek Kaplonski (slaweq) wrote :


I added question in the proposed spec for this feature but I will repeat it here too:

I don't understand why we need new attribute "ecmp_router" instead of reusing (maybe with some changes in the logic) existing "extra_routes".
What if e.g. my ecmp_route will overlaps with extra_route in the router?

Please provide this informations before we will discuss that RFE in the drivers meeting.

XiaoYu Zhu (honglan0914) wrote :

Hi Slawek,
I have considered this before, and my thought is:
The current extra_route processing logic is to execute a lunix command to add a route directly after receiving an update message, and if you want it to support ECMP, neutron need to query the database after receiving the request to see if there are extra_route entries with the same destination address on the same router, and if so, merge them into one ECMP route, which firstly increases the time complexity, and also breaks the original "replace the route with the same destination address" logic.
And modifying the original routing table database format... I feel like it would break too much of the original code logic.
As for the case of overlaps extra routes, the original route will be replaced by the new route which has a same destination , I think it's logically correct.

Slawek Kaplonski (slaweq) wrote :

I would like to discuss it on next Neutron drivers meeting which will be on Friday: http://eavesdrop.openstack.org/#Neutron_drivers_Meeting - so it would be great if You could join there if there would be any additional questions. But RFE should be discussed even if You will not be able to attend this meeting.

tags: added: rfe-triaged
removed: rfe
Slawek Kaplonski (slaweq) wrote :

We discussed that RFE on the drivers meeting on 10.07.2020 and we agreed with XiaoYu Zhu that it's worth to check if there would be possibility to reuse existing extra routes API for that. Please check that and comment here so we can get back to this rfe once again.

XiaoYu Zhu (honglan0914) wrote :

I gave my response and updated on the gerrit a few days ago, but seems like I'd better summarize it properly here:

I have read the RouterInfo part of the code again and determined that it is feasible to use the existing APIs, no matter `extra routes API` or ~router update~, I can do it by merge two or more routes with a same destination, just make the process of updating the routes a little more complicated.

I have uploaded the changes I made to RouterInfo,is on:https://github.com/z503755743/NEUTRON-ECMP
It's not as perfect as I described in the spec yet, but it can be used to add ECMP routes now.

Slawek Kaplonski (slaweq) wrote :

We were discussing this proposal on today's drivers meeting and we agreed to approve it. Please now move on with the specs and implementation for this feature.

tags: added: rfe-approved
removed: rfe-triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers