Connecting two or more distributed routers to a subnet doesn't work properly

Bug #1447227 reported by Assaf Muller on 2015-04-22
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Low
zainub

Bug Description

DVR code currently assumes that only one router may be attached to a subnet but this is not the case. OVS flows for example will not work correctly for E/W traffic as incoming traffic is always assumed to be coming from one of the two routers.

The simple solution is to block the attachment of a distributed router to a subnet already attached to another distributed router.

Eugene Nikanorov (enikanorov) wrote :

What's the use case of attaching 2 routers to a subnet?

tags: added: api
Changed in neutron:
status: New → Incomplete
Assaf Muller (amuller) wrote :

I have no idea. The API allows it though. I'm not saying this bug report is high or medium priority, I just wanted the issue known and tracked.

Changed in neutron:
status: Incomplete → Confirmed
status: Confirmed → Triaged
importance: Undecided → Low

Fix proposed to branch: master
Review: https://review.openstack.org/191671

Changed in neutron:
assignee: nobody → ZongKai LI (lzklibj)
status: Triaged → In Progress

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/191671
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Kevin Fox (kevpn) wrote :

bummer. we use multiple routes per subnet on one of our production clouds and are starting to look into dvr enabling that cloud.

The use case is:

we have two external networks, an internet facing one, and a shared private network for sharing servers between tenants but not the internet. each tenant has a tenant network and has 2 router s, one to each external network. It allows tenant users to attach floating ip's on either (or both) networks to their vm's. It has worked very well in production.

Thanks,
Kevin

John Schwarz (jschwarz) on 2016-02-15
Changed in neutron:
assignee: ZongKai LI (lzklibj) → John Schwarz (jschwarz)
John Schwarz (jschwarz) wrote :

This works for me:

* 2 tenant networks and subnets, 10.0.0.0/24 and 20.0.0.0/24,
* 2 routers:
   * router1 is connected to external network and to 10.0.0.0/24 (using 10.0.0.1, so default gateway)
   * router2 is connected to 20.0.0.0/24 (using 20.0.0.1, so default gateway) and 10.0.0.0/24 (as 10.0.0.3).
* 2 VMs, each connected to only one of the above subnets. VM1 is at 10.0.0.0/24 and should have external connectivity.

Using DVR routers for both router1 and router2 allows VM1 to ping 8.8.8.8 and VM2.
The only modification needed is adding a static routing rule in VM1 so that it will know 20.0.0.0/24 is accessible through 10.0.0.3, but this should also be done for legacy routers, so it's not a DVR issue.

So "works for me".

Assaf Muller (amuller) wrote :

@John the bug report is about one DVR router connected to two internal subnets.

John Schwarz (jschwarz) wrote :

@Assaf, in my test, router2 was connected to 2 internal subnets (10.0.0.0/24, 20.0.0.0/24) and it worked. Am I missing something?

Assaf Muller (amuller) wrote :

@John, you're right I missed that. When I looked at this there was an issue in the OVS flows where VM 2 (In your example) is hosted. Can you show ovs-ofctl dump-flows {br_int, br_tun}?

John Schwarz (jschwarz) wrote :

Sure thing.
The first node (which hosts VM2) also hosts the snat namespace of router1, so that's why it has more ovs rules.
In other words: the first node contains the snat of router1 and VM2, and the second node contains VM1.

first node: http://pastebin.com/wq5cTBns
second node: http://pastebin.com/M5SZnLiD

John Schwarz (jschwarz) wrote :

Due to power outage overnight my setup crashed and I had a bit of trouble getting it to work again. I've deleted router1's leg to 10.0.0.0/24 so the sg device was recreated (and the MAC address was changed). I'm including new printouts.
Also, these printouts are more thorough - they include, in addition the the 'ovs-ofctl dump-flows' outputs, also 'ovs-ofctl show' for both br-int and br-tun, as well as 'ip a' for relevant qrouter/snat namespaces.

first node: http://pastebin.com/kBbYd5y2
second node: http://pastebin.com/eAMkPYKp

John.

John Schwarz (jschwarz) on 2016-03-03
Changed in neutron:
assignee: John Schwarz (jschwarz) → nobody
Changed in neutron:
status: In Progress → Confirmed
zainub (zainub.wahid) on 2016-11-16
Changed in neutron:
assignee: nobody → zainub (zainub.wahid)
Oleg Bondarev (obondarev) wrote :

Shouldn't this case be handled by specifying the proper host routes for such a subnet (connected to several routers)?

Changed in neutron:
status: Confirmed → Opinion
Kevin Benton (kevinbenton) wrote :

It's actually not clear to me what the failure mode is here.

If you attach two routers to a subnet and then have routes setup to point to one router for some subnets and the rest to another, are you saying that traffic fails to be forwarded to the correct subnet due to the flows?

Changed in neutron:
status: Opinion → Incomplete
Assaf Muller (amuller) wrote :

@Kevin the bug report was correct in April of 2015. Back then OVS flows were not setup to correctly to factor in for the 2 routers case. I don't know what is the state today. The bug may still exist.

Oleg Bondarev (obondarev) wrote :

Works for me on Mitaka and on master, followed steps from John's comment #6. Just added host-route on a subnet connected to 2 DVR routers instead of manual adding a static route on VM. Marking as invalid.

Changed in neutron:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers