Comment 0 for bug 1252900

Revision history for this message
Thiago Martins (martinx) wrote :

Hello!

Currently, Havana L3 Router have a serious issue. Which makes it almost useless (sorry, I do not want to be rude but instead, trying to bring more attention to this problem).

When the tenant network traffic pass trough the L3 Router (Namespace at the Network Node), it becomes very, very slow and intermittent. The issue also affects the traffic that hit a "Floating IP", going into the Tenant subnet.

The affected topology is: "Per-Tenant Router with Private Networks".

As a reference, I'm using the following Grizzly guide for my Havana deployment:

https://github.com/mseknibilel/OpenStack-Grizzly-Install-Guide/blob/OVS_MultiNode/OpenStack_Grizzly_Install_Guide.rst

Extra info:

http://docs.openstack.org/havana/install-guide/install/apt/content/section_networking-routers-with-private-networks.html

The symptoms are:

1- "Slow connection to Canonical or when browsing the web from within a tenant subnet"

aptitude update ; aptitude safe-upgrade

From within a Tenant instance, it will take about 1 hour to finish, on a link capable of finishing it in 2~3 minutes.

2- SSH connection using Floating IPs froze 10 times per minute.

Connecting from the outside world, into a Instance using its Floating IP address, it is a pain.

We're talking about this issue at the OpenStack mail list, here is the related thread: http://lists.openstack.org/pipermail/openstack/2013-November/002705.html

Also, I made a video about it, watch it here: http://www.youtube.com/watch?v=jVjiphMuuzM

Tested versions:

* OpenStack Havana on top of Ubuntu 12.04.3 using Ubuntu Cloud Archive

* Tested with Open vSwitch versions:

1.10.2 from UCA
1.11.0 compiled for Ubuntu 12.04.3 using "dpkg-buildpackage"
1.9.0 from Ubuntu package "openvswitch-datapath-lts-raring-dkms"

* Not tested:

Havana with Ubuntu 12.04.1 + OVS 1.4.0 (does not support VXLAN).

* Tenant subnet tested types:

VXLAN
GRE
VLAN

It does not matter the subnet type you choose, it will be always slow.

Apparently, if you upgrade your Grizzly from Ubuntu 12.04.1 + OVS 1.4.0, to Ubuntu 12.04.3 with OVS 1.9.0, it will trigger this problem when with Grizzly too. So, I think that this problem might be related to Open vSwitch itself. But I need more time to check this.

My private cloud computing based on Havana is open for you guys to debug it, just ask for an access! =)

My current plan it to test Havana with OVS 1.4.0 but, I don't have too much time this week to do this job.

I'm not sure if the problem is with OVS or not, I'll try to test it this week.

Also, at my video, you guys can see how I "fixed" it, by starting a Squid proxy-cache server within the Tenant Namespece Router, proving that the problem appear ONLY when you try to establish a connection from a tenant subnet, directly to the External network.

I mean, the connection between a tenant and its router is okay, from its router to the Internet, is also okay but, from a tenant to the Internet, is not. So, Squid was a perfect choice to verify this theory at the Namespace router...

Best!
Thiago