[RFE] Does not support shared N-S qos per-tenant

Bug #1787793 reported by Na Zhu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Wishlist
Unassigned

Bug Description

The data center N-S bandwidth is bought from ISP, so it is alwawys limited and need charge when tenant use it.
For example, when tenant associate floatingip to port, he need select the floatingip bandwidth size he wants to use and need pay the bandwidth. Besides floatingip, the ipsec vpn also consumption N-S bandwidth. To save money, the tenant want the cloud provider can support floatingip and ipsec vpn shared one bandwidth, because sometimes the floatingip flow is large and sometimes the ipsec vpn flow is large, if the two share one bandwidth, he just need pay one bandwidth, which size can be less than 2 bandwidth.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

First of all, QoS for VPN traffic is not implemented yet so we can't "share" bandwidth between FIP and VPN traffic for now :)

You can apply QoS policy with limit to port directly - then it will be shared between different types of traffic but I can understand that it's not exactly what You have in mind.

Hongbin Lu (hongbin.lu)
summary: - Does not support shared N-S qos per-tenant
+ [RFE] Does not support shared N-S qos per-tenant
tags: added: rfe
Revision history for this message
Miguel Lavalle (minsel) wrote :

Let's discuss in the QoS meeting next week as a first step

Changed in neutron:
importance: Undecided → Wishlist
Revision history for this message
Miguel Lavalle (minsel) wrote :

VPN QoS is going to be implemented by Zhaobo during Stein https://review.openstack.org/#/c/558986/. But beyond this fact, if I understand correctly, the essence of this RFE is sharing a QoS bandwidth limit per tenant. Let's say the tenant has two floating IPs. The questions is how these floating IPs can share a tenant level QoS bandwidth limit. Some thoughts:

1) From the point of view of the API, we could associate a new attribute ('tenant_level' with values True or False, default False) to the QoS policy resource. When set to True, this attribute indicates that the policy is shared by the N-S traffic of the two tenant floating IPs

2) From the data plane of view the questions is how to set up such a shared QoS bandwidth limit so it is shared by these two floating IPs

Revision history for this message
LIU Yulong (dragon889) wrote :

AFAIK, the vpnaas are using the centralized router (namespace) to connect the ipsec VPN traffic. How to share the bandwidth for distributed floating IP and centralized VPN? For neutron scope, I don't think the distributed floating IP traffic and the centralized VPN traffic can share one bandwith. Maybe it can be limit under a high level physical device, maybe the TOR, core switch or a physical firewall.

By the way, the endpoint IP for the VPN traffic is the neutron L3 centralized router gateway IP address. If so, I think the BP https://blueprints.launchpad.net/neutron/+spec/router-gateway-ip-qos can be the way to limit the VPN bandwidth for the cloud. Users can buy a high level bandwidth for the gateway IP since it is shared by all subnets. Users can adjust bandwidth at any time for the floating IP and the gateway IP. But users still need to pay for the bandwidth twice, one for the floating IP and one for gateway IP.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

It's maybe not very good idea but what if we could treat bandwidth as any other resource and give users some quota of bandwidth which they can use as they want. Then user could be able to use bandwidth which he pays for in the way he wants to use.
Maybe it should be then divided for "FIP BW Quota" and "Port BW Quota" in separate as it's different kind of traffic.
What You think about such idea?

Revision history for this message
zhaobo (zhaobo6) wrote :

@yulong, currently, vpnaas not support DVR topology, there is still so much work to be done. ;-( . How to share one bandwith? The way to make it work, is introduce a new tc wraper with new qos model, as we can easy to share the bandwidth with tc, right? I already introduce a new tc-wrapper in the draft patch. Also, I have to say that I'm disagree that we do the limit outside of openstack, as that doesn't sound like a cloud. ;-) .

For the VPN and GW qos conflicts, we must fix them if we get a agreement, that is how to treat the different business goals towards qos. So let's discuss for more details.

From my opinion, I think the users just pay 1, which named "Project QoS BW", then every bw need get their cost from that.

Revision history for this message
LIU Yulong (dragon889) wrote :

@Slawek Kaplonski, bandwidth quota can be introduced to the neutron, since users can not use it unrestrainedly. But user may complain why pay for a large bandwith quota but not acctually use it.

@zhaobo, bandwidth share corss floating IPs in dvr scenario seems also not easy to implement in the neutron scope. For that tc-wrapper, I think it can be useful for the traffic sharping, average, minimum bandwidth assurence etc. Hmm, for that physic devices, it can be managed by many plugins for neutron, such as ovn, midonet, odl etc. So we can say it is a cloud. ;-)

If the VPN qos only care about the traffic of the VPN connections, then it can be implemented as a new L3 agent extension. GW IP qos is aim to the router gateway IP, it will limit all the traffic throuth that IP so it will include the VPN traffic. It's a L3 agent extension too. So I think maybe the VPN qos will install the tc rules to the same namespace and same device? So here we are now facing conflicts between the existing TC rules with the new tc-wrapper classful rules?Right? If so, maybe change the device or namespace for the VPN qos can skip such issue. If not, just do it. Let the cloud administor to decide which L3 extension he want use. : )

Revision history for this message
LIU Yulong (dragon889) wrote :

After the discussing during qos meeting last day [1], I think we have some work needs further study.
For the floating IPs bandwidth share, let me state some limitation of neutron first:
1. DVR scenario, floating IPs are hosting in different physical host, the bandwidth sharing now seems hard to achieve.
2. For one project, centralized floating IPs are located in the different router may still has no way to share the bandwidth. The router may schedule to different physical hosts too.
3. If you have mutiple external networks (floating IP networks), and each has its own NIC, then the share can not achieve. (p.s., neutron does not allow set mutiple gateway for router, so this seems not hurts so much.)
   Share floating IP bandwidth cross such external networks is not available.
4. So, the last one available scenario is floating IPs bandwidth hosted by one single centralized router.

So for the scenario 4, here are some reference TC classifier rules [2] [3] asked by Miguel Lavalle at the meeting.
Assuming we have 10Mbps sharing for IP: 1.1.1.1 and 2.2.2.2, so the rule will be:

tc qdisc add dev bond1 root handle 1: htb default 2
tc class add dev bond1 parent 1: classid 1:1 htb rate 10mbit ceil 10mbit # the share bandwidth, 10Mbps
tc class add dev bond1 parent 1: classid 1:2 htb rate 999mbit # the default class
tc class add dev bond1 parent 1:1 classid 1:10 htb rate 5mbit ceil 10mbit # IP 1 share rule
tc class add dev bond1 parent 1:1 classid 1:11 htb rate 5mbit ceil 10mbit # IP 2 share rule
tc filter add dev bond1 parent 1: protocol ip prio 1 u32 match ip src 1.1.1.1 flowid 1:10 # IP 1 filter match rule, egress
tc filter add dev bond1 parent 1: protocol ip prio 1 u32 match ip src 2.2.2.2 flowid 1:11 # IP 2 filter match rule, egress

For may test, I installed such rules to the pyhysic NIC bond1, and the neutron external network type is flat. Any other
network type, such as vlan, vxlan or gre, may change the tc filter protocol, such as 802.1q for vlan.

So, after installed these tc rules, the IP 1.1.1.1 and 2.2.2.2 can both work at 5Mbps. If 2.2.2.2 does not have data traffic,
then 1.1.1.1 can borrow the bandwidth from 2.2.2.2, it can reach the maximum 10Mbps.

[1] http://eavesdrop.openstack.org/meetings/neutron_qos/2018/neutron_qos.2018-08-28-15.01.log.html
[2] https://wiki.linuxfoundation.org/networking/ifb
[3] http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm

Revision history for this message
Miguel Lavalle (minsel) wrote :

This topic has been scheduled for discussion during the Denver PTG on Friday 14th at 10:45: https://etherpad.openstack.org/p/neutron-stein-ptg

Revision history for this message
Miguel Lavalle (minsel) wrote :

Now that we have a potential technical solution at the TC level, we will work on a spec that address other issues around this proposal, like API design, testing, etc

tags: added: rfe-triaged
removed: rfe
Revision history for this message
Miguel Lavalle (minsel) wrote :

Another approach to solve this is the one proposed in these two patches: https://review.openstack.org/#/c/568526/ and https://review.openstack.org/#/c/424468. By handling QoS in the router gateway, the limits will be shared by the tenaant

Miguel Lavalle (minsel)
tags: added: rfe-approved
removed: rfe-triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.