Performance degradation when multiple subnets added

Bug #1556807 reported by Yuli
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
DragonFlow
Fix Released
High
Yuli

Bug Description

Hello

There is a performance degradation when creating multiple subnetworks.

time spend to create 100 subnets: 265 secs
time spend to create +100 subnets: 558 secs
time spend to create +100 subnets: 807 secs
time spend to create +100 subnets: 1025 secs
time spend to create +100 subnets: 1267 secs
time spend to create +100 subnets: 1529 secs
time spend to create +100 subnets: 1766 secs
time spend to create +100 subnets: 2067 secs
time spend to create +100 subnets: 2294 secs

Total time to add 1000 subnets is 11578 secs = 192 mins = more than 3 hours
when it take just 4.5 mins to create the first 100 subnets

I am using a single box to make this test.

You need to fix the quota and file limit in order to be able to do this test.
https://bugs.launchpad.net/dragonflow/+bug/1552228
https://bugs.launchpad.net/dragonflow/+bug/1556598

Revision history for this message
Yuli (stremovsky) wrote :

Attaching script to reproduce this bug.

This script does the following:
1. creates a ROUTER-1
2. Create 10 networks titled: NETWORK-X
   2.1 Create 100 Subnets for each NETWORK-X and attach subnets to ROUTER-1 using the neutron.add_interface_router()

Changed in dragonflow:
importance: Undecided → High
Yuli (stremovsky)
Changed in dragonflow:
assignee: nobody → Yuli (stremovsky)
Revision history for this message
Yuli (stremovsky) wrote :

One of the reasons of degradation is with the arrays that we store in etcd database objects

For example in lrouter object in database we have an array of ports.

If we cook in add_lrouter_port() function, it does the following code:
1. the code fetches lrouter object from NoSQ
2. performs json decode
3. add new port to the list of ports
4. stringify back to json object
5. stores new lrouter object back to NoSQL

When we have a lot of ports like I do have in stress test (i have hundreds of ports in router)
this becomes an issue. On each new port, the code has to cope with more data:
the json strings are bigger, more memory required to decode them;
we encode them back to string.

I suggest, we might need to store ports not in json array but as a NoSQL array,.
Currently we have /lrouter/<router-id> with a json data

We might add /lrouter/<router-id> and and /lrouter/<router-id>-ports

The new version of the add_lrouter_port() might be very simple.
Instead of reading lrouter and decoding, etc...
We will just push new port to /lrouter/<router-id>-ports

When running the original testwithout add_interface_router() now there is no
degradation, in addition, it is much faster

I created 400 ports in total
time spend to create 100 subnets: 74 seconds
time spend to create +100 subnets: 68 seconds
time spend to create +100 subnets: 65 seconds
time spend to create +100 subnets: 68 seconds

Revision history for this message
Yuli (stremovsky) wrote :

In the last example I was talking about add_subnet() abd not add_interface_router()
from dragonflow/db/api_nb.py file.

Revision history for this message
Yuli (stremovsky) wrote :

same problem with add_lrouter_port() function, same file

Revision history for this message
Yuli (stremovsky) wrote :

We have the same problem with default neutron installation.

I reported this bug to the neutron project.

https://bugs.launchpad.net/neutron/+bug/1558101

Yuli

Revision history for this message
Yuli (stremovsky) wrote :

The Neutron dev community picked this bug and fixed it.

I tested the patch, and the result are much better now.
We do have very small degradation now.

Here are the results of the test with Dragonflow:
time spend to create 100 subnets: 167
time spend to create 100 subnets: 173
time spend to create 100 subnets: 181
time spend to create 100 subnets: 189
time spend to create 100 subnets: 197

Best regards,
Yuli

Revision history for this message
Yuli (stremovsky) wrote :

This bug will be closed, when neutron patch will be merged:

https://bugs.launchpad.net/neutron/+bug/1558101

Best regards,
Yuli

Changed in dragonflow:
status: New → In Progress
Revision history for this message
Yuli (stremovsky) wrote :

The code works much better now.

Closing this bug.

Changed in dragonflow:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.