neutron-server initiation is time-consuming with a large vxlan/gre range

Bug #1324875 reported by Xurong Yang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Unassigned

Bug Description

When we configure VXLAN range [1,16M], neutron-server service costs long time and cpu rate is very high(100%) when initiation. One test base on postgresql has been verified: more than 1h when VXLAN range is [1, 1M].

Tags: ml2
Xurong Yang (idopra)
Changed in neutron:
assignee: nobody → Xurong Yang (idopra)
tags: added: ml2
Changed in neutron:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
yong sheng gong (gongysh) wrote :

How about this algorithm?
first we populate a small block of resources into db, and then have a background thread monitoring the allocation status. If the allocation meets a criteria, the thread will populate one more block of resources into db.

take the vxlan ranges: 1..100000, we divide it into 100 blocks of 1000, first one 1000 is populated into db, and then another 1000 if needed.

Revision history for this message
Xurong Yang (idopra) wrote : 答复: [Bug 1324875] Re: neutron-server initiation is time-consuming with a large vxlan/gre range

Hi, yong sheng gong,
   Thanks for your idea, your algorithm sounds great, I'm in favor of the resource division, but I still have other considers:
   we can't predict which segmentation_id would be used immediately or latter, so I think it's more wise to allocate the required segmentation_id and have a background thread allocate the remaining ids in the same block, rather than allocation
   on the phase of initiation.
   And we need to consider how to solve on the condition that multi neturon-server services coexist in system.
   What do you think?

-----邮件原件-----
发件人: <email address hidden> [mailto:<email address hidden>] 代表 yong sheng gong
发送时间: 2014年6月3日 10:28
收件人: Yangxurong
主题: [Bug 1324875] Re: neutron-server initiation is time-consuming with a large vxlan/gre range

How about this algorithm?
first we populate a small block of resources into db, and then have a background thread monitoring the allocation status. If the allocation meets a criteria, the thread will populate one more block of resources into db.

take the vxlan ranges: 1..100000, we divide it into 100 blocks of 1000, first one 1000 is populated into db, and then another 1000 if needed.

--
You received this bug notification because you are subscribed to the bug report.
https://bugs.launchpad.net/bugs/1324875

Title:
  neutron-server initiation is time-consuming with a large vxlan/gre
  range

Status in OpenStack Neutron (virtual network service):
  Confirmed

Bug description:
  When we configure VXLAN range [1,16M], neutron-server service costs
  long time and cpu rate is very high(100%) when initiation. One test
  base on postgresql has been verified: more than 1h when VXLAN range is
  [1, 1M].

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1324875/+subscriptions

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

I'm not sure it's a good option: increases complexity and can lead to lags in some responses.
Why not just to optimize db performance?

Revision history for this message
yalei wang (yalei-wang) wrote :

Just use tox to do a unit test, and it costs about 70s in my PC with range ([1, 1024*1024]), so most of the time is occupied by DB update operation?

Revision history for this message
Xurong Yang (idopra) wrote :

Yes, DB operations are time-consuming, we also have done the series of tests and noticed that sqlalchemy occupied most proportions of time eventually, so i personally think the optimization in aspect of sqlalchemy is limit.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/97774

Changed in neutron:
assignee: Xurong Yang (idopra) → Eugene Nikanorov (enikanorov)
status: Confirmed → In Progress
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → juno-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/97774
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6e2fce563ab250d4bfe000dd2a2dbc00f094141b
Submitter: Jenkins
Branch: master

commit 6e2fce563ab250d4bfe000dd2a2dbc00f094141b
Author: Eugene Nikanorov <email address hidden>
Date: Wed Jun 4 14:40:13 2014 +0400

    Improve vxlan type driver initialization performance

    Vxlan type driver may take long time to initialize
    vxlan allocation table. Optimize db performance by issuing
    raw sql inserts coalesced into bulk statements.
    Also optimize deleting logic.

    Proposed patch gives ~2x performance gain in comparison with
    original code on Mysql and Postgesql backends

    Change-Id: I801d967e8e3c0260593f289097d17270ef0b391e
    Partial-Bug: #1324875

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101982

Changed in neutron:
assignee: Eugene Nikanorov (enikanorov) → Cedric Brandily (cbrandily)
Kyle Mestery (mestery)
Changed in neutron:
milestone: juno-2 → juno-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Cedric Brandily (<email address hidden>) on branch: master
Review: https://review.openstack.org/101982
Reason: Abandoned for the moment

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Lowering the importance. Initial issue with slow initialization was partially fixed with decent speed improvement.

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Also, ideally initialization is done just once for the deployment, and not be each neutron-server.

Changed in neutron:
importance: High → Low
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-3 → juno-rc1
Changed in neutron:
status: In Progress → Fix Committed
assignee: Cedric Brandily (cbrandily) → nobody
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.