In scaled environment control-node very slow in deleting virtual-machine-interfaces

Bug #1432735 reported by Praveen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.1
Won't Fix
High
Tapan Karwa
R2.20
Fix Committed
High
Tapan Karwa
Trunk
Fix Committed
High
Tapan Karwa

Bug Description

Tapan,

Here are my observations from today,

1. I have around 10K interfaces
2. Deleted about 1K interfaces
3. Interface delete from API server was completed at around 10:40 AM
4. The
http://nodec36.englab.juniper.net:8083/Snh_SandeshTraceRequest?x=IFMapBigMs
gTraceBuf shows last log message at 10:40:58 AM
5. Control-node has high cpu utilization (~100%) for more than 5 minutes
now
6. Agent has not all interface deletes even after 5 minutes.

Regards,
Praveen

tags: added: bms scale
Revision history for this message
Vedamurthy Joshi (vedujoshi) wrote :

On my 140+ tor-scale setup with 40K VMIs, we have seen that it sometimes takes more than an hour for CPU utilization to come down and the config to get synced with tor-agents

information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/8939
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8939
Committed: http://github.org/Juniper/contrail-controller/commit/8518e3e4b8f8abfda3c06daafd3078788d828b6c
Submitter: Zuul
Branch: master

commit 8518e3e4b8f8abfda3c06daafd3078788d828b6c
Author: Tapan Karwa <email address hidden>
Date: Mon Apr 6 16:11:47 2015 -0700

Graphwalker optimization

Currently, for each link-delete, the walker does one graph-walk. In the case,
where the server-table code got enough time to process a bunch of, say n,
link-deletes before the walker got a chance to run, the walker will still
perform 'n' graph-walks. The last (n - 1) graph-walks are unnecessary since the
first walk would already work with a graph with all the 'n' links removed.
We fix that by absorbing the last (n - 1) triggers into the first trigger i.e.
run the graph-walk only once instead of 'n' times. Replace the work-queue with
a bitmap of client-ids.

Partial-Bug: 1432735

Change-Id: Iab79a58285aa8a8ebce54e453a1c2215dbe60099

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.1

Review in progress for https://review.opencontrail.org/8967
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8967
Committed: http://github.org/Juniper/contrail-controller/commit/83edcfb7fa7c7f21a2ce5839e343f37f3d4e0bcc
Submitter: Zuul
Branch: R2.1

commit 83edcfb7fa7c7f21a2ce5839e343f37f3d4e0bcc
Author: Tapan Karwa <email address hidden>
Date: Mon Apr 6 16:11:47 2015 -0700

Graphwalker optimization

Currently, for each link-delete, the walker does one graph-walk. In the case,
where the server-table code got enough time to process a bunch of, say n,
link-deletes before the walker got a chance to run, the walker will still
perform 'n' graph-walks. The last (n - 1) graph-walks are unnecessary since the
first walk would already work with a graph with all the 'n' links removed.
We fix that by absorbing the last (n - 1) triggers into the first trigger i.e.
run the graph-walk only once instead of 'n' times. Replace the work-queue with
a bitmap of client-ids.

Partial-Bug: 1432735

Change-Id: Iab79a58285aa8a8ebce54e453a1c2215dbe60099

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/9259
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9259
Committed: http://github.org/Juniper/contrail-controller/commit/d2ac8e945a271bd20fb971c5e7e75542268054f1
Submitter: Zuul
Branch: master

commit d2ac8e945a271bd20fb971c5e7e75542268054f1
Author: Sachin Bansal <email address hidden>
Date: Fri Apr 17 17:45:02 2015 -0700

Provide a way to bulk delete multiple links for an identifier

When an object is deleted, we currently walk all its references and delete each
of them one by one. Instead, we can delete all of them in one message to ifmap.
This code change provides a method to do it. In a subsequent change, we will add
a call to this method in auto-generated code.

Change-Id: I0dbf5c383cfef34388f97318536e873ea13a388d
Partial-Bug: 1432735

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/9401
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9401
Committed: http://github.org/Juniper/contrail-generateDS/commit/eb8f050491d6197778170911d6bba538b2af9311
Submitter: Zuul
Branch: master

commit eb8f050491d6197778170911d6bba538b2af9311
Author: Sachin Bansal <email address hidden>
Date: Wed Apr 22 09:59:44 2015 -0700

When an object is deleted, send delete for all its refs in one message to ifmap

Also do the same when an object is updated for all refs that are being deleted
Partial-Bug: 1432735

Change-Id: I08db1e61768517b0b866e1a1106ee75d7fe7c635

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/9474
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9474
Committed: http://github.org/Juniper/contrail-controller/commit/14111e1908e765bcb2a70a5c77f1cfbf8bd28fd1
Submitter: Zuul
Branch: R2.20

commit 14111e1908e765bcb2a70a5c77f1cfbf8bd28fd1
Author: Sachin Bansal <email address hidden>
Date: Fri Apr 17 17:45:02 2015 -0700

Provide a way to bulk delete multiple links for an identifier

When an object is deleted, we currently walk all its references and delete each
of them one by one. Instead, we can delete all of them in one message to ifmap.
This code change provides a method to do it. In a subsequent change, we will add
a call to this method in auto-generated code.

Change-Id: I0dbf5c383cfef34388f97318536e873ea13a388d
Partial-Bug: 1432735
(cherry picked from commit d2ac8e945a271bd20fb971c5e7e75542268054f1)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/9504
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9504
Committed: http://github.org/Juniper/contrail-generateDS/commit/7ad98ff4dfcfbf922dd6b569b295d554a7f02f49
Submitter: Zuul
Branch: R2.20

commit 7ad98ff4dfcfbf922dd6b569b295d554a7f02f49
Author: Sachin Bansal <email address hidden>
Date: Wed Apr 22 09:59:44 2015 -0700

When an object is deleted, send delete for all its refs in one message to ifmap

Also do the same when an object is updated for all refs that are being deleted
Partial-Bug: 1432735

Change-Id: I08db1e61768517b0b866e1a1106ee75d7fe7c635
(cherry picked from commit eb8f050491d6197778170911d6bba538b2af9311)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10866
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/11010
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10866
Committed: http://github.org/Juniper/contrail-controller/commit/19dc001f35b1ad2c3843efa566a45bf572cdf2da
Submitter: Zuul
Branch: master

commit 19dc001f35b1ad2c3843efa566a45bf572cdf2da
Author: Sachin Bansal <email address hidden>
Date: Fri May 15 17:04:30 2015 -0700

New greenlet to send messages to ifmap

With this change, we no longer send ifmap message in the same context as
dequeueing them from rabbitmq. We now queue them up in a separate internal queue
and a new greenlet is launched to dequeue from there. This allows faster
dequeuing of rabbitmq and also the dequeue greenlet can send messages in bulk.

Partial-Bug: 1432735

Change-Id: I57e7c0deee5f883bc959c2928c6f473b7dc042da

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/11010
Committed: http://github.org/Juniper/contrail-controller/commit/f1ce30a663c34677db67cd399d2d56f4bf91394c
Submitter: Zuul
Branch: R2.20

commit f1ce30a663c34677db67cd399d2d56f4bf91394c
Author: Sachin Bansal <email address hidden>
Date: Fri May 15 17:04:30 2015 -0700

New greenlet to send messages to ifmap

With this change, we no longer send ifmap message in the same context as
dequeueing them from rabbitmq. We now queue them up in a separate internal queue
and a new greenlet is launched to dequeue from there. This allows faster
dequeuing of rabbitmq and also the dequeue greenlet can send messages in bulk.

Partial-Bug: 1432735

Change-Id: I57e7c0deee5f883bc959c2928c6f473b7dc042da
(cherry picked from commit 19dc001f35b1ad2c3843efa566a45bf572cdf2da)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.