BGPaaS heat stack bringup failing when resources are more than 1000

Bug #1682221 reported by Avinash
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Fix Committed
Undecided
Praneet Bachheti
R3.1.1.x
Won't Fix
Undecided
Praneet Bachheti
R3.2
Fix Committed
Undecided
Praneet Bachheti
Trunk
Fix Committed
Undecided
Praneet Bachheti

Bug Description

ATT lab : RDM5b
Contrail version : 3.0.3.2-12

ATT is doing BGPaaS scale testing as a part of the same , ATT is not able to spinup their heat stack with resources around 4000 , when the number of VMI’s are reduced to less than 1000 they are able to spin up their heat stack , beyond that they are not able to , heat stack creation is failing with connection error

I have attached a file : local-opsimple which contains complete outputs of heat cli

root@zrdm5bmosc02:/var/tmp# heat stack-show gateway
………………………………
| stack_status | CREATE_FAILED
| stack_status_reason | Resource CREATE failed: ConnectionError:
…………………………….

following traceback from heat-engine log is pointing to vnc api
heat-engine.log:
============
2017-04-06 20:12:30.942 28071 INFO heat.engine.resource [-] creating ContrailVirtualMachineInterface "RVMI_SF11_GN_10_SUBIF_38" Stack "gateway" [064a5eb6-53bd-4f53-99aa-e42eff10cbe8]
2017-04-06 20:12:31.185 28071 DEBUG heat.engine.scheduler [-] Task resource_action starting start /usr/lib/python2.7/dist-packages/heat/engine/scheduler.py:206
2017-04-06 20:12:31.185 28071 DEBUG heat.engine.scheduler [-] Task resource_action running step /usr/lib/python2.7/dist-packages/heat/engine/scheduler.py:234
2017-04-06 20:12:31.190 28071 INFO heat.engine.resource [-] creating ContrailVirtualMachineInterface "RVMI_SF4_SGI_11_SUBIF_43" Stack "gateway" [064a5eb6-53bd-4f53-99aa-e42eff10cbe8]
2017-04-06 20:12:31.253 28071 INFO heat.engine.resource [-] CREATE: ContrailVirtualMachineInterface "RVMI_SF4_SGI_11_SUBIF_43" Stack "gateway" [064a5eb6-53bd-4f53-99aa-e42eff10cbe8]
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource Traceback (most recent call last):
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 502, in _action_recorder
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 572, in _do_action
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/scheduler.py", line 310, in wrapper
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/heat/engine/resource.py", line 543, in action_handler_task
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/contrail_heat/resources/contrail.py", line 19, in wrapper
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/contrail_heat/resources/contrail.py", line 124, in vnc_lib
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 342, in __init__
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 755, in _request
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource ConnectionError
2017-04-06 20:12:31.253 28071 TRACE heat.engine.resource

I was checking for exceptions if any at following : contrail-api-0-stdout.log , during the time when heat stack creation failed , I couldn’t see any .However , after heat stack creation was failed and when ATT was trying to delete the failed stack , I see the following exception in “contrail-api-0-stdout.log”

127.0.0.1 - - [2017-04-06 21:12:27] "GET /virtual-machine-interface/357d7a96-52f6-4df8-ad0e-d591bf185775?fields=logical_router_back_refs%2Cinstance_ip_back_refs%2Cfloating_ip_back_refs HTTP/1.1" 404 171 0.003427
<pre>Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/vnc_openstack/neutron_plugin_interface.py", line 372, in plugin_get_port
    port_info = cfgdb.port_read(port['id'])
  File "/usr/lib/python2.7/dist-packages/vnc_openstack/neutron_plugin_db.py", line 2356, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_openstack/neutron_plugin_db.py", line 3654, in port_read
    self._raise_contrail_exception('PortNotFound', port_id=port_id)
  File "/usr/lib/python2.7/dist-packages/vnc_openstack/neutron_plugin_db.py", line 214, in _raise_contrail_exception
    bottle.abort(400, json.dumps(exc_info))
  File "/usr/lib/python2.7/dist-packages/bottle.py", line 2310, in abort
    raise HTTPError(code, text)
HTTPError

I have copied the logs to the following location
login into any poolserver ex: svl-jtac-tool02.juniper.net with your credentials and then
cd /volume/CSdata/avink/2017-0405-0707/tmp/jtac

[avink@svl-jtac-tool02 jtac]$ ls
cntr1/ cntr2/ cntr3/ mos1/ mos2/ mos3/
[avink@svl-jtac-tool02 jtac]$ pwd
/volume/CSdata/avink/2017-0405-0707/tmp/jtac
[avink@svl-jtac-tool02 jtac]$

Revision history for this message
Avinash (avink) wrote :
Revision history for this message
Avinash (avink) wrote :
Revision history for this message
Avinash (avink) wrote :
Revision history for this message
Avinash (avink) wrote :
Revision history for this message
Avinash (avink) wrote :
Changed in juniperopenstack:
assignee: nobody → Praneet Bachheti (praneetb)
Avinash (avink)
Changed in juniperopenstack:
milestone: none → r3.0.3.2
Jim Reilly (jpreilly)
information type: Proprietary → Private
amit surana (asurana-t)
information type: Private → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/30708
Submitter: Praneet Bachheti (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/30709
Submitter: Praneet Bachheti (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/30710
Submitter: Praneet Bachheti (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/30710
Committed: http://github.com/Juniper/contrail-heat/commit/f7043226e036ff679fadcfed39d9d3505717356b
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit f7043226e036ff679fadcfed39d9d3505717356b
Author: Praneet Bachheti <email address hidden>
Date: Mon Apr 24 10:58:53 2017 -0700

_vnc_lib handle be part of Class instead of resource object
it prevents opening large number of sockets.

Change-Id: I09f201d587622727fd29f2e8d04e5bf0889897e7
Closes-Bug: #1682221

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30709
Committed: http://github.com/Juniper/contrail-heat/commit/b3ab395dbb4acacaa04335c141b4b669c1acb4c7
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit b3ab395dbb4acacaa04335c141b4b669c1acb4c7
Author: Praneet Bachheti <email address hidden>
Date: Mon Apr 24 10:58:43 2017 -0700

_vnc_lib handle be part of Class instead of resource object
it prevents opening large number of sockets.

Change-Id: I88068a374d32f232441dff335f47c633d6ff0c26
Closes-Bug: #1682221

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30708
Committed: http://github.com/Juniper/contrail-heat/commit/5f949d6377b149490e08940cf0fa25c1853a6c68
Submitter: Zuul (<email address hidden>)
Branch: master

commit 5f949d6377b149490e08940cf0fa25c1853a6c68
Author: Praneet Bachheti <email address hidden>
Date: Mon Apr 24 10:55:48 2017 -0700

_vnc_lib handle be part of Class instead of resource object
it prevents opening large number of sockets.

Change-Id: I688bc22ece84d242ee191c16ab02d669c695fe1e
Closes-Bug: #1682221

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.