Heat stack stuck @ UPDATE_IN_PROGRESS

Bug #1518552 reported by Vijay Anand
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
New
Undecided
Unassigned

Bug Description

Version: 2.21 build 102 + patches (1507404/1464059 & 1507501)

Problem:
======
Heat stack stuck @ UPDATE_IN_PROGRESS

Test:
===
1. launched 30 parallel instances
2. UPDATE Stack twice (Policy add + delete)

Observation:
==========
One stack stuck at UPDATE_IN_PROGRESS for policy delete ( policy deleted)
waited for more than 30 mins

Problem often seen.

Logs:
====
root@csp-sol-mustang:~# heat stack-list
+--------------------------------------+------------------------------------------+--------------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+------------------------------------------+--------------------+----------------------+
| 80f1f842-84f3-4938-92c0-fa83e3546e65 | jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2 | UPDATE_IN_PROGRESS | 2015-11-21T10:39:31Z |
+--------------------------------------+------------------------------------------+--------------------+----------------------+
root@csp-sol-mustang:~# nova list
+--------------------------------------+------------------------------------------------------------------------------------------------+--------+------------+-------------+------------------------------------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------------------------------------------------------------------------------------------+--------+------------+-------------+------------------------------------------------------------------------------------+
| e826fa98-9fc0-492f-8049-3b66b1a090c4 | default-domain__admin__20074f5d-d42a-41b5-ad8c-4ab4e086b8e2-FireFly_Firewall-0-vnf-instance__1 | ACTIVE | - | Running | Management Network=10.213.14.25; Left-Site-12=80.0.12.4; Right Network=56.56.56.26 |
| 4f413ec4-a382-49fe-912f-980144c3f071 | vSPACE | ACTIVE | - | Running | Management Network=10.213.14.3 |
+--------------------------------------+------------------------------------------------------------------------------------------------+--------+------------+-------------+------------------------------------------------------------------------------------+
root@csp-sol-mustang:~#
root@csp-sol-mustang:~#
root@csp-sol-mustang:~#
root@csp-sol-mustang:~# cp /var/log/heat/heat.log heat.log
root@csp-sol-mustang:~# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-vrouter-agent active
contrail-vrouter-nodemgr active

== Contrail Control ==
supervisor-control: active
contrail-control active
contrail-control-nodemgr active
contrail-dns active
contrail-named active

== Contrail Analytics ==
supervisor-analytics: active
contrail-analytics-api active
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active

== Contrail Config ==
supervisor-config: active
contrail-api:0 active
contrail-config-nodemgr active
contrail-device-manager active
contrail-discovery:0 active
contrail-schema active
contrail-svc-monitor active
ifmap active

== Contrail Web UI ==
supervisor-webui: active
contrail-webui active
contrail-webui-middleware active

== Contrail Database ==
supervisor-database: active
contrail-database active
contrail-database-nodemgr active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

========Run time service failures=============
/var/crashes/core.virsh.15896.csp-sol-mustang.1446626116
/var/crashes/core.virsh.19376.csp-sol-mustang.1444976375
/var/crashes/core.virsh.24425.csp-sol-mustang.1446664235
/var/crashes/core.virsh.9922.csp-sol-mustang.1446632927
root@csp-sol-mustang:~# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-analytics 2.21-102 102
contrail-config 2.21-102 102
contrail-config-openstack 2.21-102 102
contrail-control 2.21-102 102
contrail-dns 2.21-102 102
contrail-f5 2.21-102 102
contrail-fabric-utils 2.21-102 102
contrail-heat 2.21-102 102
contrail-install-packages 2.21-102~icehouse 102
contrail-lib 2.21-102 102
contrail-nodemgr 2.21-102 102
contrail-nova-networkapi 2.21-102 102
contrail-nova-vif 2.21-102 102
contrail-openstack 2.21-102 102
contrail-openstack-analytics 2.21-102 102
contrail-openstack-config 2.21-102 102
contrail-openstack-control 2.21-102 102
contrail-openstack-dashboard 2.21-102 102
contrail-openstack-database 2.21-102 102
contrail-openstack-vrouter 2.21-102 102
contrail-openstack-webui 2.21-102 102
contrail-setup 2.21-102 102
contrail-utils 2.21-102 102
contrail-vrouter-3.13.0-40-generic 2.21-102 102
contrail-vrouter-agent 2.21-102 102
contrail-vrouter-common 2.21-102 102
contrail-vrouter-dkms 2.21-102 102
contrail-vrouter-init 2.21-102 102
contrail-vrouter-utils 2.21-102 102
contrail-web-controller 2.21-102 102
contrail-web-core 2.21-102 102
ifmap-python-client 0.1-2 102
ifmap-server 0.3.2-1contrail1 102
neutron-plugin-contrail 2.21-102 102
nova-api 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-common 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-compute 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-compute-kvm 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-compute-libvirt 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-conductor 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-console 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-consoleauth 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-novncproxy 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-objectstore 1:2014.1.3-0ubuntu1~cloud0.3contrail102
nova-scheduler 1:2014.1.3-0ubuntu1~cloud0.3contrail102
python-contrail 2.21-102 102
python-contrail-vrouter-api 2.21-102 102
python-neutronclient 2:2.3.4-0ubuntu1.2contrail 102
python-nova 1:2014.1.3-0ubuntu1~cloud0.3contrail102
python-opencontrail-vrouter-netns 2.21-102 102
root@csp-sol-mustang:~#
root@csp-sol-mustang:~# vi heat.log
root@csp-sol-mustang:~# grep jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2 heat.log > stack_update_in_progress_heat-logs
root@csp-sol-mustang:~# vi stack_update_in_progress_heat-logs
root@csp-sol-mustang:~#
root@csp-sol-mustang:~#
root@csp-sol-mustang:~# heat stack-list
+--------------------------------------+------------------------------------------+--------------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+------------------------------------------+--------------------+----------------------+
| 80f1f842-84f3-4938-92c0-fa83e3546e65 | jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2 | UPDATE_IN_PROGRESS | 2015-11-21T10:39:31Z |
+--------------------------------------+------------------------------------------+--------------------+----------------------+
root@csp-sol-mustang:~# top

top - 17:03:07 up 13 days, 17:19, 3 users, load average: 0.67, 0.87, 3.54
Tasks: 552 total, 2 running, 550 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.3 us, 0.4 sy, 0.0 ni, 98.0 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 65704008 total, 31510756 used, 34193252 free, 96084 buffers
KiB Swap: 67096572 total, 26006016 used, 41090556 free. 2649820 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 6286 root 20 0 1611380 24224 10808 S 6.3 0.0 823:38.71 libvirtd
20907 root 20 0 601728 68032 15748 S 5.6 0.1 602:47.44 python
 2520 rabbitmq 20 0 2975876 134224 2228 S 3.6 0.2 720:31.72 beam.smp
 2391 root 20 0 22.726g 9.248g 761352 S 2.3 14.8 1594:09 java
 2415 root 20 0 835868 38296 3980 S 2.0 0.1 251:42.02 contrail-discov
 2403 contrail 20 0 2326872 1.425g 5688 S 1.3 2.3 1300:36 contrail-collec
 2410 contrail 20 0 1354016 49812 6020 S 1.3 0.1 111:14.40 contrail-dns
 2390 nova 20 0 160904 22460 4116 S 1.0 0.0 145:52.14 nova-api
 2392 nova 20 0 136076 15928 4068 S 1.0 0.0 142:26.26 nova-conductor
 2395 cinder 20 0 158128 9484 4100 S 0.7 0.0 139:47.54 cinder-api
 3507 nova 20 0 270392 80716 2996 S 0.7 0.1 25:13.69 nova-conductor
15779 root 20 0 25344 2104 1180 R 0.7 0.0 0:00.10 top
    8 root 20 0 0 0 0 R 0.3 0.0 58:00.06 rcu_sched
    9 root 20 0 0 0 0 S 0.3 0.0 6:15.12 rcuos/0
   19 root 20 0 0 0 0 S 0.3 0.0 4:02.79 rcuos/10
 2333 nova 20 0 1760148 61652 6660 S 0.3 0.1 196:49.34 nova-compute
 2389 root 20 0 159976 21504 3924 S 0.3 0.0 21:49.63 python
 2406 contrail 20 0 249028 33632 3992 S 0.3 0.1 51:57.39 contrail-analyt
 2411 nova 20 0 272272 83992 4408 S 0.3 0.1 24:07.66 nova-scheduler
 2417 contrail 20 0 387444 128356 4416 S 0.3 0.2 37:23.02 contrail-device
 2424 contrail 20 0 593736 321084 4440 S 0.3 0.5 64:19.57 contrail-svc-mo
 3449 nova 20 0 270888 81624 2992 S 0.3 0.1 26:42.51 nova-conductor
 3461 nova 20 0 270340 80984 2992 S 0.3 0.1 25:22.31 nova-conductor
 3476 nova 20 0 270176 80972 2992 S 0.3 0.1 25:26.83 nova-conductor
 3481 nova 20 0 270392 81160 2992 S 0.3 0.1 25:34.40 nova-conductor
 3515 nova 20 0 269124 79780 2996 S 0.3 0.1 26:24.99 nova-conductor
 3549 nova 20 0 270204 80972 2992 S 0.3 0.1 26:13.05 nova-conductor
 3587 nova 20 0 270576 81224 2992 S 0.3 0.1 26:06.14 nova-conductor
 3641 nova 20 0 269256 79940 2996 S 0.3 0.1 25:50.33 nova-conductor
 3810 nova 20 0 269568 80284 2992 S 0.3 0.1 25:53.54 nova-conductor
14906 root 20 0 0 0 0 S 0.3 0.0 0:06.75 kworker/3:4
29401 root 20 0 76300 13320 1828 S 0.3 0.0 81:10.48 Xtightvnc
29829 root 20 0 287448 3448 2508 S 0.3 0.0 8:48.88 accounts-daemon
    1 root 20 0 34792 3856 1388 S 0.0 0.0 4:38.10 init
    2 root 20 0 0 0 0 S 0.0 0.0 0:01.87 kthreadd
    3 root 20 0 0 0 0 S 0.0 0.0 0:46.96 ksoftirqd/0
    5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
   10 root 20 0 0 0 0 S 0.0 0.0 13:48.00 rcuos/1
   11 root 20 0 0 0 0 S 0.0 0.0 6:36.98 rcuos/2
   12 root 20 0 0 0 0 S 0.0 0.0 11:46.50 rcuos/3
   13 root 20 0 0 0 0 S 0.0 0.0 6:51.81 rcuos/4
   14 root 20 0 0 0 0 S 0.0 0.0 12:37.40 rcuos/5
   15 root 20 0 0 0 0 S 0.0 0.0 6:52.22 rcuos/6
   16 root 20 0 0 0 0 S 0.0 0.0 12:22.28 rcuos/7
   17 root 20 0 0 0 0 S 0.0 0.0 3:13.62 rcuos/8
   18 root 20 0 0 0 0 S 0.0 0.0 6:35.97 rcuos/9
   20 root 20 0 0 0 0 S 0.0 0.0 6:50.95 rcuos/11
   21 root 20 0 0 0 0 S 0.0 0.0 3:32.93 rcuos/12
   22 root 20 0 0 0 0 S 0.0 0.0 6:13.98 rcuos/13
   23 root 20 0 0 0 0 S 0.0 0.0 4:01.90 rcuos/14
   24 root 20 0 0 0 0 S 0.0 0.0 7:40.37 rcuos/15
   25 root 20 0 0 0 0 S 0.0 0.0 0:00.01 rcuos/16
   26 root 20 0 0 0 0 S 0.0 0.0 0:00.01 rcuos/17
   27 root 20 0 0 0 0 S 0.0 0.0 0:00.01 rcuos/18
   28 root 20 0 0 0 0 S 0.0 0.0 0:00.01 rcuos/19
root@csp-sol-mustang:~#
root@csp-sol-mustang:~#
root@csp-sol-mustang:~#

Heat Logs snippet, Last few lines:
===========================
2015-11-21 16:53:52.965 20398 DEBUG root [-] JSON response : {"resources": [{"resource_name": "FireFly_Firewall-0-vnf-instance", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/FireFly_Firewall-0-vnf-instance", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "FireFly_Firewall-0-vnf-instance", "resource_status_reason": "state changed", "updated_time": "2015-11-21T10:39:40Z", "required_by": ["site-access-policy"], "resource_status": "CREATE_COMPLETE", "physical_resource_id": "690cbb10-9271-42a3-a282-a2a3e41fe93b", "resource_type": "OS::Contrail::ServiceInstance"}, {"resource_name": "FireFly_Firewall-0-vnf-flavor", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/FireFly_Firewall-0-vnf-flavor", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "FireFly_Firewall-0-vnf-flavor", "resource_status_reason": "state changed", "updated_time": "2015-11-21T10:39:31Z", "required_by": ["FireFly_Firewall-0-vnf-template"], "resource_status": "CREATE_COMPLETE", "physical_resource_id": "", "resource_type": "OS::JSM::GetFlavor"}, {"resource_name": "ns-exit", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/ns-exit", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "ns-exit", "resource_status_reason": "", "updated_time": "2015-11-21T11:23:52Z", "required_by": [], "resource_status": "INIT_COMPLETE", "physical_resource_id": "", "resource_type": "OS::Contrail::AttachPolicy"}, {"resource_name": "site-access-policy", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/site-access-policy", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "site-access-policy", "resource_status_reason": "", "updated_time": "2015-11-21T11:23:52Z", "required_by": ["ns-entry", "ns-exit"], "resource_status": "INIT_COMPLETE", "physical_resource_id": "", "resource_type": "OS::Contrail::NetworkPolicy"}, {"resource_name": "ns-entry", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/ns-entry", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "ns-entry", "resource_status_reason": "", "updated_time": "2015-11-21T11:23:52Z", "required_by": [], "resource_status": "INIT_COMPLETE", "physical_resource_id": "", "resource_type": "OS::Contrail::AttachPolicy"}, {"resource_name": "FireFly_Firewall-0-vnf-template", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65/resources/FireFly_Firewall-0-vnf-template", "rel": "self"}, {"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "stack"}], "logical_resource_id": "FireFly_Firewall-0-vnf-template", "resource_status_reason": "state changed", "updated_time": "2015-11-21T10:39:31Z", "required_by": ["FireFly_Firewall-0-vnf-instance"], "resource_status": "CREATE_COMPLETE", "physical_resource_id": "283501cb-dfb0-427b-9430-eb211a63ecda", "resource_type": "OS::Contrail::ServiceTemplate"}]} to_json /usr/lib/python2.7/dist-packages/heat/common/wsgi.py:577
2015-11-21 16:56:14.252 20398 DEBUG root [-] JSON response : {"stacks": [{"description": "HOT template to create network service\n", "links": [{"href": "http://10.213.2.96:8004/v1/55d45f62d21c4d9889e3bc858c200c26/stacks/jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2/80f1f842-84f3-4938-92c0-fa83e3546e65", "rel": "self"}], "stack_status_reason": "Stack UPDATE started", "stack_name": "jsm-20074f5d-d42a-41b5-ad8c-4ab4e086b8e2", "creation_time": "2015-11-21T10:39:31Z", "updated_time": "2015-11-21T10:46:01Z", "stack_status": "UPDATE_IN_PROGRESS", "id": "80f1f842-84f3-4938-92c0-fa83e3546e65"}]} to_json /usr/lib/python2.7/dist-packages/heat/common/wsgi.py:577
~

Vijay Anand (vijanand)
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.