Occassional "Instance failed to spawn" in contrail 2.21

Bug #1506739 reported by Deepak Jeyaraman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
New
High
Rudra Rugge

Bug Description

COpying the email exchange to this bug :

Using contrail 2.21 on Ubuntu 14.04

===

From: Praneet Bachheti
Sent: Thursday, October 15, 2015 2:26 PM
To: Ashish Ranjan <email address hidden>
Cc: PrasannaKumar Anandan Gajendran <email address hidden>; Deepak Jayaraman <email address hidden>; Ranjini Rajendran <email address hidden>; Sreelakshmi Sarva <email address hidden>; Arivudainambi A <email address hidden>; Sudhir Kumar <email address hidden>; Hari Prasad Killi <email address hidden>; Rudra Rugge <email address hidden>
Subject: Re: Heat stack-create failure in Contrail 2.21

I think we hit a bug … here is the explanation and a workaround for now until it is fixed …

The VM created by the service-instance is not present in Nova although the VM id is present in the api-server.

Whenever we do a stack-show, heat tried to get the VM from nova and fails throwing an exception of NOT_FOUND

Talked to Rudra, it is possible sometime that request to nova for creating VM may not succeed. In that case, we will never relaunch the VM and stale vm-id will be left in api-server.

Workaround:

delete the virtual-machine from api-server, and the VM will be relaunched by service-monitor.

command to do that

curl -u admin:contrail123 -X delete http://127.0.0.1:8095/virtual-machine/774b25e0-41c5-42b0-8eef-f3650cd4afcb

The id is the end is the instacne-id which is returned by heat error log.

For now I have applied the workaround and I can do a heat stack-show.

- Praneet

    On Oct 15, 2015, at 1:41 PM, Praneet Bachheti <email address hidden> wrote:

    Looking in to it.

        On Oct 15, 2015, at 1:28 PM, Ashish Ranjan <email address hidden> wrote:

        Praneet could you check.

            On Oct 15, 2015, at 1:24 PM, PrasannaKumar Anandan Gajendran <email address hidden> wrote:

            Ashish,

            Can someone please take a look at this ASAP.

            Thanks

            Prasanna

            From: Deepak Jayaraman
            Sent: Thursday, October 15, 2015 1:22 PM
            To: Ranjini Rajendran <email address hidden>; Sreelakshmi Sarva <email address hidden>; Praneet Bachheti <email address hidden>
            Cc: PrasannaKumar Anandan Gajendran <email address hidden>; Arivudainambi A <email address hidden>; Sudhir Kumar <email address hidden>; Hari Prasad Killi <email address hidden>
            Subject: Re: Heat stack-create failure in Contrail 2.21
            Importance: High

            Ranjini , Sree,

             We re-provisioned the server nethra-01 with 2.21 again and we ran into the same issue again.

            The setup nethra-01 (root/Embe1mpls) is in this state, please debug so that we can quickly identify the root cause.

            Test:

            We were bringing up 20 heat-stacks serially with 10sec interval - 16th one failed and is in this state:

            the 2 instances after the failed one are fine ...

            root@nethra-01:~# heat stack-list
            +--------------------------------------+------------------------------------------+-----------------+----------------------+
            | id | stack_name | stack_status | creation_time |
            +--------------------------------------+------------------------------------------+-----------------+----------------------+
            | fbb9c12f-aaf6-4f99-9ab7-ad5575cf8bd2 | jsm-22292fd6-8548-475d-b06a-2ac1f23afe0b | UPDATE_COMPLETE | 2015-10-15T16:33:09Z |
            | 492c13d4-d9c4-4633-9fcc-a77172b3590e | jsm-8c7bef53-7ad9-4cd7-bddc-afcf751a8122 | CREATE_COMPLETE | 2015-10-15T17:45:13Z |
            | 526f6271-7cf6-4240-8c1c-430a176a53b9 | jsm-74c015a1-99d1-4b71-8532-44d4f65ac8b0 | CREATE_COMPLETE | 2015-10-15T17:45:25Z |
            | 2b8fb47f-8fd8-4054-ac92-f7a0407da6bd | jsm-51349dda-7630-4ed6-bbef-a9f265397bf2 | CREATE_COMPLETE | 2015-10-15T17:45:37Z |
            | 7011d204-660d-44ef-8d85-9b1fea7cecc4 | jsm-1f426bcc-48f1-4f08-bbc7-b125bf529f74 | CREATE_COMPLETE | 2015-10-15T17:45:52Z |
            | 149f1d40-9c0d-479d-8114-f191c49a2b43 | jsm-26af175a-4742-40ef-afcc-aedb4429a0a2 | UPDATE_COMPLETE | 2015-10-15T17:46:05Z |
            | 544eca81-a5d1-4853-8b12-e3e5ad66e4b6 | jsm-6433cf1e-d5b9-4124-bbdf-71ba8aa9a2cc | UPDATE_COMPLETE | 2015-10-15T17:46:21Z |
            | a1dee4ac-0c4c-4363-a700-b43148630cb3 | jsm-3c8f8363-4f28-4753-9587-99e088a314e7 | UPDATE_COMPLETE | 2015-10-15T17:46:40Z |
            | 18313a51-5d93-4a00-be8d-ea976f7ac0e2 | jsm-3d9b6544-a86c-429b-8fc3-d59469d91933 | CREATE_COMPLETE | 2015-10-15T17:46:54Z |
            | 09e297bc-dc71-41b7-9453-1e50b94ebb12 | jsm-08c8391a-5ab4-47a7-bfd7-f97a6beb749c | UPDATE_COMPLETE | 2015-10-15T17:47:09Z |
            | 916c387a-acef-4b6d-9f7f-8792b897a41a | jsm-d0d09066-1d5c-4e54-a5ca-2320c162a3ea | CREATE_COMPLETE | 2015-10-15T17:47:24Z |
            | 788dae21-1b70-46cb-80ce-03238f8f02ed | jsm-7963ce08-99e6-4b05-8584-f981fcb179e9 | CREATE_COMPLETE | 2015-10-15T17:47:39Z |
            | 26afd46f-2649-4e90-97d5-22719fdc0874 | jsm-07a5c580-e14c-478f-b217-6511d355e1a0 | CREATE_COMPLETE | 2015-10-15T17:47:56Z |
            | 6f6f99a2-7649-4110-9bd8-f6ddf33492ba | jsm-f52e3599-0ea3-4cee-af9e-ce871cbab7f4 | CREATE_COMPLETE | 2015-10-15T17:48:18Z |
            | a92c8981-70e6-4657-ab2b-6f32af4b5dfc | jsm-708d8fa9-e34c-4e2b-a977-102e8fc9842a | CREATE_COMPLETE | 2015-10-15T17:48:55Z |
            | ac9d8816-1fa1-4799-a7c5-d96785377888 | jsm-56a5ff49-7905-4819-bb47-f35220a24566 | CREATE_COMPLETE | 2015-10-15T17:49:13Z |
            | d14ad529-26a0-4fd0-869f-4a89aa1f951e | jsm-1daa61f8-8141-4b5f-b341-c6749644514d | UPDATE_COMPLETE | 2015-10-15T17:49:25Z |
            | 655dae6b-bc07-47cd-92d1-351b1823f17b | jsm-b5efbf32-6e00-49f3-a779-edde40edfd4f | UPDATE_COMPLETE | 2015-10-15T17:49:42Z |
            +--------------------------------------+------------------------------------------+-----------------+----------------------+

            root@nethra-01:~# heat stack-show ac9d8816-1fa1-4799-a7c5-d96785377888
            ERROR: Remote error: NotFound Instance could not be found (HTTP 404) (Request-ID: req-9b4a1f23-6d44-4a81-803d-1291113b1fea)

            But the other ones are fine :

            root@nethra-01:~# heat stack-show a92c8981-70e6-4657-ab2b-6f32af4b5dfc | grep template
            | description | HOT template to create network service |
            | | "output_value": "708d8fa9-e34c-4e2b-a977-102e8fc9842a-vFF-0-vnf-template", |
            | | "description": "Service template name", |
            | | "output_key": "vFF-0-vnf-template_name" |
            | | "description": "Service template service instances", |
            | | "output_key": "vFF-0-vnf-template_service_instances" |
            | | "output_value": "default-domain:708d8fa9-e34c-4e2b-a977-102e8fc9842a-vFF-0-vnf-template", |
            | | "description": "Service instance service template", |
            | | "output_key": "vFF-0-vnf-instance_service_template" |
            | | "output_value": "default-domain:708d8fa9-e34c-4e2b-a977-102e8fc9842a-vFF-0-vnf-template", |
            | | "description": "Service template FQ name", |
            | | "output_key": "vFF-0-vnf-template_fq_name" |
            | template_description | HOT template to create network service

            Diving into the logs:

            nova log:

            48011:2015-10-15 10:50:59.267 8024 ERROR nova.compute.manager [req-2976a46a-34e3-49ad-a15a-50f598a7c04c 8549a9d31c3b48df96ecb4e408ddf010 107778f07fd94897b871bce0fd4c8ef2] [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] Instance failed to spawn
            48012:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] Traceback (most recent call last):
            48013:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1740, in _spawn
            48014:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] block_device_info)
            48015:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2294, in spawn
            48016:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] write_to_disk=True)
            48017:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3506, in to_xml
            48018:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] disk_info, rescue, block_device_info)
            48019:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3316, in get_guest_config
            48020:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] for vif in network_info:
            48021:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] File "/usr/lib/python2.7/dist-packages/nova/network/model.py", line 420, in __iter__
            48022:2015-10-15 10:50:59.267 8024 TRACE nova.compute.manager [instance: 774b25e0-41c5-42b0-8eef-f3650cd4afcb] return self._sync_wrapper(fn, *args, **kwargs)

Tags: config heat
Changed in juniperopenstack:
importance: Undecided → High
assignee: nobody → Praneet Bachheti (praneetb)
tags: added: config heat
Changed in juniperopenstack:
assignee: Praneet Bachheti (praneetb) → Rudra Rugge (rudrarugge)
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.