Deletion of vnf having issues when doing in between Active-->Dead State

Bug #1503480 reported by Santosh
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tacker
Invalid
Medium
Bob Haddleton

Bug Description

When using below template file to create vnf, the vnf state keeps changing from ACTIVE-->DEAD-->ACTIVE-->DEAD...
The deletion of vnf errors and vnf stucks in "PENDING_DELETE" state.

template_name: sample-vnfd-nonparam-respawn
description: demo-example

service_properties:
  Id: sample-vnfd
  vendor: tacker
  version: 1

vdus:
  vdu1:
    id: vdu1
    vm_image: cirros-0.3.4-x86_64-uec
    instance_type: m1.tiny
    user_data_format: RAW
    user_data: |
       #!/bin/sh
       df -h > /home/cirros/diskinfo
       sudo ifdown eth0

    network_interfaces:
      management:
        network: net_mgmt
        management: true
      pkt_in:
        network: net0
      pkt_out:
        network: net1

    placement_policy:
      availability_zone: nova

    auto-scaling: noop
    monitoring_policy: ping
    failure_policy: respawn

    config:
      param0: key0
      param1: key1

Revision history for this message
Santosh (ksantosh-cs) wrote :

Logs

Revision history for this message
Bob Haddleton (bob-haddleton) wrote :

Can you provide some background on why this template is turning off network access as soon as the VM is started? I would not expect that ping monitoring would ever succeed on this VM and it would be respawned continuously.
 Is there a specific scenario that this template is trying to emulate?

Revision history for this message
Santosh (ksantosh-cs) wrote :

The template is used to force vm to be unreachable, so that once monitoring starts (using ping), and vm is unreachable (because of sudo ifdown eth0 command in template), as per behavior tacker is suppose to declare vnf to DEAD state and restart.
This was the aim of testcase to verify vnf state from ACTIVE-->DEAD-->ACTIVE. This verifies someone is doing ping and when vnf is not reachable it is restarting it.

Revision history for this message
Sridhar Ramaswamy (srics-r) wrote :

@Bob -

What is our stand when a VDU, with monitoring enabled, came up unreachable in first place. I'd hope monitoring would still kick in and respawn.

BTW - we also need a max respawn-limit. We can keep respawning an hopeless VDU, instead give up saying max respawn limit reached.

@Santosh -

Bob has a valid point on ping never getting started in first place. Can we try introducing a sleep for T > boot_wait / monitoring_delay for ping to get to start first and then do the 'ifconfig eth0 down' ?

Revision history for this message
Sridhar Ramaswamy (srics-r) wrote :

Correction... I meant,

We can't keep respawning an hopeless VDU, instead ...

Changed in tacker:
importance: Undecided → High
tags: added: vnf-health-monitoring
Changed in tacker:
importance: High → Critical
tags: added: liberty-critical
Changed in tacker:
assignee: nobody → Bob Haddleton (bob-haddleton)
Revision history for this message
Abdul Rehman (an-abdulrehman) wrote :

Hi Srdihar/Bob,

I have edited the monitoring policy in Santosh's template and added the count parameter:

template_name: sample-vnfd-nonparam-respawn
description: demo-example

service_properties:
  Id: sample-vnfd
  vendor: tacker
  version: 1

vdus:
  vdu1:
    id: vdu1
    vm_image: cirros-0.3.4-x86_64-uec
    instance_type: m1.tiny
    user_data_format: RAW
    user_data: |
       #!/bin/sh
       df -h > /home/cirros/diskinfo
       sudo ifdown eth0

    network_interfaces:
      management:
        network: net_mgmt
        management: true
      pkt_in:
        network: net0
      pkt_out:
        network: net1

    placement_policy:
      availability_zone: nova
    auto-scaling: noop
    monitoring_policy:
      ping:
        monitoring_params:
          monitoring_delay: 15
          count: 3
          interval: .5
          timeout: 2
        actions:
          failure: respawn
    config:
      param0: key0
      param1: key1

I'm expecting that in this case 'count' should restrict the number of re-spawn to 3 but actually its keeps on re-spawning forever. What's the purpose of count in monitoring policy?

Also is there any documentation from where we get details of template specifications? I want to deploy my own VNF but without proper understanding of all available parameters I'm finding it difficult.

Thanks.

Abdul

Revision history for this message
Sridhar Ramaswamy (srics-r) wrote :

Abdul -

tacker docs are available in http://tacker-docs.readthedocs.org/en/latest/

There is a specific document on monitoring framework,

http://tacker-docs.readthedocs.org/en/latest/devref/monitor-api.html

.. unfortunately it doesn't go into the specific details of ping-monitor. Here the "count" indicates max number of retries before 'mark' the VNF for healing action. If you've some cycles please consider opening a doc bug and contribute a small .rst text (to [1] dir) for ping and http-ping monitoring drivers,

[1] https://github.com/openstack/tacker/tree/master/doc/source/devref

Changed in tacker:
importance: Critical → Medium
Revision history for this message
Sridhar Ramaswamy (srics-r) wrote :

Abdul - BTW, as patchset for max respawn limit is available here,

https://review.openstack.org/#/c/240814/

Again, if you are interested consider picking this up from Bharath.

Revision history for this message
Sridhar Ramaswamy (srics-r) wrote :

Multiple fixes got merged in the area of VNF monitoring and respawn. Closing this for now. Please reopen if you hit this problem.

Changed in tacker:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.