DM: Config removed after I extend VN and try to ping to lo0 address from VM

Bug #1714004 reported by Shashikiran H
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.0
Fix Committed
High
Suresh Balineni
R4.1
Fix Committed
High
Suresh Balineni
Trunk
Fix Committed
High
Suresh Balineni

Bug Description

Version: 4.0.1.0-50

Topo: host1 = 'root@10.204.216.95'
host2 = 'root@10.204.216.96'
host3 = 'root@10.204.216.97'
host4 = 'root@10.204.216.98'
host5 = 'root@10.204.216.99'
host6 = 'root@10.204.216.103'
host7 = 'root@10.204.216.160'

ext_routers = [('blr-mx1', '10.10.10.101')]
env.roledefs = {
    'all': [host1, host2, host3, host4, host5, host6, host7],
    'contrail-controller': [host6, host2, host1],
    'contrail-analytics': [host6, host2, host1],
    'contrail-analyticsdb': [host6, host2, host1],
    'openstack': [host6, host2, host1],
    'contrail-compute': [host3, host4, host5],
    'contrail-lb': [host7],
    'build': [host_build]
}

I see that the DM config on mx box is removed after I extend the VN and try to ping to the lo0 ip from my VM. The Vn does not have external tag enabled. Also, the mode is set to l3, so i expect ping from VM to lo0 l3 ip to pass.
From Dm logs:
08/30/2017 06:00:13 PM [contrail-device-manager]: could not fetch element data: //software-information/junos-version, ip: 10.204.217.190

08/30/2017 06:05:08 PM [contrail-device-manager]: Router 10.204.217.190: statement not found: groups __contrail__

Shashikiran H (skiranh)
Changed in juniperopenstack:
assignee: nobody → Suresh Balineni (sbalineni)
description: updated
Changed in juniperopenstack:
importance: Undecided → High
Jeba Paulaiyan (jebap)
tags: added: regression
Revision history for this message
Shashikiran H (skiranh) wrote :

Debugged with Suresh yesterday. I am seeing regular group delete message on DM logs:
08/31/2017 11:05:46 AM [contrail-device-manager]:
send netconf message: <config xmlns:xc="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:junos="http://xml.juniper.net/junos">
    <configuration>
        <groups operation="delete">
            <name>__contrail__</name>
        </groups>
        <apply-groups operation="delete">
            <name>__contrail__</name>
        </apply-groups>
    </configuration>
</config>

Suresh to provide further updates.

Revision history for this message
Suresh Balineni (sbalineni) wrote :

Please provide the setup in buggy state

Revision history for this message
Vinoth Kannan Ganapathy (vganapathy) wrote :

setup details:
10.87.66.145
root/c0ntrail123
/root/testbed.py

Jeba Paulaiyan (jebap)
tags: added: sanity
Revision history for this message
Suresh Balineni (sbalineni) wrote :

I patched some changes on this bug, could you please try to reproduce?

Revision history for this message
Suresh Balineni (sbalineni) wrote :

Patched some changes for this bug on this setup, could you please try to reproduce the bug?

setup details:
10.87.66.145
root/c0ntrail123
/root/testbed.py

Revision history for this message
Shashikiran H (skiranh) wrote :

Below is Vinoth's setup which is in buggy state now.
10.87.66.145
root/c0ntrail123
Please use testbed file as /root/1714004_testbed.py and NOT /root/testbed.py

10.204.217.190 is the mx peer to be used. contrail group is absent from 10.204.217.190 presently.

Revision history for this message
Vinoth Kannan Ganapathy (vganapathy) wrote :

Suresh,

its the same setup 10.87.66.145 loaded with 4.1 latest build.

Revision history for this message
Suresh Balineni (sbalineni) wrote :

Thanks Vinoth.

Hi Shashi,

Vinoth provided the setup (not in problem state), could you please re-produce the problem on this setup?

Thanks,
Suresh

Revision history for this message
Shashikiran H (skiranh) wrote :

Cluster to be used: 10.204.217.7:/root/testbed.py
MX to be used is 10.204.217.190. It is in buggy state currently.

Revision history for this message
Shashikiran H (skiranh) wrote :

After the mail to reproduce the issue again, I have reproduced the issue again now. The setup has 4.1 image.
Cluster to be used: 10.204.217.7:/root/testbed.py
MX to be used is 10.204.217.190.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/39194
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/39196
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/39197
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/39196
Committed: http://github.com/Juniper/contrail-controller/commit/33898462f2dc14038dc07945616fd3cec27c038e
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 33898462f2dc14038dc07945616fd3cec27c038e
Author: sbalineni <email address hidden>
Date: Wed Jan 24 11:08:07 2018 -0800

[DM]: Handle gracefully when two successive BGP unlink/delete PR and create PR happens

Problem:

T0: DM receives delete "bgp router" link removal request. As a result of this, DM tries to delete config from device in a separate Device Greenlet. Thi

T1: Immediately, DM receives another request “physical-router” delete, in the context of PR object delete, DM deletes the config from device (this happ

T2: Device Greenlet fails to delete config from Device (since config was already deleted (T1)), and hence goes into “RETRY” mode.

T3: DM receives new PR create/BGP router update events. ==> New PR Object gets created locally, and this will create a new Device Greenlet and ultimate

T4: When timer expires, Old Device Greenlet retries to delete the config, and this time delete will be successful. Config will be gone from Device.

Solution:
Terminate Pending Device Greenlet in the context of PR delete

Change-Id: I68a965777169841197973133c68b0a08b25570f8
Closes-Bug: #1714004

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/39405
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/39194
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/39406
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/39194
Committed: http://github.com/Juniper/contrail-controller/commit/ad98b1bc3a89df5e42e9becc5cec373672b60d81
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit ad98b1bc3a89df5e42e9becc5cec373672b60d81
Author: sbalineni <email address hidden>
Date: Wed Jan 24 11:08:07 2018 -0800

[DM]: Handle gracefully when two successive BGP unlink/delete PR and create PR happens

Problem:

T0: DM receives delete "bgp router" link removal request. As a result of this, DM tries to delete config from device in a separate Device Greenlet. Thi

T1: Immediately, DM receives another request “physical-router” delete, in the context of PR object delete, DM deletes the config from device (this happ

T2: Device Greenlet fails to delete config from Device (since config was already deleted (T1)), and hence goes into “RETRY” mode.

T3: DM receives new PR create/BGP router update events. ==> New PR Object gets created locally, and this will create a new Device Greenlet and ultimate

T4: When timer expires, Old Device Greenlet retries to delete the config, and this time delete will be successful. Config will be gone from Device.

Solution:
Terminate Pending Device Greenlet in the context of PR delete

Change-Id: I0d68d585230c503ed81650190d6e7aa4ba594875
Closes-Bug: #1714004

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/39406
Committed: http://github.com/Juniper/contrail-controller/commit/9618d9abe3672479ed45c9bd8b5ecf14e11908e4
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 9618d9abe3672479ed45c9bd8b5ecf14e11908e4
Author: sbalineni <email address hidden>
Date: Wed Jan 24 11:08:07 2018 -0800

[DM]: Handle gracefully when two successive BGP unlink/delete PR and create PR happens

Problem:

T0: DM receives delete "bgp router" link removal request. As a result of this, DM tries to delete config from device in a separate Device Greenlet. Thi

T1: Immediately, DM receives another request “physical-router” delete, in the context of PR object delete, DM deletes the config from device (this happ

T2: Device Greenlet fails to delete config from Device (since config was already deleted (T1)), and hence goes into “RETRY” mode.

T3: DM receives new PR create/BGP router update events. ==> New PR Object gets created locally, and this will create a new Device Greenlet and ultimate

T4: When timer expires, Old Device Greenlet retries to delete the config, and this time delete will be successful. Config will be gone from Device.

Solution:
Terminate Pending Device Greenlet in the context of PR delete

Closes-Bug: #1714004

Conflicts:
 src/config/device-manager/test/test_case.py

Change-Id: Id6a35f4141aff0766194c64991ab6011d92ae88b

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.