VRouterAgent reports state non-functional even though one (1/1) control nodes are connected

Bug #1396164 reported by Stefan Andres
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
High
Ashok Singh
OpenContrail
Fix Committed
Undecided
Unassigned

Bug Description

Hi,

curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus|python -m json.tool reports
[...]

"description": "Number of connections:4, Expected: 6",
"instance_id": "0",
"module_id": "VRouterAgent",
"state": "Non-Functional"
}

Even though the vrouter is properly connected to the control-node (e.g. working IPs are working). The vrouter-agent.conf also says:

max_control_nodes = 1

Pedro mentioned that https://github.com/Juniper/contrail-controller/blob/1481d670e5941c76a446969bba4f912c826fb6af/src/base/connection_info.cc#L79 might be the code that is handling this wrongly.

That leads to the problem that the service instance for SNAT cannot be deployed.

Stefan Andres (s-andres)
description: updated
Revision history for this message
Pedro Marques (5-roque) wrote :

source-nat functionality depends on the analytics reporting "Functional" state for the vrouter. The "Expected" number of connections seems to be set a 6 according to the number of TCP sessions expected for the analytics process.

Changed in juniperopenstack:
importance: Undecided → High
assignee: nobody → Raj Reddy (rajreddy)
tags: added: analytics
Revision history for this message
Raj Reddy (rajreddy) wrote :

The state 'Functional' or 'Non-functional' should be decided by the process based on it's internal state..
I think we still need the full output of
curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus
to determine why vrouter thinks some of its connections are not up..

tags: added: vrouter
Changed in juniperopenstack:
assignee: Raj Reddy (rajreddy) → Hari Prasad Killi (haripk)
Revision history for this message
Raj Reddy (rajreddy) wrote : Re: [Bug 1396164] Re: VRouterAgent reports state non-functional even though one (1/1) control nodes are connected

Hi Stefan,

Can you provide the full output of
curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus|python -m json.tool
in the problematic state..

thanks
-
Raj

On Nov 25, 2014, at 6:01 AM, Pedro Marques <email address hidden>
 wrote:

> source-nat functionality depends on the analytics reporting "Functional"
> state for the vrouter. The "Expected" number of connections seems to be
> set a 6 according to the number of TCP sessions expected for the
> analytics process.
>
> ** Also affects: juniperopenstack
> Importance: Undecided
> Status: New
>
> ** Changed in: juniperopenstack
> Importance: Undecided => High
>
> ** Changed in: juniperopenstack
> Assignee: (unassigned) => Raj Reddy (rajreddy)
>
> ** Tags added: analytics
>
> --
> You received this bug notification because you are a member of Contrail
> Systems engineering, which is subscribed to Juniper Openstack.
> https://bugs.launchpad.net/bugs/1396164
>
> Title:
> VRouterAgent reports state non-functional even though one (1/1)
> control nodes are connected
>
> Status in Juniper Openstack distribution:
> New
> Status in OpenContrail:
> New
>
> Bug description:
> Hi,
>
> curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus|python -m json.tool reports
> [...]
>
> "description": "Number of connections:4, Expected: 6",
> "instance_id": "0",
> "module_id": "VRouterAgent",
> "state": "Non-Functional"
> }
>
> Even though the vrouter is properly connected to the control-node
> (e.g. working IPs are working). The vrouter-agent.conf also says:
>
> max_control_nodes = 1
>
> Pedro mentioned that https://github.com/Juniper/contrail-
> controller/blob/1481d670e5941c76a446969bba4f912c826fb6af/src/base/connection_info.cc#L79
> might be the code that is handling this wrongly.
>
> That leads to the problem that the service instance for SNAT cannot be
> deployed.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juniperopenstack/+bug/1396164/+subscriptions

Revision history for this message
Stefan Andres (s-andres) wrote :
Revision history for this message
Stefan Andres (s-andres) wrote :

Hey Raj,

yep - I attached the file here.

Regards,
  Stefan

Revision history for this message
Pedro Marques (5-roque) wrote : Re: [Bug 1396164] VRouterAgent reports state non-functional even though one (1/1) control nodes are connected

Raj,
Why does the output include the following:
 "description": "Number of connections:4, Expected: 6”

  Pedro.

> On Nov 25, 2014, at 8:51 AM, Raj Reddy <email address hidden> wrote:
>
> The state 'Functional' or 'Non-functional' should be decided by the process based on it's internal state..
> I think we still need the full output of
> curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus
> to determine why vrouter thinks some of its connections are not up..
>
>
> ** Tags added: vrouter
>
> ** Changed in: juniperopenstack
> Assignee: Raj Reddy (rajreddy) => Hari Prasad Killi (haripk)
>
> --
> You received this bug notification because you are a member of Contrail
> Systems engineering, which is subscribed to Juniper Openstack.
> https://bugs.launchpad.net/bugs/1396164
>
> Title:
> VRouterAgent reports state non-functional even though one (1/1)
> control nodes are connected
>
> Status in Juniper Openstack distribution:
> New
> Status in OpenContrail:
> New
>
> Bug description:
> Hi,
>
> curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus|python -m json.tool reports
> [...]
>
> "description": "Number of connections:4, Expected: 6",
> "instance_id": "0",
> "module_id": "VRouterAgent",
> "state": "Non-Functional"
> }
>
> Even though the vrouter is properly connected to the control-node
> (e.g. working IPs are working). The vrouter-agent.conf also says:
>
> max_control_nodes = 1
>
> Pedro mentioned that https://github.com/Juniper/contrail-
> controller/blob/1481d670e5941c76a446969bba4f912c826fb6af/src/base/connection_info.cc#L79
> might be the code that is handling this wrongly.
>
> That leads to the problem that the service instance for SNAT cannot be
> deployed.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juniperopenstack/+bug/1396164/+subscriptions

Revision history for this message
Raj Reddy (rajreddy) wrote :

Hi Stefan,

what is the version of the software,
can you provide /etc/contrail/contrail-vrouter-agent.conf file..
also, are you using discovery or not?

thanks,
-
Raj

On Nov 25, 2014, at 11:07 AM, Pedro Marques <email address hidden>
 wrote:

> Raj,
> Why does the output include the following:
> "description": "Number of connections:4, Expected: 6”
>
> Pedro.
>
>> On Nov 25, 2014, at 8:51 AM, Raj Reddy <email address hidden> wrote:
>>
>> The state 'Functional' or 'Non-functional' should be decided by the process based on it's internal state..
>> I think we still need the full output of
>> curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus
>> to determine why vrouter thinks some of its connections are not up..
>>
>>
>> ** Tags added: vrouter
>>
>> ** Changed in: juniperopenstack
>> Assignee: Raj Reddy (rajreddy) => Hari Prasad Killi (haripk)
>>
>> --
>> You received this bug notification because you are a member of Contrail
>> Systems engineering, which is subscribed to Juniper Openstack.
>> https://bugs.launchpad.net/bugs/1396164
>>
>> Title:
>> VRouterAgent reports state non-functional even though one (1/1)
>> control nodes are connected
>>
>> Status in Juniper Openstack distribution:
>> New
>> Status in OpenContrail:
>> New
>>
>> Bug description:
>> Hi,
>>
>> curl controller_ip:8081/analytics/uves/vrouter/$COMPUTE_NODE?cfilt=NodeStatus|python -m json.tool reports
>> [...]
>>
>> "description": "Number of connections:4, Expected: 6",
>> "instance_id": "0",
>> "module_id": "VRouterAgent",
>> "state": "Non-Functional"
>> }
>>
>> Even though the vrouter is properly connected to the control-node
>> (e.g. working IPs are working). The vrouter-agent.conf also says:
>>
>> max_control_nodes = 1
>>
>> Pedro mentioned that https://github.com/Juniper/contrail-
>> controller/blob/1481d670e5941c76a446969bba4f912c826fb6af/src/base/connection_info.cc#L79
>> might be the code that is handling this wrongly.
>>
>> That leads to the problem that the service instance for SNAT cannot be
>> deployed.
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/juniperopenstack/+bug/1396164/+subscriptions
>

Revision history for this message
Ashok Singh (ashoksr) wrote :

This issue can be seen when both discovery IP and service IP is configured in contrail-vrouter-agent.conf. When both are configured we don't subscribe to discovery service as the service IP given by config/cmd-line takes precedence. In this scenario, the expected connections count is not being updated correctly.

Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Ashok Singh (ashoksr)
Revision history for this message
Stefan Andres (s-andres) wrote :

Hey Raj,
this is on a 1.20 release. This is our contrail-vrouter-agent.conf:

/etc/contrail $ grep -v '#' contrail-vrouter-agent.conf |sed '/^$/d'
[CONTROL-NODE]
server = $control_node
[DEFAULT]
collectors = $control_node:8086
debug = 1
http_server_port = 8085
log_level = SYS_DEBUG
syslog_facility = LOG_USER
[DISCOVERY]
server = $control_node
max_control_nodes = 1
[DNS]
server =
[HYPERVISOR]
[FLOWS]
[METADATA]
metadata_proxy_secret = $shared_secret
[NETWORKS]
control_network_ip = $local_external_ip
[VIRTUAL-HOST-INTERFACE]
name = vhost0
ip = $local_external_ip/26
gateway = $gateway
physical_interface = bond0
[GATEWAY-0]
[GATEWAY-1]
/etc/contrail $

Which service IP do you mean there? Do you mean if the control-node server= key?

Revision history for this message
Ashok Singh (ashoksr) wrote :

Hi Stefan,

The workaround till this issue get fixed is to have all the service IP from discovery. For this you need to replace

[CONTROL-NODE]
server = $control_node

with

[CONTROL-NODE]
#server = $control_node

and

[DEFAULT]
collectors = $control_node:8086

with

[DEFAULT]
#collectors = $control_node:8086

There is no need to change DNS configuration

Revision history for this message
Stefan Andres (s-andres) wrote :

Hm, when not setting the control-node the vrouter does not find the proper control-node anymore.

http://compute_node:8085/Snh_AgentXmppConnectionStatusReq just says 127.0.0.1.
http://controller_node:5998/services show the correct IPs though:

hr /><a href="/">Home</a> &nbsp;|&nbsp<a href="/services">Publishers</a> &nbsp;|&nbsp<a href="/clients">Subscribers</a> &nbsp;|&nbsp<a href="/stats">Stats</a> &nbsp;|&nbsp<a href="/config">Config</a> &nbsp;|&nbsp<a title="purge inactive publishers" href="/cleanup">Cleanup</a> &nbsp;|&nbsp<hr /> <table border="1" cellpadding="1" cellspacing="0">
    <tr>
        <td>Service Type</td>
        <td>Remote IP</td>
        <td>Service Id</td>
        <td>Provision State</td>
        <td>Admin State</td>
        <td>In Use</td>
        <td>Time since last Heartbeat</td>
    </tr>
    <tr>
        <td>OpServer</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:OpServer/brief">bka-001-01:OpServer</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/OpServer/bka-001-01">0</a></td>
        <td bgcolor=#00FF00>0:00:05</td>
    </tr>
    <tr>
        <td>IfmapServer</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:IfmapServer/brief">bka-001-01:IfmapServer</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/IfmapServer/bka-001-01">2</a></td>
        <td bgcolor=#00FF00>0:00:05</td>
    </tr>
    <tr>
        <td>dns-server</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:dns-server/brief">bka-001-01:dns-server</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/dns-server/bka-001-01">9</a></td>
        <td bgcolor=#00FF00>0:00:04</td>
    </tr>
    <tr>
        <td>xmpp-server</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:xmpp-server/brief">bka-001-01:xmpp-server</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/xmpp-server/bka-001-01">0</a></td>
        <td bgcolor=#00FF00>0:00:04</td>
    </tr>
    <tr>
        <td>Collector</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:Collector/brief">bka-001-01:Collector</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/Collector/bka-001-01">7</a></td>
        <td bgcolor=#00FF00>0:00:05</td>
    </tr>
    <tr>
        <td>ApiServer</td>
        <td>$control_node</td>
        <td><a href="/service/bka-001-01:ApiServer/brief">bka-001-01:ApiServer</a></td>
        <td>new</td>
        <td>up</td>
        <td><a href="/clients/ApiServer/bka-001-01">0</a></td>
        <td bgcolor=#00FF00>0:00:05</td>
    </tr>
 </table>

Revision history for this message
Ashok Singh (ashoksr) wrote :
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/4997
Committed: http://github.org/Juniper/contrail-controller/commit/3aa2e2ea19d3dea52d1e18346947c152df6c4d6c
Submitter: Zuul
Branch: master

commit 3aa2e2ea19d3dea52d1e18346947c152df6c4d6c
Author: ashoksingh <email address hidden>
Date: Wed Nov 26 18:05:10 2014 +0530

Issue: When both discovery IP and service IP is configured in contrail-vrouter-agent.conf, the expected_connections count (used by NodeStatus UVE to declare whether agent is Functional or not) computed by agent was incorrect.
Fix: While updating expected_connections count we should consider whether agent has subscribed for a given discovery service or not.
Closes-Bug: #1396164

Change-Id: Idbef7de5e63bec52dbeb50547ed61ba37255c288

Changed in juniperopenstack:
status: New → Fix Committed
Changed in opencontrail:
status: New → Fix Committed
Revision history for this message
Stefan Andres (s-andres) wrote :

Thanks, we'll try that fix.

Revision history for this message
Martin Gerhard Loschwitz (martin-loschwitz) wrote :

Can we please get a version of this that is compatible with 1.20, i.e. the latest stable release? Thanks.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5038
Committed: http://github.org/Juniper/contrail-controller/commit/f5f99c0e6de71d99b9ffcfbbe874fbf4078c07e2
Submitter: Zuul
Branch: R2.0

commit f5f99c0e6de71d99b9ffcfbbe874fbf4078c07e2
Author: ashoksingh <email address hidden>
Date: Wed Nov 26 18:05:10 2014 +0530

Issue: When both discovery IP and service IP is configured in contrail-vrouter-agent.conf, the expected_connections count (used by NodeStatus UVE to declare whether agent is Functional or not) computed by agent was incorrect.
Fix: While updating expected_connections count we should consider whether agent has subscribed for a given discovery service or not.
Closes-Bug: #1396164

Change-Id: Idbef7de5e63bec52dbeb50547ed61ba37255c288
(cherry picked from commit 3aa2e2ea19d3dea52d1e18346947c152df6c4d6c)

Revision history for this message
Ashok Singh (ashoksr) wrote :

Fix is committed to mainline and 2.0. I have submitted patch for R1.10 (from which 1.20 was released) branch which is already reviewed. Approval for this is pending (https://review.opencontrail.org/#/c/5042/)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5042
Committed: http://github.org/Juniper/contrail-controller/commit/a3c3f08755a835e8c729d132b0167563c8bbb58e
Submitter: Zuul
Branch: R1.10

commit a3c3f08755a835e8c729d132b0167563c8bbb58e
Author: ashoksingh <email address hidden>
Date: Thu Nov 27 22:21:54 2014 +0530

Issue: When both discovery IP and service IP is configured in contrail-vrouter-agent.conf, the expected_connections count (used by NodeStatus UVE to declare whether agent is Functional or not) computed by agent was incorrect.
Fix: While updating expected_connections count we should consider whether agent has subscribed for a given discovery service or not.
Closes-Bug: #1396164

Change-Id: I93837ea036276580c55903f7b2d8e36bbd66ca7d

Revision history for this message
Martin Gerhard Loschwitz (martin-loschwitz) wrote :

Ashok, thanks a bunch!

Revision history for this message
Stefan Andres (s-andres) wrote :

The fix works in our development setup, we'll try that on our real cluster in a few days:

$ curl localhost:8081/analytics/uves/vrouter/compute?cfilt=NodeStatus|python -m json.tool
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
100 532 0 532 0 0 47196 0 --:--:-- --:--:-- --:--:-- 48363
{
    "NodeStatus": {
        "process_status": [
            {
                "connection_infos": [
                    {
                        "description": "OpenSent",
                        "name": "control-node:10.0.80.12",
                        "server_addrs": [
                            "10.0.80.12:0"
                        ],
                        "status": "Up",
                        "type": "XMPP"
                    },
                    {
                        "description": "OpenSent",
                        "name": "dns-server:10.0.80.12",
                        "server_addrs": [
                            "10.0.80.12:53"
                        ],
                        "status": "Up",
                        "type": "XMPP"
                    },
                    {
                        "description": "Established",
                        "name": null,
                        "server_addrs": [
                            "10.0.80.12:8086"
                        ],
                        "status": "Up",
                        "type": "Collector"
                    }
                ],
                "description": null,
                "instance_id": "0",
                "module_id": "VRouterAgent",
                "state": "Functional"
            }
        ]
    }
}

Revision history for this message
Stefan Andres (s-andres) wrote :

That bug is solved (even though SNAT ist still not working, but that would be a different report then I guess)

Changed in juniperopenstack:
status: Fix Committed → Fix Released
Changed in opencontrail:
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed
Changed in juniperopenstack:
status: Fix Released → Fix Committed
Revision history for this message
Stefan Andres (s-andres) wrote :
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5272
Committed: http://github.org/Juniper/contrail-controller/commit/90021d5f8e212c0de6e82084f93a50691f347b5f
Submitter: Zuul
Branch: master

commit 90021d5f8e212c0de6e82084f93a50691f347b5f
Author: ashoksingh <email address hidden>
Date: Thu Dec 4 18:19:46 2014 +0530

As part of contrail-vrouter-agent NodeStatus UVE, send the status as NON-FUNCTIONAL only when all the agent's connection to control-node is down. In all other cases, agent reports the state as FUNCTIONAL.
Add UT cases.
Closes-Bug: #1396164

Change-Id: I3336bfb9eae9810307619ccb08e52a76322a24bf

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5370
Committed: http://github.org/Juniper/contrail-controller/commit/09a2d383f04916cb9ca3985c3149b7530a50eb1c
Submitter: Zuul
Branch: R2.0

commit 09a2d383f04916cb9ca3985c3149b7530a50eb1c
Author: ashoksingh <email address hidden>
Date: Thu Dec 4 18:19:46 2014 +0530

As part of contrail-vrouter-agent NodeStatus UVE, send the status as NON-FUNCTIONAL only when all the agent's connection to control-node is down. In all other cases, agent reports the state as FUNCTIONAL.
Add UT cases.
Closes-Bug: #1396164

Change-Id: I3336bfb9eae9810307619ccb08e52a76322a24bf
(cherry picked from commit 90021d5f8e212c0de6e82084f93a50691f347b5f)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.