agent/tor-agents did not get any ifmap from control node on tor-scale setup

Bug #1466825 reported by Vedamurthy Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.0
Fix Committed
High
Tapan Karwa
R2.1
Fix Committed
High
Tapan Karwa
R2.20
Fix Committed
High
Tapan Karwa
R3.0
Invalid
High
Tapan Karwa
Trunk
Fix Committed
High
Tapan Karwa

Bug Description

R2.20 Build 54 Ubuntu 14.04 Juno Multi-node HA setup

On the tor-scale setup, When supervisor-config was restarted on all the controller nodes, within a few mins, all supervisor-vrouters were stopped on the 5 compute nodes (which includes tsn, tor-agent nodes)
The supervisor-vrouters were stopped to avoid the delays in control-node graph-walk .

After the entire system stabilized(all processes were fine)...infact after many hours, supervisor-vrouters were restarted on nodei37, nodei38, nodei27, nodei30 and nodei28

It was seen that these agents/tor-agents did not get any ifmap-config from control nodes

XMPP connection to 2 controllers were fine.
On the control-node which the tor-agents subscribed to, we could see the subscription for config, but no config was sent

gcores of the three control nodes are taken (will be in http://10.204.216.50/Docs/bugs/#)

Prakash would add more details based on his debug..

env.roledefs = {
    'all': [host2, host3, host4, host5, host6, host7, host8, host9],
    'cfgm': [host2, host3, host4],
    'openstack': [host2, host3, host4],
    'webui': [host3],
    'control': [host2, host3, host4],
    'compute': [host5, host6, host7, host8, host9],
    'collector': [host2, host3, host4],
    'database': [host2, host3, host4],
    'toragent': [host5, host6, host7, host9 ],
    'tsn': [host5, host6, host7,host9 ],
    'build': [host_build],
}

env.hostnames = {
    'all': ['nodei34', 'nodei35', 'nodei36', 'nodei37', 'nodei38', 'nodei28', 'nodei27', 'nodei30']
}

Revision history for this message
Prakash Bailkeri (prakashmb) wrote :
Download full text (9.5 KiB)

Taking nodei38-1 as XmppClient(It is a ToR agent)
From the stats(Session stats and ifmap stats), it is clear that config is not set to XmppClient. The Subscribe message is rxed from the client.(It is visible in the trace and also in stats)

gdb) p *(IFMapClient *) 0x7f9557eacde0
$17 = (IFMapXmppChannel::IFMapSender) {
  <IFMapClient> = {
    _vptr.IFMapClient = 0xae3cd0 <vtable for IFMapXmppChannel::IFMapSender+16>,
    static kIndexInvalid = -1,
    index_ = 4,
    exporter_ = 0x1449f30,
    msgs_sent_ = 0,
    msgs_blocked_ = 0,
    bytes_sent_ = 0,
    nodes_sent_ = 0,
    links_sent_ = 0,
    send_is_blocked_ = false,
    vm_map_ = std::map with 0 elements,
    name_ = "nodei36:192.168.1.6"
  },
  members of IFMapXmppChannel::IFMapSender:
  parent_ = 0x7f94416ac9f0,
  hostname_ = "nodei38-1",
  identifier_ = "default-global-system-config:nodei38-1"
}
(gdb) p *(IFMapXmppChannel*) 0x7f94416ac9f0
$18 = (IFMapXmppChannel) {
  _vptr.IFMapXmppChannel = 0xae3d30 <vtable for IFMapXmppChannel+16>,
  peer_id_ = xmps::CONFIG,
  channel_ = 0x1f2df10,
  ifmap_server_ = 0x7fff29159400,
  ifmap_channel_manager_ = 0x7fff291591a0,
  ifmap_client_ = 0x7f9557eacde0,
  client_added_ = true,
  channel_name_ = "nodei36:192.168.1.6"
}

(gdb) p *(XmppChannel *) 0x1f2df10
$19 = (XmppChannelMux) {
  <XmppChannel> = {
    _vptr.XmppChannel = 0xb6d9f0 <vtable for XmppChannelMux+16>
  },
  members of XmppChannelMux:
  map_ = std::map with 0 elements,
  rxmap_ = std::map with 2 elements = {
    [xmps::CONFIG] = {
      <boost::function2<void, XmppStanza::XmppMessage const*, xmps::PeerState>> = {
        <boost::function_base> = {
          vtable = 0xae3331 <void boost::function2<void, XmppStanza::XmppMessage const*, xmps::PeerState>::assign_to<boost::_bi::bind_t<void, boost::_mfi::mf1<void, IFMapXmppChannel, XmppStanza::XmppMessage const*>, boost::_bi::list2<boost::_bi::value<IFMapXmppChannel*>, boost::arg<1> > > >(boost::_bi::bind_t<void, boost::_mfi::mf1<void, IFMapXmppChannel, XmppStanza::XmppMessage const*>, boost::_bi::list2<boost::_bi::value<IFMapXmppChannel*>, boost::arg<1> > >)::stored_vtable+1>,
          functor = {
            obj_ptr = 0x11,
            type = {
              type = 0x11,
              const_qualified = false,
              volatile_qualified = false
            },
            func_ptr = 0x11,
            bound_memfunc_ptr = {
              memfunc_ptr = &virtual table offset 16,
              obj_ptr = 0x7f94416ac9f0
            },
            obj_ref = {
              obj_ptr = 0x11,
              is_const_qualified = false,
              is_volatile_qualified = false
            },
            data = 17 '\021'
          }
        },
        <std::binary_function<XmppStanza::XmppMessage const*, xmps::PeerState, void>> = {<No data fields>},
        members of boost::function2<void, XmppStanza::XmppMessage const*, xmps::PeerState>:
        static args = <optimized out>,
        static arity = <optimized out>
      }, <No data fields>},
    [xmps::BGP] = {
      <boost::function2<void, XmppStanza::XmppMessage const*, xmps::PeerState>> = {
        <boost::function_base> = {
          vtable = 0xb239d1 <void boost::function2<void, XmppSt...

Read more...

Changed in juniperopenstack:
assignee: nobody → Tapan Karwa (tkarwa)
tags: added: blocker
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/11940
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.1

Review in progress for https://review.opencontrail.org/11945
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.0

Review in progress for https://review.opencontrail.org/11946
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/11947
Submitter: Tapan Karwa (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/11947
Committed: http://github.org/Juniper/contrail-controller/commit/8db93fec1cbb046076440c51a58cd625bef588f4
Submitter: Zuul
Branch: master

commit 8db93fec1cbb046076440c51a58cd625bef588f4
Author: Tapan Karwa <email address hidden>
Date: Mon Jun 22 12:21:02 2015 -0700

Activate the ifmap update Q for all node/link changes

Closes-Bug: #1467650
Partial-Bug: #1466825

Change-Id: I88e975e685ebc43ad20b473bb62026bb36f6b044

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/11945
Committed: http://github.org/Juniper/contrail-controller/commit/2f48d8789a11955bf895b69caacf4e039ca65de7
Submitter: Zuul
Branch: R2.1

commit 2f48d8789a11955bf895b69caacf4e039ca65de7
Author: Tapan Karwa <email address hidden>
Date: Mon Jun 22 12:21:02 2015 -0700

Activate the ifmap update Q for all node/link changes

Closes-Bug: #1467650
Partial-Bug: #1466825

Change-Id: I88e975e685ebc43ad20b473bb62026bb36f6b044

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/11946
Committed: http://github.org/Juniper/contrail-controller/commit/7046092e608bbc767df37e833f16a1e024aa0af9
Submitter: Zuul
Branch: R2.0

commit 7046092e608bbc767df37e833f16a1e024aa0af9
Author: Tapan Karwa <email address hidden>
Date: Mon Jun 22 12:21:02 2015 -0700

Activate the ifmap update Q for all node/link changes

Closes-Bug: #1467650
Partial-Bug: #1466825

Change-Id: I88e975e685ebc43ad20b473bb62026bb36f6b044

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/11940
Committed: http://github.org/Juniper/contrail-controller/commit/659fe8cec5ca4e11c8d99c3b8c64d8109782e670
Submitter: Zuul
Branch: R2.20

commit 659fe8cec5ca4e11c8d99c3b8c64d8109782e670
Author: Tapan Karwa <email address hidden>
Date: Mon Jun 22 12:21:02 2015 -0700

Activate the ifmap update Q for all node/link changes

Closes-Bug: #1467650
Partial-Bug: #1466825

Change-Id: I88e975e685ebc43ad20b473bb62026bb36f6b044

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Commit on bug 1467650 will help this case. We need to repro this after this fix as there could be more reasons than queue stuck for this. We want to tackle couple of things. Moving the work item to 2.21.

1. Don't reuse xmpp client ID for config.
2. Throttle incoming clients.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Not a blocker anymore (see comment #10) for detail

tags: removed: blocker
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Bug update]

bug update...

Revision history for this message
Nischal Sheth (nsheth) wrote :

Am going to mark this as Fix Committed.
Please open a new bug if this issue is seen again.

Revision history for this message
Nischal Sheth (nsheth) wrote :

Note that the series for 3.0 was created way after the fix
already went into master. Set the milestone for master
to r3.0-fcs and marked the R3.0 series as Invalid.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.