R1.20-Build-55- Ubuntu-havane-agent core- Sync(AgentRoute*)

Bug #1381821 reported by shajuvk
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
Critical
Ashok Singh
R2.0
Fix Committed
Critical
Ashok Singh

Bug Description

Agent core during the test :
test_ecmp_svc_in_network_nat_with_3_instance: ecmp.sanity_with_setup.ECMPSanityFixture.test_ecmp_svc_in_network_nat_with_3_instance

#0 0x00007f6a1a768425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007f6a1a768425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f6a1a76bb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f6a1a7610ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f6a1a761192 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00000000008547c5 in CompositeNHKey::ExpandLocalCompositeNH(Agent*) ()
#5 0x000000000085480f in CompositeNHKey::Reorder(Agent*, unsigned int, NextHop const*) ()
#6 0x00000000007ed550 in AgentPath::ReorderCompositeNH(Agent*, CompositeNHKey*) ()
#7 0x00000000007edbff in AgentPath::Sync(AgentRoute*) ()
#8 0x00000000007f0cd7 in AgentRoute::Sync() ()
#9 0x00000000007f1c78 in AgentRouteTable::Input(DBTablePartition*, DBClient*, DBRequest*) ()
#10 0x000000000080ea60 in Inet4UnicastAgentRouteTable::AddLocalVmRoute(Peer const*, std::string const&, boost::asio::ip::address_v4 const&, unsigned char, boost::uuids::uuid const&, std::string const&, unsigned int, std::vector<int, std::allocator<int> > const&, bool, PathPreference const&, boost::asio::ip::address_v4 const&) ()
#11 0x000000000087506b in VmInterface::AddRoute(std::string const&, boost::asio::ip::address_v4 const&, unsigned int, bool, bool, boost::asio::ip::address_v4 const&) ()
#12 0x0000000000875968 in VmInterface::UpdateL3InterfaceRoute(bool, bool, bool, VrfEntry*, boost::asio::ip::address_v4 const&) ()
#13 0x000000000087aa39 in VmInterface::UpdateL3(bool, VrfEntry*, boost::asio::ip::address_v4 const&, int, bool, bool) ()
#14 0x000000000087b07b in VmInterface::ApplyConfig(bool, bool, bool, VrfEntry*, boost::asio::ip::address_v4 const&, int, bool, bool, bool, bool) ()
#15 0x000000000087b168 in VmInterface::ResyncOsOperState(VmInterfaceOsOperStateData const*) ()
#16 0x000000000087b376 in VmInterface::Resync(VmInterfaceData*) ()
#17 0x00000000008a7c60 in AgentDBTable::Input(DBTablePartition*, DBClient*, DBRequest*) ()
#18 0x0000000000d4256a in DBPartition::QueueRunner::Run() ()
#19 0x0000000000e10cc0 in TaskImpl::execute() ()
#20 0x00007f6a1b7c3ece in ?? () from /usr/lib/libtbb_debug.so.2
#21 0x00007f6a1b7bae0b in ?? () from /usr/lib/libtbb_debug.so.2
#22 0x00007f6a1b7b96f2 in ?? () from /usr/lib/libtbb_debug.so.2
#23 0x00007f6a1b7b43ce in ?? () from /usr/lib/libtbb_debug.so.2
#24 0x00007f6a1b7b4270 in ?? () from /usr/lib/libtbb_debug.so.2
#25 0x00007f6a1b30be9a in start_thread ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#26 0x00007f6a1a8263fd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#27 0x0000000000000000 in ?? ()

Revision history for this message
shajuvk (shajuvk) wrote :

logs: /cs-shared/shaju/bugs/bug-1381821

tags: added: sanity
shajuvk (shajuvk)
summary: - R1.02-Build-55- Ubuntu-havane-agent core- Sync(AgentRoute*)
+ R1.20-Build-55- Ubuntu-havane-agent core- Sync(AgentRoute*)
Revision history for this message
Naveen N (naveenn) wrote :

Agent allocates a new mpls label, if there are two or more instances with same IP in same compute node. This mpls label would be exported to control node. Control-node path in agent would hold a reference to this mpls label. If this mpls label is deleted and the same label gets reused for some other interface, before control-node retracts the route, we would hit this assert.

Changed in juniperopenstack:
assignee: nobody → Naveen N (naveenn)
tags: removed: sanity
Changed in juniperopenstack:
assignee: Naveen N (naveenn) → Ashok Singh (ashoksr)
information type: Proprietary → Public
Sandip Dey (sandipd)
tags: added: blocker sanity
Revision history for this message
shajuvk (shajuvk) wrote :

R2.0 build 19 also has smiler core:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal 6, Aborted.
#0 0x00007fc61a7b5425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#0 0x00007fc61a7b5425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fc61a7b8b8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fc61a7ae0ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fc61a7ae192 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000889465 in CompositeNHKey::ExpandLocalCompositeNH(Agent*) ()
#5 0x00000000008894af in CompositeNHKey::Reorder(Agent*, unsigned int, NextHop const*) ()
#6 0x0000000000822e40 in AgentPath::ReorderCompositeNH(Agent*, CompositeNHKey*) ()
#7 0x0000000000823527 in AgentPath::Sync(AgentRoute*) ()
#8 0x00000000008265f7 in AgentRoute::Sync() ()
#9 0x0000000000827089 in AgentRouteTable::Input(DBTablePartition*, DBClient*, DBRequest*) ()
#10 0x0000000000826abd in AgentRouteTable::Input(DBTablePartition*, DBClient*, DBRequest*) ()
#11 0x0000000000dedd1a in DBPartition::QueueRunner::Run() ()
#12 0x0000000000ec5165 in TaskImpl::execute() ()
#13 0x00007fc61b36fe52 in ?? () from /usr/lib/libtbb.so.2
#14 0x00007fc61b36bc2d in ?? () from /usr/lib/libtbb.so.2
#15 0x00007fc61b36b0db in ?? () from /usr/lib/libtbb.so.2
#16 0x00007fc61b368c1f in ?? () from /usr/lib/libtbb.so.2
#17 0x00007fc61b368e59 in ?? () from /usr/lib/libtbb.so.2
#18 0x00007fc61b586e9a in start_thread ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x00007fc61a8733fd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#20 0x0000000000000000 in ?? ()

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/5822
Committed: http://github.org/Juniper/contrail-controller/commit/cd58ed50842abfe3b6b3d3d0699a588a2a42f2d5
Submitter: Zuul
Branch: R2.0

commit cd58ed50842abfe3b6b3d3d0699a588a2a42f2d5
Author: Naveen N <email address hidden>
Date: Sat Dec 20 02:02:31 2014 -0800

* Upon local mpls ecmp label deallocation, control-node would withdraw
this mpls label in its path, if the label gets reused even before
control-node retracts the label, due to quick activate and deactivate
of the interface(oper state change), then mpls label in BGP path would
point to internface NH, instead of composite NH as expected.
Re-evaluation of BGP peer happens because it was one of the ecmp interface
that got deactivated and added again.
Add a check to verify that local ecmp mpls label is not deleted.
Unit test:
Created ecmp with 2 instances, add the same route with aggregarte mpls
label via bgp peer, deleted both interface so that route update is
sent to BGP to withdraw this path, before BGP peer could withdraw
same instance would be activated with aggregarate mpls label and
route change would be triggered, verify that no crash is seen,
and update of path happens fine
Closes-bug:#1381821

Change-Id: I27a0f8ca0f011cc4bab3de79787356d32563c4d6

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5843
Committed: http://github.org/Juniper/contrail-controller/commit/c4e33530753b9c3a221abb958f81e4164a585f5e
Submitter: Zuul
Branch: master

commit c4e33530753b9c3a221abb958f81e4164a585f5e
Author: Naveen N <email address hidden>
Date: Sat Dec 20 02:02:31 2014 -0800

* Upon local mpls ecmp label deallocation, control-node would withdraw
this mpls label in its path, if the label gets reused even before
control-node retracts the label, due to quick activate and deactivate
of the interface(oper state change), then mpls label in BGP path would
point to internface NH, instead of composite NH as expected.
Re-evaluation of BGP peer happens because it was one of the ecmp interface
that got deactivated and added again.
Add a check to verify that local ecmp mpls label is not deleted.
Unit test:
Created ecmp with 2 instances, add the same route with aggregarte mpls
label via bgp peer, deleted both interface so that route update is
sent to BGP to withdraw this path, before BGP peer could withdraw
same instance would be activated with aggregarate mpls label and
route change would be triggered, verify that no crash is seen,
and update of path happens fine
Closes-bug:#1381821

Change-Id: I27a0f8ca0f011cc4bab3de79787356d32563c4d6

Changed in juniperopenstack:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.