pbb_evpn: vRouter crash is observed @linux_if_del+0x9f/0x180 [vrouter], eth_drv_del+0x38/0x60 [vrouter]

Bug #1677571 reported by Chandra Sekhar Reddy Mallam
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Divakar Dharanalakota
R2.21.x
Fix Committed
High
Divakar Dharanalakota
R2.22.x
Fix Committed
High
Divakar Dharanalakota
R3.0
Fix Committed
High
Divakar Dharanalakota
R3.1
Fix Committed
High
Divakar Dharanalakota
R3.2
Fix Committed
High
Divakar Dharanalakota
Trunk
Fix Committed
High
Divakar Dharanalakota

Bug Description

vRouter crash is observed @linux_if_del+0x9f/0x180 [vrouter], eth_drv_del+0x38/0x60 [vrouter]

[197577.527214] Call Trace:
[197577.532960] [<ffffffffa0393fdf>] linux_if_del+0x9f/0x180 [vrouter]
[197577.538773] [<ffffffffa03a05a8>] eth_drv_del+0x38/0x60 [vrouter]
[197577.544516] [<ffffffffa03a078f>] vif_drv_delete+0x3f/0x60 [vrouter]
[197577.550155] [<ffffffffa03a54c8>] vr_interface_shut+0x58/0x80 [vrouter]
[197577.555720] [<ffffffffa0398671>] vrouter_exit+0x31/0xa0 [vrouter]
[197577.561177] [<ffffffffa039883c>] vrouter_ops_process+0x4c/0x90 [vrouter]
[197577.566610] [<ffffffffa038dc66>] sandesh_decode+0xe6/0x210 [vrouter]
[197577.571987] [<ffffffffa038daf0>] ? sandesh_hdr_free+0x10/0x10 [vrouter]
[197577.577289] [<ffffffff812dd5b6>] ? security_sock_rcv_skb+0x16/0x20
[197577.582495] [<ffffffffa0397514>] sandesh_proto_decode+0x24/0x30 [vrouter]
[197577.587653] [<ffffffffa0396e5d>] vr_message_request+0x3d/0x70 [vrouter]
[197577.592723] [<ffffffffa03962f4>] netlink_trans_request+0x54/0x1c0 [vrouter]
[197577.597728] [<ffffffff812dde38>] ? security_capable+0x18/0x20
[197577.602628] [<ffffffff8165f2ed>] genl_family_rcv_msg+0x18d/0x370
[197577.607444] [<ffffffff8165f4d0>] ? genl_family_rcv_msg+0x370/0x370
[197577.612169] [<ffffffff8165f561>] genl_rcv_msg+0x91/0xd0
[197577.616770] [<ffffffff8165d609>] netlink_rcv_skb+0xa9/0xc0
[197577.621261] [<ffffffff8165db08>] genl_rcv+0x28/0x40
[197577.625614] [<ffffffff8165ccf5>] netlink_unicast+0xd5/0x1b0
[197577.629919] [<ffffffff8165d0de>] netlink_sendmsg+0x30e/0x680
[197577.634141] [<ffffffffa03afe2b>] ? vr_malloc_stats+0x2b/0x70 [vrouter]
[197577.638310] [<ffffffff8161632b>] sock_sendmsg+0x8b/0xc0
[197577.642503] [<ffffffff811b92a6>] ? mem_cgroup_update_page_stat+0x16/0x50
[197577.646699] [<ffffffff8117a879>] ? __do_fault+0x429/0x530
[197577.650772] [<ffffffff8161601e>] ? move_addr_to_kernel.part.14+0x1e/0x60
[197577.654812] [<ffffffff81616e51>] ? move_addr_to_kernel+0x21/0x30
[197577.658688] [<ffffffff81616739>] ___sys_sendmsg+0x389/0x3a0
[197577.662450] [<ffffffff817393c4>] ? __do_page_fault+0x204/0x560
[197577.666114] [<ffffffff81185a3e>] ? do_mmap_pgoff+0x34e/0x3d0
[197577.669812] [<ffffffff81170319>] ? vm_mmap_pgoff+0x99/0xc0
[197577.673393] [<ffffffff81617522>] __sys_sendmsg+0x42/0x80
[197577.676956] [<ffffffff81617572>] SyS_sendmsg+0x12/0x20
[197577.680495] [<ffffffff8173dd5d>] system_call_fastpath+0x1a/0x1f
[197577.684044] Code: c0 74 05 48 89 df ff d0 5b 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 8d 47 48 48 89 e5 53 48 8b 4f 48 48 8b 57 50 48 89 fb 48 89 51 08 <48> 89 0a 48 89 47 48 48 89 47 50 48 8b 7f 40 e8 ff 48 ff ff 48
[197577.695373] RIP [<ffffffff8162b92d>] netif_napi_del+0x1d/0x60
[197577.699127] RSP <ffff882025de18d0>
[197577.702821] CR2: 0000000000000000

Core/log files are copied to:
bhushana@10.204.216.50:/home/bhushana/Documents/technical/bugs/<bug-ID>

Revision history for this message
Chandra Sekhar Reddy Mallam (cmallam) wrote :
Download full text (3.3 KiB)

root@nodei15:~# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-analytics 4.0.0.0-3046 3046
contrail-config 4.0.0.0-3046 3046
contrail-config-openstack 4.0.0.0-3046 3046
contrail-control 4.0.0.0-3046 3046
contrail-database-common 4.0.0.0-3046 3046
contrail-dns 4.0.0.0-3046 3046
contrail-docs 4.0.0.0-3046 3046
contrail-f5 4.0.0.0-3046 3046
contrail-fabric-utils 4.0.0.0-3046 3046
contrail-heat 4.0.0.0-3046 3046
contrail-install-packages 4.0.0.0-3046~kilo 3046
contrail-lib 4.0.0.0-3046 3046
contrail-nodemgr 4.0.0.0-3046 3046
contrail-nova-networkapi 4.0.0.0-3046 3046
contrail-openstack 4.0.0.0-3046 3046
contrail-openstack-analytics 4.0.0.0-3046 3046
contrail-openstack-config 4.0.0.0-3046 3046
contrail-openstack-control 4.0.0.0-3046 3046
contrail-openstack-dashboard 4.0.0.0-3046 3046
contrail-openstack-database 4.0.0.0-3046 3046
contrail-openstack-webui 4.0.0.0-3046 3046
contrail-setup 4.0.0.0-3046 3046
contrail-utils 4.0.0.0-3046 3046
contrail-web-controller 4.0.0.0-3046 3046
contrail-web-core 4.0.0.0-3046 3046
neutron-plugin-contrail 4.0.0.0-3046 3046
nova-api 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-common 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-conductor 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-console 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-consoleauth 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-novncproxy 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-objectstore 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
nova-scheduler 1:2015.1.2-0ubuntu2~cloud0.1contrail13046
python-contrail 4.0.0.0-3046 3046
python-neutronclient 1:2.3.11-0ubuntu1~cloud0.3contrail 3046
python-nova 1:2015...

Read more...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/30199
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/30199
Committed: http://github.org/Juniper/contrail-vrouter/commit/07b4e1e4810b4854a1c9a48b05d9e54c0f496dee
Submitter: Zuul (<email address hidden>)
Branch: master

commit 07b4e1e4810b4854a1c9a48b05d9e54c0f496dee
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482
closes-bug: #1677571

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/30284
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/30285
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/30286
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/30289
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/30284
Committed: http://github.org/Juniper/contrail-vrouter/commit/9fbebe12ede8a762296ef21964941347c933c547
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 9fbebe12ede8a762296ef21964941347c933c547
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482
closes-bug: #1677571

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30285
Committed: http://github.org/Juniper/contrail-vrouter/commit/cbadb36c7312b3c1a3008e553a6fdcf2ecd75c63
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit cbadb36c7312b3c1a3008e553a6fdcf2ecd75c63
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482
closes-bug: #1677571

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30286
Committed: http://github.org/Juniper/contrail-vrouter/commit/867e3c6cea8806907dc8dc6efc45ec9626352e2f
Submitter: Zuul (<email address hidden>)
Branch: R3.0

commit 867e3c6cea8806907dc8dc6efc45ec9626352e2f
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482
closes-bug: #1677571

information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30289
Committed: http://github.org/Juniper/contrail-vrouter/commit/f39f54f2db6a90f9eadb01d8decd46111db99e96
Submitter: Zuul (<email address hidden>)
Branch: R2.21.x

commit f39f54f2db6a90f9eadb01d8decd46111db99e96
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

closes-bug: #1677571

Conflicts:
 dp-core/vr_interface.c

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/30399
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/30400
Submitter: Hari Prasad Killi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/30400
Committed: http://github.com/Juniper/contrail-vrouter/commit/55ea26c9e9400ad3cc564788dc742975907e2fa3
Submitter: Zuul (<email address hidden>)
Branch: R2.20

commit 55ea26c9e9400ad3cc564788dc742975907e2fa3
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

closes-bug: #1677571

Conflicts:
 dp-core/vr_interface.c

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/30399
Committed: http://github.com/Juniper/contrail-vrouter/commit/b6d4054d29a5c955261ac48016c6746c0b44f31d
Submitter: Zuul (<email address hidden>)
Branch: R2.22.x

commit b6d4054d29a5c955261ac48016c6746c0b44f31d
Author: Divakar D <email address hidden>
Date: Thu Apr 6 10:33:18 2017 +0530

Cleanup SubVIF properly if addition fails

On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

closes-bug: #1677571

Conflicts:
 dp-core/vr_interface.c

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.