Compute node kernel panic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenContrail |
New
|
Undecided
|
Unassigned |
Bug Description
Hi Team,
I encountered with an issue when i ping another VM that's located on different compute node or public host with bigger packet size than mtu of interface (ping -s 2000 8.8.8.8). Compute node crashes with kernel panic. I observe weird behavior only with linux kernel 4.x and 3.19, kernel 3.13 is not affected. I've attached kernel crashdump and kernel traceback.
OpenContrail version: 3.1.1
OpenStack release: mitaka
Linux distr: ubuntu 16.04 hwe
Linux kernel: 4.8.0-54-generic
vrouter module info
root@cmp001:/# modinfo vrouter
filename: /lib/modules/
version: 1.0
license: GPL
srcversion: 615F009C28F6CDD
depends:
vermagic: 4.8.0-54-generic SMP mod_unload modversions
parm: vr_flow_
parm: vr_oflow_
parm: vr_bridge_
parm: vr_bridge_
parm: vr_mpls_labels:uint
parm: vr_nexthops:uint
parm: vr_vrfs:uint
parm: vr_flow_
parm: vr_interfaces:uint
parm: vrouter_dbg:Set 1 for pkt dumping and 0 to disable, default value is 0 (int)
kernel traceback
[ 144.309486] BUG: unable to handle kernel paging request at ffff94ef80000000
[ 144.309705] IP: [<ffffffffa2a3d
[ 144.309863] PGD faec3c067 PUD 0
[ 144.310062] Oops: 0000 [#1] SMP
[ 144.310155] Modules linked in: veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
[ 144.314736] xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper ablk_helper cryptd megaraid_sas fnic libfcoe igb usbhid libfc hid dca ptp scsi_transport_fc enic pps_core i2c_algo_bit wmi fjes
[ 144.316721] CPU: 7 PID: 10130 Comm: kworker/7:3 Tainted: G OE 4.8.0-54-generic #57~16.04.1-Ubuntu
[ 144.316860] Hardware name: Cisco Systems Inc N1K-1110-
[ 144.317012] Workqueue: events lh_work [vrouter]
[ 144.317161] task: ffff94e749680000 task.stack: ffff94e1fd404000
[ 144.317261] RIP: 0010:[<
[ 144.317446] RSP: 0018:ffff94e1fd
[ 144.317543] RAX: ffff94ed933c7c1c RBX: ffff94e1ff318700 RCX: 000000001973ac4c
[ 144.317646] RDX: 0000000000000006 RSI: ffff94ef7ffffffc RDI: ffff94edc71719fc
[ 144.317750] RBP: ffff94e1fd407a90 R08: 00000000000000c0 R09: ffff94e75f807340
[ 144.317853] R10: 0000000000000062 R11: ffff94ee92b47c00 R12: 0000000000000588
[ 144.317957] R13: 000000000000005e R14: ffff94e7561e8100 R15: ffff94ef4b9d62f0
[ 144.318060] FS: 000000000000000
[ 144.318193] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 144.318292] CR2: ffff94ef80000000 CR3: 00000008587ee000 CR4: 00000000000426e0
[ 144.318395] Stack:
[ 144.318482] ffffffffa2d6ea81 ffffffffa2d7dc6e 0000000000000046 ffffffffffffffac
[ 144.318828] ffff94e7561e8100 fffffff400000000 000000000088000e 00000000ff780046
[ 144.319220] 0001ffff000000b2 000000000000000e 0000000000000054 ffff94e1ff318700
[ 144.319565] Call Trace:
[ 144.319657] [<ffffffffa2d6e
[ 144.319756] [<ffffffffa2d7d
[ 144.319862] [<ffffffffc0716
[ 144.319966] [<ffffffffc0716
[ 144.320072] [<ffffffffc0716
[ 144.320178] [<ffffffffc0730
[ 144.320329] [<ffffffffc0723
[ 144.320437] [<ffffffffc0722
[ 144.320572] [<ffffffffc071c
[ 144.320679] [<ffffffffc071d
[ 144.320783] [<ffffffffc072f
[ 144.320886] [<ffffffffa29fc
[ 144.320990] [<ffffffffa26b8
[ 144.321093] [<ffffffffc0720
[ 144.321199] [<ffffffffc072b
[ 144.321306] [<ffffffffc072b
[ 144.321413] [<ffffffffc072b
[ 144.321518] [<ffffffffc0712
[ 144.321618] [<ffffffffa269d
[ 144.321718] [<ffffffffa269d
[ 144.321816] [<ffffffffa269d
[ 144.321916] [<ffffffffa269d
[ 144.322017] [<ffffffffa26a3
[ 144.322114] [<ffffffffa2e9a
[ 144.322213] [<ffffffffa26a3
[ 144.322314] Code: 21 2f a3 e8 51 1a ca ff 0f 31 48 c1 e2 20 48 09 d0 48 31 c3 e9 6d ff ff ff 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3
[ 144.326057] RIP [<ffffffffa2a3d
[ 144.326209] RSP <ffff94e1fd4079a8>
[ 144.326300] CR2: ffff94ef80000000
description: | updated |
Link to linux kernel crashdump https:/ /drive. google. com/open? id=0Bw60UyguIym oWHVOSEtPUVFpOD Q