agent crash at pthread_mutex_lock with fast, repeated hping tcp setup/teardown
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
Trunk |
Fix Committed
|
High
|
Praveen |
Bug Description
R3.0 Build 2711 Ubuntu 14.04 Kilo multinode
On repeated hping3 between vms with fast tcp setup/teardown ( inter-packet gap of 100 micro sec), sometimes, below crash is seen
Ex : hping3 -S -p 22 10.1.1.3 -s 10000 -c 1000 -i u100
Core will be in http://
(gdb) bt
#0 0x00007fcbb8b90414 in pthread_mutex_lock () from /lib/x86_
#1 0x0000000000cbe419 in KSyncFlowIndexM
#2 0x0000000000cd1a00 in KSyncSandeshCon
#3 0x00000000010b9640 in Sandesh:
#4 0x0000000000ceeb16 in ?? ()
#5 0x0000000000ceec8b in KSyncBulkSandes
#6 0x0000000000ced690 in KSyncSock:
#7 0x0000000000cf377f in QueueTaskRunner
#8 0x00000000011790ac in TaskImpl::execute() ()
#9 0x00007fcbb8972b3a in ?? () from /usr/lib/
#10 0x00007fcbb896e816 in ?? () from /usr/lib/
#11 0x00007fcbb896df4b in ?? () from /usr/lib/
#12 0x00007fcbb896a0ff in ?? () from /usr/lib/
#13 0x00007fcbb896a2f9 in ?? () from /usr/lib/
#14 0x00007fcbb8b8e182 in start_thread () from /lib/x86_
#15 0x00007fcbb7e6747d in clone () from /lib/x86_
(gdb)
Crash happens with following sequence,
1. Flow is added with following,
- flow-1 with index i1
- rflow-1 with index i2
2. Vrouter evicts flows flow-1 and flow-2
- flow-1 is evicted with flow-2 (uses index i1)
- flow-2 is evicted with rflow-2 (uses index i2)
3. Agent processes flow-2 and starts eviction of flow-1
4. As part of flow-1 eviction, flow-2 is also deleted.
5. Deletion of flow-2 will result in VRouter operation of delete index i2 which inadvertently deletes flow-2
6. Later when reverse flow for flow-2 is being setup, it will point to index i2 which is deleted and results in vrouter error