Concurrency issue in TcpSession implementation

Bug #1441339 reported by Nischal Sheth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R1.1
Fix Committed
High
Nischal Sheth
R2.0
Fix Committed
High
Nischal Sheth
R2.1
Fix Committed
High
Nischal Sheth
Trunk
Fix Committed
High
Nischal Sheth

Bug Description

According to boost asio documentation, accessing a given instance of
a socket object from different threads is unsafe. The implementation
of the TcpSession class does not conform to this requirement.

- Async read for the socket is started from the io::Reader Task
- Handler for the async read is executed in the context of main thread
- Writes to the socket are done from the bgp::Send Task
- Handler for write ready is executed in the context of main thread

Since there's no way to enforce mutual exclusion between Tasks and
the main thread, this clearly is a problem. It results in situations
where a sender application has sent data, but the receiver application
doesn't get it. The data is in kernel socket buffers at the receiver,
but the asio library never reads the data.

Proposed fix is to do everything in the context of the main thread by
posting handlers.

Tags: base
Nischal Sheth (nsheth)
description: updated
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/9001
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.1

Review in progress for https://review.opencontrail.org/9021
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.0

Review in progress for https://review.opencontrail.org/9024
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R1.10

Review in progress for https://review.opencontrail.org/9025
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9024
Committed: http://github.org/Juniper/contrail-controller/commit/4f83e182922b27ac52ffc189aab20e4e61aa9712
Submitter: Zuul
Branch: R2.0

commit 4f83e182922b27ac52ffc189aab20e4e61aa9712
Author: Nischal Sheth <email address hidden>
Date: Thu Apr 9 10:34:07 2015 -0700

Fix concurrency issue in TcpSession::AsyncReadStart

Arrange for async_read_some on the underlying socket to be called
from the io thread instead of calling it directly from io::Reader
Task.

This commit addresses the read side concurrency issue with socket.
The write side issue will be addressed via another commit. Note
that we haven't seen any problems that can be attributed to write
side concurrency yet.

Make AsyncReadStart a noop for BgpSessionMock as the strand post
operation interferes with EvmManager::RunOnce.

Changes in state_machine_test.cc are needed to fix an existing bug
that got exposed as a result of making AsyncReadStart virtual. The
issue was that StateMachine::PassiveOpen enqueues EvTcpPassiveOpen
and then calls AsyncReadStart on the BgpSession. There's a chance
that the BgpSession gets deleted when processing EvTcpPassiveOpen
i.e. before the call to AsyncReadStart. This is not a problem in
production code because StateMachine::PassiveOpen is called from
bgp::Config Task which is exclusive with bgp::StateMachine Task.
However in the test, StateMachine::PassiveOpen is called from the
main thread.

Change-Id: If78b36c6fc7d45afd320b7e1ca245757c573deec
Partial-Bug: 1441339

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9021
Committed: http://github.org/Juniper/contrail-controller/commit/757279dfd04f761b58be9226d2dc51776408b660
Submitter: Zuul
Branch: R2.1

commit 757279dfd04f761b58be9226d2dc51776408b660
Author: Nischal Sheth <email address hidden>
Date: Thu Apr 9 10:34:07 2015 -0700

Fix concurrency issue in TcpSession::AsyncReadStart

Arrange for async_read_some on the underlying socket to be called
from the io thread instead of calling it directly from io::Reader
Task.

This commit addresses the read side concurrency issue with socket.
The write side issue will be addressed via another commit. Note
that we haven't seen any problems that can be attributed to write
side concurrency yet.

Make AsyncReadStart a noop for BgpSessionMock as the strand post
operation interferes with EvmManager::RunOnce.

Changes in state_machine_test.cc are needed to fix an existing bug
that got exposed as a result of making AsyncReadStart virtual. The
issue was that StateMachine::PassiveOpen enqueues EvTcpPassiveOpen
and then calls AsyncReadStart on the BgpSession. There's a chance
that the BgpSession gets deleted when processing EvTcpPassiveOpen
i.e. before the call to AsyncReadStart. This is not a problem in
production code because StateMachine::PassiveOpen is called from
bgp::Config Task which is exclusive with bgp::StateMachine Task.
However in the test, StateMachine::PassiveOpen is called from the
main thread.

Change-Id: If78b36c6fc7d45afd320b7e1ca245757c573deec
Partial-Bug: 1441339

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9001
Committed: http://github.org/Juniper/contrail-controller/commit/3e1c0fc64342fd7f02e6d97b5475731c32b07fac
Submitter: Zuul
Branch: master

commit 3e1c0fc64342fd7f02e6d97b5475731c32b07fac
Author: Nischal Sheth <email address hidden>
Date: Tue Apr 7 11:27:17 2015 -0700

Fix concurrency issue in TcpSession::AsyncReadStart

Arrange for async_read_some on the underlying socket to be called
from the io thread instead of calling it directly from io::Reader
Task.

This commit addresses the read side concurrency issue with socket.
The write side issue will be addressed via another commit. Note
that we haven't seen any problems that can be attributed to write
side concurrency yet.

Make AsyncReadStart a noop for BgpSessionMock as the strand post
operation interferes with EvmManager::RunOnce.

Changes in state_machine_test.cc are needed to fix an existing bug
that got exposed as a result of making AsyncReadStart virtual. The
issue was that StateMachine::PassiveOpen enqueues EvTcpPassiveOpen
and then calls AsyncReadStart on the BgpSession. There's a chance
that the BgpSession gets deleted when processing EvTcpPassiveOpen
i.e. before the call to AsyncReadStart. This is not a problem in
production code because StateMachine::PassiveOpen is called from
bgp::Config Task which is exclusive with bgp::StateMachine Task.
However in the test, StateMachine::PassiveOpen is called from the
main thread.

Change-Id: If78b36c6fc7d45afd320b7e1ca245757c573deec
Partial-Bug: 1441339

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9025
Committed: http://github.org/Juniper/contrail-controller/commit/b7f6c4f5fdf52de74d6cba749b8f99726826e38b
Submitter: Zuul
Branch: R1.10

commit b7f6c4f5fdf52de74d6cba749b8f99726826e38b
Author: Nischal Sheth <email address hidden>
Date: Thu Apr 9 10:34:07 2015 -0700

Fix concurrency issue in TcpSession::AsyncReadStart

Arrange for async_read_some on the underlying socket to be called
from the io thread instead of calling it directly from io::Reader
Task.

This commit addresses the read side concurrency issue with socket.
The write side issue will be addressed via another commit. Note
that we haven't seen any problems that can be attributed to write
side concurrency yet.

Make AsyncReadStart a noop for BgpSessionMock as the strand post
operation interferes with EvmManager::RunOnce.

Changes in state_machine_test.cc are needed to fix an existing bug
that got exposed as a result of making AsyncReadStart virtual. The
issue was that StateMachine::PassiveOpen enqueues EvTcpPassiveOpen
and then calls AsyncReadStart on the BgpSession. There's a chance
that the BgpSession gets deleted when processing EvTcpPassiveOpen
i.e. before the call to AsyncReadStart. This is not a problem in
production code because StateMachine::PassiveOpen is called from
bgp::Config Task which is exclusive with bgp::StateMachine Task.
However in the test, StateMachine::PassiveOpen is called from the
main thread.

Change-Id: If78b36c6fc7d45afd320b7e1ca245757c573deec
Partial-Bug: 1441339

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.