vrouter drops fragments in some cases if they are received out of order

Bug #1579828 reported by Raja Sivaramakrishnan
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Critical
Anand H. Krishnan
R2.21.x
Fix Committed
Critical
Anand H. Krishnan
R2.22.x
Fix Committed
Critical
Anand H. Krishnan
R3.0
Fix Committed
Critical
Anand H. Krishnan
Trunk
Fix Committed
Critical
Anand H. Krishnan

Bug Description

If the second fragment is received from the wire by vrouter before the first, it is dropped by vrouter in some cases as a result of fragment timeout.

Hampapur Ajay (hajay)
information type: Private → Public
Changed in juniperopenstack:
importance: Undecided → Critical
Revision history for this message
alok kumar (kalok) wrote :
Download full text (3.4 KiB)

while trying to simulate this in our setup, we see similar fragment loss for traffic from VM to vrouter.

Computes nodes are with vlan tag and sender VM has FIP.

Test details:
Sender VM - vn1-vm1, IP- 1.1.1.3, FIP- 10.204.221.181
Receiver VM - public1, IP- 10.204.221.180
Sending fragmented out of order echo request from vn1-vm1 to public1.
Order of fragments sent from VM – offset 8 , offset 0 , offset 16, offset 24
Sometime fragment with offset 8 is not sent out of the sender compute node.

On sender compute node(nodei2):
root@nodei2:~# tcpdump -nnvvi any host 10.204.221.180
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes

20:41:26.546050 IP (tos 0x0, ttl 64, id 64422, offset 8, flags [+], proto ICMP (1), length 28)
1.1.1.3 > 10.204.221.180: ip-proto-1
20:41:26.548882 IP (tos 0x0, ttl 64, id 64422, offset 0, flags [+], proto ICMP (1), length 28)
1.1.1.3 > 10.204.221.180: ICMP echo request, id 0, seq 0, length 8
20:41:26.548902 IP (tos 0x0, ttl 63, id 64422, offset 0, flags [+], proto ICMP (1), length 28)
10.204.221.181 > 10.204.221.180: ICMP echo request, id 0, seq 0, length 8
20:41:26.548907 ethertype IPv4, IP (tos 0x0, ttl 63, id 64422, offset 0, flags [+], proto ICMP (1), length 28)
10.204.221.181 > 10.204.221.180: ICMP echo request, id 0, seq 0, length 8
20:41:26.551059 IP (tos 0x0, ttl 64, id 64422, offset 16, flags [+], proto ICMP (1), length 28)
1.1.1.3 > 10.204.221.180: ip-proto-1
20:41:26.551076 IP (tos 0x0, ttl 63, id 64422, offset 16, flags [+], proto ICMP (1), length 28)
10.204.221.181 > 10.204.221.180: ip-proto-1
20:41:26.551079 ethertype IPv4, IP (tos 0x0, ttl 63, id 64422, offset 16, flags [+], proto ICMP (1), length 28)
10.204.221.181 > 10.204.221.180: ip-proto-1
20:41:26.553635 IP (tos 0x0, ttl 64, id 64422, offset 24, flags [none], proto ICMP (1), length 24)
1.1.1.3 > 10.204.221.180: ip-proto-1
20:41:26.553651 IP (tos 0x0, ttl 63, id 64422, offset 24, flags [none], proto ICMP (1), length 24)
10.204.221.181 > 10.204.221.180: ip-proto-1
20:41:26.553653 ethertype IPv4, IP (tos 0x0, ttl 63, id 64422, offset 24, flags [none], proto ICMP (1), length 24)
10.204.221.181 > 10.204.221.180: ip-proto-1

20:41:56.614628 IP (tos 0xc0, ttl 63, id 49537, offset 0, flags [none], proto ICMP (1), length 56)
10.204.221.180 > 1.1.1.3: ICMP ip reassembly time exceeded, length 36
IP (tos 0x0, ttl 63, id 64422, offset 0, flags [+], proto ICMP (1), length 28)
1.1.1.3 > 10.204.221.180: ICMP echo request, id 0, seq 0, length 8
20:41:56.614747 IP (tos 0xc0, ttl 63, id 49537, offset 0, flags [none], proto ICMP (1), length 56)
10.204.221.180 > 1.1.1.3: ICMP ip reassembly time exceeded, length 36
IP (tos 0x0, ttl 63, id 64422, offset 0, flags [+], proto ICMP (1), length 28)
1.1.1.3 > 10.204.221.180: ICMP echo request, id 0, seq 0, length 8

Verified it on receiver compute too, don’t see offset 8 packet there.

Setup details:
env.roledefs = {
    'all': [host1, host2, host3, host4, host5, host6],
    'cfgm': [host1],
    'openstack': [host1],
    'webui': [host2],
    'control': [host1, host2, host3],
    'compute': [host4, host5, host6],
    'collector': [host1, host2, host3],
    'database': [host1],
    'build'...

Read more...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20146
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20240
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20273
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/20274
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20275
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20276
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20281
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20282
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20283
Submitter: Anand H. Krishnan (<email address hidden>)

tags: added: releasenote
removed: fragment
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20275
Committed: http://github.org/Juniper/contrail-controller/commit/84949561e75600d72c9f21bdc413ac9a74df51df
Submitter: Zuul
Branch: R2.20

commit 84949561e75600d72c9f21bdc413ac9a74df51df
Author: Anand H. Krishnan <email address hidden>
Date: Mon May 16 08:59:02 2016 +0530

Remove fields from dropstats sandesh

Change-Id: Ib1cb7b9a0d463901a36518b531f18ea29a2fa1a2
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20240
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20273
Committed: http://github.org/Juniper/contrail-controller/commit/17675ad43fd4f3eff13a37dcc4c08311af9206d9
Submitter: Zuul
Branch: R3.0

commit 17675ad43fd4f3eff13a37dcc4c08311af9206d9
Author: Anand H. Krishnan <email address hidden>
Date: Mon May 16 08:59:02 2016 +0530

Remove fields from dropstats sandesh

Change-Id: Ib1cb7b9a0d463901a36518b531f18ea29a2fa1a2
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20240
Committed: http://github.org/Juniper/contrail-controller/commit/cc7902a34bc3e1dbf2bca177012bfb2b7a338ca5
Submitter: Zuul
Branch: master

commit cc7902a34bc3e1dbf2bca177012bfb2b7a338ca5
Author: ashoksingh <email address hidden>
Date: Fri May 20 20:30:54 2016 +0530

Sync Agent dropstats with that of vrouter.

Removed the following fields from agent
-composite_invalid_interface
-arp_reply_no_route

Also added missing fields in agent

Change-Id: Ib1cb7b9a0d463901a36518b531f18ea29a2fa1a2
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20506
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20506
Committed: http://github.org/Juniper/contrail-controller/commit/ed6a909139e8e41af314c335f996a9ce63a9fa72
Submitter: Zuul
Branch: R3.0

commit ed6a909139e8e41af314c335f996a9ce63a9fa72
Author: ashoksingh <email address hidden>
Date: Sun May 22 10:39:53 2016 +0530

Sync Agent dropstats with that of vrouter.

Removed the following fields from agent
-composite_invalid_interface
-arp_reply_no_route

Also added missing fields in agent

Change-Id: I92cbb10625e440e18a91e696dc9a60a53ba54341
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20274
Committed: http://github.org/Juniper/contrail-controller/commit/55600ca2ca54b0e6ea7ff247b87e2cd418e87196
Submitter: Zuul
Branch: R2.21.x

commit 55600ca2ca54b0e6ea7ff247b87e2cd418e87196
Author: Anand H. Krishnan <email address hidden>
Date: Mon May 16 08:59:02 2016 +0530

Remove fields from dropstats sandesh

Change-Id: Ib1cb7b9a0d463901a36518b531f18ea29a2fa1a2
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20276
Committed: http://github.org/Juniper/contrail-controller/commit/38c413acacfc07ec701acef053b854c3a1b171c0
Submitter: Zuul
Branch: R2.22.x

commit 38c413acacfc07ec701acef053b854c3a1b171c0
Author: Anand H. Krishnan <email address hidden>
Date: Mon May 16 08:59:02 2016 +0530

Remove fields from dropstats sandesh

Change-Id: Ib1cb7b9a0d463901a36518b531f18ea29a2fa1a2
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20660
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/20661
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20662
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20660
Committed: http://github.org/Juniper/contrail-controller/commit/14f993d2ed1e464cc528aa66f33a7345b137c484
Submitter: Zuul
Branch: R2.20

commit 14f993d2ed1e464cc528aa66f33a7345b137c484
Author: ashoksingh <email address hidden>
Date: Thu May 26 14:03:29 2016 +0530

Sync Agent dropstats with that of vrouter.

Removed invalid fields from agent and added missing fields.

Change-Id: Iaccce2a8319f390f93f166092e9f4a54af34890e
Partial-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20662
Committed: http://github.org/Juniper/contrail-controller/commit/1e5fd2004985b2fabfd0402ebccd1a60ce4229dc
Submitter: Zuul
Branch: R2.22.x

commit 1e5fd2004985b2fabfd0402ebccd1a60ce4229dc
Author: ashoksingh <email address hidden>
Date: Thu May 26 14:03:29 2016 +0530

Sync Agent dropstats with that of vrouter.

Removed invalid fields from agent and added missing fields.

Partial-BUG: #1579828
(cherry picked from commit 14f993d2ed1e464cc528aa66f33a7345b137c484)

Change-Id: I8286a3e8bec65c3a6d818aa08f4fe8c2a24a54f6

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20661
Committed: http://github.org/Juniper/contrail-controller/commit/140a22c56b5977cbbfa2fd1b9d90797b065cc08d
Submitter: Zuul
Branch: R2.21.x

commit 140a22c56b5977cbbfa2fd1b9d90797b065cc08d
Author: ashoksingh <email address hidden>
Date: Thu May 26 14:03:29 2016 +0530

Sync Agent dropstats with that of vrouter.

Removed invalid fields from agent and added missing fields.

Partial-BUG: #1579828
(cherry picked from commit 14f993d2ed1e464cc528aa66f33a7345b137c484)

Change-Id: Ic236923c8c33630216f755539b34864bce7f85df

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20283
Committed: http://github.org/Juniper/contrail-vrouter/commit/cab4c7c4a3a060c93902e4de0f84d470eb4d3146
Submitter: Zuul
Branch: R3.0

commit cab4c7c4a3a060c93902e4de0f84d470eb4d3146
Author: Anand H. Krishnan <email address hidden>
Date: Tue May 17 12:47:20 2016 +0530

Out Of Order Fragment handling fixes

. If any fragment other than the HEAD comes to the assembler, then
it means that datapath was not able to find the flow information
for that fragment, since HEAD had not yet passed through it. Hence,
there is an explicit assumption in the assembler that HEAD will
always come later than at least one fragment. Once the HEAD arrives,
the assembler will look for all the fragments of that packet and
flush the fragments.

Since the assembler is an asynchronous entity with respect to the
datapath, it is possible that by the time it gets the event and
processes the input fragments, HEAD also would have arrived, and
possibly in a different CPU than the other fragments. If the HEAD
then is processed first, the assembler will not find any fragments
that will need the information that is supplied by the HEAD and hence
the fragments that arrived in the system before the HEAD will stay in
the assembler queue till they get timed out.

. Because of the asynchronous nature of the assembler, the IP header
of the cloned HEAD is not a safe access, since the original packet
might have undergone NATing, resulting in wrong IPs being used for
selecting the queues to search for fragments. Hence, store the IPs
in the packet node along with the fragment information so that
assembler will make the right calculations.

. Initialize the packet node flags field to zero. Uninitialized pnode
flags resulted in the label being treated as a VNID and hence an unset
packet nexthop and thus a wrong key nexthop in the flow key, resulting
in HOLD flows.

Change-Id: I5d2c5abcda9c612c9d13378c79ae5d8392fd2c7b
Closes-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20281
Committed: http://github.org/Juniper/contrail-vrouter/commit/0eb83fcb9060e86bab6be908d13c22410a34f260
Submitter: Zuul
Branch: R2.20

commit 0eb83fcb9060e86bab6be908d13c22410a34f260
Author: Anand H. Krishnan <email address hidden>
Date: Thu May 12 10:58:31 2016 +0530

Out Of Order Fragment handling fixes

. If any fragment other than the HEAD comes to the assembler, then
it means that datapath was not able to find the flow information
for that fragment, since HEAD had not yet passed through it. Hence,
there is an explicit assumption in the assembler that HEAD will
always come later than at least one fragment. Once the HEAD arrives,
the assembler will look for all the fragments of that packet and
flush the fragments.

Since the assembler is an asynchronous entity with respect to the
datapath, it is possible that by the time it gets the event and
processes the input fragments, HEAD also would have arrived, and
possibly in a different CPU than the other fragments. If the HEAD
then is processed first, the assembler will not find any fragments
that will need the information that is supplied by the HEAD and hence
the fragments that arrived in the system before the HEAD will stay in
the assembler queue till they get timed out.

. Because of the asynchronous nature of the assembler, the IP header
of the cloned HEAD is not a safe access, since the original packet
might have undergone NATing, resulting in wrong IPs being used for
selecting the queues to search for fragments. Hence, store the IPs
in the packet node along with the fragment information so that
assembler will make the right calculations.

. Initialize the packet node flags field to zero. Uninitialized pnode
flags resulted in the label being treated as a VNID and hence an unset
packet nexthop and thus a wrong key nexthop in the flow key, resulting
in HOLD flows.

Change-Id: I5d2c5abcda9c612c9d13378c79ae5d8392fd2c7b
Closes-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20280
Committed: http://github.org/Juniper/contrail-vrouter/commit/5f950dc27112307c7a7ea57e4858e6ef3c055fd8
Submitter: Zuul
Branch: R2.21.x

commit 5f950dc27112307c7a7ea57e4858e6ef3c055fd8
Author: Anand H. Krishnan <email address hidden>
Date: Thu May 12 10:58:31 2016 +0530

Out Of Order Fragment handling fixes

. If any fragment other than the HEAD comes to the assembler, then
it means that datapath was not able to find the flow information
for that fragment, since HEAD had not yet passed through it. Hence,
there is an explicit assumption in the assembler that HEAD will
always come later than at least one fragment. Once the HEAD arrives,
the assembler will look for all the fragments of that packet and
flush the fragments.

Since the assembler is an asynchronous entity with respect to the
datapath, it is possible that by the time it gets the event and
processes the input fragments, HEAD also would have arrived, and
possibly in a different CPU than the other fragments. If the HEAD
then is processed first, the assembler will not find any fragments
that will need the information that is supplied by the HEAD and hence
the fragments that arrived in the system before the HEAD will stay in
the assembler queue till they get timed out.

. Because of the asynchronous nature of the assembler, the IP header
of the cloned HEAD is not a safe access, since the original packet
might have undergone NATing, resulting in wrong IPs being used for
selecting the queues to search for fragments. Hence, store the IPs
in the packet node along with the fragment information so that
assembler will make the right calculations.

. Initialize the packet node flags field to zero. Uninitialized pnode
flags resulted in the label being treated as a VNID and hence an unset
packet nexthop and thus a wrong key nexthop in the flow key, resulting
in HOLD flows.

Change-Id: I5d2c5abcda9c612c9d13378c79ae5d8392fd2c7b
Closes-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20282
Committed: http://github.org/Juniper/contrail-vrouter/commit/647c69b9eeb37e7fea562071a0e1f8a50a70ab08
Submitter: Zuul
Branch: R2.22.x

commit 647c69b9eeb37e7fea562071a0e1f8a50a70ab08
Author: Anand H. Krishnan <email address hidden>
Date: Thu May 12 10:58:31 2016 +0530

Out Of Order Fragment handling fixes

. If any fragment other than the HEAD comes to the assembler, then
it means that datapath was not able to find the flow information
for that fragment, since HEAD had not yet passed through it. Hence,
there is an explicit assumption in the assembler that HEAD will
always come later than at least one fragment. Once the HEAD arrives,
the assembler will look for all the fragments of that packet and
flush the fragments.

Since the assembler is an asynchronous entity with respect to the
datapath, it is possible that by the time it gets the event and
processes the input fragments, HEAD also would have arrived, and
possibly in a different CPU than the other fragments. If the HEAD
then is processed first, the assembler will not find any fragments
that will need the information that is supplied by the HEAD and hence
the fragments that arrived in the system before the HEAD will stay in
the assembler queue till they get timed out.

. Because of the asynchronous nature of the assembler, the IP header
of the cloned HEAD is not a safe access, since the original packet
might have undergone NATing, resulting in wrong IPs being used for
selecting the queues to search for fragments. Hence, store the IPs
in the packet node along with the fragment information so that
assembler will make the right calculations.

. Initialize the packet node flags field to zero. Uninitialized pnode
flags resulted in the label being treated as a VNID and hence an unset
packet nexthop and thus a wrong key nexthop in the flow key, resulting
in HOLD flows.

Change-Id: I5d2c5abcda9c612c9d13378c79ae5d8392fd2c7b
Closes-BUG: #1579828

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20146
Committed: http://github.org/Juniper/contrail-vrouter/commit/852890e262d0fc26d7587b8d8be14bc0757f8572
Submitter: Zuul
Branch: master

commit 852890e262d0fc26d7587b8d8be14bc0757f8572
Author: Anand H. Krishnan <email address hidden>
Date: Thu May 12 10:58:31 2016 +0530

Out Of Order Fragment handling fixes

. If any fragment other than the HEAD comes to the assembler, then
it means that datapath was not able to find the flow information
for that fragment, since HEAD had not yet passed through it. Hence,
there is an explicit assumption in the assembler that HEAD will
always come later than at least one fragment. Once the HEAD arrives,
the assembler will look for all the fragments of that packet and
flush the fragments.

Since the assembler is an asynchronous entity with respect to the
datapath, it is possible that by the time it gets the event and
processes the input fragments, HEAD also would have arrived, and
possibly in a different CPU than the other fragments. If the HEAD
then is processed first, the assembler will not find any fragments
that will need the information that is supplied by the HEAD and hence
the fragments that arrived in the system before the HEAD will stay in
the assembler queue till they get timed out.

. Because of the asynchronous nature of the assembler, the IP header
of the cloned HEAD is not a safe access, since the original packet
might have undergone NATing, resulting in wrong IPs being used for
selecting the queues to search for fragments. Hence, store the IPs
in the packet node along with the fragment information so that
assembler will make the right calculations.

. Initialize the packet node flags field to zero. Uninitialized pnode
flags resulted in the label being treated as a VNID and hence an unset
packet nexthop and thus a wrong key nexthop in the flow key, resulting
in HOLD flows.

Change-Id: I5d2c5abcda9c612c9d13378c79ae5d8392fd2c7b
Closes-BUG: #1579828

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.