vRouter: First packet of a flow ends up in a wrong destination
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.20 |
Fix Committed
|
Medium
|
Anand H. Krishnan | |||
R2.21.x |
Fix Committed
|
Medium
|
Anand H. Krishnan | |||
Trunk |
Fix Committed
|
Medium
|
Anand H. Krishnan |
Bug Description
The problem started manifesting in the form of delays in TCP connection establishment in a customer setup. After observations, it was reported that the first packet is being delivered to the wrong VM.
Observations:
Release 2.20
ping -c 1 <address> delivers the packet to the wrong VM but results in the correct flow entry.
ping -c 2 <address> results in the first packet being dropped.
Next-Hops were observed to be stable. The nature of the problem was intermittent.
Post L2 flow support, vRouter started using the label in the forwarding metadata to also store VXLAN identifier. For MPLS-O-X(GRE/UDP) packets, the label indicates the MPLS label of the packet. For VXLAN tunneled packets, the label indicates the VNID.
We store label in the forwarding metadata so that various sections of the code can make use of the label to add encapsulation data/lookup forwarding data when packets are cached. One such use of storing the label is to provide a way for the first packet that is held in the flow entry to find the destination. Label, logically, is not available for parsing at that point of time since we do flow lookup for the inner packet and the label would have been popped. When a packet is cached, we use flags to mark whether the label is a VNID or a MPLS label since there is no other way a cached packet can identify the label.
The logic that identifies this information has turned out to be buggy, resulting in VNID being identified as a MPLS label and thus sending the packet to a wrong interface/VM.
Review in progress for https:/ /review. opencontrail. org/14635
Submitter: Anand H. Krishnan (<email address hidden>)