SFC fails to route traffic when multiple compute nodes are involved

Bug #1604395 reported by Artem Plakunov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
networking-sfc
Invalid
Critical
Fred S

Bug Description

The bug: if there are more than 1 compute node in the environment sfc traffic will not be routed between them.

--------

Reproduce:
Environment with 1 controller, 2 compute nodes (3 physical machines): C (controller), N1 (compute1), N2 (compute2).
1. Create network and subnet
2. Launch 4 vms in this network:
    VM1 (traffic source) on N1
    VM2 (service 1) on N1
    VM3 (service 2) on N2
    VM4 (destination) on N1
3. Setup sfc:
    create two port pairs for VM2 and VM3
    create two port-pair-groups for VM2 and VM3
    create classifier with logical-source-port = VM1_port, protocol = icmp
    create chain with two port-pair-groups (VM2's group should go first) and classifier
4. On vms, run:
    VM1 runs: ping VM4
    VM2 and VM3 runs: tcpdump icmp -i eth0 -n

Result: Pings will not return. VM2's tcpdump will print icmp packets going through, while VM3 will print nothing.
If I recreate the port chain to include only VM2's port pair group everything works.
It seems that the bug occurs when an element of the port chain is located on a different compute node than the previous element in the traffic route.

-----------

Another setup when the bug occurs:
3 vms, source and service on N1, destination on N2. Two icmp classifiers with logical-source-port = source vm port and destination vm port.
In this case the packets will successfully reach destination vm but fail when going back. If I remove the second classifier for destination vm (so returning packets will not go through service vm) everything works again.

-----------

This bug has been reproduced on both liberty and mitaka. For liberty I used networking-sfc v1.0.0 from pip package, for mitaka I installed sfc from master branch, commit d3235cc3a9dbb602b49c155a457a67676a037ec6 (v1.0.1.dev68)

Environment setup:
Mirantis openstack 8.0 liberty, 9.0 mitaka
Neutron versions: 7.0.4 liberty, 8.1.2 mitaka
Virtual machines run clean ubuntu 15.10

Tags: multinode
Revision history for this message
Artem Plakunov (artacc) wrote :

openvswitch-agent logs report no errors but here they are

Changed in networking-sfc:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Fred S (fsbiz) wrote :

Please attach the following outputs. Let us work with setup 2 mentioned below.
3 vms, source and service on N1, destination on N2. Two icmp classifiers with logical-source-port = source vm port and destination vm port.
 In this case the packets will successfully reach destination vm but fail when going back. If I remove the second classifier for destination vm (so returning packets will not go through service vm) everything works again.

SRC SF DST
| | |
| | |
Compute Host N1 Compute Host N2

Does your SF have a single interface (used for ingress and egress), or
two interfaces (one for ingress and one for egress) ?

Please provide the following outputs:
1. neutron port-list
2. neutron port-pair-list
3. neutron port-pair-group-list
4. neutron flow-classifier-list
5. neutron port-chain-list

On nodes N1 and N2:
ovs-vsctl show
ovs-ofctl dump-groups -O OpenFlow13 br-int
ovs-ofctl dump-flows -O OpenFlow13 br-int

Thanks,
Farhad.

Revision history for this message
Artem Plakunov (artacc) wrote :
Download full text (13.6 KiB)

Thanks for answer. All service vms are single interface since I'm using one network.

Here is the data (attachment is a duplicate if it is more convenient):

neutron cli commands

neutron port-list
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 09fc96eb-ea4d-4f45-a0b1-e098b2217a69 | | fa:16:3e:73:66:db | {"subnet_id": "2779cfcc-793c-4539-9438-6aa84fd65c91", "ip_address": "192.168.111.2"} |
| 2f01899d-4d0a-4172-b218-2adce5f93b94 | | fa:16:3e:e2:18:fb | {"subnet_id": "94e4fc32-e260-4576-a98e-ea2730a2f9b6", "ip_address": "192.168.200.12"} |
| 67a4f9b0-e0ca-4831-8c07-f3c66623b868 | | fa:16:3e:31:12:c6 | {"subnet_id": "2779cfcc-793c-4539-9438-6aa84fd65c91", "ip_address": "192.168.111.1"} |
| 84200887-04b1-4f97-ad09-97e55476016c | | fa:16:3e:16:4d:9d | {"subnet_id": "eaedbcee-ce9e-4609-bc91-c31eb8c84fb1", "ip_address": "172.20.4.127"} |
| 99d03583-4b46-4bfc-bcb7-d62eb86a2c87 | | fa:16:3e:0d:b4:d7 | {"subnet_id": "94e4fc32-e260-4576-a98e-ea2730a2f9b6", "ip_address": "192.168.200.10"} |
| b5163853-c9d4-49c3-ac78-937ae48d817e | | fa:16:3e:a2:4e:a9 | {"subnet_id": "94e4fc32-e260-4576-a98e-ea2730a2f9b6", "ip_address": "192.168.200.11"} |
| ceaf6295-e850-4c38-b882-2606c3f1b00f | | fa:16:3e:05:a2:4b | {"subnet_id": "94e4fc32-e260-4576-a98e-ea2730a2f9b6", "ip_address": "192.168.200.2"} |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

neutron port-pair-list
+--------------------------------------+-------------+--------------------------------------+--------------------------------------+
| id | name | ingress | egress |
+--------------------------------------+-------------+--------------------------------------+--------------------------------------+
| c3799715-030b-4d60-a682-ee906d1603a4 | servicepair | b5163853-c9d4-49c3-ac78-937ae48d817e | b5163853-c9d4-49c3-ac78-937ae48d817e |
+--------------------------------------+-------------+--------------------------------------+--------------------------------------+

neutron port-pair-group-list
+--------------------------------------+--------------+-------------------------------------------+
| id | name | port_pairs |
+--------------------------------------+--------------+-------------------------------------------+
| db543b8e-4b9b-4b85-9517-4f216ccf2605 | servicegroup | [u'c3799715-030b-4d60-a682-ee906d1603a4'] |
+--------------------------------------+--------------+-------------------------------------------+

neutron flow-classifier-list
+-----------------------------------...

Revision history for this message
Fred S (fsbiz) wrote :
Download full text (3.2 KiB)

Thanks. All flows look good.
It looks like DST also responded. The packets hit the correct flow on br-int in the 2nd compute node but never made it to "br-int" in compute node 1.
See inline for FS:

COMPUTE NODE 1

root@node-4:~# ovs-ofctl dump-groups -O OpenFlow13 br-int
OFPST_GROUP_DESC reply (OF1.3) (xid=0x2):
 group_id=1,type=select,bucket=actions=set_field:fa:16:3e:a2:4e:a9->eth_dst,resubmit(,5)
root@node-4:~# ovs-ofctl dump-flows -O OpenFlow13 br-int

 cookie=0x8638fb0830e30535, duration=699.169s, table=0, n_packets=19, n_bytes=1862, priority=30,icmp,in_port=13 actions=group:1

FS: 19 ICMP packets were sent from SRC 192.168.200.10

 cookie=0x8638fb0830e30535, duration=699.805s, table=5, n_packets=19, n_bytes=1862, priority=0,ip,dl_dst=fa:16:3e:a2:4e:a9 actions=push_mpls:0x8847,set_field:65791->mpls_label,set_mpls_ttl(255),push_vlan:0x8100,set_field:4098->vlan_vid,resubmit(,10)

FS: MPLS tag 65791 was pushed on the 19 packets.

 cookie=0x8638fb0830e30535, duration=698.929s, table=10, n_packets=19, n_bytes=1862, priority=1,mpls,dl_vlan=2,dl_dst=fa:16:3e:a2:4e:a9,mpls_label=65791 actions=pop_vlan,pop_mpls:0x0800,output:14

FS: Since service function 192.168.200.11 is on same compute node, the MPLS tag is popped and 19 packets redirected to SF.

Please read: This counter should be 38. The reply packets from DST should also be popped here.

_______________________________________________________________________________________________

COMPUTE NODE 2

root@node-1:~# ovs-ofctl dump-groups -O OpenFlow13 br-int
OFPST_GROUP_DESC reply (OF1.3) (xid=0x2):
 group_id=1,type=select,bucket=actions=set_field:fa:16:3e:a2:4e:a9->eth_dst,resubmit(,5)
root@node-1:~#
root@node-1:~# ovs-ofctl dump-flows -O OpenFlow13 br-int

 cookie=0xb64d1a09f55ad233, duration=732.140s, table=0, n_packets=19, n_bytes=1862, priority=30,icmp,in_port=9 actions=group:1

FS: DST 192.168.200.12 has responded to the 19 packets.

 cookie=0xb64d1a09f55ad233, duration=732.609s, table=5, n_packets=19, n_bytes=1862, priority=0,ip,dl_dst=fa:16:3e:a2:4e:a9 actions=push_mpls:0x8847,set_field:65791->mpls_label,set_mpls_ttl(255),push_vlan:0x8100,set_field:4098->vlan_vid,output:1

FS: The 19 packets replied by DST are pushed with MPLS tag 65791 (same as before since same chain)
and sent to br-tun to be sent to SF in compute node 1.

Please send the outputs of br-tun also from both compute nodes.
ovs-ofctl dump-flows -O OpenFlow13 br-tun

thanks,
Farhad.

 cookie=0xb64d1a09f55ad233, duration=77852.056s, table=10, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0xb64d1a09f55ad233, duration=77853.854s, table=23, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0xb64d1a09f55ad233, duration=988.582s, table=24, n_packets=0, n_bytes=0, priority=2,icmp6,in_port=9,icmp_type=136,nd_target=fe80::f816:3eff:fee2:18fb actions=NORMAL
 cookie=0xb64d1a09f55ad233, duration=988.452s, table=24, n_packets=63, n_bytes=2646, priority=2,arp,in_port=9,arp_spa=192.168.200.12 actions=resubmit(,25)
 cookie=0xb64d1a09f55ad233, duration=77853.790s, table=24, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0xb64d1a09f55ad233, duration=988.778s, table=25, n_packets=1030, n_bytes=8...

Read more...

Revision history for this message
Artem Plakunov (artacc) wrote :
Download full text (8.1 KiB)

COMPUTE NODE 1

root@node-4:~# ovs-ofctl dump-flows -O OpenFlow13 br-tun
OFPST_FLOW reply (OF1.3) (xid=0x2):
 cookie=0x95a995f49a374bb3, duration=96193.197s, table=0, n_packets=905496, n_bytes=73695260, priority=1,in_port=1 actions=resubmit(,2)
 cookie=0x95a995f49a374bb3, duration=19266.573s, table=0, n_packets=103165, n_bytes=8406415, priority=1,in_port=5 actions=resubmit(,4)
 cookie=0x95a995f49a374bb3, duration=19245.427s, table=0, n_packets=10, n_bytes=1332, priority=1,in_port=6 actions=resubmit(,4)
 cookie=0x95a995f49a374bb3, duration=96193.196s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x95a995f49a374bb3, duration=96193.195s, table=2, n_packets=32, n_bytes=1344, priority=1,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=resubmit(,21)
 cookie=0x95a995f49a374bb3, duration=96193.195s, table=2, n_packets=905357, n_bytes=73681006, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
 cookie=0x95a995f49a374bb3, duration=96193.194s, table=2, n_packets=107, n_bytes=12910, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)
 cookie=0x95a995f49a374bb3, duration=96193.194s, table=3, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x95a995f49a374bb3, duration=19268.275s, table=4, n_packets=103175, n_bytes=8407747, priority=1,tun_id=0x5f actions=push_vlan:0x8100,set_field:4098->vlan_vid,resubmit(,10)
 cookie=0x95a995f49a374bb3, duration=96193.193s, table=4, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x95a995f49a374bb3, duration=96193.193s, table=6, n_packets=0, n_bytes=0, priority=0 actions=drop
 cookie=0x95a995f49a374bb3, duration=96193.192s, table=10, n_packets=905101, n_bytes=73729619, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0x95a995f49a374bb3,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
 cookie=0x95a995f49a374bb3, duration=19266.572s, table=20, n_packets=103171, n_bytes=8397842, priority=2,dl_vlan=2,dl_dst=fa:16:3e:05:a2:4b actions=pop_vlan,set_field:0x5f->tun_id,output:5
 cookie=0x95a995f49a374bb3, duration=19245.426s, table=20, n_packets=19, n_bytes=1862, priority=2,dl_vlan=2,dl_dst=fa:16:3e:e2:18:fb actions=pop_vlan,set_field:0x5f->tun_id,output:6
 cookie=0x95a995f49a374bb3, duration=19253.228s, table=20, n_packets=0, n_bytes=0, hard_timeout=300, priority=1,vlan_tci=0x0002/0x0fff,dl_dst=fa:16:3e:05:a2:4b actions=load:0->NXM_OF_VLAN_TCI[],load:0x5f->NXM_NX_TUN_ID[],output:5
 cookie=0x95a995f49a374bb3, duration=96193.192s, table=20, n_packets=14, n_bytes=1036, priority=0 actions=resubmit(,22)
 cookie=0x95a995f49a374bb3, duration=19266.572s, table=21, n_packets=2, n_bytes=84, priority=1,arp,dl_vlan=2,arp_tpa=192.168.200.2 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],set_field:fa:16:3e:05:a2:4b->eth_src,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xfa163e05a24b->NXM_NX_ARP_SHA[],load:0xc0a8c802->NXM_OF_ARP_SPA[],IN_PORT
 cookie=0x95a995f49a374bb3, duration=19245.426s, table=21, n_packets=2, n_bytes=84, priority=1,arp,dl_vlan=2,arp_tpa=192.168.200.12 actions=move:...

Read more...

Revision history for this message
Fred S (fsbiz) wrote :

Thanks.
Can you do "dmesg" and see if any packets are dropped.

Revision history for this message
Fred S (fsbiz) wrote :

On compute node 2:

The output does show the 19 packets being sent over the vxlan tunnel towards compute node 1.

cookie=0xadea759a86ef4f78, duration=19321.653s, table=20, n_packets=19, n_bytes=1862, priority=2,dl_vlan=2,dl_dst=fa:16:3e:a2:4e:a9 actions=pop_vlan,set_field:0x5f->tun_id,output:4

You will need to find out whether you see the corresponding counters increment in tables 2 and subsequently table 10 of br-tun on compute node 1.

Farhad.

Revision history for this message
Fred S (fsbiz) wrote :

What version of linux kernel is running on compute1 and compute2?

Revision history for this message
Artem Plakunov (artacc) wrote :

Linux kernel version is the same on compute nodes:
Linux node-4.domain.tld 3.13.0-91-generic #138-Ubuntu SMP Fri Jun 24 17:00:34 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

dmesg output does not change while packets are going.

On compute node 1:

This rule does increment it's packet count every second:
 cookie=0x95a995f49a374bb3, duration=104243.892s, table=2, n_packets=948792, n_bytes=77218984, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)

The rule from table 10 does NOT increment the packet count

Revision history for this message
Fred S (fsbiz) wrote :

Please upgrade both kernels to 3.19 (preferably the latest).
Older kernels have issues with MPLS over VXLAN support.

Don't upgrade to linux kernel 4+ since even OVS will not work in that case.
I believe OVS 2.5.0 supports only upto Linux kernel 4.3

Your OVS 2.4.1 will be lower than that. I know 3.19 (latest from kernel.org) will work fine.

Farhad.

Revision history for this message
Louis Fourie (lfourie) wrote :
Changed in networking-sfc:
status: Triaged → In Progress
assignee: nobody → Farhad Sunavala (fsbiz)
Revision history for this message
Fred S (fsbiz) wrote :

Upgrading the kernel will fix the issue. Please reopen if you still have issues.

Changed in networking-sfc:
status: In Progress → Invalid
Revision history for this message
Artem Plakunov (artacc) wrote :

I apologize for the delay.
With kernel version 3.19.8-031908-generic everything works! Thanks a lot for answers and flow rules explanations

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.