ovs firewall: mac learning of dest VM mac not working

Bug #1897637 reported by Moshe Levi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Moshe Levi

Bug Description

I have using neutron master with ovs firewall driver and ovs 2.13
I have 2 compute nodes and VM on each one of them
both VM configure security groups which allow ingress and egress of tcp traffic
I running iperf testing for tcp connection tracking
we traffic start I see the following rule:

ufid:58ea9ecf-9fe5-4662-ae46-be4b7540d9c5, skb_priority(0/0),skb_mark(0/0),ct_state(0x2/0x2),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x15),dp_hash(0/0),in_port(p4p2_11),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:9e:77:5c,dst=fa:16:3e:35:c0:68),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:7151296, bytes:64459680961, used:0.420s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x1,src=172.16.0.148,dst=172.16.0.147,ttl=64,tp_dst=4789,flags(key))),vxlan_sys_4789

This is the fdb table of the br-int with "ovs-appctl fdb/show br-int"

 port VLAN MAC Age
    5 3 fa:16:3e:35:c0:68 97
    6 3 fa:16:3e:9e:77:5c 0

As you can see the dest mac of the remote VM is Age increasing and when it get to 300s which is the default age time in the ovs the mac will disappear and the rule above will changed to flood rule.

ufid:b2967a14-aa26-433a-8df1-1cc00ef662e7, skb_priority(0/0),skb_mark(0/0),ct_state(0x2/0x2),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x12),dp_hash(0/0),in_port(p4p2_11),packet_type(ns=0/0,id=0/0),eth(src=fa:16:3e:9e:77:5c,dst=fa:16:3e:35:c0:68),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:23004560, bytes:204398890734, used:0.000s, dp:tc, actions:push_vlan(vid=1,pcp=0),br-int,set(tunnel(tun_id=0x1,src=172.16.0.148,dst=172.16.0.147,ttl=64,tp_dst=4789,flags(key))),pop_vlan,vxlan_sys_4789

This is the fdb table of the br-int with "ovs-appctl fdb/show br-int"
 port VLAN MAC Age
    9 1 fa:16:3e:9e:77:5c 0

The flood rule is breaking the offload.

see like RULES_INGRESS_TABLE table 82 is output the dest port without doing the Normal action. if we change the openflow of this table from:
table=82, n_packets=147206831, n_bytes=11772233989, priority=50,ct_state=+est-rel+rpl,ct_zone=1,ct_mark=0,reg5=0x9 actions=output:"p4p2_11"
to:
cookie=0x1e1cc3048de6c562, duration=196.708s, table=82, n_packets=145661342, n_bytes=11670250023, priority=50,ct_state=+est-rel+rpl,ct_zone=1,ct_mark=0,reg5=0x9 actions=mod_vlan_vid:1,NORMAL

the problem will be solved.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/754867

Changed in neutron:
assignee: nobody → Moshe Levi (moshele)
status: New → In Progress
Moshe Levi (moshele)
summary: - ovs firewall: mac learning on dest VM mac not working
+ ovs firewall: mac learning of dest VM mac not working
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Related bug: https://bugs.launchpad.net/neutron/+bug/1884708 (no firewall involved)

Revision history for this message
Moshe Levi (moshele) wrote :

@Rodolfo I am not sure it the same bug. What we see is that if we use firewall driver noop mac learning is working fine traffic is offloaded.
On the other hand with ovs firewall driver we don't see mac learning on the dest mac and this will cause the datapath rule to change to flood and break the offload (as we don't offload flood packets).

Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/754867
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8fc80b7e132031d18c787b5be582c146d262de74
Submitter: Zuul
Branch: master

commit 8fc80b7e132031d18c787b5be582c146d262de74
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/759343

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/759344

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/759345

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/759537

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/759538

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/759539

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/759343
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=092b34177964286fd69857c76e581e64afeb007b
Submitter: Zuul
Branch: stable/victoria

commit 092b34177964286fd69857c76e581e64afeb007b
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/759345
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=cbb949379c7acc5ec830c3a7d388aea59e1ab2c2
Submitter: Zuul
Branch: stable/train

commit cbb949379c7acc5ec830c3a7d388aea59e1ab2c2
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/759539
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9a1c830552ab1d762679351c3f182a9c7dc76d50
Submitter: Zuul
Branch: stable/stein

commit 9a1c830552ab1d762679351c3f182a9c7dc76d50
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.opendev.org/759538
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2c00c37258685174c82b8509e91e82d2a6408477
Submitter: Zuul
Branch: stable/queens

commit 2c00c37258685174c82b8509e91e82d2a6408477
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/759344
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d865165cc8cbd50a3e79a25065ef9a310d7c9396
Submitter: Zuul
Branch: stable/ussuri

commit d865165cc8cbd50a3e79a25065ef9a310d7c9396
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/759537
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=51c7c3cb2e396553eb8835f75919874f18862abb
Submitter: Zuul
Branch: stable/rocky

commit 51c7c3cb2e396553eb8835f75919874f18862abb
Author: Moshe Levi <email address hidden>
Date: Tue Sep 29 00:58:54 2020 +0300

    ovs firewall: fix mac learning on the ingress rule table when ovs offload enabled

    In RULES_INGRESS_TABLE table 82 there is a rule for allow established and
    related connections. The current rule sends the packet directly to the dest
    port without doing a mac learning. This is causing ovs to age out the dest mac
    of the remote VM and causing the rule to be changed in flood rule. For the normal
    case it fine as they try to avoid high cpu. ovs hardware offload reduce cpu usage
    by moving some of the packet processing to nic and flood rule is not offloaded,
    therefore it prefre to use the NORMAL action to avoid the flood rule.
    We also keep the same logic as today when using explicitly_egress_direct=True
    which avoid NORMAL action in the entire pipeline.

    Closes-Bug: #1897637

    Change-Id: I9b611d62be5d0529e8b35e3d8280baa5be54bc2b
    (cherry picked from commit 8fc80b7e132031d18c787b5be582c146d262de74)

tags: added: in-stable-rocky
tags: added: neutron-proactive-backport-potential
tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 15.3.1

This issue was fixed in the openstack/neutron 15.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.3.0

This issue was fixed in the openstack/neutron 16.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.1.0

This issue was fixed in the openstack/neutron 17.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.0.0.0rc1

This issue was fixed in the openstack/neutron 18.0.0.0rc1 release candidate.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Hi @moshele can you please take a look at https://bugs.launchpad.net/neutron/+bug/1931696/comments/8 and tell me what you think. In short I am finding that the patch landed here is breaking both offloaded and non-offloaded ports in my environment.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron queens-eol

This issue was fixed in the openstack/neutron queens-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol

This issue was fixed in the openstack/neutron rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.