[OVN] IGMP snooping traps IGMP messages

Bug #1918108 reported by Pedro Guimarães on 2021-03-08
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Lucas Alvares Gomes

Bug Description

Hi,

Once we enable IGMP snooping on Neutron, IGMP messages are trapped and cannot leave the virtual switch.
That leads to an non-scalable solution, given that external network cannot know which computes are looking for a given multicast flow and then, are forced to push all multicast to all hosts.

Also, if we resolve the problem above, the provider network interface on the vswitch side becomes an interface that can report into IGMP_Group table on OVN by itself. Therefore, it gets added/removed everytime the external network sends an IGMP message to join/leave a flow. That means multicast traffic entering or leaving hosts will be gated by this erradic behavior of the provnet interface.

The solution I see for this is to (1) always allow all interfaces to flood IGMP; and (2) provnet interfaces should also be allowed to flood multicast traffic.

Tags: ovn Edit Tag help
tags: added: ovn

Hi Pedro,

Thanks for reporting this.

Perhaps is something we need to include in the OVN driver itself. We have this bug downstream at https://bugzilla.redhat.com/show_bug.cgi?id=1933990#c3, and in talks with the Core OVN developers turns out we could enable "mcast_flood_reports" on the Logical_Switch_Ports in the OVN driver to workaround this problem (that would prevent ovn-controller trapping the IGMP pkts).

If you can't wait and want to test it on your environment, it's possible to set this option on the LSP running:

$ ovn-nbctl set logical_switch_port <Neutron port UUID> options:mcast_flood_reports=true

Also, talking to the Core OVN developers. They advised me to always have this enabled, it shouldn't be a problem and if mcast_snoop is disabled on the LS the "mcast_flood_reports" from LSPs will just be ignored and not cause any harm.

So the patch for the OVN driver in Neutron shouldn't be complicated. I will try to upload a fix soon.

Changed in neutron:
importance: Undecided → High
status: New → Confirmed
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
Pedro Guimarães (pguimaraes) wrote :

Hi @lucastagomes, thanks for taking a look into this. I had done some work on it as well and ended up with the following:

1) Add mcast_flood_reports to all ports created by Neutron
2) Add mcast_flood to provnet ports (if an IGMP Leave comes from the network then the provnet gets unsubscribed and that affects the entire compute host)

I had done some work on it to get into Neutron and I can push it to opendev if you hadn't started yet.

Hi Pedro, I've added the following patch upstream: https://review.opendev.org/c/openstack/neutron/+/779258

Do we need the mcast_flood to the provnets as well or mcast_flood_reports should be enough ? In talks to the OVN core team it was suggested to enable mcast_flood_reports only.

Pedro Guimarães (pguimaraes) wrote :

@lucasagomes, yes, I do believe we also need mcast_flood for provnet ports.

The thing is that provnet ports are treated like any other port by OVN. However, if it receives an IGMP Leave, it will leave the group and that affects all the VMs connected to that specific provider network and group.

The way around I've found to resolve this was to add sth like:

    def create_provnet_port(self, network_id, segment, txn=None):
        tag = segment.get(segment_def.SEGMENTATION_ID, [])
        physnet = segment.get(segment_def.PHYSICAL_NETWORK)
        cmd = self._nb_idl.create_lswitch_port(
            lport_name=utils.ovn_provnet_port_name(segment['id']),
            lswitch_name=utils.ovn_name(network_id),
            addresses=[ovn_const.UNKNOWN_ADDR],
            external_ids={},
            type=ovn_const.LSP_TYPE_LOCALNET,
            tag=tag,
            options={'network_name': physnet, <---------------------------
                     ovn_const.MCAST_FLOOD_PORT: 'true'}) <---------------------------

Thanks Pedro for the feedback.

Please take a look at the latest patch-set for https://review.opendev.org/c/openstack/neutron/+/779258, it's addressing that problem.

Pedro Guimarães (pguimaraes) wrote :

Hi @lucasagomes, I've tested your fix and it is working on my lab. I will give you a +1 on the review as well.

Hi Pedro,

Thanks for trying. As soon as the patch merges I will work on backporting it to the stable branches.

This issue was fixed in the openstack/neutron 16.3.1 release.

This issue was fixed in the openstack/neutron 17.1.1 release.

This issue was fixed in the openstack/neutron 18.0.0.0rc1 release candidate.

This issue was fixed in the openstack/networking-ovn 7.4.1 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.