Remote security groups don't allow traffic from floating IPs

Bug #2015449 reported by Adam Oswick
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

Description
-----------
When a floating IP is attached to a VM, traffic destined for other nodes appears as coming from the floating IP rather than the fixed IP. However, the ipsets created for remote security group rules do not include the floating IP address meaning it is blocked.

Preconditions
-------------
- DVR is enabled

Reproduction steps
------------------
- Create a security group which allows traffic from other members of this security group
- Create two VMs with the aforementioned SG attached
- Ensure traffic from the two VMs can reach each other
- Create a floating IP and attach it to one of the VMs

Expected output
---------------
Traffic from the VM with the FIP attached can reach the other VM

Actual output
-------------
Traffic from the VM with the FIP attached cannot reach the other VM

Version
-------
Openstack Zed

Revision history for this message
Adam Oswick (adamoswick) wrote (last edit ):

It looks like the code that determines what is in the ipset is https://opendev.org/openstack/neutron/src/commit/208421910d2bb3c71b0947254d5eca1326c184d0/neutron/api/rpc/handlers/securitygroups_rpc.py#L379 (or this is at least one of the functions that does this).

The first option we considered was updating the _select_ips_for_remote_group function (linked above) to include any floating IP addresses that are associated with the fixed IPs. However, this might not work very well as there's no update triggered for the fixed IP port when a floating IP is attached. This means that the _select_ips_for_remote_group function won't run immediately and so it won't add the floating IP address until a separate update is triggered for the fixed IP port as a result of some other action elsewhere.

Another option we considered leveraged allowed_address_pairs. The _select_ips_for_remote_group function checks the allowed_address_pairs on the fixed IP port and adds any IPs listed to the ipset. One thing we've tested locally is automatically adding the floating IP address to the allowed_address_pairs in https://opendev.org/openstack/neutron/src/commit/208421910d2bb3c71b0947254d5eca1326c184d0/neutron/db/l3_db.py#L1612 and this seems to work well.

If this sounds suitable then please let me know and I can put together a PR for that.

EDIT: Technically, this may not be a bug as enabling port security on the port associated with the FIP and adding the relevant security groups does mean that the FIP is added to the ipset list. However, this doesn't seem ideal as:
- The methods used to attach FIPs (e.g. Horizon, OpenStack CLI) don't enable port security on the FIP nor do they attach the security groups used by the fixed IP port
- Enabling port security and attaching security groups to the FIP's port seems like unnecessary overhead

tags: added: l3-dvr-backlog
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Please add some more detail about your setup, are the VMs in the same network? Then I don't understand what " - Ensure traffic from the two VMs can reach each other" is. If not, please describe your network setup in a reproducible way.

Changed in neutron:
status: New → Incomplete
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Oh, also please state the exact version of neutron that you are using. There was a broken patch that got reverted which might be related https://review.opendev.org/c/openstack/neutron/+/875644 ,

Revision history for this message
Adam Oswick (adamoswick) wrote (last edit ):

Apologies for any missing information.

"are the VMs in the same network?"

Yes, they are. I think the subnet is more relevant than the network here. The only time we wouldn't see this problem is if the VMs are in the same subnet AND are on the same host (as in this case the destination VM sees traffic from the fixed IP rather than NATed via the floating IP).

"Then I don't understand what " - Ensure traffic from the two VMs can reach each other" is."

When I say "ensure traffic from the two VMs can reach other" an example would be can instance A `ping` instance B.

"state the exact version of neutron that you are using"

We appear to be running 21.0.1. While I don't think this includes the patch you've linked, I believe this isn't related to the problem. We've had this documented locally for months now (but we haven't had the time to actually look into) so I think that means that it long pre-dates the original broken patch you've linked.

Just to clarify, we have identified the source of this problem. There simply isn't logic in https://opendev.org/openstack/neutron/src/commit/208421910d2bb3c71b0947254d5eca1326c184d0/neutron/api/rpc/handlers/securitygroups_rpc.py#L379 to return any attached FIPs when creating a list for ipset.

That means that the floating IPs are never added to the ipsets on the hypervisors which contain the IPs allowed by remote security group rules. As a result, when the hypervisor sees traffic from the VM NATed via the floating IP, it is not allowed. If we add the floating IP attached into the ipset list on the hypervisor manually (e.g. `ipset add $SET $FIP`) the issue goes away and the remote security group rule works as expected.

I'm hoping the above explanation is adequate on its own but if you think more detailed replication steps are necessary then I can do that.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

I still think that the issue is that the traffic does get natted in the first place. The security group then blocking the traffic coming from the FIP is working as designed IMO. Waiting for other neutron ppl to chime in though.

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Sorry for the very slow reaction, but I finally had time to work with this.

I tried to reproduce this bug but I couldn't. As Jens wrote, the traffic was not NAT-ted.

Please tell us more about how to reproduce this error, until then I'm marking this as incomplete.

* What neutron backend do you use?
* In what agent_mode do you use dvr (on each host)?
* Any other relevant neutron configuration?
* The exact commands to create the virtual networks, subnets, routers, floating IPs.

I tried with a two-host devstack environment in which I had:
* 1 all-in-one host: devstack0, and (agent_mode=dvr_snat)
* 1 compute host: devstack0a (agent_mode=dvr)
The backend was ml2/ovs.

openstack net create net1 --external --provider-network-type vlan --provider-segment 100 --provider-physical-network physnet1
openstack subnet create subnet1-v4 --ip-version 4 --network net1 --subnet-range 10.0.5.0/24

openstack net create net2
openstack subnet create subnet2-v4 --ip-version 4 --network net2 --subnet-range 10.0.6.0/24

openstack net create net3
openstack subnet create subnet3-v4 --ip-version 4 --network net3 --subnet-range 10.0.7.0/24

openstack router create router0
openstack router set --external-gateway net1 router0
openstack router add subnet router0 subnet2-v4
openstack router add subnet router0 subnet3-v4

openstack server create --flavor cirros256 --image cirros-0.5.2-x86_64-disk --nic net-id=net2 --availability-zone :devstack0 vm0 --wait
openstack server create --flavor cirros256 --image cirros-0.5.2-x86_64-disk --nic net-id=net3 --availability-zone :devstack0a vm0a --wait

openstack server show vm0 -f yaml -c addresses
# login to vm0a
sudo virsh console "$( openstack server show vm0a -f value -c OS-EXT-SRV-ATTR:instance_name )"
# ping vm0 from vm0a: works
ping ...
# cirros does not have tcpdump, so I'm snooping the traffic from outside, from devstack0
sudo tcpdump -ni $( sudo virsh dumpxml "$( openstack server show vm0 -f value -c OS-EXT-SRV-ATTR:instance_name )" | xmlstarlet sel -t -v '//interface[1]/target/@dev' )

# At this point nothing is NAT-ted:
12:55:05.774873 IP 10.0.7.125 > 10.0.6.114: ICMP echo request, id 59649, seq 31, length 64
12:55:05.775681 IP 10.0.6.114 > 10.0.7.125: ICMP echo reply, id 59649, seq 31, length 64

# Then I create the floating IP and restart the ping.
openstack floating ip create net1 --port $( openstack port list --server vm0a -f value -c id )

# And still nothing is NAT-ted.
12:55:54.911921 IP 10.0.7.125 > 10.0.6.114: ICMP echo request, id 59905, seq 6, length 64
12:55:54.912785 IP 10.0.6.114 > 10.0.7.125: ICMP echo reply, id 59905, seq 6, length 64

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.