DHCP agent scheduler filtering ignored when agent service restarted

Bug #1964765 reported by Andrew Bonney
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Wishlist
Unassigned

Bug Description

We have a Xena deployment based on Linux bridge networking, with a mix of Neutron managed VXLAN, and routed/L3 provider networks.

neutron-dhcp-agent services are deployed to a set of network nodes in order to serve the VXLAN networks. These services are also deployed to a subset of compute nodes which are the only hosts with access to the routed provider networks.

We would like the network nodes to be used in preference to the compute nodes when DHCP instances are created against VXLAN tenant networks. Having written a patch to achieve this (modifying the DHCP scheduler filtering - https://github.com/bbc/neutron/commit/54b135e1412d83acb68995a71ed864118162ab01), this works correctly when networks are first created. However, if a given network has a DHCP agent/port removed at any point and a DHCP service on one of the compute nodes is subsequently restarted, these filters are ignored and the compute node takes over this responsibility.

As far as I can tell, when the DHCP agent service starts up it requests details of the active networks and seeing a network it can reach which doesn't have the correct number of agents deployed, takes over this responsibility, bypassing the scheduling filters. The existing filters which ensure a DHCP service has physical access to a given L3 segment appear to work because there is a second implementation of that filtering in https://github.com/openstack/neutron/blob/master/neutron/api/rpc/handlers/dhcp_rpc.py#L212.

I'd appreciate a second opinion on whether by making changes to the DHCP scheduler filtering only, it should be reasonable to expect its rules to apply at times after initial network/subnet creation? If there is an architectural reason the filters don't get re-used, would my best option be to add further patching in dhcp_rpc to achieve the desired behaviour? Thanks

Tags: rfe
Akihiro Motoki (amotoki)
tags: added: rfe
Changed in neutron:
importance: Undecided → Wishlist
Revision history for this message
Akihiro Motoki (amotoki) wrote :

This is a kind of enhancement requests, so I added "rfe" tag for discussion.

Revision history for this message
Oleg Bondarev (obondarev) wrote :

if set config "network_auto_schedule = false" - neutron will skip network scheduling on dhcp agent startup. Will that work for your case?

Revision history for this message
Andrew Bonney (andrewbonney) wrote :

I did try this, but this also appeared to stop scheduling of DHCP agents/ports to subnets when new networks are created, which was undesirable.

The other concern I had with changing this parameter was that if we had to replace a host which had previously provided DHCP to a set of networks, I believe the new host would have to be manually instructed to act as a DHCP agent for these networks, rather than being able to automatically assume the role.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

We discussed this topic on drivers meeting:
https://meetings.opendev.org/meetings/neutron_drivers/2022/neutron_drivers.2022-03-18-14.00.log.html#l-70

summary: "allow new flag for filters to execute them on agent restart, and set this to False for already existing filters" this way we keep how current filters work, but new filters can select to be executed at agent startup which covers your case.
Is that works for you?

We understood that you have no intention to upstream your dhcp scheduler changes, is that correct?

Revision history for this message
Andrew Bonney (andrewbonney) wrote :

This sounds like a good solution, thank you.

I have no immediate plans to upstream the filter patch (unless it would be useful). I suspect there would be a better solution than using a regular expression, but this might involve further interaction with the database which I was avoiding for simplicity in testing.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.