DHCP agent fails to fully configure DHCP namespaces because of duplicate address detected
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Bence Romsics |
Bug Description
After upgrading a Neutron/ML2 OVS deployment from Ussuri to Victoria, updating the host OS from CentOS Linux 8 to CentOS Stream 8, and rebooting, DHCP was not functional on some but not all networks.
DHCP agent logs included the following error multiple times:
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
2021-11-30 17:05:35.475 7 ERROR neutron.
The tap interface inside each affected qdhcp namespace was in a state like this:
35: tap0f8bb343-c1: <BROADCAST,
link/ether fa:16:3e:ed:6f:60 brd ff:ff:ff:ff:ff:ff
inet 169.254.169.254/32 brd 169.254.169.254 scope global tap0f8bb343-c1
valid_lft forever preferred_lft forever
inet 10.18.0.10/16 brd 10.18.255.255 scope global tap0f8bb343-c1
valid_lft forever preferred_lft forever
inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative
valid_lft forever preferred_lft forever
inet6 fe80::f816:
valid_lft forever preferred_lft forever
Note the dadfailed status on the fe80::a9fe:a9fe/64 address, which caused Neutron to raise an AddressNotReady exception.
I tried restarting dhcp-agent multiple times. Occasionally DHCP for one network would configure correctly, but most of the times the list of affected networks would stay the same.
I found out that removing the fe80::a9fe:a9fe/64 address from the tap interface of each affected namespace followed by restarting dhcp-agent fixed the issue: no more dadfailed status.
Version information:
* OpenStack Victoria deployed with Kolla source images
* neutron 17.2.2.dev70 (using stable/victoria from 2021-11-28)
* CentOS Stream release 8
* Linux kernel 4.18.0-
summary: |
- DHCP agent fails to configure DHCP namespaces because of duplicate + DHCP agent fails to fully configure DHCP namespaces because of duplicate address detected |
Changed in neutron: | |
assignee: | nobody → Bence Romsics (bence-romsics) |
Changed in neutron: | |
status: | Confirmed → In Progress |
Changed in neutron: | |
status: | In Progress → Fix Released |
So this happens when you have more than one DHCP agent only, correct? Using isolated subnets?
It looks like an oversight when we added support for metadata over IPv6, since using the same link-local address on multiple nodes will fail in DAD as you show above.
Just thinking out loud there might be a couple of options:
1) Neutron tells only one DHCP agent to configure the IPv6 metadata address. It reduces availability, and there might be some edge cases, but could work.
2) We change to use an Anycast address, in which only one of the nodes will get the request. But this is more complicated as 1) Anycast addresses are only supposed to be configured on routers (which don't exist here); and 2) IANA assigns Anycast addresses, https:/ /www.iana. org/assignments /ipv6-anycast- addresses/ ipv6-anycast- addresses. xhtml
A quick fix for you would be to set this in neutron.conf:
dhcp_agents_ per_network = 1