Comment 24 for bug 1953165

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Before I review Brian's new patch I just wanted to better understand how the ipv4 metadata works. Let me document my findings here:

I created a test environment with 3 hosts and dhcp_agents_per_network=2:

devstack0 - dhcp agent serving net1, dhcp port address: 10.0.4.2
devstack0a - dhcp agent serving net1, dhcp port address: 10.0.4.3
devstack0b - vm0 booted on net1

dhcp servers push the following routes:

devstack0 $ sudo ip netns exec qdhcp-$( openstack net show net1 -f value -c id ) cat /opt/stack/data/neutron/dhcp/$( openstack net show net1 -f value -c id )/opts
tag:subnet-b4f511be-e5de-46ab-b0fb-d6276797fd6c,option6:domain-search,openstacklocal
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,option:classless-static-route,169.254.169.254/32,10.0.4.2,0.0.0.0/0,10.0.4.1
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,249,169.254.169.254/32,10.0.4.2,0.0.0.0/0,10.0.4.1
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,option:router,10.0.4.1
tag:subnet-b4f511be-e5de-46ab-b0fb-d6276797fd6c,option6:dns-server,[2001:db8::2],[2001:db8::1]

devstack0a $ sudo ip netns exec qdhcp-$( openstack net show net1 -f value -c id ) cat /opt/stack/data/neutron/dhcp/$( openstack net show net1 -f value -c id )/opts
tag:subnet-b4f511be-e5de-46ab-b0fb-d6276797fd6c,option6:domain-search,openstacklocal
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,option:classless-static-route,169.254.169.254/32,10.0.4.3,0.0.0.0/0,10.0.4.1
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,249,169.254.169.254/32,10.0.4.3,0.0.0.0/0,10.0.4.1
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,option:router,10.0.4.1
tag:subnet-dfa281b1-f0a3-4425-a972-45ce80c5f4d5,option:dns-server,10.0.4.2,10.0.4.3
tag:subnet-b4f511be-e5de-46ab-b0fb-d6276797fd6c,option6:dns-server,[2001:db8::1],[2001:db8::2]

The freshly booted vm0 has this routing table:

$ ip r
default via 10.0.4.1 dev eth0
10.0.4.0/24 dev eth0 scope link src 10.0.4.185
169.254.169.254 via 10.0.4.3 dev eth0

The metadata address replies to ping:
$ ping -c3 169.254.169.254
PING 169.254.169.254 (169.254.169.254): 56 data bytes
64 bytes from 169.254.169.254: seq=0 ttl=64 time=1.980 ms
64 bytes from 169.254.169.254: seq=1 ttl=64 time=3.646 ms
64 bytes from 169.254.169.254: seq=2 ttl=64 time=1.778 ms

--- 169.254.169.254 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.778/2.468/3.646 ms

tcpdump in the dhcp namespaces on the dhcp port's tap interface confirms that traffic goes to devstack0a only - as expected.

Let's change the route in the guest:
$ ip r del 169.254.169.254
$ ip r add 169.254.169.254 via 10.0.4.2

# ping still works
$ ping -c 3 169.254.169.254
PING 169.254.169.254 (169.254.169.254): 56 data bytes
64 bytes from 169.254.169.254: seq=0 ttl=64 time=2.094 ms
64 bytes from 169.254.169.254: seq=1 ttl=64 time=2.048 ms
64 bytes from 169.254.169.254: seq=2 ttl=64 time=1.815 ms

--- 169.254.169.254 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.815/1.985/2.094 ms

tcpdump confirms that all traffic goes to devstack0.

NOTE: This is contradicting that Linux performs any duplicate address detection for IPv4 LL. It is hard to directly observe anything here. My previous experiments were compatible with Linux performing IPv4 LL DAD. (As it should for RFC compliance.) But now I'm starting the change my opinion and believe there's no IPv4 LL DAD in Linux.

Let's drastically lower the dhcp lease time so we can observe a "failover":
dhcp_agent.ini:
[DEFAULT]
dhcp_lease_duration = 60

# and reboot vm0 to get this new lease time, this time we start with this routing table
vm0 $ ip r
default via 10.0.4.1 dev eth0
10.0.4.0/24 dev eth0 scope link src 10.0.4.205
169.254.169.254 via 10.0.4.2 dev eth0

devstack0 $ sudo systemctl stop devstack@q-dhcp ; sudo killall -r dnsmasq

When we get the new lease from the other dhcp server the metadata route changes. I believe the dhcp client shall delete the previous route when the new lease it gets does not contain the same. This happens here:
$ ip r
default via 10.0.4.1 dev eth0
10.0.4.0/24 dev eth0 scope link src 10.0.4.205
169.254.169.254 via 10.0.4.3 dev eth0

And the metadata address keeps responding to ping.

I would not really call this highly available, or a failover, when the default dhcp lease time is a day, but we have at least some recovery.