Nova assumptions about /32 routes to NS' break name resolution under DHCP

Bug #1944083 reported by Boris Lukashev
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Incomplete
Undecided
Unassigned
neutron
New
Undecided
Unassigned

Bug Description

We run designate out of a private VLAN which is accessible via one of the two external networks in our wallaby cloud. In order to permit instances name resolution via those endpoints, we add a route to the subnet in that private VLAN via the 2nd router added to each network, the external network of which is our OutsidePrivate net (the External network which resides inside the DC, vs our OutsidePublic which is a VLAN to the actual WAN).
Unfortunately, despite setting up this 2nd router and explicit route, we see nova instances coming up with an explicit /32 route to each DNS server specified _via the .1 gateway_ in the network which is the router to OutsidePrivate, and despite an explicit route to the /24 (i know CIDR works in smallest subnet preference) which should be understood to encapsulate the 3 IPs of the NS' themselves and prevent the /32 routes from being created.
Even setting explicit /32 routes to each NS via the 2nd gateway @ .2 doesn't work - the original /32's via the .1 are still present, and the only fix we've found is to force nodes to static addressing and routing via cloud-init. ICMP redirect from the primary gateway to the secondary is hit-or-miss, and not how this should work anyway.
I've not found anything in the docs about how these default routes via the primary gateway are set up, and have therefore found no way to disable them so filing this a bug since it's a major impediment to anyone resolving names via any gateway but the one set as the default gateway for the network.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

im not sure that nova is incontrol of this.
this seams like a issue likely with dhcp?

i dont think nova actully set /32 routes for the gateways itself.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

looking at the metadata code this is where we generate the dns info

https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861a41/nova/virt/netutils.py#L103-L121

       if subnet_v4:
            if subnet_v4.get_meta('dhcp_server') is not None:
                continue

            if subnet_v4['ips']:
                ip = subnet_v4['ips'][0]
                address = ip['address']
                netmask = model.get_netmask(ip, subnet_v4)
                if subnet_v4['gateway']:
                    gateway = subnet_v4['gateway']['address']
                broadcast = str(subnet_v4.as_netaddr().broadcast)
                dns = ' '.join([i['address'] for i in subnet_v4['dns']])
                for route_ref in subnet_v4['routes']:
                    (net, mask) = get_net_and_mask(route_ref['cidr'])
                    route = {'gateway': str(route_ref['gateway']['address']),
                             'cidr': str(route_ref['cidr']),
                             'network': net,
                             'netmask': mask}
                    routes.append(route)

so we just take the info from neutron and more or less use it directly

this resulted in a json blob that looks like this
https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/metadata-service-network-info.html#rest-api-impact

which can be used by cloud init to configure networking

 { // Standard VM VIF networking
        "id": "private-ipv4",
        "type": "ipv4",
        "link": "interface0",
        "ip_address": "10.184.0.244",
        "netmask": "255.255.240.0",
        "dns_nameservers": [
            "69.20.0.164",
            "69.20.0.196"
        ],
        "routes": [
            {
                "network": "10.0.0.0",
                "netmask": "255.0.0.0",
                "gateway": "11.0.0.1"
            },
            {
                "network": "0.0.0.0",
                "netmask": "0.0.0.0",
                "gateway": "23.253.157.1"
            }
        ],
        "network_id": "da5bb487-5193-4a65-a3df-4a0055a8c0d7"
    },

we are not going to add any /32 routes to the name servers and will only populate teh "routes" section if the subnet had addtional routs popluated in neutron.

i suspect the issue you are fasing is in the neturon DHCP server or possible cloud init as nova is not going to activly push any rout infomation to the vm that would install a /32 directly.

Changed in nova:
status: New → Incomplete
Revision history for this message
Boris Lukashev (rageltman) wrote :

Thanks for digging into this. If i'm reading that correctly, this bug actually belongs in Neutron then? Is there a way to move them between projects to retain your analysis to support relocating the ticket? Effect is seen on nova instances and i foolishly assumed that it was nova allocating the DNS routes due to its designate integration, so might be handy to have some supporting analysis for problem localization.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Can you please show some commands that would allow to reproduce this issue? Like for setting up your networks, subnets and routers? Also are you using OVS or OVN? On which OpenStack version?

Revision history for this message
Boris Lukashev (rageltman) wrote :

The configuration can be replicated via the shell using `route add <NS-IP> gw <DEF-GW-IP>` for every DNS server specified.
I'm seeing the same behavior go back to Mitaka at least, possibly even liberty.
Our clouds have an OutsidePublic net - real WAN, and OutsidePrivate net - an "external network" inside the datacenter (shared by all 3 openstacks in there). The GW to OutsidePublic is set to be the default, but the DNS servers (a designate set in the Wallaby cloud) are on the OutsidePrivate segment. Adding explicit routes to using the OutsidePrivate GW fails because _something_ is setting explicit routes to each NS/32 via the default GW which overrides the routes being set for the VPC network needing to get DNS resolution.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.