OpenStack Designate Charm

Cannot delete zone from designate if zone's SOA is actually hosted by a forwarder/external dns

Bug #1807464 reported by Drew Freiberger on 2018-12-07

This bug report is a duplicate of: Bug #1802227: In delete action, the function get_serial() return nonzero number when 'recursion' is on. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Designate	New	Undecided	Unassigned
	OpenStack Designate Charm	Triaged	Undecided	Unassigned

Bug Description

This is an upstream Designate bug based on packages installed by our bionic/queens cloud charm configs.

If you add a zone to designate that is actually a zone owned by an upstream DNS server serviced by the forwarders defined for bind, you cannot delete the zone from designate.

You will see the logs looping with:

https://pastebin.ubuntu.com/p/vgpzCQVRbb/

The flag "RA" denotes that this is a referred answer, not an authoritative answer.

In the code, the check is whether the response from Designate is authoritative.

With the DNS backend network included in allowed_recursion_nets, the recursive lookup northward within designate-bind will return an accurate, external SOA record where designate code expects none.

Workaround, remove the forwarders from your charm config, let the zone deletion succeed, then re-add your forwarders.

Another workaround is to configure your dns-backend network into allowed_nets instead of allowed_recursion_nets in the charm config to prefect designate's mdns updates from querying upstream DNS accidentally.

See original description

Tags:

Dmitrii Shcherbakov (dmitriis) on 2018-12-07

tags:

added: cpe-onsite

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2018-12-07:

experiend while working around https://bugs.launchpad.net/charm-designate-bind/+bug/1806485

description:

updated

Revision history for this message

James Page (james-page) wrote on 2018-12-17:

Drew - is there a bug raised against the designate project for this issue?

Chris MacNaughton (chris.macnaughton) on 2019-05-14

Changed in charm-designate:
status:	New → Triaged

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2019-08-21:

James, I think you may be on to something, this should be filed against upstream. If the record being looked up comes back from the bind server being controlled by designate as being non-authoritative, it should be assumed that the bind server is performing recursive lookups and that it's no longer hosted in the local zone files.

We just found a similar issue where it took TTL seconds to have an entry deleted from the designate database when the recursive lookup hit corporate dns that had a cached entry from this delegated subdomain that was hosted in Designate.

Imagine the scenario where you have:

End User -> corporate DNS query for designate hosted entry -> corporate DNS recurses to the designate-bind service, retrieves entry, caches it in corporate DNS, and then returns non-authoritative entry to end-user, for example zone foo.mydesignatedomain.com, where corp dns is delegating mydesignatedomain.com to designate-bind.

You then go to delete foo.mydesignatedomain.com zone from the mydesignatedomain.com zone in designate. Designate mdns updates the designate-bind service and the zone is dropped from designate-bind, but when designate's mdns service queries designate-bind, designate-bind now forwards foo.mydesignatedomain.com upstream to corporate DNS (as the forwarder configured in the charm for anything not hosted locally). Your corporate DNS service still has the ns/soa record for foo.mydesignatedomain.com cached as long as the TTL was set for that zone, and designate doesn't ack that the domain was deleted from designate-bind because it received a response due to the recursion. Once the TTL times out upstream, then the record is shown as deleted. This can lead to 24 hour DNS woes if someone uses Designate maliciously.

While we can solve this in the charm by having the designate relationship's mdns records excluded from recursive lookup upstream, or by configuring the recursion_nets to cover only the cloud's overlay networks, we can solve this internally, but I believe solving it with a check for authoritative vs non-authoritative lookups within the designate code itself would solve this much better. I do worry that there are use-cases where the DNS backend (when not using designate-bind) may have a good reason to not provide an authoritative response (like if you're doing designate->bastion-dns-server which then updates an authoritative upstream corporate server), in which case, it wouldn't be solvable upstream w/out adding some additional configuration options.

I'd really like to see the designate-bind service set to blacklist recursion for the designate-mdns endpoints to solve this in the charmed situations, as I believe there are valid instances where the designate units' mdns endpoints could be within the same CIDR as public/overlay network IPs that need to be able to recurse through the designate-bind service.

James, I think you may be on to something, this should be filed against upstream.  If the record being looked up comes back from the bind server being controlled by designate as being non-authoritative, it should be assumed that the bind server is performing recursive lookups and that it's no longer hosted in the local zone files.

Imagine the scenario where you have:

You then go to delete foo.mydesignatedomain.com zone from the mydesignatedomain.com zone in designate.  Designate mdns updates the designate-bind service and the zone is dropped from designate-bind, but when designate's mdns service queries designate-bind, designate-bind now forwards foo.mydesignatedomain.com upstream to corporate DNS (as the forwarder configured in the charm for anything not hosted locally).  Your corporate DNS service still has the ns/soa record for foo.mydesignatedomain.com cached as long as the TTL was set for that zone, and designate doesn't ack that the domain was deleted from designate-bind because it received a response due to the recursion.  Once the TTL times out upstream, then the record is shown as deleted.  This can lead to 24 hour DNS woes if someone uses Designate maliciously.

While we can solve this in the charm by having the designate relationship's mdns records excluded from recursive lookup upstream, or by configuring the recursion_nets to cover only the cloud's overlay networks, we can solve this internally, but I believe solving it with a check for authoritative vs non-authoritative lookups within the designate code itself would solve this much better.  I do worry that there are use-cases where the DNS backend (when not using designate-bind) may have a good reason to not provide an authoritative response (like if you're doing designate->bastion-dns-server which then updates an authoritative upstream corporate server), in which case, it wouldn't be solvable upstream w/out adding some additional configuration options.

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2020-03-31:

This appears to be described in upstream project bug:
https://bugs.launchpad.net/designate/+bug/1802227

In the comments of that bug, it is noted that the best practice is to not have your authoritative BIND servers for designate also perform recursion.

This seems to suggest that for those environments that choose the designate-bind/neutron recursive DNS chain setup of VM->neutron dnsmasq->designate-bind->corporate should be altered to be something more along the lines of VM->neutron dnsmasq->bind-recursion service that points designate zones to the designate-bind forwarders and remaining queries to corporate DNS. The other option may be to stop supporting recursive DNS from designate-bind at all and to force the cloud DNS strategy to have neutron dnsmasq services refer to corporate or public DNS servers and not use designate-bind servers as primary external DNS for the cloud.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1802227 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.