[Wishlist] Configurable Service IPs for bind servers

Bug #1804057 reported by Drew Freiberger
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Designate-Bind Charm
Fix Committed
Wishlist
Martin Kalcok

Bug Description

Given an environment where openstack's designate is configured in a corporate DNS structure (or even registrar NS data) as the conditional forwarder for a given domain/set of domains, the upstream DNS or registrar entries must have static IPs to point to for the canonical reference for those zones.

Example:

zone example.org.
  ns1.example.org A 10.0.0.10
  ns2.example.org A 10.0.0.11

Given the typical situation of having two designate-bind9 units running, and running bind in a container or VM in the juju environment, the addresses assigned to the units' dns-frontend space binding will not necessarily stay the same (you can't preselect LXD IPs in MAAS deploys, and likely can't preselect IPs in VM deploys in clouds). When we need to recover or rebuild the designate-bind servers/service units, we will get new IPs that then would have to be populated back into upstream DNS or registrars.

I'd suggest that we add a "service_ips" config option to designate-bind charm to configure as IPs for the designate-bind units. To illustrate the possible solution:

Spaces:
  internal: 192.168.0.0/24
  external: 10.0.0.0/24 (dhcp range 10.0.0.20-10.0.0.254, reserved range 10.0.0.1 - 10.0.0.19)

designate-bind:
  bindings:
    "": internal
    dns_backend: internal
    dns_frontend: external
  options:
    num_units: 2
    service_ips: 10.0.0.10 10.0.0.11

Deploy the two units, they get 10.0.0.20 and 10.0.0.21. The first unit inteh cluster would then take the 10.0.0.10 IP address of the service_ips stack and mark it used in the cluster relation data, then the second unit would use the 10.0.0.11 address and mark it used in the cluster relation data. The units would then configure the IPs as additional addresses on the interface where the dns_frontend ip space is bound, and then advertises and configures any front-end relations and listeners to listen on the service_ips instead of the ips auto-assigned by the binding to dns_frontend.

This would allow us to recover designate-bind services without having to update upstream forwarding rules or registrar data which both run much more slowly with updates than juju/cloud iterations, depending on SOA, record timeouts, etc.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

FYI, this is semi-related to lp#1773377 where we're looking to pass dns_frontend IPs to n-ovs and n-gw charms via relation. Instead of this, we can workaround by using/configuring these vips/service_ips and just manually configure the IPs in all three charms plus squash an upstream no-relation-available issu with this feature.

summary: - [Wishlist] VIP/Overlay IPs for bind servers
+ [Wishlist] Service IPs for bind servers
summary: - [Wishlist] Service IPs for bind servers
+ [Wishlist] Configurable Service IPs for bind servers
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

My 2 cents:

1) containers are registered as generic "devices" with interfaces with nodes set as parent devices in MAAS by juju during provisioning;
2) there is no way to specify device placeholders with fixed IPs similar to how static IPs can be assigned to nodes;
3) as a workaround, a VLAN with a small subnet for the "dns-access" space can be used. In that subnet, all extra IPs can be added to a reserved range so that there are only as many usable IPs as there are designate bind units. One additional IP will be used for a VLAN bridge on a host, one IP will be used for an L3 gateway. This seems like a good deployment-time workaround.

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :

I'm not very keen on the approach of adding more out-of-band management of IP addresses via charm config - just like VIP's, having IP via config is brittle and prone to error. How much of a problem is this really? I appreciate that having to update DNS pointers in upstream DNS systems to point to new bind servers in the deployment is *work*.

Changed in charm-designate-bind:
status: New → Incomplete
importance: Undecided → Wishlist
Revision history for this message
Drew Freiberger (afreiberger) wrote :

I think we could have this debate for a while about the work/changes involved. This is a pain point for every cloud handover to bootstack. the customer tests DNS with one set of bind IPs with Field Engineering, then Bootstack redeploys getting new bind IPs (without Dmitrii's workaround subnet).

Work required sometimes spans many groups and many services, though. You've got corporate DNS which is controlled outside the juju model, you've potentially got internic/whois records, if it's a public cloud which take X hours/days to update, you've got firewalls/ACLs to reconfigure, and you have to bounce neutron-gateway/n-ovs for dhcp-agent dns-server updates for dnsmasq. All of these are risks when what may have happened was a box hosting a bind server died and needed to be rebuilt.

I take the point of not wanting to add more ip management in the charm, but right now, DNS is brittle without statically assignable IPs, as it is the one service that can't be referenced by external name records.

Another solution is to deploy KVMs with static IPs on the infra nodes for bind, but that would require those VMs to be able to be deployable within two zones (which maas pods don't do).

I think the issue is less work-effort, and more about time to restore service based on the different things that plug into DNS that have to be reconfigured and aren't under the control of the cloud operator.

Changed in charm-designate-bind:
status: Incomplete → New
James Page (james-page)
Changed in charm-designate-bind:
status: New → Triaged
Revision history for this message
Vern Hart (vern) wrote :

IMO, the most important point to consider: Day-2 operations.

The current workflow is:

1. Deploy the designate-bind units
2. Obtain the IPs of the designate-bind units
3. Some combination of:
   a. Push the two IPs to neutron-gateway and neutron-ovs
   b. Push the two IPs to MAAS dns
   c. Push the two IPs to customer's corporate dns

This isn't too bad and almost all of it can be automated.

Where it can get complicated is when one of those designate-bind units has to be replaced. It's uncommon for this to happen soon after the deployment so it's easy to forget that these units are special and require extra redeployment steps.

Instead of using a small dns-access space to trick MAAS into giving these lxd containers the same pair of IPs (which can be problematic if MAAS doesn't release the IPs in a timely manner), it would be better if we can specify a pair of VIPs or service IPs.

These vips could be managed by juju through relations as Drew described. Or perhaps we could define hacluster rules which ensure the two IPs live on at least one server but not on the same server.

Revision history for this message
Vern Hart (vern) wrote :

Just to keep this wishlist alive, I was looking at pacemaker resource rules and I believe this could be done if we create two standard VIP (or SIP, service IP?) resources:

# pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=192.168.111.10 \
  cidr_netmask=32 op monitor interval=30s
# pcs resource create vip2 ocf:heartbeat:IPaddr2 ip=192.168.111.11 \
  cidr_netmask=32 op monitor interval=30s

Then, we'd simply add anti-colocation rules to tell pacemaker to place them on different hosts. (https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#s-resource-colocation) If we set the score to something like -500 then it would be possible for the two VIPs to exist on the same host if there are no other options. This seems like a good idea so that both VIPs will continue to function when the designate-bind units are in a degraded state.

# pcs constraint colocation add vip1 with vip2 -500
# pcs constraint colocation add vip2 with vip1 -500

Should this logic exist within the designate-bind charm? Maybe the current haproxy charm would be the right place. Or maybe this should this exist in a new subordinate charm (perhaps called multi-vip or service-vips or something)?

Changed in charm-designate-bind:
assignee: nobody → Martin Kalcok (martin-kalcok)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-designate-bind (master)
Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-designate-bind (master)

Reviewed: https://review.opendev.org/c/openstack/charm-designate-bind/+/821729
Committed: https://opendev.org/openstack/charm-designate-bind/commit/4000b6ea72e9ca2350a24af87a22afdef08556af
Submitter: "Zuul (22348)"
Branch: master

commit 4000b6ea72e9ca2350a24af87a22afdef08556af
Author: Martin Kalcok <email address hidden>
Date: Tue Dec 14 16:55:45 2021 +0100

    Add functionality to configure virtual service IPs

    Closes-Bug: #1804057
    func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/685
    Change-Id: I29f2eeca508048fdf8464193368cce5720559b9e

Changed in charm-designate-bind:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-designate-bind (stable/yoga)
Revision history for this message
Alexander Litvinov (alitvinov) wrote (last edit ):

I have proposed this change to stable/yoga as this is a feature that we and our customers have been waiting for multiple years.

I consider the regression risk minimal as:
- change is relatively small
- it doesn't break or change existing workflow or bundles
- the new feature is explicitly opt-in so there is a minimum risk for a potential regression.
New parts of the code are only used when 'config.changed.service_ips' parameter is set which was not present before this change or flag 'ha.connected' is present - the hacluster relation which was not also present before this change. So the functionality of the charm which existed before this change is not affected at all, as long as new relation or new config parameter are not used.

I have tested manually and functional tests are present.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-designate-bind (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/charm-designate-bind/+/877100
Committed: https://opendev.org/openstack/charm-designate-bind/commit/2355d603d0d646c1e5b24bdd2d57b7484893724c
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 2355d603d0d646c1e5b24bdd2d57b7484893724c
Author: Martin Kalcok <email address hidden>
Date: Tue Dec 14 16:55:45 2021 +0100

    Add functionality to configure virtual service IPs
    (cherry picked from commit 4000b6ea72e9ca2350a24af87a22afdef08556af)

    Closes-Bug: #1804057
    func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1021
    Change-Id: I29f2eeca508048fdf8464193368cce5720559b9e

tags: added: in-stable-yoga
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.