LXCs assigned IPs by MAAS DHCP lack DNS PTR entries

Bug #1382190 reported by Christian Reis on 2014-10-16
34
This bug affects 10 people
Affects Status Importance Assigned to Milestone
MAAS
Critical
Graham Binns
maas (Ubuntu)
Undecided
Unassigned

Bug Description

As per https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1274947/comments/13 we are not generating DNS PTR entries for LXC deployments on MAAS-driven DHCP; I assume this has regressed with the new static assignment mechanism.

This is a regression from 1.5 behaviour and blocks CTS deployments of 1.7 with LXCs.

Related branches

Christian Reis (kiko) on 2014-10-16
summary: - LXCs assigned IPs by MAAS DHCP lack PTR entries
+ LXCs assigned IPs by MAAS DHCP lack DNS PTR entries
description: updated
JuanJo Ciarlante (jjo) wrote :

1.7beta generated named PTR file (fyi all entries there correspond to metal hosts):
http://paste.ubuntu.com/8575506/

Christian Reis (kiko) wrote :

Targeting to 1.7 and assigning to Julian to see what can be done about it.

Changed in maas:
milestone: none → 1.7.0
importance: Undecided → Critical
assignee: nobody → Julian Edwards (julian-edwards)
Julian Edwards (julian-edwards) wrote :

So I've looked at this and I think it needs to be a won't-fix as it stands, but I want your opinion first because there's something else MAAS should be doing.

LXCs have never been supported as a first class citizen in MAAS, so the fact that a non-MAAS node obtained an IP with a DNS entry is a little bit of a side effect of how it was previously implemented. The DNS entries from dynamically assigned DHCP IPs are dangerous because they can go out of date at any time and MAAS has no way to track that in a timely fashion, and in fact this is exactly what was happening in OIL which pushed us to doing the static IP solution.

MAAS has an API to request an IP address separately to whatever the nodes may get assigned. It was intended that juju-created LXCs use this API to get an IP and configure the LXC network interface(s) appropriately.

What's missing from MAAS's API is that the IPs obtained on the API don't have the option of being given a DNS entry. If we fix that, and then juju starts using the API, everything will be working perfectly.

Let me know what you think.

Julian Edwards (julian-edwards) wrote :

Also FTR it's not the static IP work that left LXCs without PTRs, it's the fact that we used to pre-generate the whole zone with fake host names and used CNAMEs for any nodes that MAAS knew about. That came with a raft of its own problems (some charms failed because CNAMEs-> IP -> reverse lookup -> different A record) and IP ranges were limited to /16 because otherwise it took a long time to write the zone every time.

There is no doubt that juju needs to move to use the new API.

Jeroen T. Vermeulen (jtv) wrote :

I agree: this is a clear-cut case for the request-an-address API, but we still lack a request-a-hostname API.

Out of curiosity, how did the existing setup(s) obtain these generated hostnames in the first place? To my knowledge they were neither exposed nor documented. MAAS never had much control over which non-node client got which address from DHCP, and that's why we had to re-work DHCP management. These old generated hostnames were no more stable than the IP addresses were: if the container moved to a different IP address, it would also move to a different hostname.

So if at all possible, I would address the problem in two discrete steps:

1. Use the container's IP address, not its hostname. It should be about as stable as the hostname was.

2. Request a static IP through the MAAs API, and configure it statically on the container.

Jeroen T. Vermeulen (jtv) wrote :

Ah, I see that using the IP address is not an option in this case. So we'll have to add a way to manage DNS entries.

Raphaël Badin (rvb) wrote :

> Out of curiosity, how did the existing setup(s) obtain these generated hostnames in the first place? To my knowledge they were
> neither exposed nor documented.

As Jeroen said, we need to understand this in order to suggest a solution that fits CTS' needs. My guess is that they where deriving the hostname from the IP address "manually", having observed the pattern that MAAS uses.

Agreed that the path of least resistance at this stage is to improve the new "reserve IP"API to include a hostname and use that in Juju.

JuanJo Ciarlante (jjo) wrote :

To clarify what's happening with the rabbitmq charm: for its units to be
able cluster together, they need to refer to each other by hostname, see
[0] which was done based on the observed pattern as per #4,#7 comments above.

[0] https://code.launchpad.net/~jjo/charms/trusty/rabbitmq-server/fix-nodename-to-host-dns-PTR

Gavin Panella (allenap) wrote :

As a workaround, can the charm add hostnames for each node in the cluster to /etc/hosts?

Changed in maas:
status: New → Incomplete
Graham Binns (gmb) wrote :

We've agreed that as a fix for the problem — since it's going to potentially break a lot of charms on customer deployments — we're going to put PTR records back in for dynamic-pool IP addresses. I'm going to investigate using the bind GENERATE directive to make this simpler [1].

 [1] http://www.thedumbterminal.co.uk/?action=showArticle&articleId=116

Changed in maas:
status: Incomplete → Triaged
assignee: Julian Edwards (julian-edwards) → Graham Binns (gmb)
Graham Binns (gmb) wrote :

I started out by poking the $GENERATE directive into my reverse zone file manually:

; Zone file modified: 2014-10-21 18:24:59.654931.
; Note that the modification time of this file doesn't reflect
; the actual modification time. MAAS controls the modification time
; of this file to be able to force the zone to be reloaded by BIND.
$TTL 300
@ IN SOA maas. nobody.example.com. (
              0000000260 ; serial
              600 ; Refresh
              1800 ; Retry
              604800 ; Expire
              300 ; TTL
              )

    IN NS maas.
$GENERATE 1-100 $.0.0.10.in-addr.arpa. IN PTR no-name-yet-$.maas.
129.0.0.10.in-addr.arpa. IN PTR devoted-minute.maas.
128.0.0.10.in-addr.arpa. IN PTR dismayed-clown.maas.

This works perfectly:

graham@dodger:~⟫ dig +noall +answer -x 10.0.0.128 @10.0.0.1
128.0.0.10.in-addr.arpa. 300 IN PTR dismayed-clown.maas.
graham@dodger:~⟫ dig +noall +answer -x 10.0.0.129 @10.0.0.1
129.0.0.10.in-addr.arpa. 300 IN PTR devoted-minute.maas.
graham@dodger:~⟫ dig +noall +answer -x 10.0.0.100 @10.0.0.1
100.0.0.10.in-addr.arpa. 300 IN PTR no-name-yet-100.maas.

So, now all we need to do is have the DNSReverseZoneConfig add this for the dynamic range (I say "all we need to do"; obviously that's > 1 task, but I'll divide it up in the morning).

Graham Binns (gmb) wrote :

I think this will work for IPv6 too, but as Kiko pointed out on IRC so will wildcards — the main reason I'd like to stick with $GENERATE for both is that it's one less bit of code to write ;)

On 22 October 2014 22:03, Graham Binns <email address hidden> wrote:
> This works perfectly:

Awesome, that's great. Sort of brain-dump follows:

How does this work for networks that don't line up with IPv4 octets? For
example, for a dynamic range that covers 192.168.0.0/23, would we need
to write the following?

  $GENERATE 1-254 $.0.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
  $GENERATE 1-254 $.1.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.

A /16 would need 256 lines.

Is this expanded macro-like within BIND, or is it compiled into a rule?
It may make a difference to performance.

I assume we'll do this only for IPv4? If it's compiled to a rule, we
/could/ do it for IPv6 too, but that may be a slope down which we don't
want to slip. Ultimately we want to steer people towards reserving IP
addresses via MAAS's API.

Is this enough to satisfy RabbitMQ? Some services like to make sure that
they can round-trip through DNS, i.e. that the PTR record resolves to a
name that then resolves back to the starting address.

We could choose the prefix to encourage people to reserve addresses. For
example, "dynamic-$", "anon-$", "temporary-$", or "do-not-use-$", in
order of severity. Unfortunately this is bikeshed territory, so I'm
going to put myself out of the running by voting for
"do-not-rely-on-this-you-get-to-keep-the-pieces-and-give-me-one-million-dollars-$".

Graham Binns (gmb) wrote :

On 22 October 2014 23:19, Gavin Panella <email address hidden> wrote:
> On 22 October 2014 22:03, Graham Binns <email address hidden> wrote:
>> This works perfectly:
>
> Awesome, that's great. Sort of brain-dump follows:
>
> How does this work for networks that don't line up with IPv4 octets? For
> example, for a dynamic range that covers 192.168.0.0/23, would we need
> to write the following?
>
> $GENERATE 1-254 $.0.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
> $GENERATE 1-254 $.1.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
>
> A /16 would need 256 lines.

If I'm reading [1] right (and I may not be; it's late), we could do
something like this:

$GENERATE 1-254 $.$.0.10.in-addr.arpa. IN PTR no-name-yet-$.$.maas.

But a quick check of that on my local MAAS suggests it doesn't work.
I'll dig further tomorrow.

> Is this expanded macro-like within BIND, or is it compiled into a rule?
> It may make a difference to performance.

AFAICT, it's expanded rather than compiled into a rule.

> I assume we'll do this only for IPv4? If it's compiled to a rule, we
> /could/ do it for IPv6 too, but that may be a slope down which we don't
> want to slip. Ultimately we want to steer people towards reserving IP
> addresses via MAAS's API.

True. That said, if we fix this for v6 too it means that some very
nasty hacks in charms can be canned, and that's not a terrible thing
for our users. If it's easy to do, I say we might as well be
consistent. If v6 causes (as I suspect it will) considerable headaches
here, then we'll stick with v4 and push for everyone to use APIs as
they should be doing.

> Is this enough to satisfy RabbitMQ? Some services like to make sure that
> they can round-trip through DNS, i.e. that the PTR record resolves to a
> name that then resolves back to the starting address.

I haven't checked that out yet, but it's a good point. We could always
use $GENERATE to generate the forward zone too, though, if your first
point doesn't make that entirely onerous.

> We could choose the prefix to encourage people to reserve addresses. For
> example, "dynamic-$", "anon-$", "temporary-$", or "do-not-use-$", in
> order of severity. Unfortunately this is bikeshed territory, so I'm
> going to put myself out of the running by voting for
> "do-not-rely-on-this-you-get-to-keep-the-pieces-and-give-me-one-million-dollars-$".

"use-this-at-your-peril-$.maas".

Graham Binns (gmb) wrote :

On 22 October 2014 23:56, Graham Binns <email address hidden> wrote:
> $GENERATE 1-254 $.$.0.10.in-addr.arpa. IN PTR no-name-yet-$.$.maas.
>
> But a quick check of that on my local MAAS suggests it doesn't work.
> I'll dig further tomorrow.

Nope, scratch that, it works fine. Well, named-checkzone generates a
perfectly valid set of rules, but strangely the reverse lookup itself
doesn't work. Definitely more poking to be done on the morrow.

For a /23 (or some other such fun thing), yes, I think we'd have to write

$GENERATE 1-254 $.0.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
$GENERATE 1-254 $.1.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.

Again. More `dig`ing (geddit?) tomorrow.

On Wednesday 22 Oct 2014 23:18:24 you wrote:
> On 22 October 2014 23:56, Graham Binns <email address hidden> wrote:
> > $GENERATE 1-254 $.$.0.10.in-addr.arpa. IN PTR no-name-yet-$.$.maas.
> >
> > But a quick check of that on my local MAAS suggests it doesn't work.
> > I'll dig further tomorrow.
>
> Nope, scratch that, it works fine. Well, named-checkzone generates a
> perfectly valid set of rules, but strangely the reverse lookup itself
> doesn't work. Definitely more poking to be done on the morrow.
>
> For a /23 (or some other such fun thing), yes, I think we'd have to
> write
>
> $GENERATE 1-254 $.0.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
> $GENERATE 1-254 $.1.168.192.in-addr.arpa. IN PTR no-name-yet-$.maas.
>
> Again. More `dig`ing (geddit?) tomorrow.

See the related bug 1356012 about that.

I think this is a dupe of bug 1274947.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in maas (Ubuntu):
status: New → Confirmed
Graham Binns (gmb) on 2014-10-23
Changed in maas:
status: Triaged → In Progress
Graham Binns (gmb) wrote :

Ah, so, the light of day: $GENERATE is only useful for networks with an address range > /24 — i.e. up to 255 addresses. After that it breaks down, because:

$GENERATE 1-254 $.$.0.10.in-addr.arpa. IN PTR no-name-yet-$.$.maas.

Only generates 255 addresses; it's not recursive. You just get

10.10.0.10.in-addr.arpa. 300 IN PTR no-name-yet-10-10.maas.
100.100.0.10.in-addr.arpa. 300 IN PTR no-name-yet-100-100.maas.
101.101.0.10.in-addr.arpa. 300 IN PTR no-name-yet-101-101.maas.
102.102.0.10.in-addr.arpa. 300 IN PTR no-name-yet-102-102.maas.
103.103.0.10.in-addr.arpa. 300 IN PTR no-name-yet-103-103.maas.
104.104.0.10.in-addr.arpa. 300 IN PTR no-name-yet-104-104.maas.
105.105.0.10.in-addr.arpa. 300 IN PTR no-name-yet-105-105.maas.

Etc.

Which means that for a /16 we'd need 256 $GENERATEs; for a /8, 65536 of them. That's… suboptimal.

Options:

 1. We could say "We'll generate PTR records for /24 dynamic ranges only."
     Advantages: Pretty easy to do
     Disadvantages: We're forcing users into a particular configuration.
 2. We limit the dynamic range to /16 for IPv4 (similar to before when we enforced a /16 network).
     Advantages: 256 lines isn't the *worst*, performance wise, so we could probably get away with it.
     Disadvantages: We still have to write those 256 lines on every deployment.
 3. We add the PTR record when we parse the DHCP lease
     Advantages: No bloat; pretty much as we already do now for static IPs.
     Disadvantages: Noticeable lag between the node acquiring a lease and getting a PTR.

None of these options is lovely.

Graham Binns (gmb) wrote :

(I'd say that option 3 is the most tasteful, as long as we can solve the lag problem).

Gavin Panella (allenap) wrote :

I think #1 is too limited. This will only satisfy a subset of users, and
it's confusing to an outsider trying to deploy their charms, "why won't
RabbitMQ deploy in cluster $x when it deploys fine in cluster $y?". The
answer is not easily discoverable, and the MAAS administrator may not
realise that the size of the cluster's network is relevant to the
frustrated user unless he/she has fully understood the release notes.

The lag in #3 may cause hard to diagnose problems. There are also a lot
more moving parts to line up.

I say go for #2. Rendering 256 lines is a tiny blip in performance,
especially compared to rendering ~65k lines without $GENERATE.

I don't think #3 is that bad actually; remember that's how we used to do it before static IP days so it's tried and tested (although admittedly the DNS was in place by the time commissioning ended).

#2 is also good, but you have to calculate which lines to generate based on the netmask.

I'd go for #2 on balance.

Christian Reis (kiko) wrote :

Let's try #2, though we will need to limit the size of the dynamic range as part of that.

Graham Binns (gmb) wrote :

On 23 October 2014 11:04, Julian Edwards <email address hidden> wrote:
> I don't think #3 is that bad actually; remember that's how we used to do
> it before static IP days so it's tried and tested (although admittedly
> the DNS was in place by the time commissioning ended).

Yeah, #3 doesn't work for the use case of LXC containers that come up
relatively quickly, aren't going through commissioning, and just want
to Get Stuff Done.

Graham Binns (gmb) wrote :

Andres and I have discussed this further, and he pointed out that the LXCs can take a *long* time to start. That would be sufficient time for us to parse the leases file and update the DNS with a PTR for the new node (option #3).

I've tested this out locally. The LXC container took ~30minutes to start from `juju deploy mysql --to lxc:0`, and it acquired an IP address ~5 minutes before the juju agent showed as started. It was another 5 minutes until the mysql unit showed as started.

This is more than enough time for us to create DNS entries for the container, with a hostname either based on its IP address (e.g. maas-dynamic-1.maas) or based on the client-hostname field in the lease (e.g. juju-machine-0-lxc-0-dynamic.maas). The second is more meaningful, but it's also more code.

Andres and I agreed that this was a better solution than #2 above.

On Thursday 23 Oct 2014 15:40:51 you wrote:
> Andres and I have discussed this further, and he pointed out that the
> LXCs can take a *long* time to start. That would be sufficient time for
> us to parse the leases file and update the DNS with a PTR for the new
> node (option #3).
>
> I've tested this out locally. The LXC container took ~30minutes to start
> from `juju deploy mysql --to lxc:0`, and it acquired an IP address ~5
> minutes before the juju agent showed as started. It was another 5
> minutes until the mysql unit showed as started.

I bet it was quicker for the next one, right? The first one that is done ends
up downloading the lxc images.

> This is more than enough time for us to create DNS entries for the
> container, with a hostname either based on its IP address (e.g. maas-
> dynamic-1.maas) or based on the client-hostname field in the lease (e.g.
> juju-machine-0-lxc-0-dynamic.maas). The second is more meaningful, but
> it's also more code.
>
> Andres and I agreed that this was a better solution than #2 above.

Assuming that the juju seeding part is taking up all the time after the LXC
comes up, this is indeed a better solution.

Christian Reis (kiko) wrote :

Well, whatever we do let's please not race with the lxc creation. That would be worse than fixing this bug. :)

Graham Binns (gmb) wrote :

Using $GENERATE statements, I get the following results:

/16 dynamic range: 10.0.0.1 - 10.0.255.255
 - Reverse Zone: http://paste.ubuntu.com/8721352/
 - Forward Zone: http://paste.ubuntu.com/8721358/

/24 dynamic range: 10.0.0.1 - 10.0.0.255
 - Reverse zone: http://paste.ubuntu.com/8721363/
 - Forward zone: http://paste.ubuntu.com/8721366/

/25 dynamic range: 10.0.0.128 - 10.0.0.255
 - Reverse zone: http://paste.ubuntu.com/8721391/
 - Forward zone: http://paste.ubuntu.com/8721403/

Weird dynamic range: 10.0.0.1 - 10.0.1.33:
 - Reverse zone: http://paste.ubuntu.com/8721501/
 - Forward zone: http://paste.ubuntu.com/8721503/

Note that there's what looks like a weird inefficiency in the /24 (1-255) and others. That's because I'm using netaddr.IPRange() internally to create an IP range to work from, and then converting that into a set of IPNetworks to ensure we get full coverage of weird corner-case networks (and so that I don't have to do all the maths myself). Thing is, .1-.255, although we identify it as a /24 by sight, isn't actually a /24; it's only 254 addresses, so IPRange breaks it down into individual CIDRs rather than just covering it with a /24. It's a weirdness we can live with, I think. For a /16/ we still see only 260-odd $GENERATES as opposed to 65,000 which I'd call an improvement :).

Christian Reis (kiko) wrote :

I like this! I would like it even more if we found a way to avoid multiple GENERATEs for /24s.

Graham Binns (gmb) on 2014-10-29
Changed in maas:
status: In Progress → Fix Committed
Christian Reis (kiko) wrote :

This was confirmed fixed today by Juan Ciarlante and will go out with 1.7.0. Fantastic!

Changed in maas:
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package maas - 1.7.0+bzr3299-0ubuntu2

---------------
maas (1.7.0+bzr3299-0ubuntu2) vivid; urgency=medium

  * debian/maas-common.postinst: Ensure that the maas user gets its home
    directory created in /var/lib/maas. (LP: #1399016)
 -- Andres Rodriguez <email address hidden> Thu, 04 Dec 2014 12:24:18 -0500

Changed in maas (Ubuntu):
status: Confirmed → Fix Released

Hello Christian, or anyone else affected,

Accepted maas into utopic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/maas/1.7.5+bzr3369-0ubuntu1~14.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed
Andres Rodriguez (andreserl) wrote :

This issue has been verified to work both on upgrade and fresh install, and has been QA'd. Marking verification-done.

tags: added: verification-done
removed: verification-needed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers