MAAS provider: terminate-machine --force or destroy-environment don't DHCP release container IPs

Bug #1483879 reported by Andreas Hasenack
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Dimiter Naydenov
1.24
Won't Fix
High
Unassigned
1.25
Fix Released
High
Dimiter Naydenov

Bug Description

juju-core 1.24.4

Related to bug #1348663

When using a MAAS provider, juju "leaks" container IP addresses by not DHCP releasing them in the following scenarios:

 * terminate-machine --force
Any containers in that machine will not release their leases (without --force does not apply, because juju does not allow you to terminate a machine that still has units on it). The IP of the machine itself is correctly released.

 * destroy-environment with or without --force
Only the IPs of the actual machines are released. Container IPs "leak"

One use case is the Autopilot: when removing a deployed region, it issues a destroy-environment. If that exits non-zero, landscape then does a destroy-environment --force. Doing it any other way, like issuing destroy-service, is time consuming and exposes Landscape to hook errors.

To give an idea, a cloud deployed on 6 nodes with the autopilot uses 6 IPs for the nodes from MAAS's static range, 37 IPs for the containers from the dynamic range and a few more static ones for virtual IPs for some openstack services. Each time a region is removed, 37 IPs leak in this example.

The cases that are working are:
 * terminate-machine with no services: the host IP, taken from the static range, is released
 * destroy-service: all container IPs from the service are released. The host IP (from static range) is left untouched because the machine is still up, even though it has no services anymore. It needs a terminate-machine call.
 * destroy-unit
 * destroy-environment: with or without --force, releases only the host IPs, i.e., the ones acquired from the static range.

tags: added: landscape
description: updated
summary: - MAAS provider: terminate-machine --force or destroy-environment (without
- --force) don't DHCP release container IPs
+ MAAS provider: terminate-machine --force or destroy-environment doesn't
+ DHCP release container IPs
summary: - MAAS provider: terminate-machine --force or destroy-environment doesn't
+ MAAS provider: terminate-machine --force or destroy-environment don't
DHCP release container IPs
Changed in juju-core:
milestone: none → 1.24.6
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

This scenario is addressed by the addressable containers feature on MAAS, which allocates static IPs for containers, not leaking DHCP leases. In 1.25 (perhaps even 1.24.x - will confirm later) and when using MAAS 1.8+ Juju registers its containers as MAAS Devices, providing IPs based on the container MAC. Destroying the host instance deallocates its devices and their resources, so we have confirmed even juju destroy-environment --force does not leak DHCP leases. The downside is that behavior is supported only with the "address-allocation" feature flag when bootstrapping, and although it's been available for a while now it's badly documented we're actively soliciting feedback and hope to improve this, making it the default container address management process.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Note on that calling destroy-environment might fail with a non-zero code for cases other, not just when it can't be destroyed without --force. A transient error, which could be retried easily and do a graceful destruction, causes unconditional destruction. I agree it's much faster, but unless we rely solely on the provider to clean up properly resource leaks like this can happen.

Revision history for this message
Curtis Hovey (sinzui) wrote :

Was the env bootstrapped with the "address-allocation" feature flag and the maas 1.8?

tags: added: destroy-machine maas-provider
Changed in juju-core:
status: New → Incomplete
Revision history for this message
Andreas Hasenack (ahasenack) wrote : Re: [Bug 1483879] Re: MAAS provider: terminate-machine --force or destroy-environment don't DHCP release container IPs

On Wed, Aug 12, 2015 at 10:41 AM, Curtis Hovey <email address hidden> wrote:

> Was the env bootstrapped with the "address-allocation" feature flag and
> the maas 1.8?
>
>
No. Is that all it takes? enable that feature flag and then these use cases
will work?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Trying with that feature flag. Juju status:
[Services]
NAME STATUS EXPOSED CHARM
ubuntu unknown false cs:trusty/ubuntu-4

[Units]
ID WORKLOAD-STATE AGENT-STATE VERSION MACHINE PORTS PUBLIC-ADDRESS MESSAGE
ubuntu/0 unknown idle 1.24.5 0/lxc/0 10.96.14.34
ubuntu/1 unknown idle 1.24.5 1 correja.scapestack
ubuntu/2 unknown idle 1.24.5 1/lxc/0 10.96.17.36

[Machines]
ID STATE VERSION DNS INS-ID SERIES HARDWARE
0 started 1.24.5 darby.scapestack /MAAS/api/1.0/nodes/node-2a69c69e-39eb-11e5-ab72-2c59e54ace74/ trusty arch=amd64 cpu-cores=4 mem=16384M tags=openstack-storage
1 started 1.24.5 correja.scapestack /MAAS/api/1.0/nodes/node-9e9afb06-39ea-11e5-ab72-2c59e54ace74/ trusty arch=amd64 cpu-cores=4 mem=16384M tags=openstack-admin

doing terminate-machine 1 --force releases the node IP, and the container IP:
Aug 12 19:24:53 atlas maas.ip_addresses: [INFO] User andreas released IP 10.96.17.36
Aug 12 19:24:55 atlas dhcpd: DHCPRELEASE of 10.96.11.19 from 2c:59:e5:3b:a1:e4 via eth0 (not found)

A destroy-environment (without --force) released the bootstrap node IP, but not the container IP that was running there:
Aug 12 19:26:25 atlas dhcpd: DHCPRELEASE of 10.96.11.18 from 2c:59:e5:3b:01:40 via eth0 (not found)

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

10.96.14.34 is still allocated to my user in maas:
$ maas andreas-atlas ipaddresses read
Success.
Machine-readable output follows:
[
...
    {
        "alloc_type": 4,
        "ip": "10.96.14.34",
        "resource_uri": "/MAAS/api/1.0/ipaddresses/",
        "created": "2015-08-12T19:08:48.149"
    }
]
$

Curtis Hovey (sinzui)
Changed in juju-core:
status: Incomplete → Triaged
milestone: 1.24.6 → 1.25.0
importance: Undecided → High
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Yes, because in 1.24.5 we didn't support MAAS 1.8 Devices API yet. It is supported in a feature branch, so if you're willing to try a tarball with binaries and "juju upgrade --upload-tools" or directly bootstrap with --upload-tools, you can give us some early feedback.

In 1.24.5 address allocation is still useful, but not if the host machine is shot in the head. Since bug 1348663 was fixed in 1.24.1 Juju injects a clean shutdown job (for upstart or systemd) that brings down all NICs on shutdown, so a graceful shutdown or reboot, as well as killing the host of a container triggers DHCPRELEASE. Unfortunately as I investigated, it seems the fix was unknowingly and effectively reverted by some cloudinit changes in 1.24.5

I'm assigning this to myself and will try doing some live testing to verify the fix. The original fix was https://github.com/juju/juju/pull/2548 and contains more details.

Changed in juju-core:
assignee: nobody → Dimiter Naydenov (dimitern)
status: Triaged → In Progress
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

If you problem is only the bootstrap node, can you confirm there's no /etc/init/juju-clean-shutdown.conf with upstart or /etc/systemd/system/juju-clean-shutdown.service with systemd on the bootstrap node and/or its containers file systems' ?

If the job is there in both host and containers, and the leases still leak, please attach machine-0*.log at DEBUG level at least.

What are the bootstrap arguments and --debug log?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

It happens with all nodes, not just bootstrap. destroy-environment does not
deallocate the container IPs.
On Aug 12, 2015 21:50, "Dimiter Naydenov" <email address hidden>
wrote:

> If you problem is only the bootstrap node, can you confirm there's no
> /etc/init/juju-clean-shutdown.conf with upstart or /etc/systemd/system
> /juju-clean-shutdown.service with systemd on the bootstrap node and/or
> its containers file systems' ?
>
> If the job is there in both host and containers, and the leases still
> leak, please attach machine-0*.log at DEBUG level at least.
>
> What are the bootstrap arguments and --debug log?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1483879
>
> Title:
> MAAS provider: terminate-machine --force or destroy-environment don't
> DHCP release container IPs
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju-core/+bug/1483879/+subscriptions
>

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Andreas confirmed with the feature flag the shutdown job is present in summary:

<andreas> bootstrap, deploy to one extra node (so you have bootstrap plus one), deploy containers to both
<andreas> destroy-environment
<andreas> node IPs are dhcp-released
<andreas> container IPs are not released (maas api)

So with the feature flag on, in the terminate-machine (w/ or w/o --force) case both static container IPs and DHCP-based node IPs are released. Destroy-environment without --force did not succeed multiple times (need to get logs about this!), and when it did node IPs were released but container IPs were not (no chance for the provisioner to call maas api, so it's expected). The latter, as discussed, should be solved by the 1.25 devices maas api support (need to confirm if only with the feature flag on).

Without the feature flag the shutdown cleanup job introduced to ensure leases are released seems to be missing. Most likely due to https://github.com/juju/juju/pull/2628 - will test independently.

Andreas: Please, provide some destroy-environment --debug logs (and MAAS and machine-0.logs at DEBUG level) both when it fails and the success with --force.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Sorry, I don't have bandwidth to do that now I'm afraid.

Changed in juju-core:
status: In Progress → Triaged
assignee: Dimiter Naydenov (dimitern) → nobody
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.25-alpha1 → 1.25-beta1
Revision history for this message
Alexis Bruemmer (alexis-bruemmer) wrote :

More logs are needed for further analysis (see comment #10)

Changed in juju-core:
status: Triaged → Incomplete
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.25-beta1 → 1.25-beta2
David Britton (dpb)
tags: added: kanban-cross-team
tags: removed: kanban-cross-team
Revision history for this message
Dean Henrichsmeyer (dean) wrote :

We've articulated how to reproduce this behavior. It happens every single time. If you need extra logs, please reproduce the situation yourself and introspect as necessary.

Changed in juju-core:
status: Incomplete → Confirmed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Confirmed → Triaged
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I can't reproduce this with 1.8 MAAS, as the "juju-clean-shutdown" upstart/systemd job stops all NICs on shutdown, both in the machine itself and every container juju provisions. I can see with dhcpdump DHCPRELEASE is being sent from nodes and containers. Since the address-allocation feature flag was off, I don't expect to see anything in staticipaddresses using MAAS CLI (and it was empty).

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

My attempts in the last comments were with address-allocation enabled. See comments #5 and #6.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Walking through the destroy-environment code, I can see that we don't destroy containers. This was in response to bug: https://bugs.launchpad.net/juju-core/+bug/1325830

Looking at possible solutions.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Discussed some with Tim today. Going to email William for input on how to properly clean up container resources as the solution will not be trivial.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Please, do NOT use the experimental addressable containers feature flag for anything important (prod/staging/etc.), as it has known issues and is going to be completely changed in the upcoming months.

That said, I think there are (possible) several separate issues discussed here, and we need separate bugs for them:
1. DHCPRELEASE not being sent from containers on shutdown, on destroy-machine */lxc/* [--force], *without* the feature flag.
2. DHCPRELEASE not being sent from containers on shutdown, on destroy-environment (without --force), *without* the feature flag.
3. DHCPRELEASE not being sent from containers and/or nodes on shutdown, on destroy-environment --force, *without* the feature flag.
4. Statically allocated container IPs are NOT released on container/host destruction, *with* the feature flag.
5. Same as above, except container IPs ARE deallocated (no longer linked to the host/container MAC address), but not RELEASED (still reserved for the user). I suspect there are differences in behavior here between "destroy-environment --force" and destroy-environment without --force, destroy-machine (w/ or w/o --force).

Changed in juju-core:
milestone: 1.25-beta2 → 1.25.1
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

To handle the case where the container is not sending DHCPRELEASE reliably, even with the shutdown job, we have a few options:

1. Delay the shutdown of the host to give time to all containers to shutdown cleanly.
This however is also not reliable, as it depends on how the node was shutdown (ACPI: poweroff, etc. - depending on the power type MAAS uses I guess)

2. If we assume MAAS 1.8+ with the Devices API support, which we already have support for otherwise when the feature flag is on, we can extract of few steps from the ones below. What we do during container provisioning on MAAS 1.8+ with the FF on is roughly this:
- while preparing both the userdata and lxc.conf for the new container, juju generates a MAC address for it (using a template)
- juju creates a device on in MAAS, specifying the hostname, parent node id (the host), and that MAC address
- at this point MAAS knows the parent-child relationship independently of juju, and also what MAC the device has.
- juju calls claim_sticky_ip_address API for the device (explicitly specifying which address to reserve)
- if the above fails (i.e. the requested IP is not available), juju retries with another IP
- once we successfully get the IP, we finish generating the userdata, incl. a script rendering /e/n/i in the LXC with a static config, finish the lxc.conf as well, populating the IP and other info.
- finally we start the container, and it gets the address
- when removing the container only (not as a consequence of its host's destruction), we call device remove, and MAAS takes care of releasing the device and it's allocated IPs.
- when calling destroy-machine(w/ or w/o --force) or destroy-environment (no --force), the above happens as well as we shutdown things gracefully (cleaning up units, services, machines, containers - normal life cycle loop)
- when calling destroy-env --force, juju just calls nodes release API, passing all env nodes' IDs (and since maas knows node->device->mac->ip link, it can also clean up)

Now, a lot of the above steps are not necessary to ensure all node and container IPs are cleaned up properly. We only need to let MAAS know the relationship chain (node->device>mac->ip), and the clean up phase will work even with with destroy-env --force.
I think what we need is only this:
- generate a MAC during userdata generation for a container, set it to the lxc.conf as well (so we know what MAC the container comes up with)
- when we have it, call device create passing the hostname, mac_address, and parent_id (store the device id for the container)
- finish userdata generation, start the container, etc.
- (ideally) remove the device when the container is removed (not need just to make dest-env --force to work as expected)
- profit!

Since MAAS knows the MAC, the LXC can happily use the default /e/n/i with DHCP (no special lxc.conf or /e/n/i changes needed - apart from the lxc.network.hwaddress for the MAC) and calling claim_sticky_ip_address should not be necessary.

tags: added: bug-squad
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I'm starting to work on a fix using the devices API (but by default - no feature flag will be needed), as described in comment #19, starting with the 1.24 release first.

Changed in juju-core:
milestone: 1.25.1 → 1.26-alpha1
assignee: nobody → Dimiter Naydenov (dimitern)
assignee: Dimiter Naydenov (dimitern) → nobody
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Since 1.24 does not yet have the addressable containers I've switched to 1.25, so I'll fix it there first, then backport it to 1.24 and forward port it to master.

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.26-alpha1 → 1.26-alpha2
tags: added: sts
Revision history for this message
Cheryl Jennings (cherylj) wrote :

Dimiter is actively working on the fix, and should have a branch up for review soon.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Fix for 1.24 proposed: https://github.com/juju/juju/pull/3684
Still have to finish a few more manual tests, but so far it looks good!

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

It was decided not to make a 1.24.8 release, so I'll forward port my proposed fix for 1.25 - it should simplify a few things in fact.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Slightly simpler fix forward ported from the 1.24 branch and proposed as: https://github.com/juju/juju/pull/3730

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Fix for 1.25 has landed and forward ported to master, and tested the same way. PR: https://github.com/juju/juju/pull/3746

Changed in juju-core:
status: Triaged → In Progress
assignee: nobody → Dimiter Naydenov (dimitern)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.