destroy-environment on an unbootstrapped MAAS environment can release all my nodes

Bug #1490865 reported by Mike McCracken on 2015-09-01
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
High
Unassigned
juju
High
Cheryl Jennings
juju-core
High
Tim Penhey
1.25
High
Tim Penhey

Bug Description

sabdfl reported a bug ( https://github.com/Ubuntu-Solutions-Engineering/openstack-installer/issues/659 ) on the openstack-installer in which the installer, having failed to bootstrap a maas env (juju wasn't installed), then tries to do "destroy-environment --force --yes maas" (after juju was installed manually).

This resulted in the installer host itself being released.

Expected behavior would be for nothing to happen.

I haven't tested this with a sample environment yet, but I believe what's happening is juju is seeing a valid maas environment, and is calling provider/common/destroy.go's destroyInstances, which calls env.AllInstances ( https://github.com/juju/juju/blob/master/provider/common/destroy.go#L34 ), which calls environ.acquiredInstances, adding an agent_name filter to the MAAS API GET call here: https://github.com/juju/juju/blob/master/provider/maas/environ.go#L1298

However, if maas-agent-name is not set in the config (because it was never bootstrapped), maasAgentName() will return "": https://github.com/juju/juju/blob/master/provider/maas/config.go#L49

It appears that passing the empty string as agent_name to MAAS will return all instances owned by the user.
It simply reads the value from the query string here http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/src/maasserver/api/nodes.py#L196
Then it does a filter of the Django QuerySet, which contains Nodes, which have a default agent_name of '' : http://bazaar.launchpad.net/~maas-committers/maas/trunk/view/head:/src/maasserver/models/node.py#L466

So if a system was acquired by something that didn't change the default agent_name, then it looks like that filter will leave it in, and thus juju will end up releasing it because it thinks its agent_name is '' also.

Mike McCracken (mikemc) on 2015-09-01
tags: added: openstack-installer
Mike McCracken (mikemc) wrote :

I've reproduced this on trusty with maas 1.8.

Here's what I did:

1. spin up two hosts on maas
2. sign into one of them
3. write out a valid environments.yaml with a 'maas' env with valid maas api key and ip. (via the openstack installer packaging)
4. attempt a bootstrap that fails (in the bug it failed because juju wasn't installed, but it's also sufficient to bootstrap --to nonexistent-system.isnt.in.maas
5. 'juju destroy-environment maas' fails reasonably with an error message that it can't connect to the (juju) API because env isn't bootstrapped.
6. 'juju destroy-environment --force maas ' issues these requests to the MAAS API:

- 172.16.0.61 - - [01/Sep/2015:13:40:37 -0400] "GET /MAAS/api/1.0/nodes/?agent_name=&op=list HTTP/1.1" 200 1563 "-" "Go 1.1 package http"
- 172.16.0.61 - - [01/Sep/2015:13:40:37 -0400] "POST /MAAS/api/1.0/nodes/?op=release HTTP/1.1" 200 355 "-" "Go 1.1 package http"

7. and now everything I owned in maas is now released - including the system I was running the juju client on, and the other system I deployed, neither of which were created by juju.

I think that juju should check if agent_name is set to "" and avoid releasing in that case. Since it is set when bootstrap succeeds, it appears reasonable to assume that if agent_name isn't set, there's no way to safely only destroy things juju started so it should bail.

tags: added: cloud-installer
removed: openstack-installer
Mike McCracken (mikemc) wrote :

Added affecting MAAS as well because on second look I don't understand why MAAS is releasing all nodes. It looks like the op=release request should not do anything.

description: updated
Jason Hobbs (jason-hobbs) wrote :

We saw something similar happen in OIL yesterday - all nodes were released around the time we received a release request from juju.

tags: added: oil
Andres Rodriguez (andreserl) wrote :

Seems like a Juju bug.

Blake Rouse (blake-rouse) wrote :

If Juju calls /MAAS/api/1.0/nodes/?op=release without a list of nodes MAAS will release all nodes that the user has allocated. This is potentially bad behavior for all nodes.

I still think it should be possible for a user to release all nodes but should not be the default behavior. Adding a force_all=True parameter would be better behavior to make it explicit for the user.

Changed in maas:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.9.0
Blake Rouse (blake-rouse) wrote :

I still think it would be good for Juju to not make that call if it knows it doesn't have any allocated nodes.

Curtis Hovey (sinzui) on 2015-09-08
tags: added: destroy-environment maas-provider
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.25.1
Changed in juju-core:
milestone: 1.25.1 → 1.25.2
Tim Penhey (thumper) wrote :

It seems that the maas-agent-name started to be used with Juju 1.18.

Any environment created since then will have a valid maas-agent-name.

Effectively erroring out now if agent name is empty when calling destroy. A warning message is output explaining why, and if this is a running environment to manually decommission the machines using maas.

Changed in juju-core:
assignee: nobody → Tim Penhey (thumper)
status: Triaged → In Progress
Tim Penhey (thumper) on 2015-11-30
Changed in juju-core:
status: In Progress → Fix Committed
Changed in juju-core:
milestone: 1.25.2 → 1.26-alpha3
status: Fix Committed → Triaged
Changed in juju-core:
milestone: 1.26-alpha3 → 2.0-alpha1
Changed in juju-core:
milestone: 2.0-alpha1 → 2.0-alpha2
Changed in juju-core:
milestone: 2.0-alpha2 → 2.0-alpha3
Changed in maas:
status: Triaged → Invalid
milestone: 1.9.0 → none
Changed in juju-core:
milestone: 2.0-alpha3 → 2.0-beta3
Changed in juju-core:
assignee: Tim Penhey (thumper) → Cheryl Jennings (cherylj)
Changed in juju-core:
status: Triaged → Fix Committed
Curtis Hovey (sinzui) on 2016-03-25
Changed in juju-core:
status: Fix Committed → Fix Released
affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta3 → none
milestone: none → 2.0-beta3
Changed in juju-core:
assignee: nobody → Tim Penhey (thumper)
importance: Undecided → High
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers