Maas creates a pod machine even when a machine tag is set as a constraint

Bug #1706196 reported by Jason Hobbs
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Newell Jensen
2.2
Fix Released
High
Newell Jensen

Bug Description

I am using pods, and have VMs created from those pods in three zones that I've created (zone1, zone2, zone3). There is also a 'default' zone in MAAS, which I'm not allowed to edit or delete, and has no nodes in it.

When I juju bootstrap without a 'zone' placement directive, juju tries first to allocate a vm from the 'default' zone in MAAS. MAAS sees that there is no matching node, then that I'm using pods, and creates a new VM to respond to juju's request for a node.

My expected behavior here is for juju to end up bootstrapping onto one of the existing VMs in one of the zones with nodes in it. I can get that by using a '--to zone=zone1' placement directive, but I shouldn't have to - it should just work and use one of the existing VMs.

This also affects juju's "enable-ha" command, where there is no ability to specify zone. In that case there is no workaround - it always tries to use the default zone rather than the zones I'm using that already have machines available. The only solution here is to use the default zone, even though I don't want a zone named 'default'.

Here's part of the maas.log where the node was created:

Jul 24 20:09:01 infra1 maas.api: [info] Request from user root to acquire a machine with constraints: [('agent_name', ['4fc971aa-41fc-4bae-8ac8-724fd407c338']), ('zone', ['default']), ('mem', ['3584']), ('tags', ['vm'])]
Jul 24 20:09:03 infra1 maas.drivers.pod.virsh: [info] living-deer: Successfully set network boot order
Jul 24 20:09:04 infra1 maas.node: [info] living-deer: Storage layout was set to flat.
Jul 24 20:09:04 infra1 maas.node: [info] living-deer: Status transition from READY to ALLOCATED
Jul 24 20:09:04 infra1 maas.node: [info] living-deer: allocated to user root
Jul 24 20:09:05 infra1 maas.interface: [info] Allocated automatic IP address 10.245.222.20 for eth0 (physical) on living-deer.
Jul 24 20:09:05 infra1 maas.node: [info] living-deer: Status transition from ALLOCATED to DEPLOYING
Jul 24 20:09:06 infra1 maas.power: [info] Changing power state (on) of node: living-deer (pmmn6n)

Here's the juju debug log showing it trying to acquire a node:
20:09:00 INFO cmd bootstrap.go:357 Starting new instance for initial controller
Launching controller instance(s) on foundations-maas...
20:09:04 DEBUG juju.cloudconfig.instancecfg instancecfg.go:832 Setting numa ctl preference to false
20:09:05 DEBUG juju.service discovery.go:63 discovered init system "systemd" from series "xenial"
20:09:05 DEBUG juju.provider.maas environ.go:1018 maas user data; 3836 bytes
20:09:06 DEBUG juju.provider.maas environ.go:1050 started instance "pmmn6n"
 - pmmn6n (arch=amd64 mem=3.5G cores=1)
20:09:06 INFO juju.environs.bootstrap bootstrap.go:606 newest version: 2.2.2
20:09:06 INFO juju.environs.bootstrap bootstrap.go:621 picked bootstrap agent binary version: 2.2.2
20:09:06 INFO juju.environs.bootstrap bootstrap.go:393 Installing Juju agent on bootstrap instance
20:09:08 INFO cmd bootstrap.go:485 Fetching Juju GUI 2.7.5

This is with juju 2.2.2 and maas 2.2.2 (6094-g78d97d0-0ubuntu1~16.04.1).

Related branches

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

IMO pods should have zones associated with them - at least for virsh pods it makes sense. A machine hosting a virsh pod will be in a specific zone. If I could mark all of my pods with zones, in this case, maas would see there are no pods in the default zone and would not try to create a node there, and juju would move on to the zones with nodes, just like it normally would without pods.

description: updated
tags: added: cdo-qa
description: updated
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I added juju to the bug because it's possible part of this could be resolved there - enable-ha could take a list of zones, for example.

description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Based on my testing, this doesn't seem to be a bug in MAAS.

I have:

1 pod
0 machines in 'default' zone
2 machines in 'zone1' zone
4 machines in 'zone2' zone

I did the following commands:

maas admin machines allocate # allocated machine in zone1
maas admin machines allocate mem=1024 # allocated machine in zone2
maas admin machines allocate mem=1024 # allocated machine in zone1
maas admin machines allocate zone=default # created and allocate machine from the POD.

As such, MAAS is working as you would expect. The issue may be that juju is always sending zone=default when not specifying and specific zone.

Changed in maas:
status: New → Won't Fix
status: Won't Fix → Invalid
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

juju will always try to balance units across availability zones. Since we can't get rid of the default zone, and pods don't have zones associated with them, this is still an issue. How is juju supposed to know not to use the default zone, but to use the other ones instead?

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

If bug 1381125 were fixed, this wouldn't be an issue.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Or bug 1706438 - which is the 'pods don't have zones' part of this bug.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

roaksoax - the difference between your test and our use is that our use includes a tag constraint - it wants a machine with the tag 'vm'. To me, that implies an existing machine - not one that doesn't exist yet. If a user wanted a new machine, that doesn't exist, they would not supply a tag constraint.

Changed in maas:
status: Invalid → New
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

In this case, when there are tags supplied in the constraint, I do not believe MAAS should be allocating a new node from a pod.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

One of the most important use cases for tags is acting as constraints for things maas doesn't model. For example, if I have a tag that is generated based on a xpath query to lshw, and I use that tag as a constraint when allocating a node, I want to either get a machine with lshw that matches the query or get an allocation failure - I do not want a VM generated to match the other constraints. The current behavior breaks that use case.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

The same thing affects bundle deployment. We have 3 zones - default, zone2, zone3. juju is asking for a node on zone2, and MAAS, instead of saying it has nothing there, is allocating a VM on pod1. All of the allocations are going to pod1. We have available machines in zone3, but juju never gets to asking for that - MAAS always responds when it asks for zone2 with a newly created vm.

Here's an example request:
Jul 25 22:01:44 infra3 maas.api: [info] Request from user root to acquire a machine with constraints: [('agent_name', ['775df0f6-e0f0-48be-8fa9-7110eece5c5e']), ('zone', ['zone2']), ('tags', ['vm']), ('interfaces', ['peer:space=1;0:space=1;client:space=1;data:space=1;logs:space=1;nrpe-external-master:space=1'])]

In response to this, MAAS is creating a VM on pod1 (always pod1 - even though we have 3 pods).

IMO MAAS should see that there is a tag constraint included and say nothing available matches.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Here is the output from pods read on our setup:

http://paste.ubuntu.com/25172845/

Here is our bundle:
https://pastebin.canonical.com/194396/

summary: - when using pods, during juju bootstrap maas creates vm in zone default
- when a vm already exists in another zone
+ when using pods, maas creates vm's in response to allocation requests
+ that include tag and zone constraints
summary: - when using pods, maas creates vm's in response to allocation requests
+ when using pods, maas creates VMs in response to allocation requests
that include tag and zone constraints
Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: when using pods, maas creates VMs in response to allocation requests that include tag and zone constraints

@Jason,

I agree, Maas should return no machine available as zone + tag is provided. If no tag where to be provided it should create a new vm.

That said, Maas does support storage and network tags for pods, which is where the issue is.

Finally, do you mean this bug to address a whole bunch of things or can I drop Juju from this and rename this to "Maas creates a pod machine even when a machine tag is set as a constraint".

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

roaxsoax +1 for the rename - I have been filing other issues to cover the other stuff I've come across.

summary: - when using pods, maas creates VMs in response to allocation requests
- that include tag and zone constraints
+ Maas creates a pod machine even when a machine tag is set as a
+ constraint
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I created bug 1706462 against juju to cover the way juju asks for a machine in a specific zone, even when no zone constraint was supplied by the user.

no longer affects: juju
Changed in maas:
milestone: none → 2.3.0
importance: Undecided → High
status: New → Triaged
assignee: nobody → Newell Jensen (newell-jensen)
tags: added: pod
Changed in maas:
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.3.0 → 2.3.0alpha1
Changed in maas:
status: Fix Committed → Fix Released
tags: added: foundation-engine
tags: added: foundations-engine
removed: foundation-engine
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Another workaround for 2.2.2 is using hostnames defined in MAAS:

juju add-unit kibana --to kibana-3.maas

If you make a mistake of using add-unit with virsh pods and MAAS 2.2.2 beware that Juju will create tons of VMs until it runs out of retries (several VMs per an attempt).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.