juju 2.3 incorrect unit placement

Bug #1765719 reported by Xavier Esteve on 2018-04-20
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
juju
High
Tim Penhey

Bug Description

After deploying openstack bundle with "juju deploy bundle.yaml" if we remove a unit with "juju remove-unit ceilometer/0" and rerun "juju deploy bundle.yaml" the new unit is deployed in a machine where a unit is already deployed instead of the original one.

Output from the first deployment, ceilometer was deployed to machines: 0, 1 and 12:
ceilometer/0* waiting idle 0/lxd/0 100.84.4.14 8777/tcp
ceilometer/1 waiting idle 1/lxd/0 100.84.5.22 8777/tcp
ceilometer/2 waiting idle 12/lxd/0 100.84.6.25 8777/tcp

After removing the unit and reruning deploy, the placement is 1, 12 and 1:
ceilometer/1* waiting idle 1/lxd/0 100.84.5.22 8777/tcp
ceilometer/2 waiting idle 12/lxd/0 100.84.6.25 8777/tcp
ceilometer/3 maintenance executing 1/lxd/14 100.84.5.23

We tried to remove unit ceilometer/3 and deploy again and this time the unit was deployed to machine 12, here are the logs from --debug

10:38:19 DEBUG juju.cmd.juju.application bundle.go:1019 resolve "ceilometer" from map[string]string{}
10:38:19 DEBUG juju.cmd.juju.application bundle.go:700 addUnit: placement "lxd:12"
10:38:19 DEBUG juju.cmd.juju.application bundle.go:974 resolveMachine("12")
10:38:19 DEBUG juju.cmd.juju.application bundle.go:1019 resolve "12" from map[string]string{}
10:38:19 DEBUG juju.cmd.juju.application bundle.go:720 resolved: placement "lxd:12"
10:38:19 DEBUG juju.cmd.juju.application bundle.go:739 added ceilometer/4 unit to new machine
10:38:19 INFO cmd bundle.go:382 Deploy of bundle completed.
10:38:19 DEBUG juju.api monitor.go:35 RPC connection died
10:38:19 INFO cmd supercommand.go:465 command finished`

We've also tried to run the deployment with: "juju deploy bundle.yaml --map-machines=existing" but the behaviour was the same.

Example of the ceilometer part of the bundle:
machines:
  "0":
    constraints: tags=4-management
    series: xenial
  "1":
    constraints: tags=5-management
    series: xenial
  "2":
    constraints: tags=6-management
    series: xenial

...
...

  ceilometer:
    charm: ../charms/ceilometer
    num_units: 3
    bindings:
      "": *oam-space
      public: *public-space
      admin: *admin-space
      internal: *internal-space
    options:
      openstack-origin: *openstack-origin
      ...
    to:
    - lxd:0
    - lxd:1
    - lxd:2

juju version is 2.3.6

tags: added: cpe-onsite

I would guess that we count how many containers exist and then likely take
the last one from the list if the count didn't match.

I do wonder if we are just treating ids as logical. But aren't trying to
match up the machines after the fact.

We certainly could notice the count doesn't match and then try to see which
one is missing. We've never needed to do that for machines because if one
is missing that means we have to remap the ID to a newly created machine.

John
=:->

On Fri, Apr 20, 2018, 18:05 Gábor Mészáros <email address hidden>
wrote:

> ** Tags added: cpe-onsite
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1765719
>
> Title:
> juju 2.3 incorrect unit placement
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1765719/+subscriptions
>

This is due to how the bundle changes are calculated.

It isn't too smart. What it does is looks and sees that it needs three
ceilometers, and it only has two, so it uses the third placement directive.

The code does not yet do deeper analysis to see which of the placements
have been satisfied.

Tim Penhey (thumper) on 2018-04-24
Changed in juju:
status: New → Triaged
importance: Undecided → Low
tags: added: bundles
David Douglas (ddouglas-austin) wrote :

This is a high priority issue for the customer, so we need to bump this up.

the workaround is to just not mix explicit placement and implicit
placement. (update the bundle definition)

John
=:->

On Tue, Apr 24, 2018, 19:01 David Douglas <email address hidden>
wrote:

> This is a high priority issue for the customer, so we need to bump this
> up.
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1765719
>
> Title:
> juju 2.3 incorrect unit placement
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1765719/+subscriptions
>

John A Meinel (jameinel) wrote :

Sorry was thinking the other bug.
Can you work around this with --map-machines?
So that juju thinks the missing container machine is a specific machine in
the model?

John
=:->

On Tue, Apr 24, 2018, 19:15 John Meinel <email address hidden> wrote:

> the workaround is to just not mix explicit placement and implicit
> placement. (update the bundle definition)
>
> John
> =:->
>
> On Tue, Apr 24, 2018, 19:01 David Douglas <email address hidden>
> wrote:
>
>> This is a high priority issue for the customer, so we need to bump this
>> up.
>>
>> --
>> You received this bug notification because you are subscribed to juju.
>> Matching subscriptions: juju bugs
>> https://bugs.launchpad.net/bugs/1765719
>>
>> Title:
>> juju 2.3 incorrect unit placement
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/juju/+bug/1765719/+subscriptions
>>
>

Ondrej Kuchar (ondrej-kuchar) wrote :

hello, we already ran the juju "juju deploy bundle.yaml --map-machines=existing, but it did not helped

Tim Penhey (thumper) on 2018-05-02
Changed in juju:
importance: Low → High
assignee: nobody → Tim Penhey (thumper)
status: Triaged → In Progress
Tim Penhey (thumper) wrote :

https://github.com/juju/juju/pull/8693 there were many edge cases to be worked through.

Tim Penhey (thumper) on 2018-05-14
Changed in juju:
milestone: none → 2.3.8
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers