lxd containers fail to find agent binaries for arch. mixed architecture controller & machines
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Juju Version: 2.3.3-xenial-amd64
Problem: We are testing ppc64el bare metal hosts in our lab. We do not have sufficient available hosts to bootstrap the controller on ppc64el so we are therefore using an amd64 machine.
I am deploying via a bundle, in which i added constraints to the machines, so that it would use our ppc64el hosts in MAAS.
machines:
'0':
series: xenial
constraints: arch=ppc64el
'1':
series: xenial
constraints: arch=ppc64el
'2':
series: xenial
constraints: arch=ppc64el
'3':
series: xenial
constraints: arch=ppc64el
Once deployed the bare metal ppc64el hosts booted up, but the LXD containers failed to start, apparently they could not find the agent binaries.
Machine State DNS Inst id Series AZ Message
0 started 10.245.168.40 gxy47x xenial default Deployed
0/lxd/0 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
0/lxd/1 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
0/lxd/2 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
1 started 10.245.168.45 xpwspa xenial default Deployed
1/lxd/0 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
1/lxd/1 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
1/lxd/2 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
1/lxd/3 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
1/lxd/4 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
2 started 10.245.168.56 877gbf xenial default Deployed
2/lxd/0 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
2/lxd/1 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
2/lxd/2 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
3 started 10.245.168.47 fa6tef xenial default Deployed
3/lxd/0 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
3/lxd/1 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
3/lxd/2 down pending xenial need agent binaries for arch ppc64el, only found [amd64]
After some triage, we have discovered that if the constraints are specified explicitly, then the container does start.
juju add-machine lxd:0 --constraints arch=ppc64el
0/lxd/4 started 10.245.168.25 juju-c89fe4-0-lxd-4 xenial default Container started
According to the juju documentation, specifying the constraints in the charm will work:
https:/
The bundle was modified to reflect these directions, for example:
openstack-
annotations:
gui-x: '500'
gui-y: '-250'
charm: cs:~openstack-
constraints: "arch=ppc64el"
num_units: 1
options:
openstack
to:
- lxd:3
Now this appeared to fix the issue once redeployed. Please note below.
Machine State DNS Inst id Series AZ Message
0 started 10.245.168.40 gxy47x xenial default Deployed
0/lxd/7 started 10.245.168.41 juju-c89fe4-0-lxd-7 xenial default Container started
0/lxd/8 started 10.245.168.44 juju-c89fe4-0-lxd-8 xenial default Container started
0/lxd/9 started 10.245.168.52 juju-c89fe4-0-lxd-9 xenial default Container started
1 started 10.245.168.45 xpwspa xenial default Deployed
1/lxd/5 started 10.245.168.46 juju-c89fe4-1-lxd-5 xenial default Container started
1/lxd/6 started 10.245.168.43 juju-c89fe4-1-lxd-6 xenial default Container started
1/lxd/7 started 10.245.168.51 juju-c89fe4-1-lxd-7 xenial default Container started
2 started 10.245.168.56 877gbf xenial default Deployed
2/lxd/3 started 10.245.168.39 juju-c89fe4-2-lxd-3 xenial default Container started
2/lxd/4 started 10.245.168.48 juju-c89fe4-2-lxd-4 xenial default Container started
2/lxd/5 pending juju-c89fe4-2-lxd-5 xenial default Container started
3 started 10.245.168.47 fa6tef xenial default Deployed
3/lxd/3 started 10.245.168.37 juju-c89fe4-3-lxd-3 xenial default Container started
3/lxd/4 started 10.245.168.49 juju-c89fe4-3-lxd-4 xenial default Container started
3/lxd/5 pending juju-c89fe4-3-lxd-5 xenial default Container started
Even though the final solution was a success, the lxd containers should of automatically inherited this constraint due to the fact the machines in the bundle were explicitly told to use arch=ppc64l
You should not have to add constraints to every charm in the bundle.
I have the feeling we might be displaying an error that is confusing the
issue. If I was to guess, I would say we're actually trying to create amd64
instances on a ppc machine. So it isn't that 'we can't find tools' (a red
herring) but that we can't launch the instance.
I could be completely wrong, but the fact that by specifying what ARCH you
want, makes us "find the tools" makes me think the issue is something else.
On Wed, Mar 7, 2018 at 10:12 AM, Sean Feole <email address hidden>
wrote:
> Public bug reported:
>
> Juju Version: 2.3.3-xenial-amd64
>
> Problem: We are testing ppc64el bare metal hosts in our lab. We do not
> have sufficient available hosts to bootstrap the controller on ppc64el
> so we are therefore using an amd64 machine.
>
> I am deploying via a bundle, in which i added constraints to the
> machines, so that it would use our ppc64el hosts in MAAS.
>
> machines:
> '0':
> series: xenial
> constraints: arch=ppc64el
> '1':
> series: xenial
> constraints: arch=ppc64el
> '2':
> series: xenial
> constraints: arch=ppc64el
> '3':
> series: xenial
> constraints: arch=ppc64el
>
> Once deployed the bare metal ppc64el hosts booted up, but the LXD
> containers failed to start, apparently they could not find the agent
> binaries.
>
>
> Machine State DNS Inst id Series AZ Message
> 0 started 10.245.168.40 gxy47x xenial default Deployed
> 0/lxd/0 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 0/lxd/1 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 0/lxd/2 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 1 started 10.245.168.45 xpwspa xenial default Deployed
> 1/lxd/0 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 1/lxd/1 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 1/lxd/2 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 1/lxd/3 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 1/lxd/4 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 2 started 10.245.168.56 877gbf xenial default Deployed
> 2/lxd/0 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 2/lxd/1 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 2/lxd/2 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 3 started 10.245.168.47 fa6tef xenial default Deployed
> 3/lxd/0 down pending xenial need agent
> binaries for arch ppc64el, only found [amd64]
> 3/lxd/1 down pending xenial need agent
> binaries for arch ppc64el, only...