LXD containers stuck pending using MaaS

Bug #1677265 reported by Michael Skalka
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
Undecided
Unassigned

Bug Description

Juju 2.1.2 on Trusty hangs when creating LXD containers on top of MaaS 2.2.0. This manifests in two issues that I've run into in the past few days. First is Juju will spin up the container but fail to recognize it is started, then never re-checks to see if it is available. Alternatively Juju will attempt to spin up the container and the lxd daemon on the machine never starts the containers, and Juju again never retries.

In the first case, ssh-ing into the machine and running 'lxc list' reports the containers as running. In the second the same 'lxc list' reports nothing.

jenkins@host:~$ juju status
Model Controller Cloud/Region Version
default maas maas 2.1.2

App Version Status Scale Charm Store Rev OS Notes
dummy-sink-a waiting 0/1 dummy-sink jujucharms 0 ubuntu
dummy-sink-b active 1 dummy-sink jujucharms 0 ubuntu
dummy-sink-c active 1 dummy-sink jujucharms 0 ubuntu
dummy-source-a active 1 dummy-source jujucharms 0 ubuntu
dummy-source-b waiting 0/1 dummy-source jujucharms 0 ubuntu
dummy-source-c maintenance 1 dummy-source jujucharms 0 ubuntu
dummy-subordinate active 2 dummy-subordinate jujucharms 0 ubuntu

Unit Workload Agent Machine Public address Ports Message
dummy-sink-a/0 waiting allocating 0/lxd/0 waiting for machine
dummy-sink-b/0* active idle 1 10.246.56.103 Token is true
  dummy-subordinate/0* active idle 10.246.56.103 Token is true
dummy-sink-c/0* active idle 1/lxd/0 10.246.67.54 Token is true
  dummy-subordinate/1 active idle 10.246.67.54 Token is true
dummy-source-a/0* active idle 0 10.246.56.105 Token is true
dummy-source-b/0 waiting allocating 0/lxd/1 waiting for machine
dummy-source-c/0* maintenance idle 1/lxd/1 10.246.67.55 Started

Machine State DNS Inst id Series AZ
0 started 10.x.x.x 4y3h87 xenial default
0/lxd/0 pending juju-c1c15a-0-lxd-0 xenial
0/lxd/1 pending juju-c1c15a-0-lxd-1 xenial
1 started 10.x.x.x 4y3h83 xenial default
1/lxd/0 started 10.x.x.x juju-c1c15a-1-lxd-0 xenial
1/lxd/1 started 10.x.x.x juju-c1c15a-1-lxd-1 xenial

Relation Provides Consumes Type
sink dummy-sink-a dummy-source-b regular
sink dummy-sink-a dummy-source-c regular
sink dummy-sink-b dummy-source-a regular
sink dummy-sink-b dummy-subordinate subordinate
sink dummy-sink-c dummy-source-b regular
sink dummy-sink-c dummy-subordinate subordinate

jenkins@host:~$ juju ssh 0

ubuntu@machine-0:~$ sudo lxc list
+---------------------+---------+----------------------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+---------------------+---------+----------------------+------+------------+-----------+
| juju-c1c15a-0-lxd-0 | RUNNING | 10.x.x.x (eth0) | | PERSISTENT | 0 |
+---------------------+---------+----------------------+------+------------+-----------+
| juju-c1c15a-0-lxd-1 | RUNNING | 10.x.x.x (eth0) | | PERSISTENT | 0 |
+---------------------+---------+----------------------+------+------------+-----------+

Michael Skalka (mskalka)
description: updated
description: updated
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Michael Skalka (mskalka),
What version of LXD are you using?

Also in the report description, you say "trusty" but I am seeing "xenial" in the pasted status...

Could you please attach logs, including cloud-init log and output as well as juju logs?

Changed in juju:
status: New → Incomplete
Revision history for this message
Michael Skalka (mskalka) wrote :

Apologies for the miscommunication. The system running Juju was trusty, it was spinning up xenial containers. Happy to post the logs, but could you explain how to pull the cloud-init log?

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1677265] Re: LXD containers stuck pending using MaaS

for the LXD machines it will be /var/log/cloud-init-output.log you can do
something like:
 juju ssh host-machine
 lxc exec LXD-ID bash
 cat /var/log/cloud-init-output.log

It might be possible to SSH directly into the LXD machines, but given the
other information about pending status, it sounds like that might not work.

Other things to look for:
  juju status --format=yaml

Are you using multiple network spaces in this deployment? Are you deploying
things into containers on spaces that do not have routes to the spaces
where things like the Juju Controller lives? (some of that might be clear
from: juju status --format=yaml)

On Thu, Mar 30, 2017 at 5:39 PM, Michael Skalka <email address hidden>
wrote:

> Apologies for the miscommunication. The system running Juju was trusty,
> it was spinning up xenial containers. Happy to post the logs, but could
> you explain how to pull the cloud-init log?
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1677265
>
> Title:
> LXD containers stuck pending using MaaS
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1677265/+subscriptions
>

Revision history for this message
Michael Skalka (mskalka) wrote :

So the machines I initially ran this on got pulled out of OIL's lab into CDO-QA's, so the logs are lost :/ I'll try to recreate the issue on a set of similar machines when possible.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.