8+ containers makes one get stuck in "pending" on joyent
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
James Tunnicliffe |
Bug Description
When we add eight containers to a Joyent machine, one gets stuck in pending. Eventually, the test script raises AgentsNotStarted.
We are seeing this in our long-running industrial/
e.g. http://
It happens almost every time, but not every time. It is usually the last container (e.g. 3/lxd/7), but not always. Sometimes it's the seventh or even the first.
It does not happen on AWS, even though AWS machines are no better (and in some regards worse) than Joyent machines in terms of their cpu/memory/storage.
I reproduced this using our juju-ci-tools industrial_test script.
./industrial_
An example run is attached.
Changed in juju: | |
milestone: | 2.0-rc2 → 2.0.0 |
assignee: | nobody → Richard Harding (rharding) |
Changed in juju: | |
milestone: | 2.0.0 → 2.1.0 |
Changed in juju: | |
status: | Triaged → In Progress |
Changed in juju: | |
status: | In Progress → Fix Committed |
Changed in juju: | |
milestone: | 2.1.0 → 2.1-beta1 |
status: | Fix Committed → Fix Released |
This job has gone, but the problem remains. Just did a bunch of add-machines and eventually machine 0 (the host) just stopped responding. Can ping it, but not SSH to it and the agent state shows as down.
On MAAS I added 50 LXDs and got bored of waiting for something bad to happen.
Nothing was crying out to me from the logs, but that isn't much of a surprise at this stage in the investigation.