Juju doesn't retry or timeout container failures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
OS: Ubuntu 16.04.6
LXD: 3.12, installed via snap
Juju: 2.5.7-xenial-amd64, installed via snap
Cloud: lxd
Summary: If an lxd cloud fails to provide IP addresses for new containers, Juju will remain stuck in a allocating state when new charms are deployed.
I am/was encountering random dnsmasq failures. I was able to bootstrap a Juju controller to lxd. dnsmasq later failed, so when I tried to deploy a charm the unit was stuck in an 'allocating' state.
When I realised the problem, I restarted lxd, which restarted dnsmasq. The containers returned, this time with IP addresses, but the units remained in 'allocating' for quite some time. In the end I had to remove the applications or destroy the model.
Steps to reproduce.
1. Install LXD and Juju from snap
2. Configure LXD
3. Bootstrap a Juju controller to LXD
4. Kill the dnsmasq process
5. Deploy a charm
6. Wait for charm to enter 'allocating' state
7. `snap restart lxd`
What I expect to happen: Either the charm enter an error state because it can't be allocated, or for there to be a retry that continues once the network is available.
Changed in juju: | |
status: | New → Triaged |
importance: | Undecided → Medium |
This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.