juju-core

Bug #1190715
Comment #1

Comment 1 for bug 1190715

Revision history for this message

William Reade (fwereade) wrote on 2013-06-15: Re: Unit in error, yet juju resolved claims it's fixed

This is partly a communication issue -- it's intending to say something like "I didn't do anything, because the flag I'd be setting is already set"; and the problem is that the unit agent, because it's not running, can't respond to that flag and advance the lifecycle.

So, that's definitely a problem, and we need --force flags on destroy-machine and destroy-unit (lp:1089291 and lp:1089289), that will cause some other part of the system to take over the appropriate responsibilities and tidy up the entities correctly.

Longer-term, this issue emphasizes the value of a storage management system that could let us migrate unit and machine state onto fresh hardware; but that's not on the cards in the immediate future.

It is correct that, once the instance is unrecoverable (what happened to it, btw?), the only way to remove that machine and unit (and the unit's service, and any of its relations the unit had joined...) is to destroy the whole environment. But in practice the *environment* itself should not be in trouble -- unless you lose the bootstrap instance, ofc -- and you should be able to continue to interact with other entities without difficulty. I presume the biggest problem is being unable to reuse service names, but I may be misunderstanding your use case... or unaware of additional problems triggered by this situation?