Failure to set root password leaves instance in ERROR
Bug #1061045 reported by
Johannes Erdfelt
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Amir Sadoughi |
Bug Description
If the agent isn't running on an instance, then setting the root password will timeout.
The API server will return a 500 error because of an RPC timeout. This should return something other than 500.
Eventually the compute server will timeout as well and leave the instance in ERROR. The instance is still running fine and ERROR seems like an incorrect state to leave the instance in.
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → High |
tags: | added: xenserver |
tags: | added: folsom-backport-potential |
Changed in nova: | |
milestone: | none → grizzly-3 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | grizzly-3 → 2013.1 |
To post a comment you must log in.
I think both problems are complicated by the retries that happen in the compute layer. There are 10 retries combined with a 30 second timeout for the xenapi driver, this could take 300 seconds total. This is longer than the RPC timeout.
The retry logic seems unnecessary and appears to be a result of legacy code.
If the whole timeout was something reasonable, then an error could be returned synchronously to the client instead of requiring the instance to be moved to ERROR so an asynchronous error could be made available.