Volume becomes in 'error' state after scheduler starts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Fix Released
|
High
|
Michal Dulko |
Bug Description
Steps to reproduce:
1. Deploy working OpenStack (E.g. with Devstack)
2. Stop cinder-scheduler
3. Try to create volume
3.1. Ensure that volume is in 'creating' state
4. Start cinder scheduler
Actual results:
Volume becomes in 'error' state. cinder-scheduler reports:
2015-01-09 15:30:48.301 WARNING cinder.
2015-01-09 15:30:48.305 ERROR cinder.
2015-01-09 15:30:48.306 DEBUG cinder.
Expected results:
Volume created successfully and become in 'available' state.
This is a synthetic test case but would be a real case for failover in high availability (HA) deployment in production.
Changed in cinder: | |
assignee: | nobody → Ivan Kolodyazhny (e0ne) |
Changed in cinder: | |
status: | Confirmed → In Progress |
Changed in cinder: | |
assignee: | Huang Zhiteng (zhiteng-huang) → Michal Dulko (michal-dulko-f) |
Changed in cinder: | |
milestone: | none → kilo-3 |
status: | Fix Committed → Fix Released |
Changed in cinder: | |
milestone: | kilo-3 → 2015.1.0 |
This is caused by the fact a newly started scheduler needs to receive stat update broadcasts to know what backends are out there.
Possibly fixable by modifying the scheduler not to listen for incoming create (etc) requests until the stats broadcast period (30 seconds by default) has passed.
Note that this same problem occurs if you spin up a second (or more) scheduler, e.g. for load balancing or active/active H/A - the new scheduler will fail all requests it receives until it hears from backends.