Need some kind of 'auto' boolean column in the Service table
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Opinion
|
Wishlist
|
jichenjc |
Bug Description
Bug 1250049 reported a problem with automatically disabling/enabling a host via the libvirt driver, but rather than fix it the right way, i.e. add a new column to the Service table which indicates if an admin intentionally disabled the host or if nova detected a fail and did it automatically, a hack was done instead to prefix the 'disabled_reason' with "AUTO:" and build some logic in the driver around that.
The problem with that approach is the ComputeFilter in the scheduler can't perform any kind of retry logic around that if needed, i.e. bug 1257644.
Right now if the ComputeFilter encounters a disabled host, it just logs it at debug level and skips it. If the host was automatically disabled because of a connection fail, we should at least log that as a warning in the scheduler (like we do now for hosts that haven't checked in for awhile) - or possibly build some retry logic around that to make it more robust in case the connection fail is just a hiccup that quickly resolves itself.
One could maybe argue that some kind of connection retry logic could be built into the libvirt driver instead, I wouldn't be against that.
Changed in nova: | |
importance: | Undecided → Wishlist |
status: | New → Triaged |
Changed in nova: | |
assignee: | nobody → jichencom (jichenjc) |
Changed in nova: | |
status: | In Progress → Opinion |
Fix proposed to branch: master /review. openstack. org/80885
Review: https:/