Do you have any ERRORs in the log files of the agents that went down? I've identified the issue here to be related to the race condition of stats running + a volume disappearing which can crash a thread:
https://review.openstack.org/#/q/I7d3d006b023ca4b7963c4c684e4c036399d1295c
I believe landing this locally will stop the threads from crashing and seeing agents go to down state.
Do you have any ERRORs in the log files of the agents that went down? I've identified the issue here to be related to the race condition of stats running + a volume disappearing which can crash a thread:
https:/ /review. openstack. org/#/q/ I7d3d006b023ca4 b7963c4c684e4c0 36399d1295c
I believe landing this locally will stop the threads from crashing and seeing agents go to down state.