Instance evacuation is stuck and completes after a long time
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
New
|
Undecided
|
Unassigned |
Bug Description
Stack:
3 node cluster on CentOS 8 Stream, Libvirt+KVM
Openstack Xena.
Nova packages version - 24.1.1-1
SAN based Shared storage connected through ISCSI
MariaDB Galera Cluster configured in Active-Backup mode on haproxy.
We are observing an issue with instance evacuation when a host with active database is powered off.
Evacuation task is triggered but it is taking more than 15 minutes. At the same time, we see placement and other Openstack services report 504 gateway timeout errors when talking with keystone. We saw that the major part of time is spent in nova scheduler from the following log :
Request filter 'map_az_
Sometimes rebuild task is stuck forever until we restart nova services and force VM to go to an error state.
Please let me know if any other configurations or logs are required to understand this issue.