Activity log for bug #1974070

Date Who What changed Old value New value Message
2022-05-18 17:28:16 John Garbutt bug added bug
2022-05-18 17:28:28 John Garbutt nova: status New In Progress
2022-05-18 17:28:30 John Garbutt nova: assignee John Garbutt (johngarbutt)
2022-05-18 17:28:35 John Garbutt nova: importance Undecided Low
2022-05-19 10:17:52 John Garbutt tags ironic
2022-05-19 10:18:27 John Garbutt description In a happy world, placement reserved gets updated when a node is not availabe any more, so the scheduler doesn't pick that one, everyone it happy. Howerver, as is fairly well known, it takes a while for Nova to notice if a node has been marked as in maintenance or if it has started cleaning due to the instance now having been deleted, and you can still reach a node in a bad state. This actually fails hard when setting the instance uuid, as expected here: https://github.com/openstack/nova/blob/4939318649650b60dd07d161b80909e70d0e093e/nova/virt/ironic/driver.py#L378 You get a conflict errors, as the ironic node is in a transitioning state (i.e. its not actually available any more). When people are busy rebuilding large numbers of nodes, they tend to hit this problem, even when only building when you know there available nodes, you sometimes pick the ones you just deleted. In an idea world this would trigger a re-schedule, a bit like when you hit errors in the resource tracker such as ComputeResourcesUnavailable In a happy world, placement reserved gets updated when a node is not availabe any more, so the scheduler doesn't pick that one, everyone is happy. Howerver, as is fairly well known, it takes a while for Nova to notice if a node has been marked as in maintenance or if it has started cleaning due to the instance now having been deleted, and you can still reach a node in a bad state. This actually fails hard when setting the instance uuid, as expected here: https://github.com/openstack/nova/blob/4939318649650b60dd07d161b80909e70d0e093e/nova/virt/ironic/driver.py#L378 You get a conflict errors, as the ironic node is in a transitioning state (i.e. its not actually available any more). When people are busy rebuilding large numbers of nodes, they tend to hit this problem, even when only building when you know there available nodes, you sometimes pick the ones you just deleted. In an idea world this would trigger a re-schedule, a bit like when you hit errors in the resource tracker such as ComputeResourcesUnavailable
2022-12-14 14:11:44 OpenStack Infra nova: status In Progress Fix Released
2022-12-17 17:01:33 OpenStack Infra tags ironic in-stable-zed ironic
2023-05-24 14:59:04 OpenStack Infra tags in-stable-zed ironic in-stable-yoga in-stable-zed ironic