OpenStack Compute (nova)

Bug #1864665
Activity log

Activity log for bug #1864665

Date	Who	What changed	Old value	New value	Message
2020-02-25 15:51:48	Balazs Gibizer	bug			added bug
2020-02-25 15:52:00	Balazs Gibizer	tags		stable-only
2020-02-25 15:52:10	Balazs Gibizer	nova: assignee		Balazs Gibizer (balazs-gibizer)
2020-02-25 15:52:16	Balazs Gibizer	nova: status	New	Triaged
2020-02-25 15:52:22	Balazs Gibizer	nova: importance	Undecided	Medium
2020-02-25 15:54:16	Balazs Gibizer	nominated for series		nova/pike
2020-02-25 15:54:16	Balazs Gibizer	bug task added		nova/pike
2020-02-25 15:54:16	Balazs Gibizer	nominated for series		nova/rocky
2020-02-25 15:54:16	Balazs Gibizer	bug task added		nova/rocky
2020-02-25 15:54:16	Balazs Gibizer	nominated for series		nova/queens
2020-02-25 15:54:16	Balazs Gibizer	bug task added		nova/queens
2020-02-25 15:54:28	Balazs Gibizer	nominated for series		nova/ocata
2020-02-25 15:54:28	Balazs Gibizer	bug task added		nova/ocata
2020-02-25 15:54:41	Balazs Gibizer	nova: status	Triaged	Invalid
2020-02-25 15:54:49	Balazs Gibizer	nova/pike: status	New	Triaged
2020-02-25 15:54:53	Balazs Gibizer	nova/pike: importance	Undecided	Medium
2020-02-25 15:55:02	Balazs Gibizer	nova/pike: assignee		Balazs Gibizer (balazs-gibizer)
2020-02-25 15:55:24	Balazs Gibizer	description	Description =========== Server cold migration fails after re-schedule. Steps to reproduce ================== * create a devstack with two compute hosts with libvirt driver * set allow_resize_to_same_host=True on both computes * set up cellsv2 without cell conductor and rabbit separation to allow re-schedule logic to call back to the super conductor / scheduler * enable NUMATopologyFilter and make sure both computes has NUMA resources * create a flavor with hw:cpu_policy='dedicated' extra spec * boot a server with the flavor and ensure that the server. Check which compute the server is placed (let's call it host1) * boot enough servers on host2 so that the next scheduling request could still be fulfilled by both computes but host1 will be preferred by the weighers * cold migrate the pinned server Expected result =============== * scheduler selects host1 first but that host fails with UnableToMigrateToSelf exception as libvirt does not have the capability * re-schedule happens * scheduler selects host2 where the server spawns successfully Actual result ============= * during the re-schedule when the conductor sends prep_resize RPC to host2 the json serialization of the request spec fails with Circural reference error. Environment =========== * two node devstack with libvirt driver * stable/pike nova. But expected to be reproduced in newer branches but not since stein. See triage part Triage ====== The json serialization blows up in the migrate conductor task. [1] After debugging I see that the infinit loop happens when jsonutils.to_primitive tries to serialize a VirtCPUTopology instance. The problematic piece of code has been removed by I4244f7dd8fe74565180f73684678027067b4506e in Stein. [1] https://github.com/openstack/nova/blob/4224a61b4f3a8b910dcaa498f9663479d61a6060/nova/conductor/tasks/migrate.py#L87	Description =========== Server cold migration fails after re-schedule. Steps to reproduce ================== * create a devstack with two compute hosts with libvirt driver * set allow_resize_to_same_host=True on both computes * set up cellsv2 without cell conductor and rabbit separation to allow re-schedule logic to call back to the super conductor / scheduler * enable NUMATopologyFilter and make sure both computes has NUMA resources * create a flavor with hw:cpu_policy='dedicated' extra spec * boot a server with the flavor. Check which compute the server is placed (let's call it host1) * boot enough servers on host2 so that the next scheduling request could still be fulfilled by both computes but host1 will be preferred by the weighers * cold migrate the pinned server Expected result =============== * scheduler selects host1 first but that host fails with UnableToMigrateToSelf exception as libvirt does not have the capability * re-schedule happens * scheduler selects host2 where the server spawns successfully Actual result ============= * during the re-schedule when the conductor sends prep_resize RPC to host2 the json serialization of the request spec fails with Circural reference error. Environment =========== * two node devstack with libvirt driver * stable/pike nova. But expected to be reproduced in newer branches but not since stein. See triage part Triage ====== The json serialization blows up in the migrate conductor task. [1] After debugging I see that the infinit loop happens when jsonutils.to_primitive tries to serialize a VirtCPUTopology instance. The problematic piece of code has been removed by I4244f7dd8fe74565180f73684678027067b4506e in Stein. [1] https://github.com/openstack/nova/blob/4224a61b4f3a8b910dcaa498f9663479d61a6060/nova/conductor/tasks/migrate.py#L87
2020-02-25 17:28:42	OpenStack Infra	nova/rocky: status	New	In Progress
2020-02-25 17:28:42	OpenStack Infra	nova/rocky: assignee		Balazs Gibizer (balazs-gibizer)
2020-02-25 17:29:00	Balazs Gibizer	nova/rocky: importance	Undecided	Medium
2020-03-14 19:01:19	OpenStack Infra	nova/rocky: status	In Progress	Fix Committed
2020-03-15 11:45:14	OpenStack Infra	nova/queens: status	New	In Progress
2020-03-15 11:45:14	OpenStack Infra	nova/queens: assignee		s10 (vlad-esten)
2020-03-16 07:42:40	Alexander Rubtsov	bug			added subscriber Alexander Rubtsov
2020-03-19 21:09:36	OpenStack Infra	nova/queens: status	In Progress	Fix Committed
2020-03-20 15:33:16	OpenStack Infra	nova/pike: status	Triaged	In Progress
2020-03-25 06:40:44	OpenStack Infra	nova/pike: assignee	Balazs Gibizer (balazs-gibizer)	Elod Illes (elod-illes)
2020-03-25 11:40:31	OpenStack Infra	nova/pike: status	In Progress	Fix Committed
2022-08-01 11:04:42	OpenStack Infra	nova/pike: status	Fix Committed	Fix Released
2022-11-11 18:11:58	OpenStack Infra	nova/queens: status	Fix Committed	Fix Released
2022-11-11 18:20:35	OpenStack Infra	nova/rocky: status	Fix Committed	Fix Released