NovaObjectSerializer cannot handle backporting a nested object

Bug #1475254 reported by Nikola Đipanov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
High
Nikola Đipanov
Nominated for Kilo by Nikola Đipanov

Bug Description

NovaObjectSerializer will call obj_from_primitive, and tries to guard against IncompatibleObjectVersion in which case it will call on the conductor to backport the object to the highest version it knows about. See:

https://github.com/openstack/nova/blob/35375133398d862a61334783c1e7a90b95f34cdb/nova/objects/base.py#L634

The problem is if a top-level object can be serialized but one of the nested objects throws an IncompatibleObjectVersion what happens, due to the way that we handle all exceptions from the recursion at the top level is that conductor gets asked to backport the top-level object to the nested object's latest known version - completely wrong!

https://github.com/openstack/nova/blob/35375133398d862a61334783c1e7a90b95f34cdb/nova/objects/base.py#L643

This happens in our case when trying to fix https://bugs.launchpad.net/nova/+bug/1474074, and running upgrade tests with unpatched Kilo code - we bumped the PciDeviceList version on master, and need to do it on Kilo but the stable/kilo patch cannot be landed first, so the highest PciDeviceList kilo node know about is 1.1, however we end up asking the conductor to backport the Instance to 1.1 which drops a whole bunch of things we need, which then causes lazy_loading exception (copied from the gate logs of https://review.openstack.org/#/c/201280/ PS 6)

2015-07-15 16:55:15.377 ERROR nova.compute.manager [req-fb91e079-1eef-4768-b315-9233c6b9946d tempest-ServerAddressesTestJSON-1642250859 tempest-ServerAddressesTestJSON-713705678] [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] Instance failed to spawn
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] Traceback (most recent call last):
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/compute/manager.py", line 2461, in _build_resources
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] yield resources
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/compute/manager.py", line 2333, in _build_and_run_instance
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] block_device_info=block_device_info)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 2378, in spawn
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] write_to_disk=True)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 4179, in _get_guest_xml
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] context)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 3989, in _get_guest_config
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] pci_devs = pci_manager.get_instance_pci_devs(instance, 'all')
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/pci/manager.py", line 279, in get_instance_pci_devs
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] pci_devices = inst.pci_devices
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/objects/base.py", line 72, in getter
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] self.obj_load_attr(name)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/objects/instance.py", line 1018, in obj_load_attr
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] self._load_generic(attrname)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] File "/opt/stack/old/nova/nova/objects/instance.py", line 908, in _load_generic
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] reason='loading %s requires recursion' % attrname)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23] ObjectActionError: Object action obj_load_attr failed because: loading pci_devices requires recursion
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 25387a96-e47f-47f1-8e3c-3716072c9c23]

Revision history for this message
Paul Murray (pmurray) wrote :

Is this a duplicate of https://bugs.launchpad.net/nova/+bug/1275675 - or is there more to it?

Note the is an abandonded patch for bug/1275675: https://review.openstack.org/#/c/78605/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/202554

Changed in nova:
assignee: nobody → Nikola Đipanov (ndipanov)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/202560

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Nikola Dipanov (<email address hidden>) on branch: master
Review: https://review.openstack.org/202554
Reason: Will abandon for a more optimal fix that does not potentially send more than one backleveling request for a highly nested structure

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/kilo)

Change abandoned by Nikola Dipanov (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/202560

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.