Ironic driver tries to update the compute_node's UUID which of course fails in case of existing compute_nodes

Bug #1798172 reported by Surya Seetharaman
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann
Rocky
Fix Committed
High
Surya Seetharaman

Bug Description

The patch - https://review.openstack.org/#/c/571535 was introduced with the aim of keeping the same uuid value for ironic nodes and their corresponding compute_node records. This works fine for when new nodes are created. However upon restart/periodic updates from the ironic driver, it tries to update the uuid of an existing compute_node record based on the resource update from the ironic driver and this fails since once a uuid is set it cannot be changed/updated.

Error traceback:

2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager [req-62cf93be-023b-41ca-8971-d3dbab4324f8 - - - - -] Error updating resources for node e3ef5531-3c39-458a-99f0-fd44592ae1ae.: ReadOnlyFieldError: Cannot modify readonly field uuid
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager Traceback (most recent call last):
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7337, in update_available_resource_for_node
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 680, in update_available_resource
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager self._update_available_resource(context, resources)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager return f(*args, **kwargs)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, in _update_available_resource
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager self._init_compute_node(context, resources)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 559, in _init_compute_node
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager self._copy_resources(cn, resources)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 617, in _copy_resources
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager compute_node.update_from_virt_driver(resources)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/objects/compute_node.py", line 353, in update_from_virt_driver
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager setattr(self, key, resources[key])
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 77, in setter
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager raise exception.ReadOnlyFieldError(field=name)
2018-10-11 02:30:34.142 21850 ERROR nova.compute.manager ReadOnlyFieldError: Cannot modify readonly field uuid

Matt Riedemann (mriedem)
Changed in nova:
status: New → Triaged
importance: Undecided → High
tags: added: upgrade
Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (3.8 KiB)

Confirmed in the ironic grenade job in a stable/rocky change:

http://logs.openstack.org/00/607600/1/check/ironic-grenade-dsvm/4d493b1/logs/screen-n-cpu.txt.gz#_Oct_03_18_33_59_072341

Oct 03 18:33:59.072341 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager [None req-e6c511df-45eb-4b7d-b3a5-032f597d1d01 None None] Error updating resources for node 04adb5c0-27f0-4ebd-a1d5-859d6efb769c.: ReadOnlyFieldError: Cannot modify readonly field uuid
Oct 03 18:33:59.072520 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager Traceback (most recent call last):
Oct 03 18:33:59.072688 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/manager.py", line 7746, in _update_available_resource_for_node
Oct 03 18:33:59.072845 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager rt.update_available_resource(context, nodename)
Oct 03 18:33:59.072987 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 724, in update_available_resource
Oct 03 18:33:59.073143 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager self._update_available_resource(context, resources)
Oct 03 18:33:59.073300 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner
Oct 03 18:33:59.073464 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager return f(*args, **kwargs)
Oct 03 18:33:59.073621 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 747, in _update_available_resource
Oct 03 18:33:59.073786 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager self._init_compute_node(context, resources)
Oct 03 18:33:59.073922 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 572, in _init_compute_node
Oct 03 18:33:59.074065 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager self._copy_resources(cn, resources)
Oct 03 18:33:59.074215 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 649, in _copy_resources
Oct 03 18:33:59.074384 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager compute_node.update_from_virt_driver(resources)
Oct 03 18:33:59.074548 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/objects/compute_node.py", line 354, in update_from_virt_driver
Oct 03 18:33:59.074690 ubuntu-xenial-vexxhost-sjc1-0002567935 nova-compute[14949]: ERROR nova.compute.manager setattr(self, key, resources[key])
Oct 03 18:33:59.074830 ubunt...

Read more...

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
Revision history for this message
Matt Riedemann (mriedem) wrote :

It doesn't blow up on startup because the exception is caught and logged in ComputeManager._update_available_resource_for_node.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/611162

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/611337

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/611162
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=498413074d1f11688123b6b592d5c204dc7b5ef2
Submitter: Zuul
Branch: master

commit 498413074d1f11688123b6b592d5c204dc7b5ef2
Author: Matt Riedemann <email address hidden>
Date: Tue Oct 16 16:23:54 2018 -0400

    Ignore uuid if already set in ComputeNode.update_from_virt_driver

    Change Ia69fabce8e7fd7de101e291fe133c6f5f5f7056a sets the
    ComputeNode.uuid to whatever the virt driver reports if the
    virt driver reports a uuid, like in the case of ironic.

    However, that breaks upgrades for any pre-existing compute
    node records which have a random uuid since ComputeNode.uuid
    is a read-only field once set.

    This change simply ignores the uuid from the virt driver
    resources dict if the ComputeNode.uuid is already set.

    The bug actually shows up in the ironic grenade CI job
    logs in stable/rocky but didn't fail the nova-compute startup
    because ComputeManager._update_available_resource_for_node()
    catches and just logs the error, but it doesn't kill the service.

    Change-Id: Id02f501feefca358d36f39b24d426537685e425c
    Closes-Bug: #1798172

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/611337
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=57566f4c8d2929c25e76564883369a7c6eda720a
Submitter: Zuul
Branch: stable/rocky

commit 57566f4c8d2929c25e76564883369a7c6eda720a
Author: Matt Riedemann <email address hidden>
Date: Tue Oct 16 16:23:54 2018 -0400

    Ignore uuid if already set in ComputeNode.update_from_virt_driver

    Change Ia69fabce8e7fd7de101e291fe133c6f5f5f7056a sets the
    ComputeNode.uuid to whatever the virt driver reports if the
    virt driver reports a uuid, like in the case of ironic.

    However, that breaks upgrades for any pre-existing compute
    node records which have a random uuid since ComputeNode.uuid
    is a read-only field once set.

    This change simply ignores the uuid from the virt driver
    resources dict if the ComputeNode.uuid is already set.

    The bug actually shows up in the ironic grenade CI job
    logs in stable/rocky but didn't fail the nova-compute startup
    because ComputeManager._update_available_resource_for_node()
    catches and just logs the error, but it doesn't kill the service.

    Change-Id: Id02f501feefca358d36f39b24d426537685e425c
    Closes-Bug: #1798172
    (cherry picked from commit 498413074d1f11688123b6b592d5c204dc7b5ef2)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.3

This issue was fixed in the openstack/nova 18.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.