shelve doesn't handle UnexpectedTaskStateError

Bug #1850694 reported by Artom Lifshitz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Artom Lifshitz
Queens
Fix Released
Low
Artom Lifshitz
Rocky
Fix Committed
Low
Artom Lifshitz
Stein
Fix Committed
Low
Artom Lifshitz
Train
Fix Committed
Low
Artom Lifshitz

Bug Description

Description
===========

Shelving a server expects its task state to be None. If it's not None (for example, if attempting to shelve a server that's already being shelved), we get a UnexpectedTaskStateError from the database layer and return a 500 to the user. A 409 would be more appropriate.

Steps to reproduce
==================

1. Send multiple shelve requests in quick succession.

Expected result
===============

The initial request should be accepted, the rest should return 409.

Actual result
=============

Error 500 for all requests except the first.

Environment
===========

This was reported on OSP13 (Queens) originally [1].

Logs & Configs
==============

2019-05-28 03:18:48.530 26 INFO nova.osapi_compute.wsgi.server [req-1437e513-3e32-4243-8f5d-1a7e17c111df 3ff59a48497842e7a716a03a17e5bf8b 493b17f3b02b4f9ea6e71b1ae4c5ac5d - e4c6faf4dfb04f2da40c0595f1a424c7 e4c6faf4dfb04f2da40c0595f1a424c7] 10.101.4.137,10.101.4.1 "POST /v2.1/493b17f3b02b4f9ea6e71b1ae4c5ac5d/servers/f905b880-9caa-465e-93c5-fffe9192c825/action HTTP/1.1" status: 500 len: 622 time: 0.1237578
2019-05-28 03:18:48.529 26 INFO nova.api.openstack.wsgi [req-1437e513-3e32-4243-8f5d-1a7e17c111df 3ff59a48497842e7a716a03a17e5bf8b 493b17f3b02b4f9ea6e71b1ae4c5ac5d - e4c6faf4dfb04f2da40c0595f1a424c7 e4c6faf4dfb04f2da40c0595f1a424c7] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.UnexpectedTaskStateError'>
2019-05-28 03:18:48.529 26 DEBUG nova.api.openstack.wsgi [req-1437e513-3e32-4243-8f5d-1a7e17c111df 3ff59a48497842e7a716a03a17e5bf8b 493b17f3b02b4f9ea6e71b1ae4c5ac5d - e4c6faf4dfb04f2da40c0595f1a424c7 e4c6faf4dfb04f2da40c0595f1a424c7] Returning 500 to user: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.UnexpectedTaskStateError'> __call__ /usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py:1064
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi [req-1437e513-3e32-4243-8f5d-1a7e17c111df 3ff59a48497842e7a716a03a17e5bf8b 493b17f3b02b4f9ea6e71b1ae4c5ac5d - e4c6faf4dfb04f2da40c0595f1a424c7 e4c6faf4dfb04f2da40c0595f1a424c7] Unexpected exception in API method: UnexpectedTaskStateError: Conflict updating instance f905b880-9caa-465e-93c5-fffe9192c825. Expected: {'task_state': [None]}. Actual: {'task_state': u'shelving'}
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py", line 788, in wrapped
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/shelve.py", line 43, in _shelve
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi self.compute_api.shelve(context, instance)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 204, in inner
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return function(self, context, instance, *args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 152, in inner
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return f(self, context, instance, *args, **kw)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 3518, in shelve
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi instance.save(expected_task_state=[None])
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return fn(self, *args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 826, in save
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi columns_to_join=_expected_cols(expected_attrs))
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 890, in instance_update_and_get_original
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi expected=expected)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 169, in wrapper
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi ectxt.value = e.inner_exc
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi self.force_reraise()
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi six.reraise(self.type_, self.value, self.tb)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 135, in wrapper
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 240, in wrapped
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi return f(context, *args, **kwargs)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 2737, in instance_update_and_get_original
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi context, instance_uuid, values, expected, original=instance_ref))
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 2878, in _instance_update
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi raise exc(**exc_props)
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi UnexpectedTaskStateError: Conflict updating instance f905b880-9caa-465e-93c5-fffe9192c825. Expected: {'task_state': [None]}. Actual: {'task_state': u'shelving'}
2019-05-28 03:18:48.523 26 ERROR nova.api.openstack.wsgi

Additional Info
===============

This is obviously minor, as the difference between a 500 and a 409 is purely semantic, but we're being told this is an SLA thing. An SLA defines 5xx as being "down", while 4xx is user error and therefore "up".

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1718171

Tags: api shelve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/692206

Changed in nova:
assignee: nobody → Artom Lifshitz (notartom)
status: New → In Progress
Changed in nova:
assignee: Artom Lifshitz (notartom) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Low
tags: added: api
Changed in nova:
assignee: Matt Riedemann (mriedem) → Artom Lifshitz (notartom)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/692206
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a423d8b27785623550f401a200b68ca926c0a119
Submitter: Zuul
Branch: master

commit a423d8b27785623550f401a200b68ca926c0a119
Author: Artom Lifshitz <email address hidden>
Date: Wed Oct 30 14:03:49 2019 -0400

    Avoid error 500 on shelve task_state race

    When shelving a server and saving its new SHELVING task state, we
    expect the current task state to be None and do not handle any
    UnexpectedTaskStateError exceptions that the database layer may throw
    at us. In those cases, we return an error 500 to the user, when in
    actuality an error 409 conflict would be more appropriate. This
    patch makes the API layer handle UnexpectedTaskStateError exceptions
    and return a 409 to the user.

    Closes-bug: 1850694
    Change-Id: Ie0b421cd5d3a5781c1dd09fab4ed013ece0f939d

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/692628

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/692630

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/692631

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/692632

Matt Riedemann (mriedem)
tags: added: shelve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/692628
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=32dbd2f585ef1902478170f3a1153b1f71e81db3
Submitter: Zuul
Branch: stable/train

commit 32dbd2f585ef1902478170f3a1153b1f71e81db3
Author: Artom Lifshitz <email address hidden>
Date: Wed Oct 30 14:03:49 2019 -0400

    Avoid error 500 on shelve task_state race

    When shelving a server and saving its new SHELVING task state, we
    expect the current task state to be None and do not handle any
    UnexpectedTaskStateError exceptions that the database layer may throw
    at us. In those cases, we return an error 500 to the user, when in
    actuality an error 409 conflict would be more appropriate. This
    patch makes the API layer handle UnexpectedTaskStateError exceptions
    and return a 409 to the user.

    Closes-bug: 1850694
    Change-Id: Ie0b421cd5d3a5781c1dd09fab4ed013ece0f939d
    (cherry picked from commit a423d8b27785623550f401a200b68ca926c0a119)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/692630
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8ee59604dcd9f770286f54e3a39c80f54413c358
Submitter: Zuul
Branch: stable/stein

commit 8ee59604dcd9f770286f54e3a39c80f54413c358
Author: Artom Lifshitz <email address hidden>
Date: Wed Oct 30 14:03:49 2019 -0400

    Avoid error 500 on shelve task_state race

    When shelving a server and saving its new SHELVING task state, we
    expect the current task state to be None and do not handle any
    UnexpectedTaskStateError exceptions that the database layer may throw
    at us. In those cases, we return an error 500 to the user, when in
    actuality an error 409 conflict would be more appropriate. This
    patch makes the API layer handle UnexpectedTaskStateError exceptions
    and return a 409 to the user.

    NOTE(artom) Conflicts:
     * nova/api/openstack/compute/shelve.py because
       InstanceUnknownCell stopped being handled as of
       95dc5d0ab558604df7d8ef31eba8f9efaee9c474 in train
     * nova/tests/unit/api/openstack/compute/test_shelve.py due to the
       vm_states import that was present in train as of
       27b6c18c666389ee68935f28cf340b7673879d6f but is missing from stein.

    Closes-bug: 1850694
    Change-Id: Ie0b421cd5d3a5781c1dd09fab4ed013ece0f939d
    (cherry picked from commit a423d8b27785623550f401a200b68ca926c0a119)
    (cherry picked from commit 32dbd2f585ef1902478170f3a1153b1f71e81db3)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/692631
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a90fe1951200ebd27fe74788c0a96c01104ac2cf
Submitter: Zuul
Branch: stable/rocky

commit a90fe1951200ebd27fe74788c0a96c01104ac2cf
Author: Artom Lifshitz <email address hidden>
Date: Wed Oct 30 14:03:49 2019 -0400

    Avoid error 500 on shelve task_state race

    When shelving a server and saving its new SHELVING task state, we
    expect the current task state to be None and do not handle any
    UnexpectedTaskStateError exceptions that the database layer may throw
    at us. In those cases, we return an error 500 to the user, when in
    actuality an error 409 conflict would be more appropriate. This
    patch makes the API layer handle UnexpectedTaskStateError exceptions
    and return a 409 to the user.

    NOTE(artom) Conflict in imports in
    nova/tests/unit/api/openstack/compute/test_shelve.py due to using
    oslo's uuidsentinel as of 8e1ca5bf34f924b9a23ddbcf31ae727a7dcacb20 in
    stein.

    Closes-bug: 1850694
    Change-Id: Ie0b421cd5d3a5781c1dd09fab4ed013ece0f939d
    (cherry picked from commit a423d8b27785623550f401a200b68ca926c0a119)
    (cherry picked from commit 32dbd2f585ef1902478170f3a1153b1f71e81db3)
    (cherry picked from commit 8ee59604dcd9f770286f54e3a39c80f54413c358)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.opendev.org/692632
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e72c6a3a491b141f7e74b6a34692c1bddec40a1a
Submitter: Zuul
Branch: stable/queens

commit e72c6a3a491b141f7e74b6a34692c1bddec40a1a
Author: Artom Lifshitz <email address hidden>
Date: Wed Oct 30 14:03:49 2019 -0400

    Avoid error 500 on shelve task_state race

    When shelving a server and saving its new SHELVING task state, we
    expect the current task state to be None and do not handle any
    UnexpectedTaskStateError exceptions that the database layer may throw
    at us. In those cases, we return an error 500 to the user, when in
    actuality an error 409 conflict would be more appropriate. This
    patch makes the API layer handle UnexpectedTaskStateError exceptions
    and return a 409 to the user.

    Closes-bug: 1850694
    Change-Id: Ie0b421cd5d3a5781c1dd09fab4ed013ece0f939d
    (cherry picked from commit a423d8b27785623550f401a200b68ca926c0a119)
    (cherry picked from commit 32dbd2f585ef1902478170f3a1153b1f71e81db3)
    (cherry picked from commit 8ee59604dcd9f770286f54e3a39c80f54413c358)
    (cherry picked from commit a90fe1951200ebd27fe74788c0a96c01104ac2cf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.1

This issue was fixed in the openstack/nova 20.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.1.0

This issue was fixed in the openstack/nova 19.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.3.0

This issue was fixed in the openstack/nova 18.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova queens-eol

This issue was fixed in the openstack/nova queens-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.