Failed to abort overcloud update

Bug #1668269 reported by Yurii Prokulevych
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Brad P. Crochet

Bug Description

Attempt to abort overcloud update hangs:

overcloud update abort overcloud
Started Mistral Workflow tripleo.package_update.v1.cancel_stack_update. Execution ID: d732cefd-9ade-48d2-9c39-0a5c60d092f0
Waiting for messages on queue '66e08597-312c-4574-ac93-3f3bac79c778' with no timeout.

Looks like mistral tasks fail:
------------------------------
mistral task-list d732cefd-9ade-48d2-9c39-0a5c60d092f0
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+------------------------------+------------------------------+------------------------------+------------------------------+---------+------------------------------+---------------------+---------------------+
| ID | Name | Workflow name | Execution ID | State | State info | Created at | Updated at |
+------------------------------+------------------------------+------------------------------+------------------------------+---------+------------------------------+---------------------+---------------------+
| 416e7e3c-5502-4eb3-bf31-f6d4 | set_cancel_stack_update_fail | tripleo.package_update.v1.ca | d732cefd-9ade- | SUCCESS | None | 2017-02-27 13:04:46 | 2017-02-27 13:04:46 |
| 9b61c298 | ed | ncel_stack_update | 48d2-9c39-0a5c60d092f0 | | | | |
| d6c3ea0e- | send_message | tripleo.package_update.v1.ca | d732cefd-9ade- | ERROR | Failed to run action [act... | 2017-02-27 13:04:46 | 2017-02-27 13:04:47 |
| eaf7-4924-8233-18e4030770ab | | ncel_stack_update | 48d2-9c39-0a5c60d092f0 | | | | |
| fa41b255-382b-44ee- | cancel_stack_update | tripleo.package_update.v1.ca | d732cefd-9ade- | ERROR | Failed to run action [act... | 2017-02-27 13:04:46 | 2017-02-27 13:04:46 |
| bde2-31b197fb2789 | | ncel_stack_update | 48d2-9c39-0a5c60d092f0 | | | | |
+------------------------------+------------------------------+------------------------------+------------------------------+---------+------------------------------+---------------------+---------------------+

mistral task-get 416e7e3c-5502-4eb3-bf31-f6d49b61c298
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+---------------+-----------------------------------------------+
| Field | Value |
+---------------+-----------------------------------------------+
| ID | 416e7e3c-5502-4eb3-bf31-f6d49b61c298 |
| Name | set_cancel_stack_update_failed |
| Workflow name | tripleo.package_update.v1.cancel_stack_update |
| Execution ID | d732cefd-9ade-48d2-9c39-0a5c60d092f0 |
| State | SUCCESS |
| State info | None |
| Created at | 2017-02-27 13:04:46 |
| Updated at | 2017-02-27 13:04:46 |
+---------------+-----------------------------------------------+

mistral task-get d6c3ea0e-eaf7-4924-8233-18e4030770ab
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID | d6c3ea0e-eaf7-4924-8233-18e4030770ab |
| Name | send_message |
| Workflow name | tripleo.package_update.v1.cancel_stack_update |
| Execution ID | d732cefd-9ade-48d2-9c39-0a5c60d092f0 |
| State | ERROR |
| State info | Failed to run action [action_ex_id=394def26-17d3-44e6-b99f-65751adc9813, action_cls='<class 'mistral.actions.action_factory.ZaqarAction'>', attributes='{u'client_method_name': |
| | u'queue_post'}', params='{u'queue_name': u'<% $.queue_name', u'messages': {u'body': {u'type': u'tripleo.package_update.v1.cancel_stack_update', u'payload': {u'status': u'FAILED', u'message': |
| | u"Failed to run action [action_ex_id=09e4ea8b-54b1-4944-8e89-1db3de529512, action_cls='<class 'mistral.actions.action_factory.CancelStackUpdateAction'>', attributes='{}', |
| | params='{u'stack_id': u'5f54018a-938d-4503-ac21-634b9b7ff37a'}']\n __init__() takes exactly 5 arguments (4 given)", u'execution': {u'input': {u'stack_id': u'5f54018a- |
| | 938d-4503-ac21-634b9b7ff37a', u'queue_name': u'66e08597-312c-4574-ac93-3f3bac79c778'}, u'params': {}, u'id': u'd732cefd-9ade-48d2-9c39-0a5c60d092f0', u'name': |
| | u'tripleo.package_update.v1.cancel_stack_update', u'spec': {u'input': [u'stack_id', {u'queue_name': u'tripleo'}], u'tasks': {u'cancel_stack_update': {u'name': u'cancel_stack_update', u'on- |
| | error': u'set_cancel_stack_update_failed', u'on-success': u'send_message', u'version': u'2.0', u'action': u'tripleo.package_update.cancel_stack_update stack_id=<% $.stack_id %>', u'type': |
| | u'direct'}, u'send_message': {u'action': u'zaqar.queue_post', u'input': {u'queue_name': u'<% $.queue_name', u'messages': {u'body': {u'type': u'tripleo.package_update.v1.cancel_stack_update', |
| | u'payload': {u'status': u"<% $.get('status', 'SUCCESS') %>", u'message': u"<% $.get('message', '') %>", u'execution': u'<% execution() %>'}}}}, u'version': u'2.0', u'type': u'direct', |
| | u'name': u'send_message'}, u'set_cancel_stack_update_failed': {u'version': u'2.0', u'type': u'direct', u'name': u'set_cancel_stack_update_failed', u'publish': {u'status': u'FAILED', |
| | u'message': u'<% task(cancel_stack_update).result %>'}, u'on-success': u'send_message'}}, u'description': u'Cancel a currently running stack update', u'version': u'2.0', u'name': |
| | u'cancel_stack_update'}}}}}}'] |
| | ZaqarAction.queue_post failed: <class 'zaqarclient.transport.errors.MalformedRequest'>: Error response from Zaqar. Code: 400. Title: Invalid queue identification. Description: The format of |
| | the submitted queue name or project id is not valid.. |
| Created at | 2017-02-27 13:04:46 |
| Updated at | 2017-02-27 13:04:47 |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

mistral task-get fa41b255-382b-44ee-bde2-31b197fb2789
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID | fa41b255-382b-44ee-bde2-31b197fb2789 |
| Name | cancel_stack_update |
| Workflow name | tripleo.package_update.v1.cancel_stack_update |
| Execution ID | d732cefd-9ade-48d2-9c39-0a5c60d092f0 |
| State | ERROR |
| State info | Failed to run action [action_ex_id=09e4ea8b-54b1-4944-8e89-1db3de529512, action_cls='<class 'mistral.actions.action_factory.CancelStackUpdateAction'>', attributes='{}', params='{u'stack_id': |
| | u'5f54018a-938d-4503-ac21-634b9b7ff37a'}'] |
| | __init__() takes exactly 5 arguments (4 given) |
| Created at | 2017-02-27 13:04:46 |
| Updated at | 2017-02-27 13:04:46 |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Brad P. Crochet (brad-9)
Changed in tripleo:
assignee: nobody → Brad P. Crochet (brad-9)
importance: Undecided → Critical
status: New → In Progress
milestone: none → ocata-rc2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/438583

Brad P. Crochet (brad-9)
tags: added: ocata-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/438583
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=d2ee6d070e755cb77ec8fcddeec56bc97df10b3d
Submitter: Jenkins
Branch: master

commit d2ee6d070e755cb77ec8fcddeec56bc97df10b3d
Author: Brad P. Crochet <email address hidden>
Date: Mon Feb 27 09:17:31 2017 -0500

    Fix wrong args to update manager

    The abort update and clear breakpoints actions were using the wrong
    number of arguments to the update manager. This fixes that, and adds
    to the unit tests to detect the problem.

    Change-Id: I198a33c8608648c7abcafc2cfb1aefb0fb8417e6
    Closes-Bug: #1668269

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/438725

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/ocata)

Reviewed: https://review.openstack.org/438725
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=944c16214ffb8259b7a1cd6551169e736a129dfb
Submitter: Jenkins
Branch: stable/ocata

commit 944c16214ffb8259b7a1cd6551169e736a129dfb
Author: Brad P. Crochet <email address hidden>
Date: Mon Feb 27 09:17:31 2017 -0500

    Fix wrong args to update manager

    The abort update and clear breakpoints actions were using the wrong
    number of arguments to the update manager. This fixes that, and adds
    to the unit tests to detect the problem.

    Change-Id: I198a33c8608648c7abcafc2cfb1aefb0fb8417e6
    Closes-Bug: #1668269
    (cherry picked from commit d2ee6d070e755cb77ec8fcddeec56bc97df10b3d)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 6.0.0

This issue was fixed in the openstack/tripleo-common 6.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 7.0.0.0b1

This issue was fixed in the openstack/tripleo-common 7.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (stable/ocata)

Reviewed: https://review.openstack.org/481673
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=879d73c5a4c4fda99412a16229931ab91232796d
Submitter: Jenkins
Branch: stable/ocata

commit 879d73c5a4c4fda99412a16229931ab91232796d
Author: Brad P. Crochet <email address hidden>
Date: Thu Mar 23 13:18:04 2017 -0400

    Remove update abort

    The update abort is not reliable, and could mess up a TripleO stack
    beyond repair. Since this has a potential for data loss, I would
    suggest this be removed without a deprecation period.

    Change-Id: Ieec4f01e38768eafb3df1f06340bdd3e220d30bd
    Related-Bug: #1668269
    (cherry picked from commit 1d3231de5a74d5ddf89cd96ccb598e2293a36b2f)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/ocata)

Reviewed: https://review.openstack.org/481679
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=3beb3402c522a552542e6f6ecfa7089a35b12480
Submitter: Jenkins
Branch: stable/ocata

commit 3beb3402c522a552542e6f6ecfa7089a35b12480
Author: Brad P. Crochet <email address hidden>
Date: Thu Mar 23 13:35:41 2017 -0400

    Remove update abort

    The update abort is not reliable, and could mess up a TripleO stack
    beyond repair. Since this has a potential for data loss, I would
    suggest this be removed without a deprecation period.

    Depends-On: Ieec4f01e38768eafb3df1f06340bdd3e220d30bd

    Change-Id: I752e061979d667c1fb2b115c1a7339002e1824d5
    Related-Bug: #1668269
    (cherry picked from commit b14c0f727e65de9dbc76160730b413ddac4a54e9)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.