ffwd prepare fails from queens to train fails with ConcurrentTransaction error

Bug #1869335 reported by Rabi Mishra
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
High
Rabi Mishra

Bug Description

After the undercloud has been upgraded, FFWD upgrade prepare fails with ConcurrentTransaction error, when deleting a bunch of software deployments as we've move to config-download.

We've switched from POLL_TEMP_URL to HEAT_POLL_SERVER as the config transport in stable/train. Looks that's causing the issue.

Though I've not fully understood what's going on with the switch, it seems reverting https://review.opendev.org/671980/ fixes the issue.

We can probably switch at a later point of time.

| 1503 | 64c97190-81fc-46e4-9da0-4d92fdd5cc2f | 8dcdfe51-95da-420a-9b52-cb0572d13d90 | SshHostPubKeyDeployment | 2020-03-19 14:28:33 | NULL | DELETE | FAILED | ConcurrentTransaction_Remote: resources.SshHostPubKeyDeployment: Concurrent transaction for deployments of server 9b6a501b-8f0a-460d-92a8-c9df70941b3e
Traceback (most recent call last):

  File "/usr/lib/python3.6/site-packages/heat/common/context.py", line 423, in wrapped
    return func(self, ctx, *args, **kwargs)

  File "/usr/lib/python3.6/site-packages/heat/engine/service.py", line 2278, in delete_software_deployment
    cnxt, deployment_id)

  File "/usr/lib/python3.6/site-packages/heat/engine/service_software_config.py", line 389, in delete_software_deployment
    cnxt, sd.server_id, sd.stack_user_project_id)

  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f
    return self.call(f, *args, **kw)

  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call
    do = self.iter(retry_state=retry_state)

  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 331, in iter
    raise retry_exc.reraise()

  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 167, in reraise
    raise self.last_attempt.result()

  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()

  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception

  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call
    result = fn(*args, **kwargs)

  File "/usr/lib/python3.6/site-packages/heat/engine/service_software_config.py", line 127, in _push_metadata_software_deployments
    raise exception.ConcurrentTransaction(action=action)

heat.common.exception.ConcurrentTransaction: Concurrent transaction for deployments of server 9b6a501b-8f0a-460d-92a8-c9df70941b3e
 | 40286a1c-322c-41c0-95fd-23e2953ad410 | {} | null | NULL | 5 | [] | [1504] | NULL | NULL | 617 | NULL | aaa17b7d-2cd0-4849-b393-d0c1dd5458de | 2163 | NULL |

Rabi Mishra (rabi)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → ussuri-3
Changed in tripleo:
assignee: nobody → Rabi Mishra (rabi)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/715360
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=6ff119ddacdeb55d1637ad8c6557225732c31d22
Submitter: Zuul
Branch: master

commit 6ff119ddacdeb55d1637ad8c6557225732c31d22
Author: Rabi Mishra <email address hidden>
Date: Fri Mar 27 16:00:10 2020 +0530

    Revert "Stop using swift temp url for config transport"

    This switch seems to be creating issues with upgrades, where a number of
    software deployments are deleted concurrently while updating the config
    transport for the server. Switching the config transport does not work
    with convergence heat and should be fixed in heat. We can revert this
    now, as we still use swift for other stuff in the undercloud. Can be
    changed once the issue is fixed in heat.

    It also reverts the following dependant commit.

    Revert "Cleanup SoftwareConfigTransport"

    This reverts commit (1821c01846a20da331959ff49fe8536f1e1bf86a and
    3ea9dd4040686b7c2ec82f7e7e467e0b3c3bd2a7)

    Closes-Bug: #1869335
    Change-Id: I835c8be3eecce91f8a370d036bf1085bc445e01d

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/716123

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/716123
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f6b5ecde9e93b0f3663639e4cc96fdc925cecc76
Submitter: Zuul
Branch: stable/train

commit f6b5ecde9e93b0f3663639e4cc96fdc925cecc76
Author: Rabi Mishra <email address hidden>
Date: Fri Mar 27 16:00:10 2020 +0530

    Revert "Stop using swift temp url for config transport"

    This switch seems to be creating issues with upgrades, where a number of
    software deployments are deleted concurrently while updating the config
    transport for the server. Switching the config transport does not work
    with convergence heat and should be fixed in heat. We can revert this
    now, as we still use swift for other stuff in the undercloud. Can be
    changed once the issue is fixed in heat.

    It also reverts the following dependant commit.

    Revert "Cleanup SoftwareConfigTransport"

    This reverts commit (1821c01846a20da331959ff49fe8536f1e1bf86a and
    3ea9dd4040686b7c2ec82f7e7e467e0b3c3bd2a7)

    Closes-Bug: #1869335
    Change-Id: I835c8be3eecce91f8a370d036bf1085bc445e01d
    (cherry picked from commit 6ff119ddacdeb55d1637ad8c6557225732c31d22)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.2.0

This issue was fixed in the openstack/tripleo-heat-templates 12.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers