undercloud-upgrade fails upgrade_tasks step3 'migrate existing introspection data' -> 'Lost connection to MySQL server during query'

Bug #1934658 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

at [1] master and [2][3] wallaby the tripleo-ci-centos-8-undercloud-upgrade (-wallaby) is failing during the "Upgrade tasks for step 3" with trace like

 2021-07-01 17:06:51 | 2021-07-01 17:06:51.747653 | fa163e50-07a3-a33e-56b9-0000000002ab | FATAL | migrate existing introspection data | undercloud | error={"changed": true, "cmd": "podman exec -u root ironic_inspector ironic-inspector-migrate-data --from swift --to database --config-file /etc/ironic-inspector/inspector.conf\n", "delta": "0:01:42.854225", "end": "2021-07-01 17:06:51.705402", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2021-07-01 17:05:08.851177", "stderr": "b\"(pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')\\n(Background on this error at: http://sqlalche.me/e/e3q8)\"", "stderr_lines": ["b\"(pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')\\n(Background on this error at: http://sqlalche.me/e/e3q8)\""], "stdout": "", "stdout_lines": []}

The job is not yet running in check/gate being added/reintroduced with [4] (so is not a gate blocker).

[1] https://eb2a2511515e6460e8f6-cbeb9c893e713fd8fef388929b1d4977.ssl.cf2.rackcdn.com/793393/4/check/tripleo-ci-centos-8-undercloud-upgrade/0eb23a7/logs/undercloud/home/zuul/undercloud_upgrade.log
[2] https://ad94df435a366c083a4e-893f9315e5179a0462ce52bc0ed28dd9.ssl.cf5.rackcdn.com/793123/3/check/tripleo-ci-centos-8-undercloud-upgrade-wallaby/5006e9c/logs/undercloud/home/zuul/undercloud_upgrade.log
[3] https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_609/793135/2/check/tripleo-ci-centos-8-undercloud-upgrade-wallaby/609235d/logs/undercloud/home/zuul/undercloud_upgrade.log
[4] https://review.opendev.org/q/topic:wallaby-upgrade-jobs

Changed in tripleo:
importance: Undecided → Critical
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :

even though it isn't a promotion blocker per comment #1, adding the promotion-blocker flag here so it gets tracked as CIX, hoping it may ring some bells with the DF or Upgrades squads

tags: added: promotion-blocker
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :

testing to see if we can move that migration to step1 with https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799832 @ramishra is that sane or does it have to be in step 3 ?

Otherwise grateful for any thoughts on the issue here thanks

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799974

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "Marios Andreou <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799974
Reason: was meant to update https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799832

Revision history for this message
Marios Andreou (marios-b) wrote :

fix for this is at

https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799832 "Moves undercloud upgrade introspection data migration to step 1"

Revision history for this message
Marios Andreou (marios-b) wrote :

Adding a comment here to point Upgrade squad to

For this bug, moving the introspection migration during step1 instead of step3 is OK i.e. there is no issue running that task during step1.

However the issue/question remains, as commented by ramishra there https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799832/2#message-2ba36424970cebb5b704583310934d4f8fe21b41

So the mysql container is stopped in step2 of the upgrade tasks. Is it expected that it remains down until the deployment steps are executed again (as part of the upgrade)? i.e. do we really have no db migration steps during the upgrade tasks that would need the DB for example?

Revision history for this message
Jose Luis Franco (jfrancoa) wrote :

So, this task was added in https://github.com/openstack/tripleo-heat-templates/commit/0015cc74416445c3cf4e18e566a954f609f7cf0f as due to an issue with the redo log format the upgrade from mysql 10.1 to 10.3 caused issues. I'll try to confirm with dciabrin if this is still needed, or if we could get rid of it in wallaby. I'll post with the info.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/800007

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799832
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/dce6bee2f9e477fd53f01b76b8fad4c7ef79cf8b
Submitter: "Zuul (22348)"
Branch: master

commit dce6bee2f9e477fd53f01b76b8fad4c7ef79cf8b
Author: Marios Andreou <email address hidden>
Date: Thu Jul 8 12:03:27 2021 +0300

    Moves undercloud upgrade introspection data migration to step 1

    This moves the ironic introspection data migration from swift
    to dbase to step1 of upgrade_tasks. It was previously in step3 but
    that caused related-bug since mysql is down at that point. Needed
    by [1] to fix the failing undercloud-upgrade job.

    [1] https://review.opendev.org/c/openstack/tripleo-ci/+/793393
    Related-Bug: 1934658

    Change-Id: Ic68fbc91e538ebb101b67b2abb3e285a79d770ab

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/800007
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/81373cb67777c44f1732552e3708f44d939d5d1c
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 81373cb67777c44f1732552e3708f44d939d5d1c
Author: Marios Andreou <email address hidden>
Date: Thu Jul 8 12:03:27 2021 +0300

    Moves undercloud upgrade introspection data migration to step 1

    This moves the ironic introspection data migration from swift
    to dbase to step1 of upgrade_tasks. It was previously in step3 but
    that caused related-bug since mysql is down at that point. Needed
    by [1] to fix the failing undercloud-upgrade job.

    [1] https://review.opendev.org/c/openstack/tripleo-ci/+/793393
    Related-Bug: 1934658

    Change-Id: Ic68fbc91e538ebb101b67b2abb3e285a79d770ab
    (cherry picked from commit dce6bee2f9e477fd53f01b76b8fad4c7ef79cf8b)

tags: added: in-stable-wallaby
Revision history for this message
Marios Andreou (marios-b) wrote :

moving this to fix-released for now - the job is green at https://review.opendev.org/c/openstack/tripleo-ci/+/793393 (which depends-on the step1 fix from comment #11 above).

If there is further related work to do with respect to comment #9 please move back to in progress and update with the relevant information

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.