k8s network upgrade fails while upgrading k8s 1.21.8 to 1.22.5

Bug #1979129 reported by Kaustubh Dhokte
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Kaustubh Dhokte

Bug Description

Brief Description
-----------------
k8s upgrade failed from 1.21.8 to 1.22.5 due to kube-upgrade-networking failed

Severity
--------

Critical

Steps to Reproduce
------------------
Upgrade k8s from 1.21.8 to 1.22.5

Expected Behavior
------------------
k8s upgrade should work

Actual Behavior
----------------
k8s upgrade fails

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-DX

Branch/Pull Time/Commit
-----------------------

Last Pass
---------

Timestamp/Logs
--------------
TASK [common/upgrade-k8s-networking : Create Multus config file] ***************
Sunday 19 June 2022 04:39:11 +0000 (0:00:02.415) 0:00:31.659 ***********
ok: [localhost]

TASK [common/upgrade-k8s-networking : Update Multus Networking] ****************
Sunday 19 June 2022 04:39:11 +0000 (0:00:00.259) 0:00:31.919 ***********
changed: [localhost]

TASK [common/upgrade-k8s-networking : Check if SRIOV config file is present in the backup tarball] ***
Sunday 19 June 2022 04:39:12 +0000 (0:00:01.199) 0:00:33.118 ***********
fatal: [localhost]: FAILED! =>
  msg: |-
    The conditional check 'previous_mode == 'restore'' failed. The error was: error while evaluating conditional (previous_mode == 'restore'): 'previous_mode' is undefined

    The error appears to have been in '/usr/share/ansible/stx-ansible/playbooks/roles/common/upgrade-k8s-networking/tasks/main.yml': line 90, column 5, but may
    be elsewhere in the file depending on the exact syntax problem.

    The offending line appears to be:

    - block:
      - name: Check if SRIOV config file is present in the backup tarball
        ^ here

PLAY RECAP *********************************************************************
localhost : ok=64 changed=12 unreachable=0 failed=1

Sunday 19 June 2022 04:39:12 +0000 (0:00:00.016) 0:00:33.135 ***********
===============================================================================
Gathering Facts --------------------------------------------------------- 5.55s
common/upgrade-k8s-networking : Create the upgrade overrides file ------- 2.67s
common/upgrade-k8s-networking : Update Calico Networking ---------------- 2.42s
common/push-docker-images : Download images and push to local registry --- 2.36s
common/push-docker-images : Query the gcr_registry ---------------------- 1.54s
common/push-docker-images : Query the k8s_registry ---------------------- 1.42s
common/push-docker-images : Query the docker_registry ------------------- 1.41s
common/push-docker-images : Query the quay_registry --------------------- 1.36s
common/push-docker-images : Query the ghcr_registry --------------------- 1.35s
common/push-docker-images : Query the elastic_registry ------------------ 1.31s
common/upgrade-k8s-networking : Update Multus Networking ---------------- 1.20s
common/push-docker-images : Get local registry credentials -------------- 0.54s
common/upgrade-k8s-networking : Create Calico config file --------------- 0.50s
common/load-images-information : Check if additional image config file exists --- 0.33s
common/load-images-information : Get the list of kubernetes images ------ 0.32s
common/push-docker-images : Set secure to bool -------------------------- 0.32s
common/push-docker-images : set_fact ------------------------------------ 0.29s
common/push-docker-images : set_fact ------------------------------------ 0.29s
common/push-docker-images : set_fact ------------------------------------ 0.28s
common/upgrade-k8s-networking : Create Multus config file --------------- 0.26s
.
sysinv 2022-06-19 04:39:12.765 487575 WARNING sysinv.conductor.manager [-] ansible-playbook returned an error: 2
sysinv 2022-06-19 04:39:57.760 487575 INFO sysinv.common.rest_api [-] GET cmd:http://localhost:30001/nfvi-plugins/v1/sw-update hdr:{'Content-type': 'application/json', 'User-Agent': 'sysinv/1.0'} payload:None
sysinv 2022-06-19 04:39:57.762 487575 INFO sysinv.common.rest_api [-] Response={u'status': u'success', u'in-progress': False, u'sw-update-type': u'kube-upgrade'}
sysinv 2022-06-19 04:39:57.810 487575 INFO sysinv.conductor.manager [-] Upgrade in progress - defer platform managed application activity

Test Activity
-------------
Developer Testing

Workaround
----------

Changed in starlingx:
status: New → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote (last edit ):

Looks like this issue was introduced by this recent change: https://review.opendev.org/c/starlingx/ansible-playbooks/+/845612

Changed in starlingx:
importance: Undecided → High
Revision history for this message
Ghada Khalil (gkhalil) wrote :
Changed in starlingx:
assignee: nobody → Kaustubh Dhokte (kdhokte)
tags: added: stx.7.0 stx.containers stx.update
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/846476
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/eb17492b21125a567cfd01ff2425ccb26e50f68a
Submitter: "Zuul (22348)"
Branch: master

commit eb17492b21125a567cfd01ff2425ccb26e50f68a
Author: Kaustubh Dhokte <email address hidden>
Date: Sun Jun 19 02:07:44 2022 -0400

    Improve the condition for SRIOV config file presence check

    This change https://review.opendev.org/c/starlingx/ansible-playbooks/+/845612
    missed a check that if previous_mode is defined.
    Without this, k8s upgrade fails from 1.21.8 to 1.22.5.

    Test Plan (On CentOS):
    On AIO-DX
    PASS: 'system kube-upgrade-networking' successful
           when upgrading k8s 1.21.8 -> 1.22.5.

    Closes-Bug: 1979129

    Signed-off-by: Kaustubh Dhokte <email address hidden>
    Change-Id: I4313f646bcf9a208c9c49ee5676fa731a025b4cf

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.