AIO-SX upgrade : Ansible execution failure in Ansible

Bug #1978721 reported by Lucas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Lucas

Bug Description

Brief Description

AIO-SX upgrade ansible failure during the upgrade.

 ++

Severity

standard

Steps to Reproduce

1. Install simplex lab with stx 6.0
2. Follow the upgrade procedure for AIO-SX stx/master . Ansible execution failed.

Expected Behavior

No failure on ansible.

Actual Behavior

Ansible failure on execution.

Reproducibility

100% reproducible. Seen in 2 labs

System Configuration

WCP-13,AIO-SX

Load info

stx 6.0

 ++

Timestamp/Logs

                022-06-10 18:43:00,023 p=29480 u=sysadmin | changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':

{u'namespace': u'flux-helm', u'deployment': u'helm-controller'}
, u'ansible_job_id': u'332081948425.153745', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'flux-helm', u'deployment': u'helm-controller'}, u'finished': 0, u'results_file': u'/root/.ansible_async/332081948425.153745', '_ansible_ignore_errors': None, '_ansible_no_log': False})

2022-06-10 18:43:00,147 p=29480 u=sysadmin | changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':

{u'namespace': u'flux-helm', u'deployment': u'source-controller'}
, u'ansible_job_id': u'291623480296.153882', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'flux-helm', u'deployment': u'source-controller'}, u'finished': 0, u'results_file': u'/root/.ansible_async/291623480296.153882', '_ansible_ignore_errors': None, '_ansible_no_log': False})

2022-06-10 18:43:00,156 p=29480 u=sysadmin | TASK [common/fluxcd-controllers : Fail if the helm and source controllers are not ready by this time] ****************************************************************************************************************************************************

2022-06-10 18:43:00,156 p=29480 u=sysadmin | Friday 10 June 2022 18:43:00 +0000 (0:03:59.484) 0:23:35.492 ***********

2022-06-10 18:43:00,207 p=29480 u=sysadmin | failed: [localhost] (item={'_ansible_parsed': True, 'stderr_lines': [u'error: timed out waiting for the condition on deployments/helm-controller'], u'changed': True, u'stderr': u'error: timed out waiting for the condition on deployments/helm-controller', u'ansible_job_id': u'332081948425.153745', u'stdout': u'', '_ansible_item_result': True, u'invocation': {u'module_args': {u'creates': None, u'executable': None, u'_uses_shell': False, u'_raw_params': u'kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=flux-helm --for=condition=Available deployment helm-controller --timeout=240s', u'removes': None, u'argv': None, u'warn': True, u'chdir': None, u'stdin': None}}, 'attempts': 40, u'delta': u'0:04:00.064085', 'stdout_lines': [], 'failed_when_result': False, '_ansible_no_log': False, u'end': u'2022-06-10 18:42:58.682014', '_ansible_item_label': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':

{u'namespace': u'flux-helm', u'deployment': u'helm-controller'}
, u'ansible_job_id': u'332081948425.153745', 'item': {u'namespace': u'flux-helm', u'deployment': u'helm-controller'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/332081948425.153745', '_ansible_ignore_errors': None, '_ansible_no_log': False}, u'start': u'2022-06-10 18:38:58.617929', u'cmd': [u'kubectl', u'--kubeconfig=/etc/kubernetes/admin.conf', u'wait', u'--namespace=flux-helm', u'--for=condition=Available', u'deployment', u'helm-controller', u'--timeout=240s'], u'finished': 1, u'failed': False, 'item': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_no_log': False, u'ansible_job_id': u'332081948425.153745', 'item':

{u'namespace': u'flux-helm', u'deployment': u'helm-controller'}
, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/332081948425.153745', '_ansible_ignore_errors': None, '_ansible_item_label': {u'namespace': u'flux-helm', u'deployment': u'helm-controller'}}, u'rc': 1, u'msg': u'non-zero return code', '_ansible_ignore_errors': None}) => changed=false

  item:

    ansible_job_id: '332081948425.153745'

    attempts: 40

    changed: true

    cmd:

    - kubectl

    - --kubeconfig=/etc/kubernetes/admin.conf

    - wait

    - --namespace=flux-helm

    - --for=condition=Available

    - deployment

    - helm-controller

    - --timeout=240s

    delta: '0:04:00.064085'

    end: '2022-06-10 18:42:58.682014'

    failed: false

    failed_when_result: false

    finished: 1

                invocation:

      module_args:

        _raw_params: kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=flux-helm --for=condition=Available deployment source-controller --timeout=240s

        _uses_shell: false

        argv: null

        chdir: null

        creates: null

        executable: null

        removes: null

        stdin: null

        warn: true

    item:

      ansible_job_id: '291623480296.153882'

      changed: true

      failed: false

      finished: 0

      item:

        deployment: source-controller

        namespace: flux-helm

      results_file: /root/.ansible_async/291623480296.153882

      started: 1

    msg: non-zero return code

    rc: 1

    start: '2022-06-10 18:38:59.710057'

    stderr: 'error: timed out waiting for the condition on deployments/source-controller'

    stderr_lines:

    - 'error: timed out waiting for the condition on deployments/source-controller'

    stdout: ''

    stdout_lines: []

  msg: 'Pod {u''namespace'': u''flux-helm'', u''deployment'': u''source-controller''} is still not ready.'

2022-06-10 18:43:00,225 p=29480 u=sysadmin | PLAY RECAP ***********************************************************************************************************************************************************************************************************************************************

2022-06-10 18:43:00,225 p=29480 u=sysadmin | localhost : ok=416 changed=219 unreachable=0 failed=1

2022-06-10 18:43:00,225 p=29480 u=sysadmin | Friday 10 June 2022 18:43:00 +0000 (0:00:00.068) 0:23:35.560 ***********

2022-06-10 18:43:00,225 p=29480 u=sysadmin | ===============================================================================

 ++

Test Activity

Regression testing

Workaround

 **

 **

                    **

Lucas (lcavalca)
Changed in starlingx:
assignee: nobody → Lucas (lcavalca)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/845811
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/3bc0450ab3e86ca95741dc0da7ea7297131351cd
Submitter: "Zuul (22348)"
Branch: master

commit 3bc0450ab3e86ca95741dc0da7ea7297131351cd
Author: Lucas Cavalcante <email address hidden>
Date: Tue Jun 14 15:41:16 2022 -0300

    Fix FluxCD controller install when running restore

    After the refactoring of FluxCD tasks to a new role, some of its tasks
    were being run before a block of tasks that only run during restore,
    responsible for among other things to restarting calico, coredns and so
    on. This would make FluxCD pods unable to start, stuck at container
    creating waiting for calico/coredns and failing bootstrap

    TEST PLAN:

    PASS: Upgrade Simplex subcloud
    PASS: Upgrade Duplex
    PASS: Bootstrap Simplex

    Closes-bug: 1978721
    Signed-off-by: Lucas Cavalcante <email address hidden>
    Change-Id: I5ef17c36453b13dcf34386208a95e78a55252510

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.update
tags: added: stx.7.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.