Backup & Restore: Backup fail running ansible playbook - fail on pre-backup action for nginx-ingress-controller

Bug #1978346 reported by Thiago Paiva Brito
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Thiago Paiva Brito

Bug Description

Brief Description
-----------------

Ansible backup bootstrap failed.

Severity
--------

Critical

Steps to Reproduce
------------------

Install StarlingX

Run the ansible backup playbook

Expected Behavior
------------------

Backup should proceed successfully.

Actual Behavior
----------------

Ansible Backup bootstrap fails.

Reproducibility
---------------

5/5 (100%)

System Configuration
--------------------

SX, DX, STD

Timestamp/Logs
--------------
{code:java}
TASK [backup/backup-system : Fail if some application cannot handle the pre-backup action] ************************************************************************************************************************
Tuesday 07 June 2022  01:49:33 +0000 (0:00:02.176)       0:00:13.643 **********
fatal: [localhost]: FAILED! => changed=false
  msg: |-
    Pre-backup action for application nginx-ingress-controller.TASK [backup/backup-system : Notify applications that backup failed.] *********************************************************************************************************************************************
Tuesday 07 June 2022  01:49:33 +0000 (0:00:00.046)       0:00:13.689 **********
fatal: [localhost]: FAILED! => changed=true
  cmd:
  - /usr/bin/sysinv-utils
  - notify
  - post-backup-action
  - failure
  delta: '0:00:03.032648'
  end: '2022-06-07 01:49:36.311321'
  msg: non-zero return code
  rc: 1
  start: '2022-06-07 01:49:33.278673'
  stderr: nginx-ingress-controller
  stderr_lines:
  - nginx-ingress-controller
  stdout: |-
    sysinv 2022-06-07 01:49:35.128 281226 INFO sysinv.openstack.common.rpc.common [-] Connected to AMQP server on face::1:5672[00m
    sysinv 2022-06-07 01:49:35.205 281226 ERROR sysinv.cmd.utils [-] Operation 'post-backup-action' was aborted by 'nginx-ingress-controller' appliction.[00m
  stdout_lines: <omitted> {code}

Test Activity
-------------

Regression Testing

Workaround
----------

Do:

{code:java}
sudo sed -i 's/def _get_webhook_configuration(self, app_op):/def _get_webhook_configuration(self, app_op):\n return None/g' /opt/platform/helm/22.06/nginx-ingress-controller/1.1-33/plugins/k8sapp_nginx_ingress_controller/lifecycle/lifecycle_nginx_ingress_controller.py
sudo sm-restart service sysinv-conductor
sudo sm-restart service sysinv-inv
{code}

Then run again the playbook. Note that this workaround skips the backup of webhooks for nginx:

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Thiago Paiva Brito (outbrito)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/845372
Committed: https://opendev.org/starlingx/config/commit/d68f2d641a730b61a1113e06eb5cae5a21685170
Submitter: "Zuul (22348)"
Branch: master

commit d68f2d641a730b61a1113e06eb5cae5a21685170
Author: Thiago Brito <email address hidden>
Date: Fri Jun 10 14:32:07 2022 -0300

    Improve KubeOperator.list_custom_resources()

    This commit extends the capabilities of list_custom_resources() to send
    out other parameters that may be used with the CustomResource API such
    as:
    - pretty
    - label_selector
    - resource_version
    - watch

    Partial-Bug: 1978346

    Signed-off-by: Thiago Brito <email address hidden>
    Change-Id: I8cb9fbb42865bfdb0718a0fea5300b54b1135ef3

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nginx-ingress-controller-armada-app (master)

Reviewed: https://review.opendev.org/c/starlingx/nginx-ingress-controller-armada-app/+/845373
Committed: https://opendev.org/starlingx/nginx-ingress-controller-armada-app/commit/99ab7ba8b560f35bcf0d6a95e2f6390e679cd1f5
Submitter: "Zuul (22348)"
Branch: master

commit 99ab7ba8b560f35bcf0d6a95e2f6390e679cd1f5
Author: Thiago Brito <email address hidden>
Date: Fri Jun 10 14:36:40 2022 -0300

    Fix get admission-webhooks w/ 1.23.1

    With the upversion of k8s on the platform to 1.23.1, the
    kubernetes-client we are using doesn't support getting the
    admission-webhooks with the older v1beta1 version. This is a temporary
    workaround to get backups working while we evaluate the upversion of the
    kubernetes-client library for Stx.8.0.

    TEST PLAN
    PASS Run backup playbook, no errors

    LOGS: https://paste.opendev.org/show/bJaMTRrEBdjwK4XwWm8l/

    Closes-Bug: 1978346
    Depends-On: https://review.opendev.org/c/starlingx/config/+/845372
    Signed-off-by: Thiago Brito <email address hidden>
    Change-Id: Ic57a05d8151a5d498e2422ca53fc0306158d28dc

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
tags: added: stx.7.0 stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.