SX after BnR, platform-integ-apps apply-failed

Bug #1866704 reported by Peng Peng
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Mihnea Saracin

Bug Description

Brief Description
-----------------
platform-integ-apps applied on system. After backup and Restore, platform-integ-apps app applu-failed.
/var/log/armada/platform-integ-apps-apply_2020-03-09-09-28-48.log shows:
Unable to extract /tmp/armadacwbivapw

Severity
--------
Major

Steps to Reproduce
------------------
platform-integ-apps applied on system.
Executing backup and Restore,
After BnR, check platform-integ-apps app status.

TC-name: Sanity after BnR

Expected Behavior
------------------
app applied

Actual Behavior
----------------
apply-failed

Reproducibility
---------------
Unknown - first time this is seen in sanity, will monitor

System Configuration
--------------------
One node system

Lab-name: WCP_112

Branch/Pull Time/Commit
-----------------------
2020-03-08_04-10-00

Last Pass
---------
First time run

Timestamp/Logs
--------------
[2020-03-09 08:31:57,752] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-03-09 08:31:58,815] 436 DEBUG MainThread ssh.expect :: Output:
+---------------------+---------+-------------------------------+---------------+----------+-----------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+---------+-------------------------------+---------------+----------+-----------+
| oidc-auth-apps | 1.0-0 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | applied | completed |
+---------------------+---------+-------------------------------+---------------+----------+-----------+

Backup and Restore ...

[2020-03-09 17:28:42,296] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-03-09 17:28:43,399] 436 DEBUG MainThread ssh.expect :: Output:
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+
| oidc-auth-apps | 1.0-0 | oidc-auth-manifest | manifest.yaml | apply-failed | operation aborted, check logs for detail |
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | apply-failed | operation aborted, check logs for detail |
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+

From /var/log/armada/platform-integ-apps-apply_2020-03-09-09-28-48.log
2020-03-09 09:28:49.743 11 WARNING armada.handlers.armada [-] Disabling server validation certs to extract charts^[[00m
2020-03-09 09:28:50.731 11 INFO armada.handlers.lock [-] Releasing lock^[[00m
2020-03-09 09:28:50.737 11 ERROR armada.cli [-] Caught internal exception: armada.exceptions.source_exceptions.TarballExtractException: Unable to extract /tmp/armadacwbivapw
2020-03-09 09:28:50.737 11 ERROR armada.cli Traceback (most recent call last):
2020-03-09 09:28:50.737 11 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/utils/source.py", line 150, in extract_tarball
2020-03-09 09:28:50.737 11 ERROR armada.cli file = tarfile.open(tarball_path)
2020-03-09 09:28:50.737 11 ERROR armada.cli File "/usr/lib/python3.6/tarfile.py", line 1576, in open
2020-03-09 09:28:50.737 11 ERROR armada.cli raise ReadError("file could not be opened successfully")
2020-03-09 09:28:50.737 11 ERROR armada.cli tarfile.ReadError: file could not be opened successfully

Test Activity
-------------
Sanity

Peng Peng (ppeng)
tags: added: stx.retestneeded
Revision history for this message
Peng Peng (ppeng) wrote :
Frank Miller (sensfan22)
Changed in starlingx:
assignee: nobody → Mihnea Saracin (msaracin)
Revision history for this message
Mihnea Saracin (msaracin) wrote :

It seems that the helm charts are not correctly restored when the wipe_ceph_osds flag is set to true.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / high priority - this issue was introduced by recent code changes related to B&R

tags: added: stx.4.0 stx.update
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)

Fix proposed to branch: master
Review: https://review.opendev.org/713332

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/713332
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=3441d046eaafcf20a463aaf4bd8c11172014f1e9
Submitter: Zuul
Branch: master

commit 3441d046eaafcf20a463aaf4bd8c11172014f1e9
Author: Mihnea Saracin <email address hidden>
Date: Mon Mar 16 23:11:27 2020 +0200

    Fix B&R when wipe_ceph_osds is set to true

    The helm charts are restored only when the ceph osds are not wiped,
    but they also need to be restored when the ceph osds are wiped.

    Closes-Bug: 1866704
    Change-Id: I7d9bf8cbfbf26a0edd44c6af6f0d98d95abfb682
    Signed-off-by: Mihnea Saracin <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

This issue is no longer seen since following load: 2020-03-19_04-10-00

tags: removed: stx.retestneeded
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716133

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (f/centos8)
Download full text (12.5 KiB)

Reviewed: https://review.opendev.org/716133
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=ddcb11f4b773f4b3190663defe3ba0f3ec4201c8
Submitter: Zuul
Branch: f/centos8

commit bf103f3c54eb45c26d52a43c35339d1d863a42de
Author: Mihnea Saracin <email address hidden>
Date: Fri Mar 27 18:19:02 2020 +0200

    Fix B&R when the controller needs to be unlocked

    After running the restore playbook, all the applications
    should be in an uploaded state. But they are in an
    applied state instead, making the controller-0
    unable to unlock.

    Closes-Bug: 1869403
    Change-Id: I8bd9c51e250969cc334d52b78c616f9ad082afd8
    Signed-off-by: Mihnea Saracin <email address hidden>

commit 6e875971afeaf1378c2c8aeb845359459838ce30
Author: Stefan Dinescu <email address hidden>
Date: Sat Mar 21 16:57:57 2020 +0200

    Fix Netapp port conflict

    By default, the Trident Netapp service opens port 8443 for
    HTTPS REST api usage. This conflicts with the port the
    Horizon dashboard uses on an HTTPS enabled setup (the port
    is also 8443).

    In order to fix this, we change the default port from 8443
    to 8678, but also make it configurable through ansible
    overrides.

    The Trident service also opens port 8001 for metrics usage.
    While that doesn't currently conflict with any other service
    on the system, I also made that configurable through
    ansible overrides, in case such a conflict appears in the
    future.

    Change-Id: I08db939acac6082f82b9e12e932d8289c7cecdeb
    Closes-bug: 1868382
    Signed-off-by: Stefan Dinescu <email address hidden>

commit 5a9ba6786e393f2cd93bfae8c3a8f09f0cf9eb26
Author: Robert Church <email address hidden>
Date: Thu Mar 19 19:08:17 2020 -0400

    Upversion Multus to 3.4

    Updates the Multus configuration to align with version 3.4

    Change-Id: Ifc236ccbbe4e559987d7ef522902f638062348ca
    Depends-On: https://review.opendev.org/#/c/714024/
    Story: 2006999
    Task: 39110
    Signed-off-by: Robert Church <email address hidden>

commit 6a261463f9ac0f81d9c7f054dd3cb10a51934d4a
Author: Robert Church <email address hidden>
Date: Wed Mar 18 22:01:03 2020 -0400

    Upversion Calico from 3.6 to 3.12

    Updates the Calico configuration to align with version 3.12. This
    introduces support for a Flex Volume Driver which requires enabling the
    --volume-plugin-dir option for kubelet, the --flex-volume-plugin-dir
    option for kube-controller-manager, and pulling the pod2daemon-flexvol
    image used by calico-node pods.

    Change-Id: I74bc5c53ffcb16c8e3c06cebf20eac296b9ccc65
    Story: 2006999
    Task: 39109
    Depends-On: https://review.opendev.org/#/c/714023
    Signed-off-by: Robert Church <email address hidden>

commit b35387f8bc40714e9633e6191267284b8af8ccee
Author: Stefan Dinescu <email address hidden>
Date: Thu Mar 19 18:13:26 2020 +0200

    Netapp: Fix handling of IPv6 addresses

    Using bash process subtitution to pass the file parameter
    to the "create backend" command doesn't work as the bash
    variable expansion...

tags: added: in-f-centos8
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.