Backup & Restore: platform-integ-apps not auto applying after restore in DC system with upgraded apps

Bug #1893309 reported by Dan Voiculeasa
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Dan Voiculeasa

Bug Description

Brief Description
-----------------
During the restore procedure the platform-integ-apps should be auto-applied.
The restore procedure requires manual intervention for it to complete.

Severity
--------
Critical: Upgrades feature which is based on B&R will not be usable due to the defect.

Steps to Reproduce
------------------
Bring up a DC systemcontroller.
Upgrade platform-integ-apps from 1.0-x to 1.0-(x+1) or other.
Do the B&R.

Expected Behavior
------------------
During the restore procedure, after controller-0 is unlocked, controller-1 is booted from pixie, and ceph health becomes HEALTH_OK the platform-integ-apps should be auto-applied

Actual Behavior
----------------
App platform-integ-apps stays in uploaded state.

Reproducibility
---------------
100%

System Configuration
--------------------
DC systemcontroller

Branch/Pull Time/Commit
-----------------------
Branch and the time when code was pulled or git commit or cengn load info

Last Pass
---------
Did this test scenario pass previously? If so, please indicate the load/pull time info of the last pass.
Use this section to also indicate if this is a new test scenario.

Timestamp/Logs
--------------
The sysinv periodic thread responsible for auto-applying is failing:

 File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/periodic_task.py", line 180, in run_periodic_tasks
    task(self, context)
  File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 5503, in _k8s_application_audit
    app = kubeapp_obj.get_by_name(context, app_name)
  File "/usr/lib64/python2.7/site-packages/sysinv/objects/base.py", line 103, in wrapper
    result = fn(cls, context, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/sysinv/objects/kube_app.py", line 33, in get_by_name
    return cls.dbapi.kube_app_get(name)
  File "/usr/lib64/python2.7/site-packages/sysinv/objects/__init__.py", line 111, in wrapper
    result = fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/sysinv/db/sqlalchemy/api.py", line 7799, in kube_app_get
    return self._kube_app_get(name)
  File "/usr/lib64/python2.7/site-packages/sysinv/db/sqlalchemy/api.py", line 7743, in _kube_app_get
    result = query.one()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2817, in one
    "Multiple rows were found for one()")
MultipleResultsFound: Multiple rows were found for one()

Querying the database shows multiple versions in uploaded state. Older version (the 1.0-x) should be in inactive state.

 2020-08-26 15:26:10.086176 | 2020-08-26 21:47:55.369676 | 3 | platform-integ-apps | 1.0-x | platform-integration-manifest | manifest.yaml | uploaded | | t | 0
 2020-08-26 21:47:55.376151 | 2020-08-26 21:49:03.491622 | 8 | platform-integ-apps | 1.0-(x+1) | platform-integration-manifest | manifest.yaml | uploaded | Application update from version 1.0-x to version 1.0-(x+1) completed. | t | 0

Test Activity
-------------
Testing

Workaround
----------
Manually updating the database for older version to change it from uploaded to inactive.

Changed in starlingx:
assignee: nobody → Dan Voiculeasa (dvoicule)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)

Fix proposed to branch: master
Review: https://review.opendev.org/748601

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/748601
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=a484917fef38d0c784e6dd0b100b1e29826b6faa
Submitter: Zuul
Branch: master

commit a484917fef38d0c784e6dd0b100b1e29826b6faa
Author: Dan Voiculeasa <email address hidden>
Date: Fri Aug 28 13:38:59 2020 +0300

    B&R: Fix restore with upgraded platform-integ-apps

    In case multiple platform-integ-apps are found in the database, the
    restore code modifies for each of them the status to uploaded. The old
    ones are in inactive status, but this is modified to uploaded.
    This is not the intended behavior.

    This crashes the periodic thread responsible for auto-appying the app.
    It uses a query that expects only one entry in the database to match.
    More entries match, thus the query throws an exception.
    Note: The query already filters out the inactive apps.

    The fix is to change the restore code to modify the status of
    platform-integ-apps to uploaded but only for the one app which should
    be active(the app status is different than inactive).

    Closes-Bug: 1893309
    Change-Id: I1492d417c2cdc9ac85858bef28663852dbe2b2fc
    Signed-off-by: Dan Voiculeasa <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.5.0 stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.