Debian: SX Backup/Restore - restore failed during puppet-manifest-apply.sh

Bug #2001715 reported by Jorge Saffe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jorge Saffe

Bug Description

Brief Description
-----------------
Restore failed using a backup file created after a fresh install.

Severity
-----------------
<Major: System/Feature is usable but degraded>

Steps to Reproduce
-----------------
* Install a controller configuring OAM if with IPv6
* Backup System
* Reinstall controller configuring OAM if with IPv4
* Restore system

Backup
-----------------
ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=Li69nux* admin_password=Li69nux*" -e "backup_user_local_registry=true"
Using jenkins, reinstall with same build and stop at "install_controller"

Restore
-----------------
ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "backup_filename=localhost_platform_backup_2022_11_18_10_58_36.tgz admin_password=Li69nux* ansible_become_pass=Li69nux* initial_backup_dir=/home/sysadmin"

Expected Behavior
-----------------
Restore complete without errors

Actual Behavior
-----------------
Restore failing

Reproducibility
-----------------
Reproduced 2x

Ghada Khalil (gkhalil)
tags: added: stx.update
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/881407

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ansible-playbooks (master)

Change abandoned by "Jorge Saffe <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/ansible-playbooks/+/880896
Reason: Another solution has been proposed: https://review.opendev.org/c/starlingx/stx-puppet/+/881407

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/881407
Committed: https://opendev.org/starlingx/stx-puppet/commit/2ceceb29f605dbe0e7bd34721b3075ad471d193c
Submitter: "Zuul (22348)"
Branch: master

commit 2ceceb29f605dbe0e7bd34721b3075ad471d193c
Author: Jorge Saffe <email address hidden>
Date: Mon Apr 24 17:17:17 2023 -0400

    Fix restore failure during puppet-manifest-apply

    When K8s custom config puppet script is executed during restore
    playbook, K8s updates fail when trying to validate cluster network
    data. This happens whenever the OAM IP address is reconfigured (after
    reinstall) with a different protocol version than the one used for the
    K8 cluster host subnet.

    The issue is related to "advertise-address" parameter. It is not
    predefined in the api-server extra-args during bootstrap, so k8s gets
    the host's default interface as default value. In this case, the host’s
    default value is an IPv4 (IPv6) address while all the other K8s cluster
    subnets are configured with IPv6 (IPv4) addresses.

    K8s validation fails because STX defaults to a SingleStack mode. Only
    dual-stack networks allow the assignment of IPv4 and IPv6 addresses to
    pods and services.

    Test Plan:
      PASS: Fresh Install AIO-SX.
      PASS: Create a backup and reinstall server.
      PASS: Reconfigure network OAM IF with a different IP family.
      PASS: Restore system.
      PASS: Verify advertise-address parameter.
      PASS: Modify and Apply K8s service-parameter.
      PASS: Fresh Install STD/DX
      PASS: Modify and Apply K8s service-parameter.
      PASS: Verify advertise-address parameter in both controllers.

    Closes-bug: 2001715

    Signed-off-by: Jorge Saffe <email address hidden>
    Change-Id: I6f75f171d0a45abe2d5e047a31308dc97ce19eed

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Jorge Saffe (jsaffe)
Ghada Khalil (gkhalil)
tags: added: stx.9.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.