StarlingX

Bug #1955162
Comment #3

Comment 3 for bug 1955162

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-01-20: Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/822130
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/9a790282e588a066f28c6a42f7683146f137b558
Submitter: "Zuul (22348)"
Branch: master

commit 9a790282e588a066f28c6a42f7683146f137b558
Author: Mihnea Saracin <email address hidden>
Date: Fri Dec 17 17:36:01 2021 +0200

Fix Backup&Restore when backup is taken on controller-1

There are 2 main problems when restoring a backup from controller-1:

    - The certificates that are generated by k8s can only be used on
      controller-1. The fix for this is to let k8s regenerate those
      when restoring a backup taken from controller-1.

      In kube-controller-manager and kube-scheduler I've seen logs like:
      error retrieving resource lock kube-system/kube-controller-manager:
      Get
      "https://192.168.205.2:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s":
      x509: certificate is valid for 10.96.0.1, 192.168.205.3, 192.168.205.1,
      127.0.0.1, 128.224.49.105, 128.224.48.105, 128.224.48.106, not
      192.168.205.2

Where the 192.168.205.2 ip is the controller-0-cluster-host.

    - The ceph.conf from controller-1 can no longer
      be used on controller-0 when restoring.(Due to recent ceph changes).
      To fix this, when we take backup on controller-1
      we also backup ceph.conf from controller-0 and use it at restore.

Test Plan:

     PASS: AIO-SX bootstrap
     PASS: AIO-DX bootstrap
     PASS: STANDARD bootstrap
     PASS: B&R on AIO-SX
     PASS: B&R on AIO-DX with backup taken from both controllers
     PASS: B&R on STANDARD with backup taken from both controllers

    Closes-Bug: 1955162
    Change-Id: I2e9c7d81113d04782d91efaaa568d9b2bdd20672
    Signed-off-by: Mihnea Saracin <email address hidden>