Bug #1870389 “Backup & Restore: Subcloud AIO-DX active controlle...” : Bugs : StarlingX

Revision history for this message

Senthil Mukundakumar (smukunda) wrote on 2020-04-02:

#1

collect logs Edit (29.1 MiB, application/x-tar)

Revision history for this message

Ovidiu Poncea (ovidiuponcea) wrote on 2020-04-02:

#2

This is expected behavior if the setup had patches installed. This is waht the logs show. Senthil polease confirm. Thanks!

Revision history for this message

Senthil Mukundakumar (smukunda) wrote on 2020-04-03:

#3

Yes the system was pre installed with patches. The restore did fail again after controller reboot. I have mentioned that above in description. Once I reproduce again, will update more logs.

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2020-04-03:

#4

stx.4.0 / medium priority - B&R failure with a patch applied

tags:	added: stx.4.0 stx.update
Changed in starlingx:
status:	New → Triaged
importance:	Undecided → Medium
assignee:	nobody → Dan Voiculeasa (dvoicule)

Revision history for this message

Senthil Mukundakumar (smukunda) wrote on 2020-04-03:

#5

The issue is reproduced in a different DC system - AIO-SX subcloud without any patches involved in the system.

description:

updated

Senthil Mukundakumar (smukunda) on 2020-04-03

description:

updated

Senthil Mukundakumar (smukunda) on 2020-04-05

description:

updated

Senthil Mukundakumar (smukunda) on 2020-04-05

description:	updated
description:	updated

Revision history for this message

Dan Voiculeasa (dvoicule) wrote on 2020-04-08:

#6

From the logs:
etc/resolv.conf on the failed subcloud controller shows
nameserver 2620:10a:a001:a103::2

The restore fails to do a docker login to registry.central:9001 because the name can't be resolved.

This is what would happen
nslookup registry.central 2620:10a:a001:a103::2
Server: 2620:10a:a001:a103::2
Address: 2620:10a:a001:a103::2#53

** server can't find registry.central: NXDOMAIN

Revision history for this message

Dan Voiculeasa (dvoicule) wrote on 2020-04-09:

#7

A task creating an entry in /etc/hosts pointing to the OAM of the central must be added by restore playbook.

`IP_OAM_CENTRAL registry.central`

This is not the only issue with restoring subclouds. Multiple tasks that will fail are hidden by the fact that the playbook fails at 'Log in k8s' task.

Also i think "-e distributed_cloud_role=subcloud" should be added to the ansible-playbook responsible for the restore. At least until proper detection and correct setting of that variable is done by the restore playbook.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-09: Fix proposed to ansible-playbooks (master)

#8

Fix proposed to branch: master
Review: https://review.opendev.org/718749

Changed in starlingx:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-10: Fix proposed to config (master)

#9

Fix proposed to branch: master
Review: https://review.opendev.org/718957

Frank Miller (sensfan22) on 2020-04-13

summary:

- Backup & Restore: Subcloud AIO-DX active controller restore fails
+ Backup & Restore: Subcloud AIO-DX active controller restore fails to
+ connect to central registry

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-13: Fix merged to ansible-playbooks (master)

#10

Reviewed: https://review.opendev.org/718749
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=9080db419d559d3d5d33c0a6459e9f5e8b7700e5
Submitter: Zuul
Branch: master

commit 9080db419d559d3d5d33c0a6459e9f5e8b7700e5
Author: Dan Voiculeasa <email address hidden>
Date: Thu Apr 9 16:07:30 2020 +0300

Add registry.central host for DC subcloud restore

    During bootstrap management network is temporarly assigned on lo
    interface. Backup archive contains /etc/resolv.conf and /etc/hosts
    of an already unlocked controller. Before backup registry.central is
    resolved through dns (nameserver `floating central management`).

    During restore a temporary host for registry.central must be created.
    Since there is no reference of a backup/shadow management network that
    provides connectivity for such use cases the `floating central oam`
    can be used.

Partial-Bug: 1870389

Change-Id: I86166da31491736d6695e04fa287f79871975b55
Signed-off-by: Dan Voiculeasa <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-13: Fix merged to config (master)

#11

Reviewed: https://review.opendev.org/718957
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=362d905dad25369bf116bb1e34a659f33b7260af
Submitter: Zuul
Branch: master

commit 362d905dad25369bf116bb1e34a659f33b7260af
Author: Dan Voiculeasa <email address hidden>
Date: Fri Apr 10 11:31:06 2020 +0300

Improve host-overrides

Add distributed cloud role information in the host overrides.
The restore playbook needs this information.

    Partial-Bug: 1870389
    Change-Id: I278f19be32d1fe87687feb75e26b2898237de86f
    Signed-off-by: Dan Voiculeasa <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-14: Fix proposed to ansible-playbooks (master)

#12

Fix proposed to branch: master
Review: https://review.opendev.org/719924

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-14: Fix merged to ansible-playbooks (master)

#13

Reviewed: https://review.opendev.org/719924
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=40cfef7c417709c234e50a1a034fb4a11dbf180a
Submitter: Zuul
Branch: master

commit 40cfef7c417709c234e50a1a034fb4a11dbf180a
Author: Dan Voiculeasa <email address hidden>
Date: Tue Apr 14 14:18:29 2020 +0300

Remove subcloud task from restore mode

A task supposed to run only during bootstrap is running during restore.

    Keystone dc variables (dc_admin_user_id and dc_admin_project_id) are
    added during bootstrap to hieradata static.yaml file.
    When doing the restore the information is already present in the file in
    the backup archive.

    Partial-Bug: 1870389
    Change-Id: Iebab8dc059435c7e2b0f19947fedce88bd71bb65
    Signed-off-by: Dan Voiculeasa <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-15: Fix proposed to ansible-playbooks (master)

#14

Fix proposed to branch: master
Review: https://review.opendev.org/720229

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-16:

#15

Fix proposed to branch: master
Review: https://review.opendev.org/720579

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-21: Fix proposed to config (master)

#16

Fix proposed to branch: master
Review: https://review.opendev.org/721611

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-22: Fix merged to config (master)

#17

Reviewed: https://review.opendev.org/721611
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=4ccb11cb4019734e424362d677afb00dd6ecc4b6
Submitter: Zuul
Branch: master

commit 4ccb11cb4019734e424362d677afb00dd6ecc4b6
Author: Dan Voiculeasa <email address hidden>
Date: Tue Apr 21 11:47:37 2020 +0300

Improve host-overrides

Add missing variables for DC.

    Central+Subclod:
    system_mode
    location
    description

    Subcloud:
    region_config
    region_name
    system_controller_oam_subnet
    system_controller_oam_floating_address
    system_controller_subnet
    system_controller_floating_address

    Partial-Bug: 1870389
    Closes-Bug: 1873617
    Change-Id: Ieb12ffc0ad769dd6ca22eb4c15f9d6d55778fd4b
    Signed-off-by: Dan Voiculeasa <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-28: Fix merged to ansible-playbooks (master)

#18

Reviewed: https://review.opendev.org/720579
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=36a01e8ba38f3e0d1e2ea7a2bce31edbedfde04e
Submitter: Zuul
Branch: master

commit 36a01e8ba38f3e0d1e2ea7a2bce31edbedfde04e
Author: Dan Voiculeasa <email address hidden>
Date: Tue Apr 21 17:54:53 2020 +0300

B&R: Do keystone db backup for subcloud

    Keystone db backup file is missing for subclouds.
    Create the keystone db backup file when running the backup playbook on
    subcloud.

    Partial-Bug: 1870389
    Change-Id: I64c8b38a51bf04714931d70e126e0f63782deb20
    Signed-off-by: Dan Voiculeasa <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-30:

#19

Reviewed: https://review.opendev.org/720229
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=204641a5b3082c9873109169f93ae1845eb79813
Submitter: Zuul
Branch: master

commit 204641a5b3082c9873109169f93ae1845eb79813
Author: Dan Voiculeasa <email address hidden>
Date: Wed Apr 15 15:54:58 2020 +0300

DC subcloud restore registry.central certs

    During restore a certificate is missing.
    Docker needs the certificate to connect to registry.central.
    Extract it from backup archive.

Closes-Bug: 1870389

    Depends-On: I64c8b38a51bf04714931d70e126e0f63782deb20
    Depends-On: Ieb12ffc0ad769dd6ca22eb4c15f9d6d55778fd4b
    Depends-On: I86166da31491736d6695e04fa287f79871975b55
    Depends-On: Iebab8dc059435c7e2b0f19947fedce88bd71bb65
    Depends-On: I278f19be32d1fe87687feb75e26b2898237de86f

Change-Id: Ief65a8963b81ef489171c264964d472a66fec282
Signed-off-by: Dan Voiculeasa <email address hidden>

Changed in starlingx:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-21: Fix proposed to ansible-playbooks (f/centos8)

#20

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/729809

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-21: Fix proposed to config (f/centos8)

#21

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/729812

Yang Liu (yliu12) on 2020-05-21

tags:

added: stx.retestneeded

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-22: Fix merged to config (f/centos8)

#22

Download full text (37.5 KiB)

Reviewed: https://review.opendev.org/729812
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539d476456277c22d0dcbc3cbbc832e623242264
Submitter: Zuul
Branch: f/centos8

commit 320cc40de8518787c2be234d7fdf88ec0a462df2
Author: Don Penney <email address hidden>
Date: Wed May 13 13:06:11 2020 -0400

Add auto-versioning to starlingx/config packages

This update makes use of the PKG_GITREVCOUNT variable to auto-version
the packages in this repo.

    Change-Id: I3a2c8caeb4b4647608978b1f2ccfcf0661508803
    Depends-On: https://review.opendev.org/727837
    Story: 2006166
    Task: 39766
    Signed-off-by: Don Penney <email address hidden>

commit d9f2aea0fb228ed69eb9c9262e29041eedabc15d
Author: Sharath Kumar K <email address hidden>
Date: Wed Apr 22 16:22:22 2020 +0200

De-branding in starlingx/config: CGCS -> StarlingX

1. Rename CGCS to StarlingX for .spec files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

Please note, doing de-brand changes in batches, this is batch9 changes.

Story: 2006387
Task: 39524

Change-Id: Ia1fe0f2baafb78c974551100f16e6a7d99882f15
Signed-off-by: Sharath Kumar K <email address hidden>

De-branding in starlingx/config: CGCS -> StarlingX

1. Rename CGCS to StarlingX for .spec file
2. Rename TIS to StarlingX for .service files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

Please note, doing de-brand changes in batches, this is batch10 changes.

Story: 2006387
Task: 36202

Change-Id: I404ce0da2621495175ad31489e9ad6f7b0211e26
Signed-off-by: Sharath Kumar K <email address hidden>

commit d141e954fa6bbf688929ec90d1b6604a97792c43
Author: Teresa Ho <email address hidden>
Date: Tue Mar 31 10:08:57 2020 -0400

Sysinv extensions for FPGA support

This update adds cli and restapi to support FPGA device
programming.

    CLI commands:
    system device-image-apply
    system device-image-create
    system device-image-delete
    system device-image-list
    system device-image-remove
    system device-image-show
    system device-image-state-list
    system device-label-list
    system host-device-image-update
    system host-device-image-update-abort
    system host-device-label-assign
    system host-device-label-list
    system host-device-label-remove

Story: 2006740
Task: 39498

Change-Id: I556c2e7a51b3931b5a66ab27b67f51e3a8aebd9f
Signed-off-by: Teresa Ho <email address hidden>

commit 491cca42ed854d2cb3ee3646b93c56a4f45f563c
Author: Elena Taivan <email address hidden>
Date: Wed Apr 29 11:25:26 2020 +0000

Qcow2 conversion to raw can be done using 'image-conversion' filesystem

    1. Conversion filesystem can be added before/after
       stx-openstack is applied
    2. If conversion filesystem is added after stx-openstack
       is applied, changes to stx-openstack will only take effec...

Reviewed:  https://review.opendev.org/729812
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539d476456277c22d0dcbc3cbbc832e623242264
Submitter: Zuul
Branch:    f/centos8

commit 320cc40de8518787c2be234d7fdf88ec0a462df2
Author: Don Penney <don.penney@windriver.com>
Date:   Wed May 13 13:06:11 2020 -0400

Add auto-versioning to starlingx/config packages
    
    This update makes use of the PKG_GITREVCOUNT variable to auto-version
    the packages in this repo.
    
    Change-Id: I3a2c8caeb4b4647608978b1f2ccfcf0661508803
    Depends-On: https://review.opendev.org/727837
    Story: 2006166
    Task: 39766
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit d9f2aea0fb228ed69eb9c9262e29041eedabc15d
Author: Sharath Kumar K <sharath.kumar@intel.com>
Date:   Wed Apr 22 16:22:22 2020 +0200

De-branding in starlingx/config: CGCS -> StarlingX
    
    1. Rename CGCS to StarlingX for .spec files
    
    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch9 changes.
    
    Story: 2006387
    Task: 39524
    
    Change-Id: Ia1fe0f2baafb78c974551100f16e6a7d99882f15
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>
    
    De-branding in starlingx/config: CGCS -> StarlingX
    
    1. Rename CGCS to StarlingX for .spec file
    2. Rename TIS to StarlingX for .service files
    
    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch10 changes.
    
    Story: 2006387
    Task: 36202
    
    Change-Id: I404ce0da2621495175ad31489e9ad6f7b0211e26
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>

commit d141e954fa6bbf688929ec90d1b6604a97792c43
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Tue Mar 31 10:08:57 2020 -0400

Sysinv extensions for FPGA support
    
    This update adds cli and restapi to support FPGA device
    programming.
    
    CLI commands:
    system device-image-apply
    system device-image-create
    system device-image-delete
    system device-image-list
    system device-image-remove
    system device-image-show
    system device-image-state-list
    system device-label-list
    system host-device-image-update
    system host-device-image-update-abort
    system host-device-label-assign
    system host-device-label-list
    system host-device-label-remove
    
    Story: 2006740
    Task: 39498
    
    Change-Id: I556c2e7a51b3931b5a66ab27b67f51e3a8aebd9f
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit 491cca42ed854d2cb3ee3646b93c56a4f45f563c
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Apr 29 11:25:26 2020 +0000

Qcow2 conversion to raw can be done using 'image-conversion' filesystem
    
    1. Conversion filesystem can be added before/after
       stx-openstack is applied
    2. If conversion filesystem is added after stx-openstack
       is applied, changes to stx-openstack will only take effect
       once the application is re-applied
    
    3. It is not allowed to delete image-conversion filesystem
       when stx-openstack is in applying/applied/removing state
    4. Raise alarms for image-conversion
    
    Change-Id: Ie205329b694525509b0820497186fcd9ec2e45c9
    Closes-bug: 1819688
    Depends-On: https://review.opendev.org/#/c/724270/
    Depends-On: https://review.opendev.org/724288/
    Signed-off-by: Elena Taivan <elena.taivan@windriver.com>

commit bc9cde71a0bbcd099427b8808e0bdb1b78cb9725
Author: albailey <Al.Bailey@windriver.com>
Date:   Tue May 12 14:24:17 2020 -0500

Specify an upper limit for flake8 and pycodestyle
    
    Both flake8 and pycodestyle were updated on May 11
    which caused zuul jobs to start failing.
    
    The copyrights were updated as a way of triggering
    zuul to run the flake8 jobs associated with the
    test-requirements.txt
    
    Similar solution as:
    https://review.opendev.org/#/c/727133/
    
    Change-Id: Ia2b97203e7ab767586ee7393ac08fcf781af7609
    Closes-Bug: 1878276
    Signed-off-by: albailey <Al.Bailey@windriver.com>

commit c317fb0324c93cbaeab1b635c745b806c04dc613
Author: Don Penney <don.penney@windriver.com>
Date:   Fri May 8 11:40:03 2020 -0400

Add support to sysinv-conductor to update static images
    
    As part of the sysinv-conductor init, apply the
    upgrade-static-images.yml playbook to download updated images to the
    local registry as needed.
    
    Change-Id: I726a244ae226588327ebe2f69d4131b57cebab85
    Depends-On: https://review.opendev.org/726420
    Story: 2006781
    Task: 39705
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit dac06a7a57efca8c6eeeb1021a768df4842eecb2
Author: John Kung <john.kung@windriver.com>
Date:   Wed May 6 17:56:19 2020 +0000

Revert "Update conditions for oam config change and manifest apply"
    
    Investigation into requirement for
      openstack::keystone::endpoint::runtime
    for configuring admin-ep is required.
    
    This reverts commit c1112ad2c5d6a6ee3a34bb345055d21fcd08a6d9.
    
    Change-Id: Icfe6bbcd0c0a0489aede56552ec15712f314c1c5

commit 09dc3cbcded99900feb0fca5f65542c3fa673231
Author: Robert Church <robert.church@windriver.com>
Date:   Tue May 5 15:29:33 2020 -0400

Provide an update strategy for Tiller deployment
    
    In the case of a simplex controller configuration the current patching
    strategy for the Tiller environment will fail as the tiller ports will
    be in use when the new deployment is attempted to be applied. The
    resulting tiller pod will be stuck in a Pending state.
    
    The deployment strategy provided by 'helm init' is unspecified. This
    change will allow one additional pod (current + new) and one unavailable
    pod (current) during an update. The maxUnavailable setting allows the
    tiller pod to be deleted which will release its ports, thus allowing the
    patch deployment to spin up an new pod to a Running state.
    
    This patching ensures that on an installed system where tiller has been
    manually removed and re-applied via 'helm init', it is patched
    appropriately.
    
    Change-Id: I356545d05a585f7cbbbd5ca5071aa834fb086c31
    Depends-On: https://review.opendev.org/#/c/725705/
    Closes-Bug: #1876396
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 40463adf9aba476ba44b0dd89d4c30c9343b43b4
Author: John Kung <john.kung@windriver.com>
Date:   Mon May 4 17:13:59 2020 -0400

Fix certificate-key to 64 characters
    
    Update get_secure_static_config() to fix the
    kubernetes::kubeadm::certificate-key,  to the 64 characters
    expected by kubernetes.
    
    Change-Id: I366e6eb1dc4e764425ef2a82a493db47a080f49a
    Closes-bug: 1876755
    Signed-off-by: John Kung <john.kung@windriver.com>

commit ee72ac30762d5182ff5fa8051cd0f86a1a18efba
Author: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Date:   Thu Apr 30 20:10:53 2020 +0300

Copy RBD provisioner secret to k8s namespaces only when Ceph is enabled
    
    When an application is started the secret to access kube RBD pool is
    needed in the application namespace to allow PVC creation.
    
    This commit adds a semantic check to verify that Ceph is enabled before
    attempting the copy operation.
    
    Change-Id: If890e53414df183337b563902d3566285ab27213
    Story: 2007391
    Task: 39604
    Signed-off-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>

commit fbcdbf63ea3ac192a8e6dbd8588ca34399444008
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Thu Apr 30 18:19:29 2020 -0400

Use persistent backup during upgrade
    
    Use the persistent backup to store the upgrade data during simplex
    upgrades.
    
    Change-Id: I83280fdc5b2c702045a6a51b1c379758dd50baa2
    Story: 2007403
    Task: 39606
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit c1112ad2c5d6a6ee3a34bb345055d21fcd08a6d9
Author: John Kung <john.kung@windriver.com>
Date:   Thu Apr 30 11:33:52 2020 -0400

Update conditions for oam config change and manifest apply
    
    The runtime manifest apply for an oam config change was being
    triggered on host-swact to the target controller after startup.
    Thus, the config runtime manifest was being triggered even when
    there was not an oam config change.
    
    Update the runtime manifest apply for oam config to be triggered
    on active controller startup after an oam configuration change.
    
    During upgrades, disallow oam network changes as the configuration
    affects the platform and kubernetes components dependent on the
    OAM network.
    
    Tests Performed:
    bootstrap and enable duplex controllers
    bootstrap and enable AIO-SX
    host-swact after initial install and reinstall
    oam-modify and host-swact and verify oam access
    
    Change-Id: I4777891eaec05a6a39322325cec3c2ed006446da
    Story: 2007403
    Task: 39605
    Partial-Bug: 1874136
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 88f2f7dc1a12327e12b006ef437b919bcef29108
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Wed Apr 22 03:19:29 2020 +0300

Fix race condition during certificate key regeneration
    
    When monitor is created on compute-1 hiera data is regenerated including
    certificate-key during controller-1 reboot as part of the unlock. When
    controller-1 boots up the join command fails as certificate key is no
    longer valid
    
    Change-Id: I99057fa1afc3648c7aa3910f95067bde7b51b033
    Closes-bug: 1873916
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit 426f034c14ea6f3d292c8a3a8b8de50efe0a2171
Author: Mihnea Saracin <Mihnea.Saracin@windriver.com>
Date:   Thu Apr 23 19:31:34 2020 +0300

Persistent backup partition comments
    
    Add some information about the persistent partition
    in the sysinv where the partitions sizes are computed
    
    Depends-On: https://review.opendev.org/#/c/720256/
    Change-Id: Id07e38c1c8cf68c83ba393bf3e809bf892f430f5
    Signed-off-by: Mihnea Saracin <Mihnea.Saracin@windriver.com>

commit 8099bbbbcf6e67190dc2ede949c47da081317e2d
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Mar 25 12:33:42 2020 +0000

Add a new filesystem for image conversion
    
    Create the new host_fs CLI commands and the APIs
        system host-fs-add
        system host-fs-delete
    
    These commands will be used only for adding/removing 'image-conversion'
    filesystem dedicated only for qcow2 image conversion.
    'image-conversion' filesystem is optional.
    It is not allowed to add/remove any other filesystem.
    
    Change-Id: I87c876371e123ec1ba946170258401d220260e31
    Partial-bug: 1819688
    Depends-On: https://review.opendev.org/#/c/714936/
    Signed-off-by: Elena Taivan <stefan.dinescu@windriver.com>

commit e0d751f79060c788526ad4f3af56abe1e2308f8f
Author: Matt Peters <matt.peters@windriver.com>
Date:   Tue Apr 28 12:38:19 2020 -0500

Remove storage class backend from helm overrides
    
    Remove the storageClass parameter from the stx-monitor
    helm system overrides.  With support for different
    storage classes, the specific request for the storage
    class of "general" should not be configured so that the
    default storage class is used when not specified.
    
    NOTE: The old parameter had no effect since it should
    have been storageClassName.  However, it is being removed
    since it is confusing to the end user.
    
    Story: 2007391
    Task: 39589
    
    Change-Id: Ie690e53404df183337b563902d3566285db27313
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit dbc41d03626d4f963f24cd83b7b417bba361a969
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Mon Apr 27 22:32:49 2020 -0400

Fix db error in creating route for dc host
    
    In creating a host route for DC, the interface id is
    required instead of the interface uuid.
    This update fixed the database error.
    
    Tested in vbox with system controller and subcloud.
    
    Closes-Bug: 1875461
    
    Change-Id: Ica81d0cd237ada1232f3fb3b3518a8d74df9ba99
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit f0b1f8b604f9cb908213113648dabc63e268aaa8
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Fri Apr 24 14:12:58 2020 -0400

Rename the existing /opt/patch-vault filesystem to /opt/dc-vault
    
    The filesystem /opt/patch-vault is created on the system controller.
    In order to re-use this filesystem to store FPGA images and software
    loads, it is renamed to /opt/dc-vault. Additionally, the default size
    of the dc-vault-lv is increased from 8G to 15G.
    
    Story: 2006740
    Task: 39550
    Change-Id: Id8cda76759da6e6c73fd24357f79658894c95a64
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit 5d04b37e9074c2beedc678a98e54a6d27e5d35c7
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Apr 27 12:49:02 2020 -0400

Data Collection Reduction
    
    This changeset aims to reduce the amount of data collected by
    stx-monitor. This is achieved by:
    - Dropping the load, process_summary and fsstat metricsets from the
      system module
    - Dropping the system metricset from the kubernetes module
    - Dropping percentage metrics from the cpu metricset
    - Increasing daemonset kubernetes module period from 10s to 60s
    
    Story: 2007221
    Task: 39567
    
    Change-Id: I01899ac5af8dc48313d801c3d16bff209286000b
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit 382491ffde5bbafd43154fcd69f8345df9ea9bc7
Author: John Kung <john.kung@windriver.com>
Date:   Mon Apr 27 17:45:40 2020 -0400

Disallow host-lock controller-1 during upgrade-starting
    
    Add a semantic check to prevent host-lock controller-1
    when the upgrade state is 'starting'.  This is to ensure the
    database is not snapshot with N+1 controller administratively locked,
    as that is to become the N+1 active controller.
    
    Change-Id: Ia34cbe40d58920fb26be0901bce6a6966a3ec27c
    Story: 2007403
    Task: 39574
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 600f0a678541368d8c973850fbabcd6b55eacf3f
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Apr 27 15:04:15 2020 -0400

Include app isolated CPUs when checking for minimum app cores
    
    Add total_isolated_cores when computing the total number of reserved
    cores. This will ensure that at least one unassigned core is available
    for general applications and all CPUs will not be consumed by all
    reservations.
    
    Change-Id: Ic5b493741dbd5d626906f686c002eb4e6f5775a4
    Story: 2006999
    Task: 39573
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 2d30ca7673acddefd22baf1d25641f3ebbf1a42a
Author: Matt Peters <matt.peters@windriver.com>
Date:   Fri Apr 24 15:01:09 2020 -0500

Remove helm plugin version checks
    
    Until the full application decoupling is completed, the helm
    plugin version enforcement is being removed since applications
    may still want to upversion the application without a change
    to the platform plugins.
    
    Full platform application compatibility will be enforced once
    the application decoupling story is completed.
    
    Story: 2006537
    Task: 39551
    
    Change-Id: Ia86fcfc2d100bad6fce5763bd2ab21a6bc3611b2
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit ff66f652d5b5108e19030852cb30c7f395517779
Author: Matt Peters <matt.peters@windriver.com>
Date:   Fri Apr 24 11:20:01 2020 -0500

Update Logstash to use NodePort Ingress
    
    Logstash should not be using a custom port for collectd
    input from the K8s NodePort range since it might cause a
    conflict if the port is allocated to another service.
    Therefore, logstash will use a proper NodePort value
    reserved by the nginx-ingress service.
    
    Do not disable the nginx-ingress on the subcloud since it
    is required for collectd to send events to logstash.
    
    Story: 2007221
    Task: 39549
    
    Change-Id: Ibdcbcf1b217ddd17197c0e8fb6cc069a573d10a5
    Depends-On: https://review.opendev.org/#/c/722674
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit 92828038b4cfa720c6dfc74fbdcb2e463ac5996d
Author: Robert Church <robert.church@windriver.com>
Date:   Wed Apr 22 02:50:11 2020 -0400

Enable --reserved-cpus option in k8s v1.18.1
    
    The option was introduced in k8s v1.17 and will now be used to define
    the explicit set of CPUs that are reserved for specific cpu functions in
    StarlingX.
    
    This retires setting the number of CPUs reserved in the --kube-reserved
    and --system-reserved options.
    
    Instead of calculating the number of CPUs related to reservations,
    provide the specific list of CPUs in a comma separated range format.
    This will be used by puppet to set the --reserved-cpus option based on
    cpu manager policy.
    
    Remove restrictions around CPU assignments:
    - Allow platform cores to be reserved on any processor
    - Allow application isolated cores to be reserved on any processor
    
    Change-Id: I1a3d4e4cca7b6940682a787c2e7348e56a047a06
    Depends-On: https://review.opendev.org/#/c/722189
    Story: 2006999
    Task: 39528
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit fd3a279c83de163face3cc69f551cc6f65d1cace
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Thu Apr 23 17:32:33 2020 -0400

Fix application-update reuse-user-overrides
    
    The 'maintain_user_overrides' flag in the application tarball
    metadata.yaml file is meant to indicate whether to preserve
    user overrides over application update.  The --reuse-user-overrides
    flag of the application-update command can override the setting
    in the metadata.yaml file, but the current logic means the
    'maintain_user_overrides' flag will never be checked even if the
    --reuse-user-overrides flag is not set.
    
    This update allows the maintain_user_overrides to be checked when
    the --reuse-user-overrides flag is not set.
    
    Closes-Bug: 1874552
    Change-Id: I38e009f72c432f43b1ad8744771ce32de1269736
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit 1c77d6664264814e37ccf998fd1aea896235e7e6
Author: Bin Qian <bin.qian@windriver.com>
Date:   Tue Apr 7 23:58:08 2020 -0400

Set dc adminep cert and root ca cert to secure system config
    
    Extract admin endpoint cert and key pair from cert-manager to secure
    system config, for puppet to pick up and install.
    The cert and key are used to by haproxy to provide ssl termination
    on admin endpoints.
    
    Performed tests:
    Install DC, unlocked system controller 0 and 1
    Unlocked SX subcloud controller 0.
    
    Story: 2007347
    Task: 39429
    
    Depends-on: https://review.opendev.org/#/c/720270
    Depends-on: https://review.opendev.org/#/c/720224
    
    Change-Id: Idb302fffe2b4c4ae36a901377d5089a91d26a3ba
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 0333ccbb4216300eb451004790ce8b4c7e492e6f
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Thu Apr 23 09:56:38 2020 -0400

Fix Filebeat readiness probe exceeding timeout
    
    The 7.6.0 chart upgrade added a readiness probe to the beats. The
    Filebeat readiness probe will occasionally fail, causing
    application-apply to fail. This fix addresses this issue by increasing
    Filebeat's resource limits to match those allotted to Metricbeat.
    
    Closes-Bug: 1874328
    
    Change-Id: Ie2e23bbe063fd837999ceb48cc97071034526f35
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit f20970adcff43bfc1f410fd7efa211920ac33e2e
Author: John Kung <john.kung@windriver.com>
Date:   Wed Apr 22 17:02:59 2020 -0400

Fix application-update to reference inst_path
    
    Issue with directory path to metadata_file set incorrectly,
    is fixed with setting to path.inst_path.
    
    Tests Performed:
    - Verified application-update passes
    - Verified updated application stx-monitor metadata
    - Verified updated application stx-openstack metadata
    
    Change-Id: I084bf34c6e19d9c05766639160af5dbe39aa4499
    Closes-Bug: 1874284
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 24a0284e3d182faac2b613ddb9f9f36c5ba3995a
Author: Robert Church <robert.church@windriver.com>
Date:   Sun Apr 19 06:22:50 2020 -0400

Patch Tiller deployment to ensure self-recovery
    
    On node startup, there appears to be a race condition between when
    kubelet sees a pod and when kubelet sees a service. Due to this race,
    required environment variable are missing to allow tiller to function
    properly.
    
    See the comment at
    https://github.com/kubernetes/kubernetes/blob/v1.18.1/pkg/kubelet/kubelet_pods.go#L566
    
    This change patches the tiller deployment to make sure the four classes
    of environment variables are present prior to starting tiller. If any
    class of variables are not present in the environment, then exit. This
    will recreate the pod and will populate the correct environment for
    tiller to function.
    
    Since the upgrade to v1.18.1, this has been seen in simplex and duplex
    controller configurations.
    
    Review https://review.opendev.org/#/c/699307/ will cover patching during
    initial provisioning via ansible. This change will check that tiller is
    patched every time the conductor starts as part of the tiller upgrade
    logic. This will cover scenarios where tiller is manually removed from
    the cluster and reinstalled via helm.
    
    This change should be reverted once StarlingX moves to helm v3.
    
    Also removed dead code: get_k8s_secret()
    
    Change-Id: Icd199ec1b1e10840094c0eae59d53838f32ffd6f
    Closes-Bug: #1856078
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 7e10e2091497a70bb39583c0678968171790bfdf
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Tue Apr 21 22:37:50 2020 -0400

Correct alarm calculations in health check
    
    The health check would incorrectly report all alarms as management
    affecting. This was a result of moving to the fm API instead of directly
    querying the database. As we are querying the API, a tuple is never
    returned and the mgmt_affecting property is calculated to "True" or
    "False".
    
    Same root cause as this bug/change:
    https://review.opendev.org/#/c/664274/
    
    Change-Id: Ia0b8a1df9526daa5052bf977f2c8812416b7e3b9
    Story: 2007403
    Task: 39517
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit 4ccb11cb4019734e424362d677afb00dd6ecc4b6
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Tue Apr 21 11:47:37 2020 +0300

Improve host-overrides
    
    Add missing variables for DC.
    
    Central+Subclod:
    system_mode
    location
    description
    
    Subcloud:
    region_config
    region_name
    system_controller_oam_subnet
    system_controller_oam_floating_address
    system_controller_subnet
    system_controller_floating_address
    
    Partial-Bug: 1870389
    Closes-Bug: 1873617
    Change-Id: Ieb12ffc0ad769dd6ca22eb4c15f9d6d55778fd4b
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 7b8ab9ff532dca6f1bf9e1a37deef9650e790167
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Tue Apr 21 13:47:50 2020 +0000

Support host routes for storage networks
    
    Add storage network type in the list of interfaces
    that support routes
    
    Change-Id: I6fd5117006159c6622649a563d5268bbd49d05d3
    Story: 2007391
    Task: 39511
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit b0e76a69277441b6becec6533214bdbbb38e6058
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Thu Dec 19 15:47:00 2019 +0200

Allow yaml formatting for controllerfs-list
    
    In oder to be easily parsed by ansible, the controllerfs-list
    command should support yaml output format.
    
    Change-Id: Ic766980645d618d54d34bd04d82339fd2cd36562
    Depends-On: https://review.opendev.org/#/c/719782/
    Partial-bug: 1854169
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit e169d1caea71b63034dbe1a008616df0f7a52639
Author: Andy Ning <andy.ning@windriver.com>
Date:   Mon Apr 6 10:47:09 2020 -0400

Generate admin_url to enable https for admin endpoints
    
    This commit updated platform services' sysinv puppet plugins to
    generate proper admin_url hiera data to enable https for these endpoints
    during controller unlock.
    
    This commit also updated controller_config to copy and install dc admin
    endpoint CA cert and haproxy cert for the second controller.
    
    Change-Id: I21345a96f8a0ffb416069ff28dbcfa51b9e12359
    Story: 2007347
    Task: 39314
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 4e0b2acdfed437e95abf789748969f26880a53a5
Author: John Kung <john.kung@windriver.com>
Date:   Thu Apr 2 10:46:56 2020 -0400

Enable duplex platform upgrades
    
    Enable the mechanism to upgrade the platform components on
    a running StarlingX system with duplex controllers.
    
    This includes upgrade updates for:
      o generation of kubernetes join_cmd to enable the N+1 controller
        to join the cluster
      o migrate of kubernetes config
      o migrate etcd on host-swact
      o migrate of DistributedCloud dcmanager and dcorch databases
    
    A maintenance release for stx3.x is required to upgrade to stx4.0
    
    Tests Performed with duplex controller: AIO-DX and Standard
    - system load-import
    - system health-query-upgrade
    - system upgrade-start
    - system host-upgrade controller-0
    - system host-lock/unlock host N while controller N, N+1
    - system host-lock/unlock controller-0 while controller N+1
    - system host-upgrade controller-1
    - system host-upgrade storage
    - system host-upgrade worker
    - system upgrade-activate
    - system upgrade-abort
    - system host-downgrade
    - system upgrade-complete
    - verified application (e.g. stx-monitor) over upgrade
    
    Change-Id: I4267c7b32b2e7b59b5ffdd8146288698962da1e0
    Story: 2007403
    Task: 39243
    Task: 39244
    Task: 39245
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 4247ed2fde53aa17b51feba93421090c432084e4
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Wed Apr 15 16:40:52 2020 -0400

Update verify-license call in sysinv
    
    This commit updates the parameters required to call verify-license in
    sysinv to bring it in sync with its most recent implementation.
    
    Story: 2007403
    Task: 39433
    
    Depends-on: https://review.opendev.org/#/c/720615/
    
    Change-Id: Ie35e5bb3f1237887dfff66f4ed8d71a24f95ebdb
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit 5c1361b0e81f53349d0d6715f7b627b4456147a0
Author: Robert Church <robert.church@windriver.com>
Date:   Sun Apr 19 06:23:44 2020 -0400

Update MatchNodeSelector recovery logic for NodeAffinity status
    
    NodeAffinity pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.
    
    This update now will look for NodeAffinity pods when the sysinv
    conductor starts. This is no longer limited to simplex nodes. This
    behavior is now observed on simplex and duplex controller configurations
    as of the upversion to k8s v1.18.1.
    
    Change-Id: I6384ffd1d14ac105e26b83c02aaa8f090e1fdde1
    Story: 2006999
    Task: 39475
    Related-Bug: #1849688
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit b1a290f0ccfa0b44af6fd7247be92f361d919467
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Fri Apr 17 10:48:09 2020 -0400

Fix beat fails to parse kubernetes.pod.labels.app
    
    Metricbeat and Filebeat fail to parse labels that are used as both
    objects and keywords in the Elasticsearch document hierarchy. This
    change addresses this issue by enabling the 'labels.dedot' and
    'annotations.dedot' options on Metricbeat kubernetes modules and
    Filebeat's kubernetes metadata processor, which automatically escapes
    conflicting labels and annotations.
    
    Story: 2007221
    Task: 39463
    
    Change-Id: Id7f6cd6fc499ea4644e16c80b68ebde19c6f59ad
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit e6dd6fee38f1f180c4b611db4570021eb7c85bae
Author: Zhipeng Liu <zhipengs.liu@intel.com>
Date:   Thu Apr 2 01:03:34 2020 +0800

Add mariadb database config override to support ipv6
    
    Override "config_override" in helm/mariadb.py according to ip version.
    Test Pass on both ipv4 and ipv6 simplex.
    
    Closes-Bug: 1859641
    
    Change-Id: Ic15865105f305a8d7b93187eb51ef5aaf3d7d96e
    Signed-off-by: Zhipeng Liu <zhipengs.liu@intel.com>

commit d4c2f23d4fd24fa60c79c2fe0ac7e0c6ab97887b
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Thu Apr 16 16:42:41 2020 -0400

Fixed invalid lldp tlv update by sysinv conductor
    
    Sysinv conductor reads tlv packets for a list of vlan names, and attempts
    to shove it into DB without checking its string size. Since in DB,
    'dot1_vlan_names' field only permits 255 char, the DB update can fail.
    
    This fix truncates the list of vlan names to ensure it is under 255 char.
    Unit tests are added to verify the conductor behavior.
    
    Closes-Bug: 1866230
    
    Change-Id: Ibe0f06bc5c6a96573a338ebbb991bfc88cde6fb4
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 2e40c98ed07abad6cc84b32b129cac52baea794f
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Wed Apr 15 10:55:41 2020 -0400

Use ansible for simplex upgrade start
    
    Use the ansible backup playbook for simplex upgrade start. Pass the
    backup location and filename to the playbook.
    
    Change-Id: I624e38adfb5a7d4c1193da0dfe29991492f41d6a
    Story: 2007403
    Task: 39427
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit f64ae62e4dfede86ad821aa8282a783f3c406c8d
Author: Tao Liu <tao.liu@windriver.com>
Date:   Thu Apr 16 10:04:52 2020 -0400

Support subcloud deploy upload the common files
    
    Define a constant for /opt/platform/deploy/<version>
    
    Partial-Bug: 1864508
    
    Change-Id: Ide43993992aeae830631a0c1bb8ee377990a6974
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit 49f93b5d6d4d30d5717753efe499485ea15cca8f
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Tue Apr 14 13:34:03 2020 -0400

Add system fields to container logs
    
    These changes move the system fields out from Filebeat's 'log' input
    config so that they are added to all log inputs. System fields are now
    added to autodiscovered container logs as well.
    
    Change-Id: I4810df8c79f69029347554124849ee44068f5e5f
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit 994068cbd88f9eb3df99a4bff016df73493285e8
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Apr 13 11:46:07 2020 -0400

Container logs collected without Kubernetes metadata
    
    Container logs are now being collected using Filebeat's 'container'
    input.
    This change excludes container logs from being collected by the 'log'
    input, so that the logs can be enriched with Kubernetes metadata.
    
    Depends-On: https://review.opendev.org/#/c/719585/
    Change-Id: Ia7ed274975bfe4c4a5bd0dc78f256fa3fae23d5f
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit b1ca87c7cfca4ac493fe8ef6e57de4d425effba2
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Fri Apr 10 16:48:49 2020 +0300

Change ceph manager port
    
    Free port 5001 to be used by keystone.
    
    Story: 2007347
    Task: 39391
    
    Depends-On: I45ee810c9b4686d98c246c3a73f21f0de4ba76a1
    Change-Id: Ie2901a5affc803e0c86af6a94ed27bfa9cd9d458
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit a6f0615860742b4323a6967a0f9a0059aabb1550
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Mar 23 20:35:46 2020 -0400

Update get_kube_versions to align with v1.18.1
    
    Change-Id: Ib5b2cb2849a2865b8e31bc37a84d35bb9736f131
    Story: 2006999
    Task: 39341
    Depends-On: https://review.opendev.org/#/c/718568/
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 362d905dad25369bf116bb1e34a659f33b7260af
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Fri Apr 10 11:31:06 2020 +0300

Improve host-overrides
    
    Add distributed cloud role information in the host overrides.
    The restore playbook needs this information.
    
    Partial-Bug: 1870389
    Change-Id: I278f19be32d1fe87687feb75e26b2898237de86f
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 55ce64cc58c3548e66b0e2aee454087f5d17c23d
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Tue Apr 7 15:11:35 2020 -0400

Refactor obsolete versions usage in sysinv
    
    This commit removes obsolete version checks from sysinv code.
    
    Story: 2007403
    Task: 39226
    
    Change-Id: Ibc5ba1d65c16971926dfd3aae05564fbb314aa1b
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit a68e15140886a8ed31a40ce8186012b25de77b87
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Mar 27 14:15:52 2020 -0400

Support adding admission plugin post bootstrap
    
    This commit adds a system service parameter for admission plugins of
    kube-apiserver. We need this for pod security plugin. Starting pod
    security plugin without any policies will result in all pods being
    denied. This means pod security plugin must be started by service
    parameter after bootstrap.
    
    Story: 2007351
    Task: 38897
    Depends-On:  https://review.opendev.org/#/c/717374
    
    Change-Id: I1a7e19f85a4be609112765c975bb81a248217168
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit fb8ae2dbae2e6d441579b04a9629439e2cced3c8
Author: Sharath Kumar K <sharath.kumar@intel.com>
Date:   Mon Apr 6 09:53:28 2020 +0200

De-branding in starlingx/config: Titanium Cloud -> StarlingX
    
    1. Rename Titanium Cloud to StarlingX for .spec files
    2. Rename Titanium Cloud to StarlingX for .service file
    
    Test:
    After the de-brand change, bootimage.iso has built in the flock layer
    and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch4 changes.
    
    Story: 2006387
    Task: 36202
    
    Change-Id: I708a1edb07dcd21a623fa484bb3b935c5180d089
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>

commit b101cc1719e356baac24b7eda3f7ff2bdd5e984d
Author: Ambarish Das <ambarish.das@intel.com>
Date:   Fri Apr 3 21:38:17 2020 -0500

Clean up: python libvirt removed from test requirement of sysinv
    
    This patch removes the dependency of libvirt-python from
    test_requirements.txt file of sysinv.This package is no more
    used by sysinv and generates error in "tox" execution.
    
    Closes-Bug:#1869318
               libvirt-python in test requirement throws error in tox build
               for py27 for config module
    
    Change-Id: I6f662159d5d71465079746755dabc8c063d9a158
    Signed-off-by: Ambarish Das <ambarish.das@intel.com>

commit 01c8b191d19ed6dd7a0d6475aa3a439890e43379
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Fri Mar 27 14:03:28 2020 -0400

Config updates for stx3.0 upgrades
    
    Update controllerconfig to remove non-platform openstack components
    and fix db barbican migration.
    Create RPC call to allow to touch /etc/platform/.upgrade_controller_1
    Remove not needed upgrade-scripts.
    Obsolete software version related fixes.
    
    Story: 2007403
    Task: 39086
    Task: 39087
    Task: 39182
    Task: 39183
    Task: 39226
    
    Change-Id: I28e746f3d267c322f59402beaf25c271138a124d
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit 898d48afe5ee894277246642e3533113771d1672
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Mar 2 16:23:19 2020 -0500

Update helm overrides for elastic helm charts 7.6.0
    
    Update filebeat overrides to use "filebeatConfig" parameter for config
    files.
    Update logstash "replicas" and "elasticseachHosts" overrides.
    Update metricbeat module overrides to conform to metricbeat's
    configuration format.
    
    Story: 2007221
    Task: 38473
    Task: 38476
    Task: 38477
    Task: 38478
    
    Change-Id: Ie27916c1e26c4c1ada25c15277daa0598f7599b5
    Depends-On: https://review.opendev.org/#/c/708730/

commit d7ba6775212401f2bfc0bee04febe661152e504d
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Mon Mar 23 19:06:49 2020 -0400

Wait for pod termination on stx-monitor remove
    
    On removal of the stx-monitor application, wait for all pods
    to have terminated before moving to 'uploaded' status.
    This will prevent the user from issuing an application-delete
    command which could possibly timeout.
    
    Change-Id: I116a98bdc60a4a7fe05e50eb9b4ddd4e6ef2e24f
    Closes-Bug: 1868567
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit 423a475aff4f9ea1b60af6a9a2989027d1506f10
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Thu Mar 12 14:06:08 2020 +0800

Refresh local registry auth info each time when access local registry
    
    Local registry uses admin account password as authentication info.
    And this password may be changed by openstack client at any time.
    When sysinv tries to download images from local registry, it cannot
    cache the auth info, otherwise it may lead to authentication failure
    in keystone, and account be locked at the end.
    
    Partial-Bug: 1853017
    
    Change-Id: I07f273a05a1bc3c08b48d13c94eb6df6aecdf7c3
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

tags:

added: in-f-centos8

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-06-03: Fix merged to ansible-playbooks (f/centos8)

#23

Download full text (22.6 KiB)

Reviewed: https://review.opendev.org/729809
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=73027425d4501a6b7785e91024c9e8ddbc03115d
Submitter: Zuul
Branch: f/centos8

commit 55c9afd075194f7669fa2a87e546f61034679b04
Author: Dan Voiculeasa <email address hidden>
Date: Wed May 13 14:19:52 2020 +0300

Restore: disconnect etcd from ceph

At the moment etcd is restored only if ceph data is kept.
Etcd should be restored regardless if ceph data is kept or wiped.

    Story: 2006770
    Task 39751
    Change-Id: I9dfb1be0a83c3fdc5f1b29cbb974c5e0e2236ad3
    Signed-off-by: Dan Voiculeasa <email address hidden>

commit 003ddff574c74adf11cf8e4758e93ba0eed45a6a
Author: Don Penney <email address hidden>
Date: Fri May 8 11:35:58 2020 -0400

Add playbook for updating static images

This commit introduces a new playbook, upgrade-static-images.yml, used
for downloading updating images and pushing to the local registry.

    Change-Id: I8884440261a5a4e27b40398e5a75c9d03b09d4ba
    Story: 2006781
    Task: 39706
    Signed-off-by: Don Penney <email address hidden>

commit 26fd273cf5175ba4bdd31d6b6b777814f1a6c860
Author: Matt Peters <email address hidden>
Date: Thu May 7 14:29:02 2020 -0500

Add kube-apiserver port to calico failsafe rules

    An invalid GlobalNetworkPolicy or NetworkPolicy may prevent
    calico-node from communicating with the kube-apiserver.
    Once the communication is broken, calico-node is no longer
    able to update the policies since it cannot communicate to
    read the updated policies. It can also prevent the pod
    from starting since the policies will prevent it from
    reading the configuration.

    To ensure that this scenario does not happen, the kube-apiserver
    port is being added to the failsafe rules to ensure communication
    is always possible, regardless of the network policy configuration.

    Change-Id: I1b065a74e7ad0ba9b1fdba4b63136b97efbe98ce
    Closes-Bug: 1877166
    Related-Bug: 1877383
    Signed-off-by: Matt Peters <email address hidden>

commit bd0f14a7dfb206ccaa3ce0f5e7d9034703b3403c
Author: Robert Church <email address hidden>
Date: Tue May 5 15:11:15 2020 -0400

Provide an update strategy for Tiller deployment

    In the case of a simplex controller configuration the current patching
    strategy for the Tiller environment will fail as the tiller ports will
    be in use when the new deployment is attempted to be applied. The
    resulting tiller pod will be stuck in a Pending state.

    This will be observed if the node becomes ready after 'helm init'
    installs the initial deployment and before the deployment is patched for
    environment checks.

    The deployment strategy provided by 'helm init' is unspecified. This
    change will allow one additional pod (current + new) and one unavailable
    pod (current) during an update. The maxUnavailable setting allows the
    tiller pod to be deleted which will release its ports, thus allowing the
    patch deployment to spin up an new pod to a Running state.

Change-Id: I83c43c52a77...

Reviewed:  https://review.opendev.org/729809
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=73027425d4501a6b7785e91024c9e8ddbc03115d
Submitter: Zuul
Branch:    f/centos8

commit 55c9afd075194f7669fa2a87e546f61034679b04
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Wed May 13 14:19:52 2020 +0300

Restore: disconnect etcd from ceph
    
    At the moment etcd is restored only if ceph data is kept.
    Etcd should be restored regardless if ceph data is kept or wiped.
    
    Story: 2006770
    Task 39751
    Change-Id: I9dfb1be0a83c3fdc5f1b29cbb974c5e0e2236ad3
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 003ddff574c74adf11cf8e4758e93ba0eed45a6a
Author: Don Penney <don.penney@windriver.com>
Date:   Fri May 8 11:35:58 2020 -0400

Add playbook for updating static images
    
    This commit introduces a new playbook, upgrade-static-images.yml, used
    for downloading updating images and pushing to the local registry.
    
    Change-Id: I8884440261a5a4e27b40398e5a75c9d03b09d4ba
    Story: 2006781
    Task: 39706
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit 26fd273cf5175ba4bdd31d6b6b777814f1a6c860
Author: Matt Peters <matt.peters@windriver.com>
Date:   Thu May 7 14:29:02 2020 -0500

Add kube-apiserver port to calico failsafe rules
    
    An invalid GlobalNetworkPolicy or NetworkPolicy may prevent
    calico-node from communicating with the kube-apiserver.
    Once the communication is broken, calico-node is no longer
    able to update the policies since it cannot communicate to
    read the updated policies.  It can also prevent the pod
    from starting since the policies will prevent it from
    reading the configuration.
    
    To ensure that this scenario does not happen, the kube-apiserver
    port is being added to the failsafe rules to ensure communication
    is always possible, regardless of the network policy configuration.
    
    Change-Id: I1b065a74e7ad0ba9b1fdba4b63136b97efbe98ce
    Closes-Bug: 1877166
    Related-Bug: 1877383
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit bd0f14a7dfb206ccaa3ce0f5e7d9034703b3403c
Author: Robert Church <robert.church@windriver.com>
Date:   Tue May 5 15:11:15 2020 -0400

Provide an update strategy for Tiller deployment
    
    In the case of a simplex controller configuration the current patching
    strategy for the Tiller environment will fail as the tiller ports will
    be in use when the new deployment is attempted to be applied. The
    resulting tiller pod will be stuck in a Pending state.
    
    This will be observed if the node becomes ready after 'helm init'
    installs the initial deployment and before the deployment is patched for
    environment checks.
    
    The deployment strategy provided by 'helm init' is unspecified. This
    change will allow one additional pod (current + new) and one unavailable
    pod (current) during an update. The maxUnavailable setting allows the
    tiller pod to be deleted which will release its ports, thus allowing the
    patch deployment to spin up an new pod to a Running state.
    
    Change-Id: I83c43c52a77bce9f085bfb6c6a2c4171f2ba8f97
    Partial-Bug: #1876396
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 0dc9e173855792c38bec90360c0c4c066c36d66b
Author: Robert Church <robert.church@windriver.com>
Date:   Mon May 4 12:59:49 2020 -0400

Ensure containerd binds to the loopback interface
    
    Set the stream_server_address to bind to the loopback interface with a
    value of "127.0.0.1" for IPv4 and "::1" for IPv6.
    
    This will explicitly update the containerd configuration to use the IP
    address of the loopback interface based on the system's network
    configuration.
    
    Change-Id: I76a4ad1c123b8b701cb1fa74b16609b50cdf9bd2
    Partial-Bug: #1875891
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 2ea3ce6a7fdff5c2079acd76bd8eee7001b4127c
Author: Andy Ning <andy.ning@windriver.com>
Date:   Thu Apr 30 13:41:33 2020 -0400

Increase wait time for certificate during subcloud bootstrap
    
    Currently during subcloud ansible bootstrap, it waits up to 15s for
    certificate secret to be ready after the yaml file applies. For some
    slow hosts (VBox for example) 15s appears not long enough so the
    extracted certificate is partial, which in turn fails haproxy.
    
    This commit updates to use the better "kubectl wait" mechanism to wait
    for the certificate to be ready, with a timeout of 30s.
    
    Change-Id: Ibd8cab9339c6d532353b45b49cc4d141f0cf5ace
    Closes-Bug: 1876099
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit d05785ffd9add6553662fcab43f30bf8d9f6d2e3
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Fri Apr 24 10:48:20 2020 +0000

Upversion Netapp application
    
    Changes included in this commit:
    - updated netapp required docker images
    - add support for PVC snapshots (beta feature since K8s
      1.17);
    - create new ansible role for enabling PVC snapshot
      support and start required pod
    - import role for bootstrap as well, so any backend
      added in the future will also have support enabled
      by default
    - also use snapshot role for the netapp backend
      configuration (for upgrade considerations)
    - change netapp backend configuration of mapping backends
      and storage classes from 1-to-1 mapping to many-to-many
      mapping; instead of one backend configured for each
      storage-class, now any number of backends can be
      configured for any number of storage classes
    - add a new VolumeSnapshotClass configuration option for
      PVC snapshot support
    
    Change-Id: Ib1cf5a5b46f24a6864ac6d894e37db8732e0c6fb
    Depends-On: https://review.opendev.org/#/c/724237/
    Story: 2007391
    Task: 39566
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit 204641a5b3082c9873109169f93ae1845eb79813
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Wed Apr 15 15:54:58 2020 +0300

DC subcloud restore registry.central certs
    
    During restore a certificate is missing.
    Docker needs the certificate to connect to registry.central.
    Extract it from backup archive.
    
    Closes-Bug: 1870389
    
    Depends-On: I64c8b38a51bf04714931d70e126e0f63782deb20
    Depends-On: Ieb12ffc0ad769dd6ca22eb4c15f9d6d55778fd4b
    Depends-On: I86166da31491736d6695e04fa287f79871975b55
    Depends-On: Iebab8dc059435c7e2b0f19947fedce88bd71bb65
    Depends-On: I278f19be32d1fe87687feb75e26b2898237de86f
    
    Change-Id: Ief65a8963b81ef489171c264964d472a66fec282
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit acd84841d201f1d5777edd2996086732cb3a3104
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Thu Apr 23 17:37:23 2020 +0300

Fix SystemController filesystem at restore
    
    The filesystem `dc-vault` is created at unlock.
    It doesn't exist at restore time to be resized.
    It will be correctly sized during unlock.
    
    It is not mounted into /dev/cgts-vg/dc-vault-lv.
    
    Closes-Bug: 1873617
    Change-Id: Ia2748756eaa8109065af1848374cc058c447910e
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 885cfe61269a43c7cff7e56732baefc2190d5be1
Author: Bin Qian <bin.qian@windriver.com>
Date:   Wed Apr 29 11:58:14 2020 -0400

Set root certification duration
    
    Setting root certification to 5 years and renew 30 days ahead.
    
    Change-Id: I780edaab0c041a0db1e9faf47bcd473e20068247
    Story: 3007347
    Task: 39428

commit 54e9b94773f3ae9c6be7eb14e141537cad373915
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Wed Apr 22 15:44:15 2020 +0300

Fix restore without ceph backend
    
    When ceph backend is not configured there is no ceph crushmap to be
    restored, nor ceph monitors data. Skip restoring those.
    
    The rest of the logic regarding ceph osds can be treaded as if osds were
    wiped.
    
    Closes-Bug: 1873974
    Depends-On: Ic2b7a77f4a54d3d30aedd6c00747fc4586428997
    Change-Id: I2776d7c2d5801ce6e81c487da263075b6f6873c8
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit dd89ba118d21027da28f860f2da47e6794d0453b
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Wed Apr 22 13:32:21 2020 +0300

Fix backup without ceph backend
    
    When ceph backend is not configured there is no ceph crushmap to be
    backed up. Skip the crushmap backup step.
    
    Partial-Bug: 1873974
    Change-Id: Ic2b7a77f4a54d3d30aedd6c00747fc4586428997
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 3bb26d81d51f0590dba2a19caf9cc430673f6018
Author: Andy Ning <andy.ning@windriver.com>
Date:   Wed Apr 8 09:42:10 2020 -0400

Setup https admin endpoint certificates for subcloud
    
    This commit updated ansible bootstrap to generate, install and
    configure certificates for https enabled admin endpoints. This change
    applies to subcloud of a DC system only.
    
    The subcloud admin endpoint certificate has valid duration of 180 days
    and renew before of 30 days.
    
    Tests:
      - Successfully deploy subcloud by "dcmanager subcloud add"
      - Verify haproxy admin endpoint certificate is generated and
        installed properly in subcloud.
      - Verify DC admin endpoint root CA certificate is installed in
        subcloud's trusted CA cert list in subcloud.
      - Verify the haproxy admin endpoint certificate can be validiated by
        the DC endpoint root CA certificate successfully in subcloud.
    
    Change-Id: Ib24d27ac4cafe345fb57ba906ea5baf0930af892
    Story: 2007347
    Task: 39465
    Depends-On: https://review.opendev.org/#/c/720224/
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 2b287b1050fa2b1a7b5f5d983eaa634a055b8ec2
Author: Bin Qian <bin.qian@windriver.com>
Date:   Tue Apr 7 23:48:11 2020 -0400

Install dc root cert
    
    This is to create a distributed cloud specific root CA issuer with
    cert-manager.
    
    The root CA issuer is to authorize intermediate issuers for each
    subcloud, the latter then to issue certificate for admin endpoints.
    
    Test cases:
    Bootstrap systemcontroller from local/remote
    Replay systemcontroller bootstrap playbook
    
    Story: 3007347
    Task: 39428
    
    Change-Id: I7546d6562f0bc072c3cf76f422a258a2c32b4a34
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 36a01e8ba38f3e0d1e2ea7a2bce31edbedfde04e
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Tue Apr 21 17:54:53 2020 +0300

B&R: Do keystone db backup for subcloud
    
    Keystone db backup file is missing for subclouds.
    Create the keystone db backup file when running the backup playbook on
    subcloud.
    
    Partial-Bug: 1870389
    Change-Id: I64c8b38a51bf04714931d70e126e0f63782deb20
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit df25466798d2487c933f7d2fc1d04ec968f4bcd2
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Fri Apr 24 15:23:37 2020 -0400

Rename the existing /opt/patch-vault filesystem to /opt/dc-vault
    
    The filesystem /opt/patch-vault is renamed to /opt/dc-vault so that
    it can be re-used to store FPGA images and software loads. Thus,
    necessary changes have been made to the ansible playbook files.
    
    Change-Id: I3358fe2d87c79785a8803815b1bbd2727ae80a24
    Story: 2006740
    Task: 39550
    Depends-On: https://review.opendev.org/#/c/723007/
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit d3341102189031551e8d4d194e42d86d8878920f
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Sun Apr 19 21:30:57 2020 -0400

Enable applying applications after bootstrap
    
    This commit adds the ability to specify applications to be applied
    directly after bootstrap, before controller-0 have been unlocked.
    This is needed for cert manager.
    
    Currently, nginx and cert manager will be applied by default, with
    no overrides. The user can optionally specify overrides if they wish
    
    NOTE: This aligns with long term direction for platform applications
    to:
    - move away from the existing platform application framework in sysinv
      due to wanting to decouple application behaviour from sysinv code
      in order to support such things as independent upgrades of these
      platform applications.
    - support auto-upload/apply of platform applications in either:
         a) bootstrap playbook, if app required for supporting bootstrap
            functions, or
         b) a post-bootstrap deployment-type playbook.
    In the case of cert-manager, in near future, it will be required at
    bootstrap to support initial configuration around generating
    certificates for kubernetes and https connections.
    
    Story: 2007360
    Task: 39471
    
    Change-Id: I91ee31c7c2d35c2a101b156ef8633fc69139938d
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit 0a1c06a66bc286b306bfdf4ada7cf823787b7a94
Author: Tao Liu <tao.liu@windriver.com>
Date:   Tue Apr 21 15:36:29 2020 -0400

Increase wait timeout for service endpoints reconfig
    
    Install/bootstrap HP EL8000 as subcloud timed out, while
    waiting for endpoints reconfiguration to complete
    during bootstrapping.
    
    This server has a single processor which takes around 9 mins
    to apply the runtime manifest, which is greater timeout
    value than 450 seconds. In general, everything is slower on this
    particular hardware, e.g. install is slower and cli commands
    take almost twice longer to complete than other servers.
    
    This update increases the endpoints reconfiguration wait
    timeout to 720 seconds which provides a safety margin.
    
    Testcases:
    Install/bootstrap HP EL8000 as a subcloud.
    
    Closes-Bug: 1871699
    
    Change-Id: If284281aa13e79cc14d0369e44e8cacebb24f415
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit abbf21f7fcef00e90e75d393f638a73d58b41adb
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Dec 16 12:53:10 2019 -0500

Patch tiller deployment to provide environment validation
    
    There appears to be a race condition between when kubelet sees a pod and
    when kubelet sees a service. Due to this race, required environment
    variable are missing to allow tiller to function properly.
    
    See the comment at
    https://github.com/kubernetes/kubernetes/blob/v1.18.1/pkg/kubelet/kubelet_pods.go#L566
    
    This change patches the tiller deployment to make sure the four classes
    of environment variables are present prior to starting tiller. If any
    class of variables are not present in the environment, then exit. This
    will recreate the pod and will populate the correct environment for
    tiller to function.
    
    Since the upgrade to v1.18.1, this has been seen in simplex and duplex
    controller configurations.
    
    This will cover patching during initial provisioning via ansible and
    will be reverted once StarlingX moves to helm v3.
    
    Change-Id: I78e43459fedab611a67b8d9b6b3121b78ef048a6
    Partial-Bug: #1856078
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 9a8136b5b11a874da9a5b67519a59b27530b4aad
Author: Tao Liu <tao.liu@windriver.com>
Date:   Sat Apr 18 13:54:45 2020 -0400

Backup & restore: subcloud deploy files
    
    Backup the subcloud deploy files if available on the system.
    Restore the subcloud deploy files if included in the archive.
    
    Testcases:
    Backup & restore System Controller with the subcloud deploy
    files.
    Backup & restore a regular system without the subcloud
    deploy files
    
    Partial-Bug: 1864508
    
    Change-Id: Ic14f6c02dd187a082b03458b0a766c690400e317
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit 40cfef7c417709c234e50a1a034fb4a11dbf180a
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Tue Apr 14 14:18:29 2020 +0300

Remove subcloud task from restore mode
    
    A task supposed to run only during bootstrap is running during restore.
    
    Keystone dc variables (dc_admin_user_id and dc_admin_project_id) are
    added during bootstrap to hieradata static.yaml file.
    When doing the restore the information is already present in the file in
    the backup archive.
    
    Partial-Bug: 1870389
    Change-Id: Iebab8dc059435c7e2b0f19947fedce88bd71bb65
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 5cdd394cb10c2c2d94174fdc32beb989290c6de9
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Thu Dec 19 15:23:23 2019 +0200

Resize DRBD resources when doing a restore
    
    In cases where we do a backup of a system that has non-default
    sizes for drbd-backed partitions, the restore fails when first
    unlocking controller-0.
    
    The normal resize procedure requires all controller nodes to
    be unlocked and available because the puppet manifest does
    not support resizing at unlock.
    
    To prevent the issue from occuring, as part of the restore
    procedure, we should resize the partitions on controller-0
    with the proper sizes found in sysinv. Controller-1 will
    automatically create the partitions with the proper sizes
    from the very start, so it will not need any resizes.
    
    Change-Id: Ia73452ce721514d393b486a659730d0ca7c0d7e5
    Closes-bug: 1854169
    Depends-on: https://review.opendev.org/#/c/699990
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit a027bcf50a037166f84d897e22535c8dedf2590f
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Mar 23 20:32:08 2020 -0400

Support for upversioning of k8s to v1.18.1
    
    Changes include:
    - Renamed the v1.16.2 versioned directories to v1.18.1.
    - Updated kubeadm.yaml to align the kubernetesVersion and enable the
      featureGate for multiple hugepage support
    
    Change-Id: I7241164f0185496093c0c8b5cb541fd09926b2ed
    Story: 2006999
    Task: 39334
    Depends-On: https://review.opendev.org/#/c/718568/
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 1b50022d55a9da2bbab284b1fdda2ddc78c30c79
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Wed Apr 8 10:57:50 2020 +0800

Fix account be locked due to access registry without password
    
    Correct code to let exception be raised when password cannot be
    got from keyring. Account is locked due to exception is not raised,
    and client try to access registry with None password, which is
    incorrect.
    
    Closes-Bug: #1871141
    Change-Id: Ia68b4a4f25756fdad7a198a31d5870245ff9dc1a
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

commit 9080db419d559d3d5d33c0a6459e9f5e8b7700e5
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Thu Apr 9 16:07:30 2020 +0300

Add registry.central host for DC subcloud restore
    
    During bootstrap management network is temporarly assigned on lo
    interface. Backup archive contains /etc/resolv.conf and /etc/hosts
    of an already unlocked controller. Before backup registry.central is
    resolved through dns (nameserver `floating central management`).
    
    During restore a temporary host for registry.central must be created.
    Since there is no reference of a backup/shadow management network that
    provides connectivity for such use cases the `floating central oam`
    can be used.
    
    Partial-Bug: 1870389
    
    Change-Id: I86166da31491736d6695e04fa287f79871975b55
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 46e9c405cb13972a3bf08cbfcdfe4181c12b3cfc
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Mar 27 14:09:45 2020 -0400

Add default pod security policies
    
    This commit adds default pod security policies. We need this
    pod security plugin. Starting pod security plugin without any
    policies will result in all pods being denied. These default
    policies prevent the user from putting the system into an
    unusable state if they accidentally enable pod security
    policies without adding policies first.
    
    Story: 2007351
    Task: 38897
    
    Change-Id: Iac49f81ef44e6cb82ff884717888dfc1a7cd2a45
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit f3340a3b5379f8c33de42aeaf11e96cc886df020
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Tue Apr 7 11:36:19 2020 +0300

Backup & restore: Restore license files
    
    STX offers support for installing license files through the
    "system license-install" command.
    
    While, these licenses are not enforced, they are part of the
    backups created, but they are not restored when doing a full
    backup & restore.
    
    Since license is optional, it is not expected to always be
    present in the backup archive, so we only restore it if it
    is present in the archive.
    
    Change-Id: Ibd4cdcb53d1d55409d947c1f3af45659ed21a7ae
    Closes-bug: 1871034
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit 5c542524e4cd9fb65da698c1d4cba4d50f56bdab
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Wed Apr 1 15:58:07 2020 +0800

Add kubelet_vol_plugin_dir definition to fix ansible failure
    
    When do host-swact, upgrade-k8s-networking.yml will be called to check
    calico upgrade. And kubelet_vol_plugin_dir is missed in definition
    and cause ansible fail. Add definition from main.yml to fix it.
    
    Closes-Bug: 1870038
    Change-Id: I30287ebca7f0d4a1d3c5ee656136375a7b1c182f
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

commit d6cff0496dcf52655eba340e1e57b1d973040edf
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Thu Mar 12 14:34:09 2020 +0800

Refresh local registry auth info each time when access local registry
    
    Local registry uses admin account password as authentication info.
    And this password may be changed by openstack client at any time.
    When try to download images from local registry, auth info cannot
    be cached, otherwise it may lead to authentication failure in keystone,
    and account be locked at the end.
    For this specific case, there is host-swact first, then function
    "_upgrade_downgrade_kube_networking" in sysinv conductor is called.
    And upgrade-k8s-networking.yml is executed which will try to download
    kube network images from local registry. During this period, admin
    account password is changed. And lead to account be locked due to
    authentication failure in keystone.
    With this update, there is still possibility that password be changed
    just after get operation. And due to the images download are run in
    parallel with multi threads, so account lock may still hit. This
    change could minimize the issue rate, but cannot fix all.
    
    Closes-Bug: 1853017
    
    Change-Id: I686616937031a3f7ac6d65e5b118511dc549ab85
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

Revision history for this message

Senthil Mukundakumar (smukunda) wrote on 2020-06-26:

#24

Verified in DC3/subcloud1 using 2020-06-24_22-16-59

tags:

removed: stx.retestneeded

StarlingX

Backup & Restore: Subcloud AIO-DX active controller restore fails to connect to central registry

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches