Bug #1849688 “platform-integ-apps apply-failed after lock/unlock...” : Bugs : StarlingX

Revision history for this message

Yang Liu (yliu12) wrote on 2019-10-24:

#1

controller-0_20191024.144401.tar Edit (21.0 MiB, application/x-tar)

Revision history for this message

Bob Church (rchurch) wrote on 2019-10-24:

#2

Download full text (9.4 KiB)

Potentially looks like an endpoint/firewall update may be impacting an application apply that is in progress.

# Initial apply works during provisioning
2019-10-24 06:56:58.365 110188 INFO sysinv.conductor.manager [-] Platform managed application platform-integ-apps: Applying...
2019-10-24 06:56:58.640 110188 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 06:56:58.940 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 06:58:09.356 110188 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 70 seconds
2019-10-24 06:58:33.243 110188 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 06:58:33.244 110188 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

# An override change has been detected. Not sure why this is the case. Needs investigation…
2019-10-24 07:13:16.722 102497 INFO sysinv.conductor.manager [-] There has been an overrides change, setting up reapply of platform-integ-apps

# Firewall update is triggered
2019-10-24 07:13:16.726 102497 INFO sysinv.agent.rpcapi [-] config_apply_runtime_manifest: fanout_cast: sending config 66c95e55-43a1-4b79-847d-43e6960123d2 {'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime', 'platform::sysinv::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} to agent
2019-10-24 07:13:16.728 22171 INFO sysinv.agent.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] config_apply_runtime_manifest: 66c95e55-43a1-4b79-847d-43e6960123d2 {u'classes': [u'openstack::keystone::endpoint::runtime', u'platform::firewall::runtime', u'platform::sysinv::runtime'], u'force': False, u'personalities': [u'controller'], u'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} controller

2019-10-24 07:13:31.950 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Updating platform data for host: 4624ddd2-6b83-4e12-ada6-f6862e120509 with: {u'availability': u'services-enabled'}
2019-10-24 07:13:32.171 102497 INFO sysinv.helm.manifest_base [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 07:13:32.172 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] There has been an overrides change, setting up reapply of platform-integ-apps

# Re-apply occurs due to reapply flag being raised
2019-10-24 07:14:13.743 102497 INFO sysinv.conductor.manager [-] Reapplying platform-integ-apps app
2019-10-24 07:14:13.747 102497 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 07:14:14.054 102497 INFO sy...

Potentially looks like an endpoint/firewall update may be impacting an application apply that is in progress.

# Initial apply works during provisioning
2019-10-24 06:56:58.365 110188 INFO sysinv.conductor.manager [-] Platform managed application platform-integ-apps: Applying...
2019-10-24 06:56:58.640 110188 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 06:56:58.940 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 06:58:09.356 110188 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 70 seconds
2019-10-24 06:58:33.243 110188 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 06:58:33.244 110188 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.
 
# An override change has been detected. Not sure why this is the case. Needs investigation…
2019-10-24 07:13:16.722 102497 INFO sysinv.conductor.manager [-] There has been an overrides change, setting up reapply of platform-integ-apps
 
# Firewall update is triggered
2019-10-24 07:13:16.726 102497 INFO sysinv.agent.rpcapi [-] config_apply_runtime_manifest: fanout_cast: sending config 66c95e55-43a1-4b79-847d-43e6960123d2 {'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime', 'platform::sysinv::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} to agent
2019-10-24 07:13:16.728 22171 INFO sysinv.agent.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] config_apply_runtime_manifest: 66c95e55-43a1-4b79-847d-43e6960123d2 {u'classes': [u'openstack::keystone::endpoint::runtime', u'platform::firewall::runtime', u'platform::sysinv::runtime'], u'force': False, u'personalities': [u'controller'], u'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} controller
 
2019-10-24 07:13:31.950 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Updating platform data for host: 4624ddd2-6b83-4e12-ada6-f6862e120509 with: {u'availability': u'services-enabled'}
2019-10-24 07:13:32.171 102497 INFO sysinv.helm.manifest_base [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 07:13:32.172 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] There has been an overrides change, setting up reapply of platform-integ-apps
 
# Re-apply occurs due to reapply flag being raised  
2019-10-24 07:14:13.743 102497 INFO sysinv.conductor.manager [-] Reapplying platform-integ-apps app
2019-10-24 07:14:13.747 102497 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 07:14:14.054 102497 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 07:14:14.134 102497 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-10-24 07:14:14.281 102497 INFO sysinv.helm.manifest_base [-] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 07:14:14.282 102497 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-10-24 07:14:14.310 102497 INFO sysinv.conductor.kube_app [-] Armada manifest file has no img tags for chart helm-toolkit
2019-10-24 07:14:14.339 102497 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download started from local registry
2019-10-24 07:14:14.341 102497 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download started from local registry
2019-10-24 07:14:15.043 102497 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download succeeded in 0 seconds
2019-10-24 07:14:15.051 102497 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download succeeded in 0 seconds
2019-10-24 07:14:15.052 102497 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 0 seconds
2019-10-24 07:14:15.058 102497 INFO sysinv.conductor.kube_app [-] Starting Armada service...
2019-10-24 07:14:15.058 102497 INFO sysinv.conductor.kube_app [-] kube_config=/opt/platform/armada/19.10/admin.conf, manifests_dir=/opt/platform/armada/19.10, overrides_dir=/opt/platform/helm/19.10, logs_dir=/var/log/armada.
2019-10-24 07:14:15.308 102497 INFO sysinv.conductor.kube_app [-] Armada service started!
2019-10-24 07:14:15.308 102497 INFO sysinv.conductor.kube_app [-] Armada apply command = /bin/bash -c 'set -o pipefail; armada apply --enable-chart-cleanup --debug /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml  --values /overrides/platform-integ-apps/1.0-8/kube-system-rbd-provisioner.yaml  --values /overrides/platform-integ-apps/1.0-8/kube-system-ceph-pools-audit.yaml  --values /overrides/platform-integ-apps/1.0-8/helm-toolkit-helm-toolkit.yaml  --tiller-host tiller-deploy.kube-system.svc.cluster.local | tee /logs/platform-integ-apps-apply.log'
2019-10-24 07:14:16.057 102497 INFO sysinv.conductor.kube_app [-] Starting progress monitoring thread for app platform-integ-apps
 
# Tiller is getting information
2019-10-24 07:14:16.609 11 DEBUG armada.handlers.tiller [-] Using Tiller host IP: tiller-deploy.kube-system.svc.cluster.local _get_tiller_ip /usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py:165ESC[00m
2019-10-24 07:14:16.609 11 DEBUG armada.handlers.tiller [-] Getting Tiller Status: Tiller exists tiller_status /usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py:186ESC[00m
2019-10-24 07:14:16.609 11 INFO armada.handlers.armada [-] Downloading tarball from: http://controller:8080/helm_charts/stx-platform/rbd-provisioner-0.1.0.tgzESC[00m
2019-10-24 07:14:16.609 11 WARNING armada.handlers.armada [-] Disabling server validation certs to extract chartsESC[00m
2019-10-24 07:14:16.632 11 INFO armada.handlers.armada [-] Downloading tarball from: http://controller:8080/helm_charts/stx-platform/helm-toolkit-0.1.0.tgzESC[00m
2019-10-24 07:14:16.633 11 WARNING armada.handlers.armada [-] Disabling server validation certs to extract chartsESC[00m
2019-10-24 07:14:16.649 11 INFO armada.handlers.armada [-] Downloading tarball from: http://controller:8080/helm_charts/stx-platform/ceph-pools-audit-0.1.0.tgzESC[00m
2019-10-24 07:14:16.649 11 WARNING armada.handlers.armada [-] Disabling server validation certs to extract chartsESC[00m
2019-10-24 07:14:16.666 11 DEBUG armada.handlers.tiller [-] Tiller ListReleases() with timeout=300, request=limit: 32
 
# Firewall update completed
2019-10-24 07:14:18.770 22171 INFO sysinv.agent.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] Runtime manifest apply completed for classes [u'openstack::keystone::endpoint::runtime', u'platform::firewall::runtime', u'platform::sysinv::runtime'].
2019-10-24 07:14:18.770 22171 INFO sysinv.agent.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] Agent config applied  66c95e55-43a1-4b79-847d-43e6960123d2
2019-10-24 07:14:18.802 102497 INFO sysinv.conductor.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] SYS_I Clear system config alarm: controller-0 target config 66c95e55-43a1-4b79-847d-43e6960123d2
 
# Tiller fails to get the releases with a name resolution failure
status_codes: UNKNOWN
status_codes: DEPLOYED
status_codes: DELETED
status_codes: DELETING
status_codes: FAILED
status_codes: PENDING_INSTALL
status_codes: PENDING_UPGRADE
status_codes: PENDING_ROLLBACK
get_results /usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py:215ESC[00m
2019-10-24 07:14:22.610 11 ERROR armada.cli grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
2019-10-24 07:14:22.610 11 ERROR armada.cli     status = StatusCode.UNAVAILABLE
2019-10-24 07:14:22.610 11 ERROR armada.cli     details = "Name resolution failure"
2019-10-24 07:14:22.610 11 ERROR armada.cli     debug_error_string = "{"created":"@1571901261.785335491","description":"Failed to create subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":2721,"referenced_errors":[{"created":"@1571901261.785332424","description":"Name resolution failure","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3026,"grpc_status":14}]}"
2019-10-24 07:14:22.610 11 ERROR armada.cli >
 
# Reapply fails.
2019-10-24 07:14:22.718 102497 ERROR sysinv.conductor.kube_app [-] Failed to apply application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml. See /var/log/armada/platform-integ-apps-apply.log for details.
2019-10-24 07:14:22.719 102497 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 07:14:22.914 102497 ERROR sysinv.conductor.kube_app [-] Application apply aborted!.
2019-10-24 07:14:22.914 102497 INFO sysinv.conductor.kube_app [-] Deregister the abort status of app platform-integ-apps

Revision history for this message

Yang Liu (yliu12) wrote on 2019-10-24:

#3

Removed "tiller pod" portion from title - the tiller status may not be related to this LP since there are more than 1 tiller pods on system and one of them seems to be running.

There's another LP for multiple tiller pods - https://bugs.launchpad.net/starlingx/+bug/1848033

summary:

- platform-integ-apps apply-failed after lock/unlock controller - tiller
- pod stuck at MatchNodeSelector
+ platform-integ-apps apply-failed after lock/unlock controller

Yang Liu (yliu12) on 2019-10-24

description:

updated

Revision history for this message

Bob Church (rchurch) wrote on 2019-10-24:

#4

Attempting a manual reapply fails:

2019-10-24 16:42:21.443 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']
2019-10-24 16:42:21.443 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']

Checking pod status, I observe some rather old pods that are stuck waiting on MatchNodeSelector

kube-system rbd-provisioner-7484d49cf6-k9bw8 0/1 MatchNodeSelector 0 9h <none> controller-0 <none> <none>
kube-system rbd-provisioner-7484d49cf6-vlr62 1/1 Running 1 9h 172.16.192.99 controller-0 <none> <none>
kube-system storage-init-rbd-provisioner-xwnrl 0/1 Completed 0 9h 172.16.192.72 controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-j8q2s 1/1 Running 1 9h 192.168.204.3 controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-ntd2v 0/1 MatchNodeSelector 0 9h <none> controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-z4jzg 0/1 MatchNodeSelector 0 10h <none> controller-0 <none> <none>

Looking at the status of the pods. It looks like originally applied pods may have been scheduled and started, but eventually failed due to missing node selector.

2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Name: rbd-provisioner-7484d49cf6-k9bw8
Namespace: kube-system
Priority: 0
Node: controller-0/
Start Time: Thu, 24 Oct 2019 06:58:10 +0000
Labels: app=rbd-provisioner
pod-template-hash=7484d49cf6
Annotations: cni.projectcalico.org/podIP: 172.16.192.78/32
Status: Failed
Reason: MatchNodeSelector
Message: Pod Predicate MatchNodeSelector failed

Another provisioner is started later, but does not appear associated with a particular re-apply

Name: rbd-provisioner-7484d49cf6-vlr62
Namespace: kube-system
Priority: 0
Node: controller-0/192.168.204.3
Start Time: Thu, 24 Oct 2019 07:30:51 +0000
Labels: app=rbd-provisioner
pod-template-hash=7484d49cf6
Annotations: cni.projectcalico.org/podIP: 172.16.192.99/32
Status: Running

Attempting a manual reapply fails:

2019-10-24 16:42:21.443 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']
2019-10-24 16:42:21.443 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']

Checking pod status, I observe some rather old pods that are stuck waiting on MatchNodeSelector

kube-system                   rbd-provisioner-7484d49cf6-k9bw8           0/1     MatchNodeSelector   0          9h      <none>           controller-0   <none>           <none>
kube-system                   rbd-provisioner-7484d49cf6-vlr62           1/1     Running             1          9h      172.16.192.99    controller-0   <none>           <none>
kube-system                   storage-init-rbd-provisioner-xwnrl         0/1     Completed           0          9h      172.16.192.72    controller-0   <none>           <none>
kube-system                   tiller-deploy-d6b59fcb-j8q2s               1/1     Running             1          9h      192.168.204.3    controller-0   <none>           <none>
kube-system                   tiller-deploy-d6b59fcb-ntd2v               0/1     MatchNodeSelector   0          9h      <none>           controller-0   <none>           <none>
kube-system                   tiller-deploy-d6b59fcb-z4jzg               0/1     MatchNodeSelector   0          10h     <none>           controller-0   <none>           <none>

Looking at the status of the pods. It looks like originally applied pods may have been scheduled and started, but eventually failed due to missing node selector.

2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Name:           rbd-provisioner-7484d49cf6-k9bw8
Namespace:      kube-system
Priority:       0
Node:           controller-0/
Start Time:     Thu, 24 Oct 2019 06:58:10 +0000
Labels:         app=rbd-provisioner
                pod-template-hash=7484d49cf6
Annotations:    cni.projectcalico.org/podIP: 172.16.192.78/32
Status:         Failed
Reason:         MatchNodeSelector
Message:        Pod Predicate MatchNodeSelector failed

Another provisioner is started later, but does not appear associated with a particular re-apply

Name:         rbd-provisioner-7484d49cf6-vlr62
Namespace:    kube-system
Priority:     0
Node:         controller-0/192.168.204.3
Start Time:   Thu, 24 Oct 2019 07:30:51 +0000
Labels:       app=rbd-provisioner
              pod-template-hash=7484d49cf6
Annotations:  cni.projectcalico.org/podIP: 172.16.192.99/32
Status:       Running

Revision history for this message

Bob Church (rchurch) wrote on 2019-10-24:

#5

Killing the pod enables the app reapply to work.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl delete pods -nkube-system rbd-provisioner-7484d49cf6-k9bw8

2019-10-24 17:35:50.842 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 17:35:50.868 96382 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-10-24 17:35:51.017 96382 INFO sysinv.helm.manifest_base [req-ef2900cd-a6a8-460d-9743-27eac307844f admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 17:35:51.017 96382 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-10-24 17:35:51.043 96382 INFO sysinv.conductor.kube_app [-] Armada manifest file has no img tags for chart helm-toolkit
2019-10-24 17:35:51.070 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download started from local registry
2019-10-24 17:35:51.072 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download started from local registry
2019-10-24 17:35:51.438 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 0 seconds
2019-10-24 17:35:51.458 96382 INFO sysinv.conductor.kube_app [-] Armada apply command = /bin/bash -c 'set -o pipefail; armada apply --enable-chart-cleanup --debug /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml --values /overrides/platform-integ-apps/1.0-8/kube-system-rbd-provisioner.yaml --values /overrides/platform-integ-apps/1.0-8/kube-system-ceph-pools-audit.yaml --values /overrides/platform-integ-apps/1.0-8/helm-toolkit-helm-toolkit.yaml --tiller-host tiller-deploy.kube-system.svc.cluster.local | tee /logs/platform-integ-apps-apply.log'
2019-10-24 17:35:52.456 96382 INFO sysinv.conductor.kube_app [-] Starting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:52.613 96382 INFO sysinv.conductor.kube_app [-] processing chart: stx-ceph-pools-audit, overall completion: 100.0%
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:53.546 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Killing the pod enables the app reapply to work.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl delete pods -nkube-system rbd-provisioner-7484d49cf6-k9bw8

2019-10-24 17:35:50.842 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 17:35:50.868 96382 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-10-24 17:35:51.017 96382 INFO sysinv.helm.manifest_base [req-ef2900cd-a6a8-460d-9743-27eac307844f admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 17:35:51.017 96382 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-10-24 17:35:51.043 96382 INFO sysinv.conductor.kube_app [-] Armada manifest file has no img tags for chart helm-toolkit
2019-10-24 17:35:51.070 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download started from local registry
2019-10-24 17:35:51.072 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download started from local registry
2019-10-24 17:35:51.438 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 0 seconds
2019-10-24 17:35:51.458 96382 INFO sysinv.conductor.kube_app [-] Armada apply command = /bin/bash -c 'set -o pipefail; armada apply --enable-chart-cleanup --debug /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml  --values /overrides/platform-integ-apps/1.0-8/kube-system-rbd-provisioner.yaml  --values /overrides/platform-integ-apps/1.0-8/kube-system-ceph-pools-audit.yaml  --values /overrides/platform-integ-apps/1.0-8/helm-toolkit-helm-toolkit.yaml  --tiller-host tiller-deploy.kube-system.svc.cluster.local | tee /logs/platform-integ-apps-apply.log'
2019-10-24 17:35:52.456 96382 INFO sysinv.conductor.kube_app [-] Starting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:52.613 96382 INFO sysinv.conductor.kube_app [-] processing chart: stx-ceph-pools-audit, overall completion: 100.0%
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:53.546 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Revision history for this message

Wendy Mitchell (wmitchellwr) wrote on 2019-10-24:

#6

Download full text (5.1 KiB)

Hit this issue again after lock/unlock on the single node system
2019-10-23_20-00-00

2019-10-24 20:09:03.486 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada Traceback (most recent call last):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 225, in handle_result
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada result = get_result()
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 236, in <lambda>
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada if (handle_result(chart, lambda: deploy_chart(chart))):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 214, in deploy_chart
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada chart, cg_test_all_charts, prefix, known_releases)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/chart_deploy.py", line 248, in execute
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada chart_wait.wait(timer)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 134, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada wait.wait(timeout=timeout)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 294, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada modified = self._wait(deadline)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 354, in _wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada raise k8s_exceptions.KubernetesWatchTimeoutException(error)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada
2019-10-24 20:09:03.489 11 ERROR armada.handlers.armada [-] Chart deploy(s) failed: ['kube-system-rbd-provisioner']
2019-10-24 20:09:03.979 11 ERROR armada.cli [-] Caught internal exception: armada.exceptions...

Hit this issue again after lock/unlock on the single node system
2019-10-23_20-00-00

2019-10-24 20:09:03.486 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada Traceback (most recent call last):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 225, in handle_result
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     result = get_result()
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 236, in <lambda>
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     if (handle_result(chart, lambda: deploy_chart(chart))):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 214, in deploy_chart
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     chart, cg_test_all_charts, prefix, known_releases)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/chart_deploy.py", line 248, in execute
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     chart_wait.wait(timer)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 134, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     wait.wait(timeout=timeout)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 294, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     modified = self._wait(deadline)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 354, in _wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada     raise k8s_exceptions.KubernetesWatchTimeoutException(error)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada
2019-10-24 20:09:03.489 11 ERROR armada.handlers.armada [-] Chart deploy(s) failed: ['kube-system-rbd-provisioner']
2019-10-24 20:09:03.979 11 ERROR armada.cli [-] Caught internal exception: armada.exceptions.armada_exceptions.ChartDeployException: Exception deploying charts: ['kube-system-rbd-provisioner']
2019-10-24 20:09:03.979 11 ERROR armada.cli Traceback (most recent call last):
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/local/lib/python3.6/dist-packages/armada/cli/__init__.py", line 38, in safe_invoke
2019-10-24 20:09:03.979 11 ERROR armada.cli     self.invoke()
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 213, in invoke
2019-10-24 20:09:03.979 11 ERROR armada.cli     resp = self.handle(documents, tiller)
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/lock.py", line 81, in func_wrapper
2019-10-24 20:09:03.979 11 ERROR armada.cli     return future.result()
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
2019-10-24 20:09:03.979 11 ERROR armada.cli     return self.__get_result()
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
2019-10-24 20:09:03.979 11 ERROR armada.cli     raise self._exception
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
2019-10-24 20:09:03.979 11 ERROR armada.cli     result = self.fn(*self.args, **self.kwargs)
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 256, in handle
2019-10-24 20:09:03.979 11 ERROR armada.cli     return armada.sync()
2019-10-24 20:09:03.979 11 ERROR armada.cli   File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 252, in sync
2019-10-24 20:09:03.979 11 ERROR armada.cli     raise armada_exceptions.ChartDeployException(failures)
2019-10-24 20:09:03.979 11 ERROR armada.cli armada.exceptions.armada_exceptions.ChartDeployException: Exception deploying charts: ['kube-system-rbd-provisioner']
2019-10-24 20:09:03.979 11 ERROR armada.cli

Revision history for this message

Frank Miller (sensfan22) wrote on 2019-10-25:

#7

Assigning to Bob to triage.

Changed in starlingx:
assignee:	nobody → Bob Church (rchurch)

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-10-25:

#8

stx.3.0 / high priority - issue related to container recovery and occurs frequently

Changed in starlingx:
importance:	Undecided → High
tags:	added: stx.containers
Changed in starlingx:
status:	New → Triaged
tags:	added: stx.3.0

Yang Liu (yliu12) on 2019-10-31

tags:

added: stx.retestneeded

Revision history for this message

Bob Church (rchurch) wrote on 2019-11-14:

#9

I applied the following upstream proposed fix: https://github.com/kubernetes/kubernetes/pull/80976

With this incorporated in the build and installed on my test lab, I ran this lab in a continual lock/unlock cycle for over 36hrs without seeing the MatchNodeSelector issue. Prior to this patch inclusion, I’d see the issue in < ~45minutes.

Possible next steps here:

1) Live with the issue occurring periodically and wait for this to land upstream (if it does) and pull it in on the following rebase

2) Patch k8s in StarlingX with this change and evaluate if this fixes the issue over the coming weeks.

Revision history for this message

Frank Miller (sensfan22) wrote on 2019-11-21:

#10

Changing the tag for this issue from stx.3.0 to stx.4.0 as the solution requires moving to a new version of kubernetes which won't happen until stx.4.0.

Also lowering the priority to medium as a workaround exists: delete any pods stuck in the MatchNodeSelector state.

tags:	added: stx.4.0 removed: stx.3.0
Changed in starlingx:
importance:	High → Medium

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-13: Fix proposed to config (master)

#11

Fix proposed to branch: master
Review: https://review.opendev.org/707571

Changed in starlingx:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-14: Fix merged to config (master)

#12

Reviewed: https://review.opendev.org/707571
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=2b49e9f3f93c9913961b437d4e51d1e7d46f1222
Submitter: Zuul
Branch: master

commit 2b49e9f3f93c9913961b437d4e51d1e7d46f1222
Author: Robert Church <email address hidden>
Date: Thu Feb 13 10:00:56 2020 -0600

Workaround for cleaning up MatchNodeSelector pods after host reboot

Added a K8sPodOperator class to look for and remove Failed pods with a
MatchNodeSelector reason.

    MatchNodeSelector pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.

    A check for this condition is added in two locations:
    - to the _k8s_application_audit() which is run immediately on
      sysinv-conductor startup and runs every minute. This runs 4 times in a
      5 minute window at startup on a simplex install. This should catch all
      cases unless there is a delay accessing the k8s API that lasts longer
      than 5 minutes at startup.
    - to the application-apply path. This would cover any case that occurs
      after the initial 5 minute conductor startup OR any occurance on a
      non-simplex installation (so far only observed on AIO-SX)

NOTE: This commit will be reverted once a proper upstream k8S fix is
provided.

    Related upstream bugs:
    - https://github.com/kubernetes/kubernetes/issues/80745
    - https://github.com/kubernetes/kubernetes/issues/85334

    The following PR was tested and fixed this issue but has not landed
    upstream in a new k8s release:
    - https://github.com/kubernetes/kubernetes/pull/80976

    Change-Id: Ia5418794a44e7821933e8335d5c5db25b58a739f
    Closes-Bug: #1849688
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status:	In Progress → Fix Released

Revision history for this message

Yang Liu (yliu12) wrote on 2020-02-26:

#13

Has not seen this issue in recently simplex sanity. Closing.

tags:

removed: stx.retestneeded

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-31: Fix proposed to config (f/centos8)

#14

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716137

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-31: Fix merged to config (f/centos8)

#15

Download full text (32.3 KiB)

Reviewed: https://review.opendev.org/716137
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=cb4cf4299c2ec10fb2eb03cdee3f6d78a6413089
Submitter: Zuul
Branch: f/centos8

commit 16477935845e1c27b4c9d31743e359b0aa94a948
Author: Steven Webster <email address hidden>
Date: Sat Mar 28 17:19:30 2020 -0400

Fix SR-IOV runtime manifest apply

    When an SR-IOV interface is configured, the platform's
    network runtime manifest is applied in order to apply the virtual
    function (VF) config and restart the interface. This results in
    sysinv being able to determine and populate the puppet hieradata
    with the virtual function PCI addresses.

    A side effect of the network manifest apply is that potentially
    all platform interfaces may be brought down/up if it is determined
    that their configuration has changed. This will likely be the case
    for a system which configures SR-IOV interfaces before initial
    unlock.

    A few issues have been encountered because of this, with some
    services not behaving well when the interface they are communicating
    over suddenly goes down.

    This commit makes the SR-IOV VF configuration much more targeted
    so that only the operation of setting the desired number of VFs
    is performed.

    Closes-Bug: #1868584
    Depends-On: https://review.opendev.org/715669
    Change-Id: Ie162380d3732eb1b6e9c553362fe68cbc313ae2b
    Signed-off-by: Steven Webster <email address hidden>

commit 45c9fe2d3571574b9e0503af108fe7c1567007db
Author: Zhipeng Liu <email address hidden>
Date: Thu Mar 26 01:58:34 2020 +0800

Add ipv6 support for novncproxy_base_url.

For ipv6 address, we need url with below format
[ip]:port

Partial-Bug: 1859641

Change-Id: I01a5cd92deb9e88c2d31bd1e16e5bce1e849fcc7
Signed-off-by: Zhipeng Liu <email address hidden>

commit d119336b3a3b24d924e000277a37ab0b5f93aae1
Author: Andy Ning <email address hidden>
Date: Mon Mar 23 16:26:21 2020 -0400

Fix timeout waiting for CA cert install during ansible replay

    During ansible bootstrap replay, the ssl_ca_complete_flag file is
    removed. It expects puppet platform::config::runtime manifest apply
    during system CA certificate install to re-generate it. So this commit
    updated conductor manager to run that puppet manifest even if the CA cert
    has already installed so that the ssl_ca_complete_flag file is created
    and makes ansible replay to continue.

    Change-Id: Ic9051fba9afe5d5a189e2be8c8c2960bdb0d20a4
    Closes-Bug: 1868585
    Signed-off-by: Andy Ning <email address hidden>

commit 24a533d800b2c57b84f1086593fe5f04f95fe906
Author: Zhipeng Liu <email address hidden>
Date: Fri Mar 20 23:10:31 2020 +0800

Fix rabbitmq could not bind port to ipv6 address issue

    When we use Armada to deploy openstack service for ipv6, rabbitmq
    pod could not start listen on [::]:5672 and [::]:15672.
    For ipv6, we need an override for configuration file.

Upstream patch link is:
https://review.opendev.org/#/c/714027/

Test pass for deploying rabbitmq service on both ipv...

Reviewed:  https://review.opendev.org/716137
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=cb4cf4299c2ec10fb2eb03cdee3f6d78a6413089
Submitter: Zuul
Branch:    f/centos8

commit 16477935845e1c27b4c9d31743e359b0aa94a948
Author: Steven Webster <steven.webster@windriver.com>
Date:   Sat Mar 28 17:19:30 2020 -0400

Fix SR-IOV runtime manifest apply
    
    When an SR-IOV interface is configured, the platform's
    network runtime manifest is applied in order to apply the virtual
    function (VF) config and restart the interface.  This results in
    sysinv being able to determine and populate the puppet hieradata
    with the virtual function PCI addresses.
    
    A side effect of the network manifest apply is that potentially
    all platform interfaces may be brought down/up if it is determined
    that their configuration has changed.  This will likely be the case
    for a system which configures SR-IOV interfaces before initial
    unlock.
    
    A few issues have been encountered because of this, with some
    services not behaving well when the interface they are communicating
    over suddenly goes down.
    
    This commit makes the SR-IOV VF configuration much more targeted
    so that only the operation of setting the desired number of VFs
    is performed.
    
    Closes-Bug: #1868584
    Depends-On: https://review.opendev.org/715669
    Change-Id: Ie162380d3732eb1b6e9c553362fe68cbc313ae2b
    Signed-off-by: Steven Webster <steven.webster@windriver.com>

commit 45c9fe2d3571574b9e0503af108fe7c1567007db
Author: Zhipeng Liu <zhipengs.liu@intel.com>
Date:   Thu Mar 26 01:58:34 2020 +0800

Add ipv6 support for novncproxy_base_url.
    
    For ipv6 address, we need url with below format
    [ip]:port
    
    Partial-Bug: 1859641
    
    Change-Id: I01a5cd92deb9e88c2d31bd1e16e5bce1e849fcc7
    Signed-off-by: Zhipeng Liu <zhipengs.liu@intel.com>

commit d119336b3a3b24d924e000277a37ab0b5f93aae1
Author: Andy Ning <andy.ning@windriver.com>
Date:   Mon Mar 23 16:26:21 2020 -0400

Fix timeout waiting for CA cert install during ansible replay
    
    During ansible bootstrap replay, the ssl_ca_complete_flag file is
    removed. It expects puppet platform::config::runtime manifest apply
    during system CA certificate install to re-generate it. So this commit
    updated conductor manager to run that puppet manifest even if the CA cert
    has already installed so that the ssl_ca_complete_flag file is created
    and makes ansible replay to continue.
    
    Change-Id: Ic9051fba9afe5d5a189e2be8c8c2960bdb0d20a4
    Closes-Bug: 1868585
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 24a533d800b2c57b84f1086593fe5f04f95fe906
Author: Zhipeng Liu <zhipengs.liu@intel.com>
Date:   Fri Mar 20 23:10:31 2020 +0800

Fix rabbitmq could not bind port to ipv6 address issue
    
    When we use Armada to deploy openstack service for ipv6, rabbitmq
    pod could not start listen on [::]:5672 and [::]:15672.
    For ipv6, we need an override for configuration file.
    
    Upstream patch link is:
    https://review.opendev.org/#/c/714027/
    
    Test pass for deploying rabbitmq service on both ipv4 and ipv6 setup
    
    Partial-Bug: 1859641
    
    Change-Id: I6495c45fbd8cc1de3c9f5d9ef5003447079d91b8
    Signed-off-by: Zhipeng Liu <zhipengs.liu@intel.com>

commit 08aa950393a7e3c5fd5299b88e134307800584aa
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Sun Mar 22 14:29:15 2020 -0400

application-apply error string too long
    
    During application-apply exception handling, str(e) is
    used as the input to the progress column of the kube_app
    table in the database, which may be longer than the 255
    character limit.  The result is an application stuck
    in 'applying' status.  This update adds a more readable
    error message to just check logs.
    
    There are other instances where str(e) is used as input to
    the database and could cause a similar problem which should
    also be looked at.
    
    Change-Id: I01a5e8f56a628726163e2cfffc58143ae8d5f845
    Closes-Bug: 1867019
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit c1c18871d72cdcd877b95f593bd119b47b3ddbb6
Author: Andy Ning <andy.ning@windriver.com>
Date:   Tue Feb 18 14:52:06 2020 -0500

Support multiple CA certificates installation
    
    This update enhanced sysinv certificate install API to be able to
    install multiple CA certs from a file. The returns from the API call
    indicates the certs actually installed in the call (ie, excluding these
    that are already in the system). This is neccessary especially for DC to
    support multiple CA certs synchronization.
    
    This update also added sysinv certficate uninstall API. The API is to
    be used to remove a particular CA certficate from the system, identified
    by its uuid. The API returns a json body with information about the
    certificate that has been removed. This is required by DC sysinv api
    proxy for certificate deletion synchronization, since DC tracks subcloud
    certificates resource by signature while the uninstall API request
    contains only uuid.
    
    The uninstall API only supports ssl_ca certificate.
    
    cgtsclient and system CLI are also updated to align with the updated
    and new APIs. User can use "system certificate-install ..." to install
    one or multiple CA certificates, and "system certificate-uninstall ..."
    to remove a particular CA certificate from the system.
    
    When multiple CA certificates are installed in the system,
    "system certificate-list" will display each of the individual
    certificates.
    
    THe sysinv certificate configuration API reference is updated with the
    new uninstall API. Unit tests are added for CA certificate install and
    delete APIs.
    
    Change-Id: I7dba11e56792b7d198403c436c37f71d7b7193c9
    Depends-On: https://review.opendev.org/#/c/711633/
    Closes-Bug: 1861438
    Closes-Bug: 1860995
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 241ea2871b15965bd694895f796660f7f1fddbf3
Author: Tee Ngo <tee.ngo@windriver.com>
Date:   Thu Mar 19 13:54:15 2020 -0400

Set time limit for filebeat open filehandlers
    
    In a large system, filebeat can harvest a large number of files
    and with the default file closing policies, many deleted files are
    not freed. Over time, this leads to /var/log partition running out
    of space, services not being able to flush their logs to disk and
    logmgmt process continously rotating logs.
    
    This commit sets a default time limit for each open file harvester.
    This value can be adjusted as needed via user overrides.
    
    Closes-Bug: 1865924
    Change-Id: I9dbf9cb2128157834b937357dcc6c4945dc5d2f3
    Signed-off-by: Tee Ngo <tee.ngo@windriver.com>

commit d7c3822a52ecc3b4288106c4e544e67add80fbf5
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Mar 13 12:37:39 2020 -0400

Remove usage of /etc/kubernetes/kubeadm.yaml
    
    /etc/kubernetes/kubeadm.yaml could contain stale data, for example, from
    changing kube-apiserver parameters. There are currently no system impacts
    from using the stale file, but as we change more parameters, there could
    be system impact. This commit makes the existing usage of kubeadm.yaml
    generate a temp copy of the file with current data first.
    
    Change-Id: I62391d184e3e5d6397a9af4f43c7c7ec19314afc
    Partial-bug: 1866695
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit 8ecdcbbbcdc2807113c7b7004f92653acffa0b41
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Tue Mar 10 16:46:04 2020 -0400

Add platform network type for storage
    
    Added a new platform network type for optional backend storage.
    
    Story: 2007391
    Task: 39018
    
    Change-Id: I1a389b8aede49095e4f7f7d24ed8224504575d45
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit 2528dce84b5891038ca56c6959304ac4c1fc934a
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Thu Feb 13 18:52:15 2020 -0500

Allow VF type interface to detect underlying port
    
    Do `host-if-show` on VF interface whose underlying port supports
    dpdk will now display accelerated [True]. Before this fix, only
    ethernet, vlan, and ae type interfaces supports detecting
    underlying ports that support dpdk.
    
    Closes-Bug: 1846260
    
    Change-Id: Ifdee31811824a38ebc7d3a8febde2341d39ba986
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 95d8bb436b625c82e78ebb2a2134e0e861bd5574
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Wed Mar 4 16:07:22 2020 -0500

Support post-bootstrap config of kube-apiserver parameters
    
    Add system service parameters for each of the kube-apiserver parameters
    for openid connect.
    
    Story: 2006711
    Task: 38944
    
    Depends-On: https://review.opendev.org/711336
    
    Change-Id: Ib4b9aee036447087f88f803548e3f982446ccda4
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit 6f162c3422df6c11b0d9f548487bfb3b9e401ca5
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Fri Feb 7 15:28:42 2020 -0500

Fixed address interface foreign key inconsistency
    
    Foreign key in sysinv.object.address.Address is `interface_uuid`,
    which is inconsistent with the foreign key `interface_id` defined
    in the database schema. This fix corrected that.
    
    Added a unit test to verify that addresses associated with an interface
    could be deleted.
    
    Additionally wrote a set of TODO unit tests blocked by
    the bug: tested delete address for orphaned-routes case, unlocked
    host state, and the case where address is allocated from pool.
    
    Modified interface querying mechanism to look up all interfaces.
    This modification is necessary because the current implementation of
    add_interface_filter only looks up those of type ethernet, ae and
    vlan. Attempting to get an virtual-type interface will raise an
    exception, causing Jenkins installation to fail.
    
    After a visual inspection of interface_uuid occurrences, fixed a few
    other occurrences of bad address.interface_uuid that are not caught
    by the unit test. Added new unit test suites in place to cover the
    code paths.
    
    Closes-Bug: 1861131
    
    Change-Id: I6f2449bbbb69d6f2353e521bfcd138d880ce878f
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 964a2b7c6238ce91d4ace34dcac790fa5a37d55c
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Tue Mar 3 14:17:42 2020 -0500

stx-monitor: only delete pvcs on app delete.
    
    It may be desired to keep the persistent volumes after removing the
    stx-monitor application.  This update will not remove the pvcs on
    application-remove, but remove them on application-delete
    
    Closes-Bug: 1865568
    
    Change-Id: I9b06008fe6b6033e5a1ce6808cc5d4fa6aabcd05
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit c5d43da89e7fd2a12407bc4bebd14ab87d16c638
Author: Angie Wang <angie.wang@windriver.com>
Date:   Tue Feb 25 17:00:53 2020 -0500

Allow users to override a single image with a custom registry
    
    In the case that the user overrides a single image with a
    custom registry that is not from any known registries
    in Sysinv. This image downloading will fail as it
    prepends the docker.io registry to the image reference
    , then generates an invalid image tag.
    
    The original purpose of adding that logic is to handle
    the image that comes from docker.io but do not have
    docker.io explicitly specified in its image name. This
    case has already been updated to handle in the class
    "AppImageParser".
    
    This commit removes the related logic that causing the
    issue.
    
    Tested:
     - system helm-override-update stx-openstack nova openstack \
         --set images.tags.nova_api=mycustomregistry.com/stx-nova:latest
     - system application-apply stx-openstack
    
    Change-Id: I07d1a658c3cf56a3e09e81e1f947f93de50b513d
    Closes-Bug: 1859881
    Signed-off-by: Angie Wang <angie.wang@windriver.com>

commit 347af170f9cf1fd49be2a52107f0594d9d4b8ba8
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Tue Feb 25 21:13:59 2020 -0500

Update PTP API ref and unit tests
    
    Add the PTP apply function to the API ref and the unit tests.
    
    Story: 2006759
    Task: 38848
    Change-Id: Iae3cc9e90b653fd92a83a0d9a216d87016cf4c6c
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit 8e2e5f7e82efde39407d34c1a26daffb97dbe26d
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Fri Feb 21 07:56:04 2020 -0500

Set elasticsearch pod java options according to ip config
    
    The "-Djava.net.preferIPv6Addresses=true" java option was set
    for both ipv4 and ipv6 configurations which worked fine in both
    configs.  At some point recently in ipv4 configurations, the
    stx-monitor application stopped applying successfully due to
    elasticsearch cluster discovery failure.  Why the ipv4 failures
    are only recently occurring is unknown, but removal of this
    unnecessary java option for ipv4 eliminates the failures.
    
    This update will set the above java option for elasticsearch
    pods only if the cluster service network is ipv6.
    
    Closes-Bug: 1864193
    
    Change-Id: I2952f1c799b121d0812314156162af7696ebd6b0
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit 6065f1318af289001d2017111cc8633c3320efda
Author: Matt Peters <matt.peters@windriver.com>
Date:   Thu Feb 20 16:22:02 2020 -0500

Remove system name from default index naming
    
    Remove the system name from the default index naming
    since it causes a large number of small independent
    indexes to be created that does not scale well against
    the current daily index rotation.
    
    Change-Id: Ia880a1d8c48703a0741a72e999c0cdb93c229423
    Story: 2006990
    Task: 38834
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit 73d407bdf44933673e8e975e2523828b9c43e25d
Author: Matt Peters <matt.peters@windriver.com>
Date:   Thu Feb 20 16:21:40 2020 -0500

Add normalized percentages to cpu metric collection
    
    CPU metric collection which has been normalized against the
    number of cores is not enabled.  This update adds the
    appropriate configuration option to enable these metrics.
    
    Change-Id: I1e2dcd0fac144236dab3718a917344c339444003
    Closes-Bug: 1864128
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit bbb9a477c1cb33ca51a134d742073cc200f89fb0
Author: Angie Wang <angie.wang@windriver.com>
Date:   Thu Feb 20 11:56:01 2020 -0500

Reject the k8s first control plane upgrade after networking is upgraded
    
    The first upgraded control plane shouldn't be allowed to re-upgrade
    after the k8s networking upgrade is done. This commit adds a check
    to prevent this action.
    
    Change-Id: I01c6539fe89749663dff6159e56d14f9a510ebe0
    Story: 2006781
    Task: 38761
    Signed-off-by: Angie Wang <angie.wang@windriver.com>

commit cb2b83365e823cd69a0e8e2a3c54b3e679f48776
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Thu Feb 20 11:37:03 2020 -0500

Support for https in OIDC client
    
    Changed OIDC client to use HTTPS by default.
    
    Story: 2006711
    Task: 38481
    
    Depends-On: https://review.opendev.org/#/c/708911
    Change-Id: I567b224030cfe2278cdca57f2d40ad36c98d7ff6
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit 4687ea36b5fadb7dad0cfe0a1ede4b488a0b5aeb
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Fri Feb 14 15:30:41 2020 -0500

Apply PTP configuration at runtime
    
    Allow PTP configuration to be applied at runtime. Previously this would
    have required a lock/unlock of the host. A new command 'system
    ptp-apply' has been added to apply the ptp configuration.
    
    Note we will not apply ptp configurations to hosts that have switched
    from ntp to ptp. That change will require a lock/unlock as before.
    
    Depends-On: https://review.opendev.org/707904
    Change-Id: I098bd12336f34324a77615a20a4e36b7620ab79b
    Story: 2006759
    Task: 38770
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit d93d5804c626955fb711897745dce4a61136183b
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Fri Feb 14 16:47:03 2020 -0500

Fixed error responses in controller-fs
    
    Error response given by controller-fs-modify erroneously mentions
    filesystem names which are not controller filesystems. To fix this,
    hard-coded filesystem names have been completely removed.
    
    Change-Id: Ic6f563dd0b347ac7ece628f6e716c952205c1687
    Closes-Bug: 1862416
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit f6eebbd318f3c596c7d408696ce1558fd03a5497
Author: Bart Wensley <barton.wensley@windriver.com>
Date:   Wed Feb 19 12:56:21 2020 -0600

Disable keystone caching on subclouds
    
    The use of keystone caching on subclouds causes problems because
    the syncing of fernet keys to the subcloud results in stale
    cache entries. This causes authentication failures until the
    cache entries age out or new tokens are created.
    
    Since the keystone load in a subcloud is light, there is really
    no need for caching at this time - it is being disabled in
    subclouds.
    
    Change-Id: I777c57c46cf1bcd701fbbac73228a2cb81d8424b
    Closes-Bug: 1860372
    Signed-off-by: Bart Wensley <barton.wensley@windriver.com>

commit b330498aecb7068e8bfa65c41c71e974b2d674aa
Author: Mingyuan Qi <mingyuan.qi@intel.com>
Date:   Tue Feb 18 03:48:44 2020 +0000

Change docker client to crictl in cert rotation
    
    When container runtime moving to containerd, the containers are
    created by containerd. Accordingly, the client tool is changed
    to crictl. In the kube cert rotation script, the containers will
    be stopped by crictl and automatically started by kubelet to
    update the renewed certificates within the container.
    
    Story: 2006145
    Task: 37619
    
    Change-Id: Ia8cf76c15811f8f9d88199158e83ccba31534e4e
    Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>

commit 7afe5de64d0d23ec951620e0380fb65e2f49f4c3
Author: Angie Wang <angie.wang@windriver.com>
Date:   Tue Feb 11 17:25:11 2020 -0500

Add semantic checks for k8s upgrade
    
    Semantic checks added:
      - verify whether all installed applications are compatible with
        the new k8s version before starting k8s upgrade
      - prevent host-unlock if the host kubelet upgrade is in progress
        (allow --force to do force unlock).
      - prevent application-apply/update if the app is incompatible with
        the current k8s version.
    
    For the application that has k8s version restriction, the following
    keys need to be optionally specified in its metadata file:
    ie...
    supported_k8s_version:
      minimum: v1.16.1
      maximum: v1.16.3
    
    The k8s version related information in metadata file will be used for
    compatibility check. The metadata file is updated to copy over to the
    drbd fs during application-upload.
    
    Tests conducted:
      - "system kube-upgrade-start" rejected if any installed app's k8s
        version check failed
      - host-unlock rejected if the host is in upgrading-kubelet status
      - was able to forcibly unlock host even if it's upgrading kubelet
      - application-apply/update testing
    
    Change-Id: I1ef852cccddf7ae39eca4b4e25b80a7f4347d8a4
    Story: 2006781
    Task: 38761
    Signed-off-by: Angie Wang <angie.wang@windriver.com>

commit 2b49e9f3f93c9913961b437d4e51d1e7d46f1222
Author: Robert Church <robert.church@windriver.com>
Date:   Thu Feb 13 10:00:56 2020 -0600

Workaround for cleaning up MatchNodeSelector pods after host reboot
    
    Added a K8sPodOperator class to look for and remove Failed pods with a
    MatchNodeSelector reason.
    
    MatchNodeSelector pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.
    
    A check for this condition is added in two locations:
    - to the _k8s_application_audit() which is run immediately on
      sysinv-conductor startup and runs every minute. This runs 4 times in a
      5 minute window at startup on a simplex install. This should catch all
      cases unless there is a delay accessing the k8s API that lasts longer
      than 5 minutes at startup.
    - to the application-apply path. This would cover any case that occurs
      after the initial 5 minute conductor startup OR any occurance on a
      non-simplex installation (so far only observed on AIO-SX)
    
    NOTE: This commit will be reverted once a proper upstream k8S fix is
    provided.
    
    Related upstream bugs:
    - https://github.com/kubernetes/kubernetes/issues/80745
    - https://github.com/kubernetes/kubernetes/issues/85334
    
    The following PR was tested and fixed this issue but has not landed
    upstream in a new k8s release:
    - https://github.com/kubernetes/kubernetes/pull/80976
    
    Change-Id: Ia5418794a44e7821933e8335d5c5db25b58a739f
    Closes-Bug: #1849688
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 34e410821b7b0699444b303fcdec1ab89d860cc6
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Thu Feb 13 15:38:51 2020 -0500

Fix inconsistent disk space calculation
    
    Integer division in Python 2 behaves like floating-point
    division in Python 3. Thus, changes are made to rectify this
    behavior.
    
    Change-Id: I6a5905a4d97df5b9e73e165580801c865006f316
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>
    Closes-Bug: 1862668

commit e6e37c949a39e4ee3d4f4c9407a85089e7514345
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Mon Feb 10 16:26:13 2020 -0500

Added unit test cases for host file system.
    
    Test cases added for API endpoints used by:
     1. host-fs-list
     2. host-fs-modify
     3. host-fs-show
    
    This commit also fixes the issue of Host FS disk space calculations
    yielding different values in Python 2 and Python 3.
    
    Change-Id: I50a1ca43c43c3bba30730c616b3788664920d0c9
    Story: 2007082
    Task: 38013
    Partial-Bug: 1862668
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit 227ddec6189fdabdc75d45162fc22b9af7118982
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Thu Feb 13 10:47:55 2020 -0500

Fix device plugin port handling for pci-passthrough
    
    While generating the SR-IOV device plugin configuration data,
    it is necessary to get the underlying port information.
    For SR-IOV ports there is special handling required to deal
    with the case of a 'VF' subinterface.  For PCI-Passthrough,
    the port can and should be accessed directly.
    
    Closes-Bug: 1856587
    
    Co-Authored-By: Steven Webster <steven.webster@windriver.com>
    
    Change-Id: I70f315669776a591e23e69c6653098e720815b99
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit cab522030f79c0060b80050c6a560696d7db80d9
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Fri Jan 31 17:31:17 2020 +0200

Make Ceph storage backend optional
    
    Changes included in this commit:
    - change consistency checks to allow a system to
      be deployed without ceph configured
    - allow ceph to be provisioned before unlocking
      controller-0
    - add support for runtime provisioning of ceph
      on an already fully deployed system
    - move default cluster and storage tier config
      from conductor initialization to storage-backend
      creation
    - move CephOperator initialization from conductor
      initialization to a greenthread that waits for
      the ceph cluster to become responsive
    - make adding ceph storage-backend timing consistent
      across all setups: you can add it before unlocking
      controller-0 or only after all controller nodes
      have been unlocked.
    
    Tests run:
    - all tests were run on AIO-SX, AIO-DX, Standard
      and Storage configs
    - deploy system without ceph
    - configure ceph after running ansible bootstrap,
      but before unlocking controller-0
    - configure ceph at runtime on an already deployed
      system
    - swacting
    
    Change-Id: I05fbd494d9a22a535eae200a26c21b1702500194
    Depends-On: https://review.opendev.org/705234
    Story: 2007064
    Task: 37931
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit f1605d465b5cb10a9d46803e88096951cdacc3a5
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Mon Feb 3 14:35:45 2020 -0500

PTP Configuration Enhancements
    
    Add PTP service parameters. Any service parameters in the global ptp
    section will be written to the ptp4l conf. phc2sys service parameters
    will be used to specify the command line options used with the phc2sys
    service.
    
    Values specified in the service parameters will take precedence over
    values specified by the PTP table.
    
    Story: 2006759
    Task: 38669
    Depends-On: https://review.opendev.org/#/c/706364
    Change-Id: I791ec251be44d963bfb5eb69268fbc7a8a75391a
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit 173eb3bea75e2a774976461a5caef482c20a814a
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Mon Feb 3 16:21:42 2020 -0500

Added unit test cases for controller file system
    
    Test cases added for API endpoints used by:
     1. controllerfs-list
     2. controllerfs-modify
     3. controllerfs-show
    
    Change-Id: Ifd525d2218a099b15139f17d6b4ae1b7279e8810
    Story: 2007082
    Task: 38003
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit aead92341082065798ee4450d804f64d63ba35f1
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Tue Jan 21 18:12:46 2020 -0500

Enabled platform interfaces to add ip address(es)
    
    Removed network type check in api controller interface to allow platform
    interfaces to have static address mode in the database.
    
    Removed broken network type check in api controller address.
    
    Loosened interface-class and network-type restrictions in puppet
    controller to allow platform interfaces to have static ip address
    during system unlock.
    
    Added unit tests to test puppet interface's new restriction logic of
    get_interface_address_method for ipv4 static mode (valid), ipv6 static
    mode (valid), and ipv4 static mode with network type (invalid).
    
    Added unit test to ensure one can add an ip address to the static
    platform interface. Enabled DAD for ipv6 tests. Renamed get_post_object
    parameter interface_id to interface_uuid to eliminate usage
    inconsistency because the former is rejected in the POST request.
    
    Closes-Bug: 1855191
    
    Change-Id: I1f2bc92bb1a97dc4afb21966de4055b12855510a
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit b27ae6b348fdd03d83859e7c1a21baf828859328
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Thu Jan 16 11:21:30 2020 -0500

Fixed semantic checks for SR-IOV VF parameters.
    
    Only interfaces of class pci-sriov may have numvfs and vf_driver.
    However, interfaces of class data attempting to add numvfs and
    vf_driver via the cli was able to pass the semantic check.
    Moreover, when an interface class changes from pci-sriov to data,
    the numvfs and vf_driver fields are not cleared.
    
    This fix tackles the above issues by altering the condition-
    check that resets the 2 fields before the semantic check such
    that faulty semantic will not pass the semantic check.
    This fix also ensures the 2 fields are permanently reset
    once interface class is changed from pci-sriov to data.
    
    Added several unit tests to verify all situations described
    above.
    
    Depends-On: https://review.opendev.org/#/c/705293
    
    Closes-Bug: 1855933
    
    Change-Id: I3c25c57edcdd50c5e76e17da658c7985821a3436
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 4598ca8d65417b7ac9f19f6fd3954639d230b46b
Author: Al Bailey <Al.Bailey@windriver.com>
Date:   Wed Feb 5 09:38:42 2020 -0600

Deprecate sysinv.openstack.common.db in favor of oslo_db
    
    openstack.common.db was not being used except by unit tests.
    The sysinv engine had previously been converted, so the
    changes are primarily in the unit test environment.
    
    Story: 2006796
    Task: 37426
    Change-Id: Ie638ee7e347fef0ada061ed4047decd0cbb919ef
    Signed-off-by: Al Bailey <Al.Bailey@windriver.com>

commit fb84bf9bdcb7844e6ac0ea192480a43ae4ac7480
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Fri Jan 31 10:06:25 2020 -0500

Forbid unlocked hosts to modify interfaces
    
    Simplified the convoluted logic that allows certain unlocked hosts to
    modify interfaces. Now the logic simply rejects unlocked hosts.
    
    Fixed a series of unit tests that modifies unlocked test controller by
    transfer the modification operations to locked test workers. Moreover,
    hardcoded test controller id is replaced with worker id attribute.
    
    Fixed another set of tests that attempts to create ethernet, vlan, or
    bond on a unlocked test controller, even though those tests are intended
    for locked test workers. These redundant network configuration are
    promptly removed, because to keep them will force the only active
    controller node to be locked.
    
    Closes-Bug: 1855187
    
    Change-Id: I7eacba9d064a4efb2c2032c3879d11460401ca08
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 29f38ce63725a829a165989bb134fd98ac8bea78
Author: Andy Ning <andy.ning@windriver.com>
Date:   Tue Feb 4 15:42:20 2020 -0500

Copy encryption provider config file to second controller
    
    kube-apiserver encryption provider config file is generated by ansible
    bootstrap on the first controller and stored in the shared fs. It is
    then copied over to the second controller. When kube-apiserver pod
    starts it will take this configuration file as its encryption provider
    configuration.
    
    Change-Id: Ibfcfb13c8a6685e38a1043acd7ec752ea116911c
    Story: 2007243
    Task: 38627
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit c4fa36214c444b34ae9c2b06f35758eb1ba8c987
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Mon Feb 3 15:41:28 2020 -0500

Forbid IPv4 DNS in an IPv6 OAM config
    
    Implemented IP version check in DNS controller api to reject patch
    operations with mismatched DNS server IP version.
    
    Enabled and fixed relevant unit tests.
    
    Rearranged unit test inheritance hierachy to eliminate undesired test
    repetitions.
    
    Closes-Bug: 1860489
    
    Change-Id: Ief4a19eeea03086bb5816a13cb3a706a48bab51a
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 5df1f3a89a6e1ef699fc6030a18902faf45daf88
Author: Bin Qian <bin.qian@windriver.com>
Date:   Wed Feb 5 13:26:43 2020 -0500

Adding job to upload commits to GitHub
    
    Add job to publish config repo to GitHub
    Fix host_key
    
    Story: 2007252
    Task: 38657
    
    Change-Id: Id0c1fe7278cbddbf6082f452323537427fefe95f
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 8ab1e2d7c624f83d72efcbfcddcdffa567a26bad
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Wed Dec 11 16:37:03 2019 +0800

Audit local registry secret info when there is user update in keystone
    
    local registry uses admin's username&password for authentication.
    And admin's password could be changed by openstack client cmd. It will
    cause auth info in secrets obsolete, and lead to invalid authentication
    in keystone.
    To keep secrets info updated, keystone event notification is enabled.
    And event notification listener is added in sysinv. So when there is
    user password change, a user update event will be sent out by keystone.
    And sysinv will call function audit_local_registry_secrets to check
    whether kubernetes secret info need be updated or not.
    
    A periodic task is added also to ensure secrets are always synced, in
    case notification is missed or there is failure in handle notification.
    
    oslo_messaging is added to tox's requirements.txt to avoid tox failure.
    The version is based on global-requirements.txt from Openstack Train.
    
    Test:
    Pass deployment and secrets could be updated automatically with new auth
    info.
    Pass host-swact in duplex mode.
    
    Closes-Bug: 1853017
    Depends-On: https://review.opendev.org/700677
    Depends-On: https://review.opendev.org/699547
    Change-Id: I959b65288e0834b989aa87e40506e41d0bba0d59
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

tags:

added: in-f-centos8

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-20: Related fix proposed to config (master)

#16

Related fix proposed to branch: master
Review: https://review.opendev.org/721163

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-20: Related fix merged to config (master)

#17

Reviewed: https://review.opendev.org/721163
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=5c1361b0e81f53349d0d6715f7b627b4456147a0
Submitter: Zuul
Branch: master

commit 5c1361b0e81f53349d0d6715f7b627b4456147a0
Author: Robert Church <email address hidden>
Date: Sun Apr 19 06:23:44 2020 -0400

Update MatchNodeSelector recovery logic for NodeAffinity status

    NodeAffinity pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.

    This update now will look for NodeAffinity pods when the sysinv
    conductor starts. This is no longer limited to simplex nodes. This
    behavior is now observed on simplex and duplex controller configurations
    as of the upversion to k8s v1.18.1.

    Change-Id: I6384ffd1d14ac105e26b83c02aaa8f090e1fdde1
    Story: 2006999
    Task: 39475
    Related-Bug: #1849688
    Signed-off-by: Robert Church <email address hidden>

Revision history for this message

Peng Peng (ppeng) wrote on 2020-04-28:

#18

Verify 1873933 on

Lab: WP_8_12
Load: 2020-04-25_13-17-56

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-21: Related fix proposed to config (f/centos8)

#19

Related fix proposed to branch: f/centos8
Review: https://review.opendev.org/729812

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-22: Related fix merged to config (f/centos8)

#20

Download full text (37.5 KiB)

Reviewed: https://review.opendev.org/729812
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539d476456277c22d0dcbc3cbbc832e623242264
Submitter: Zuul
Branch: f/centos8

commit 320cc40de8518787c2be234d7fdf88ec0a462df2
Author: Don Penney <email address hidden>
Date: Wed May 13 13:06:11 2020 -0400

Add auto-versioning to starlingx/config packages

This update makes use of the PKG_GITREVCOUNT variable to auto-version
the packages in this repo.

    Change-Id: I3a2c8caeb4b4647608978b1f2ccfcf0661508803
    Depends-On: https://review.opendev.org/727837
    Story: 2006166
    Task: 39766
    Signed-off-by: Don Penney <email address hidden>

commit d9f2aea0fb228ed69eb9c9262e29041eedabc15d
Author: Sharath Kumar K <email address hidden>
Date: Wed Apr 22 16:22:22 2020 +0200

De-branding in starlingx/config: CGCS -> StarlingX

1. Rename CGCS to StarlingX for .spec files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

Please note, doing de-brand changes in batches, this is batch9 changes.

Story: 2006387
Task: 39524

Change-Id: Ia1fe0f2baafb78c974551100f16e6a7d99882f15
Signed-off-by: Sharath Kumar K <email address hidden>

De-branding in starlingx/config: CGCS -> StarlingX

1. Rename CGCS to StarlingX for .spec file
2. Rename TIS to StarlingX for .service files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

Please note, doing de-brand changes in batches, this is batch10 changes.

Story: 2006387
Task: 36202

Change-Id: I404ce0da2621495175ad31489e9ad6f7b0211e26
Signed-off-by: Sharath Kumar K <email address hidden>

commit d141e954fa6bbf688929ec90d1b6604a97792c43
Author: Teresa Ho <email address hidden>
Date: Tue Mar 31 10:08:57 2020 -0400

Sysinv extensions for FPGA support

This update adds cli and restapi to support FPGA device
programming.

    CLI commands:
    system device-image-apply
    system device-image-create
    system device-image-delete
    system device-image-list
    system device-image-remove
    system device-image-show
    system device-image-state-list
    system device-label-list
    system host-device-image-update
    system host-device-image-update-abort
    system host-device-label-assign
    system host-device-label-list
    system host-device-label-remove

Story: 2006740
Task: 39498

Change-Id: I556c2e7a51b3931b5a66ab27b67f51e3a8aebd9f
Signed-off-by: Teresa Ho <email address hidden>

commit 491cca42ed854d2cb3ee3646b93c56a4f45f563c
Author: Elena Taivan <email address hidden>
Date: Wed Apr 29 11:25:26 2020 +0000

Qcow2 conversion to raw can be done using 'image-conversion' filesystem

    1. Conversion filesystem can be added before/after
       stx-openstack is applied
    2. If conversion filesystem is added after stx-openstack
       is applied, changes to stx-openstack will only take effec...

Reviewed:  https://review.opendev.org/729812
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539d476456277c22d0dcbc3cbbc832e623242264
Submitter: Zuul
Branch:    f/centos8

commit 320cc40de8518787c2be234d7fdf88ec0a462df2
Author: Don Penney <don.penney@windriver.com>
Date:   Wed May 13 13:06:11 2020 -0400

Add auto-versioning to starlingx/config packages
    
    This update makes use of the PKG_GITREVCOUNT variable to auto-version
    the packages in this repo.
    
    Change-Id: I3a2c8caeb4b4647608978b1f2ccfcf0661508803
    Depends-On: https://review.opendev.org/727837
    Story: 2006166
    Task: 39766
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit d9f2aea0fb228ed69eb9c9262e29041eedabc15d
Author: Sharath Kumar K <sharath.kumar@intel.com>
Date:   Wed Apr 22 16:22:22 2020 +0200

De-branding in starlingx/config: CGCS -> StarlingX
    
    1. Rename CGCS to StarlingX for .spec files
    
    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch9 changes.
    
    Story: 2006387
    Task: 39524
    
    Change-Id: Ia1fe0f2baafb78c974551100f16e6a7d99882f15
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>
    
    De-branding in starlingx/config: CGCS -> StarlingX
    
    1. Rename CGCS to StarlingX for .spec file
    2. Rename TIS to StarlingX for .service files
    
    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch10 changes.
    
    Story: 2006387
    Task: 36202
    
    Change-Id: I404ce0da2621495175ad31489e9ad6f7b0211e26
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>

commit d141e954fa6bbf688929ec90d1b6604a97792c43
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Tue Mar 31 10:08:57 2020 -0400

Sysinv extensions for FPGA support
    
    This update adds cli and restapi to support FPGA device
    programming.
    
    CLI commands:
    system device-image-apply
    system device-image-create
    system device-image-delete
    system device-image-list
    system device-image-remove
    system device-image-show
    system device-image-state-list
    system device-label-list
    system host-device-image-update
    system host-device-image-update-abort
    system host-device-label-assign
    system host-device-label-list
    system host-device-label-remove
    
    Story: 2006740
    Task: 39498
    
    Change-Id: I556c2e7a51b3931b5a66ab27b67f51e3a8aebd9f
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit 491cca42ed854d2cb3ee3646b93c56a4f45f563c
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Apr 29 11:25:26 2020 +0000

Qcow2 conversion to raw can be done using 'image-conversion' filesystem
    
    1. Conversion filesystem can be added before/after
       stx-openstack is applied
    2. If conversion filesystem is added after stx-openstack
       is applied, changes to stx-openstack will only take effect
       once the application is re-applied
    
    3. It is not allowed to delete image-conversion filesystem
       when stx-openstack is in applying/applied/removing state
    4. Raise alarms for image-conversion
    
    Change-Id: Ie205329b694525509b0820497186fcd9ec2e45c9
    Closes-bug: 1819688
    Depends-On: https://review.opendev.org/#/c/724270/
    Depends-On: https://review.opendev.org/724288/
    Signed-off-by: Elena Taivan <elena.taivan@windriver.com>

commit bc9cde71a0bbcd099427b8808e0bdb1b78cb9725
Author: albailey <Al.Bailey@windriver.com>
Date:   Tue May 12 14:24:17 2020 -0500

Specify an upper limit for flake8 and pycodestyle
    
    Both flake8 and pycodestyle were updated on May 11
    which caused zuul jobs to start failing.
    
    The copyrights were updated as a way of triggering
    zuul to run the flake8 jobs associated with the
    test-requirements.txt
    
    Similar solution as:
    https://review.opendev.org/#/c/727133/
    
    Change-Id: Ia2b97203e7ab767586ee7393ac08fcf781af7609
    Closes-Bug: 1878276
    Signed-off-by: albailey <Al.Bailey@windriver.com>

commit c317fb0324c93cbaeab1b635c745b806c04dc613
Author: Don Penney <don.penney@windriver.com>
Date:   Fri May 8 11:40:03 2020 -0400

Add support to sysinv-conductor to update static images
    
    As part of the sysinv-conductor init, apply the
    upgrade-static-images.yml playbook to download updated images to the
    local registry as needed.
    
    Change-Id: I726a244ae226588327ebe2f69d4131b57cebab85
    Depends-On: https://review.opendev.org/726420
    Story: 2006781
    Task: 39705
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit dac06a7a57efca8c6eeeb1021a768df4842eecb2
Author: John Kung <john.kung@windriver.com>
Date:   Wed May 6 17:56:19 2020 +0000

Revert "Update conditions for oam config change and manifest apply"
    
    Investigation into requirement for
      openstack::keystone::endpoint::runtime
    for configuring admin-ep is required.
    
    This reverts commit c1112ad2c5d6a6ee3a34bb345055d21fcd08a6d9.
    
    Change-Id: Icfe6bbcd0c0a0489aede56552ec15712f314c1c5

commit 09dc3cbcded99900feb0fca5f65542c3fa673231
Author: Robert Church <robert.church@windriver.com>
Date:   Tue May 5 15:29:33 2020 -0400

Provide an update strategy for Tiller deployment
    
    In the case of a simplex controller configuration the current patching
    strategy for the Tiller environment will fail as the tiller ports will
    be in use when the new deployment is attempted to be applied. The
    resulting tiller pod will be stuck in a Pending state.
    
    The deployment strategy provided by 'helm init' is unspecified. This
    change will allow one additional pod (current + new) and one unavailable
    pod (current) during an update. The maxUnavailable setting allows the
    tiller pod to be deleted which will release its ports, thus allowing the
    patch deployment to spin up an new pod to a Running state.
    
    This patching ensures that on an installed system where tiller has been
    manually removed and re-applied via 'helm init', it is patched
    appropriately.
    
    Change-Id: I356545d05a585f7cbbbd5ca5071aa834fb086c31
    Depends-On: https://review.opendev.org/#/c/725705/
    Closes-Bug: #1876396
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 40463adf9aba476ba44b0dd89d4c30c9343b43b4
Author: John Kung <john.kung@windriver.com>
Date:   Mon May 4 17:13:59 2020 -0400

Fix certificate-key to 64 characters
    
    Update get_secure_static_config() to fix the
    kubernetes::kubeadm::certificate-key,  to the 64 characters
    expected by kubernetes.
    
    Change-Id: I366e6eb1dc4e764425ef2a82a493db47a080f49a
    Closes-bug: 1876755
    Signed-off-by: John Kung <john.kung@windriver.com>

commit ee72ac30762d5182ff5fa8051cd0f86a1a18efba
Author: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Date:   Thu Apr 30 20:10:53 2020 +0300

Copy RBD provisioner secret to k8s namespaces only when Ceph is enabled
    
    When an application is started the secret to access kube RBD pool is
    needed in the application namespace to allow PVC creation.
    
    This commit adds a semantic check to verify that Ceph is enabled before
    attempting the copy operation.
    
    Change-Id: If890e53414df183337b563902d3566285ab27213
    Story: 2007391
    Task: 39604
    Signed-off-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>

commit fbcdbf63ea3ac192a8e6dbd8588ca34399444008
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Thu Apr 30 18:19:29 2020 -0400

Use persistent backup during upgrade
    
    Use the persistent backup to store the upgrade data during simplex
    upgrades.
    
    Change-Id: I83280fdc5b2c702045a6a51b1c379758dd50baa2
    Story: 2007403
    Task: 39606
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit c1112ad2c5d6a6ee3a34bb345055d21fcd08a6d9
Author: John Kung <john.kung@windriver.com>
Date:   Thu Apr 30 11:33:52 2020 -0400

Update conditions for oam config change and manifest apply
    
    The runtime manifest apply for an oam config change was being
    triggered on host-swact to the target controller after startup.
    Thus, the config runtime manifest was being triggered even when
    there was not an oam config change.
    
    Update the runtime manifest apply for oam config to be triggered
    on active controller startup after an oam configuration change.
    
    During upgrades, disallow oam network changes as the configuration
    affects the platform and kubernetes components dependent on the
    OAM network.
    
    Tests Performed:
    bootstrap and enable duplex controllers
    bootstrap and enable AIO-SX
    host-swact after initial install and reinstall
    oam-modify and host-swact and verify oam access
    
    Change-Id: I4777891eaec05a6a39322325cec3c2ed006446da
    Story: 2007403
    Task: 39605
    Partial-Bug: 1874136
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 88f2f7dc1a12327e12b006ef437b919bcef29108
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Wed Apr 22 03:19:29 2020 +0300

Fix race condition during certificate key regeneration
    
    When monitor is created on compute-1 hiera data is regenerated including
    certificate-key during controller-1 reboot as part of the unlock. When
    controller-1 boots up the join command fails as certificate key is no
    longer valid
    
    Change-Id: I99057fa1afc3648c7aa3910f95067bde7b51b033
    Closes-bug: 1873916
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit 426f034c14ea6f3d292c8a3a8b8de50efe0a2171
Author: Mihnea Saracin <Mihnea.Saracin@windriver.com>
Date:   Thu Apr 23 19:31:34 2020 +0300

Persistent backup partition comments
    
    Add some information about the persistent partition
    in the sysinv where the partitions sizes are computed
    
    Depends-On: https://review.opendev.org/#/c/720256/
    Change-Id: Id07e38c1c8cf68c83ba393bf3e809bf892f430f5
    Signed-off-by: Mihnea Saracin <Mihnea.Saracin@windriver.com>

commit 8099bbbbcf6e67190dc2ede949c47da081317e2d
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Mar 25 12:33:42 2020 +0000

Add a new filesystem for image conversion
    
    Create the new host_fs CLI commands and the APIs
        system host-fs-add
        system host-fs-delete
    
    These commands will be used only for adding/removing 'image-conversion'
    filesystem dedicated only for qcow2 image conversion.
    'image-conversion' filesystem is optional.
    It is not allowed to add/remove any other filesystem.
    
    Change-Id: I87c876371e123ec1ba946170258401d220260e31
    Partial-bug: 1819688
    Depends-On: https://review.opendev.org/#/c/714936/
    Signed-off-by: Elena Taivan <stefan.dinescu@windriver.com>

commit e0d751f79060c788526ad4f3af56abe1e2308f8f
Author: Matt Peters <matt.peters@windriver.com>
Date:   Tue Apr 28 12:38:19 2020 -0500

Remove storage class backend from helm overrides
    
    Remove the storageClass parameter from the stx-monitor
    helm system overrides.  With support for different
    storage classes, the specific request for the storage
    class of "general" should not be configured so that the
    default storage class is used when not specified.
    
    NOTE: The old parameter had no effect since it should
    have been storageClassName.  However, it is being removed
    since it is confusing to the end user.
    
    Story: 2007391
    Task: 39589
    
    Change-Id: Ie690e53404df183337b563902d3566285db27313
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit dbc41d03626d4f963f24cd83b7b417bba361a969
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Mon Apr 27 22:32:49 2020 -0400

Fix db error in creating route for dc host
    
    In creating a host route for DC, the interface id is
    required instead of the interface uuid.
    This update fixed the database error.
    
    Tested in vbox with system controller and subcloud.
    
    Closes-Bug: 1875461
    
    Change-Id: Ica81d0cd237ada1232f3fb3b3518a8d74df9ba99
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit f0b1f8b604f9cb908213113648dabc63e268aaa8
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Fri Apr 24 14:12:58 2020 -0400

Rename the existing /opt/patch-vault filesystem to /opt/dc-vault
    
    The filesystem /opt/patch-vault is created on the system controller.
    In order to re-use this filesystem to store FPGA images and software
    loads, it is renamed to /opt/dc-vault. Additionally, the default size
    of the dc-vault-lv is increased from 8G to 15G.
    
    Story: 2006740
    Task: 39550
    Change-Id: Id8cda76759da6e6c73fd24357f79658894c95a64
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit 5d04b37e9074c2beedc678a98e54a6d27e5d35c7
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Apr 27 12:49:02 2020 -0400

Data Collection Reduction
    
    This changeset aims to reduce the amount of data collected by
    stx-monitor. This is achieved by:
    - Dropping the load, process_summary and fsstat metricsets from the
      system module
    - Dropping the system metricset from the kubernetes module
    - Dropping percentage metrics from the cpu metricset
    - Increasing daemonset kubernetes module period from 10s to 60s
    
    Story: 2007221
    Task: 39567
    
    Change-Id: I01899ac5af8dc48313d801c3d16bff209286000b
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit 382491ffde5bbafd43154fcd69f8345df9ea9bc7
Author: John Kung <john.kung@windriver.com>
Date:   Mon Apr 27 17:45:40 2020 -0400

Disallow host-lock controller-1 during upgrade-starting
    
    Add a semantic check to prevent host-lock controller-1
    when the upgrade state is 'starting'.  This is to ensure the
    database is not snapshot with N+1 controller administratively locked,
    as that is to become the N+1 active controller.
    
    Change-Id: Ia34cbe40d58920fb26be0901bce6a6966a3ec27c
    Story: 2007403
    Task: 39574
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 600f0a678541368d8c973850fbabcd6b55eacf3f
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Apr 27 15:04:15 2020 -0400

Include app isolated CPUs when checking for minimum app cores
    
    Add total_isolated_cores when computing the total number of reserved
    cores. This will ensure that at least one unassigned core is available
    for general applications and all CPUs will not be consumed by all
    reservations.
    
    Change-Id: Ic5b493741dbd5d626906f686c002eb4e6f5775a4
    Story: 2006999
    Task: 39573
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 2d30ca7673acddefd22baf1d25641f3ebbf1a42a
Author: Matt Peters <matt.peters@windriver.com>
Date:   Fri Apr 24 15:01:09 2020 -0500

Remove helm plugin version checks
    
    Until the full application decoupling is completed, the helm
    plugin version enforcement is being removed since applications
    may still want to upversion the application without a change
    to the platform plugins.
    
    Full platform application compatibility will be enforced once
    the application decoupling story is completed.
    
    Story: 2006537
    Task: 39551
    
    Change-Id: Ia86fcfc2d100bad6fce5763bd2ab21a6bc3611b2
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit ff66f652d5b5108e19030852cb30c7f395517779
Author: Matt Peters <matt.peters@windriver.com>
Date:   Fri Apr 24 11:20:01 2020 -0500

Update Logstash to use NodePort Ingress
    
    Logstash should not be using a custom port for collectd
    input from the K8s NodePort range since it might cause a
    conflict if the port is allocated to another service.
    Therefore, logstash will use a proper NodePort value
    reserved by the nginx-ingress service.
    
    Do not disable the nginx-ingress on the subcloud since it
    is required for collectd to send events to logstash.
    
    Story: 2007221
    Task: 39549
    
    Change-Id: Ibdcbcf1b217ddd17197c0e8fb6cc069a573d10a5
    Depends-On: https://review.opendev.org/#/c/722674
    Signed-off-by: Matt Peters <matt.peters@windriver.com>

commit 92828038b4cfa720c6dfc74fbdcb2e463ac5996d
Author: Robert Church <robert.church@windriver.com>
Date:   Wed Apr 22 02:50:11 2020 -0400

Enable --reserved-cpus option in k8s v1.18.1
    
    The option was introduced in k8s v1.17 and will now be used to define
    the explicit set of CPUs that are reserved for specific cpu functions in
    StarlingX.
    
    This retires setting the number of CPUs reserved in the --kube-reserved
    and --system-reserved options.
    
    Instead of calculating the number of CPUs related to reservations,
    provide the specific list of CPUs in a comma separated range format.
    This will be used by puppet to set the --reserved-cpus option based on
    cpu manager policy.
    
    Remove restrictions around CPU assignments:
    - Allow platform cores to be reserved on any processor
    - Allow application isolated cores to be reserved on any processor
    
    Change-Id: I1a3d4e4cca7b6940682a787c2e7348e56a047a06
    Depends-On: https://review.opendev.org/#/c/722189
    Story: 2006999
    Task: 39528
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit fd3a279c83de163face3cc69f551cc6f65d1cace
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Thu Apr 23 17:32:33 2020 -0400

Fix application-update reuse-user-overrides
    
    The 'maintain_user_overrides' flag in the application tarball
    metadata.yaml file is meant to indicate whether to preserve
    user overrides over application update.  The --reuse-user-overrides
    flag of the application-update command can override the setting
    in the metadata.yaml file, but the current logic means the
    'maintain_user_overrides' flag will never be checked even if the
    --reuse-user-overrides flag is not set.
    
    This update allows the maintain_user_overrides to be checked when
    the --reuse-user-overrides flag is not set.
    
    Closes-Bug: 1874552
    Change-Id: I38e009f72c432f43b1ad8744771ce32de1269736
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit 1c77d6664264814e37ccf998fd1aea896235e7e6
Author: Bin Qian <bin.qian@windriver.com>
Date:   Tue Apr 7 23:58:08 2020 -0400

Set dc adminep cert and root ca cert to secure system config
    
    Extract admin endpoint cert and key pair from cert-manager to secure
    system config, for puppet to pick up and install.
    The cert and key are used to by haproxy to provide ssl termination
    on admin endpoints.
    
    Performed tests:
    Install DC, unlocked system controller 0 and 1
    Unlocked SX subcloud controller 0.
    
    Story: 2007347
    Task: 39429
    
    Depends-on: https://review.opendev.org/#/c/720270
    Depends-on: https://review.opendev.org/#/c/720224
    
    Change-Id: Idb302fffe2b4c4ae36a901377d5089a91d26a3ba
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 0333ccbb4216300eb451004790ce8b4c7e492e6f
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Thu Apr 23 09:56:38 2020 -0400

Fix Filebeat readiness probe exceeding timeout
    
    The 7.6.0 chart upgrade added a readiness probe to the beats. The
    Filebeat readiness probe will occasionally fail, causing
    application-apply to fail. This fix addresses this issue by increasing
    Filebeat's resource limits to match those allotted to Metricbeat.
    
    Closes-Bug: 1874328
    
    Change-Id: Ie2e23bbe063fd837999ceb48cc97071034526f35
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit f20970adcff43bfc1f410fd7efa211920ac33e2e
Author: John Kung <john.kung@windriver.com>
Date:   Wed Apr 22 17:02:59 2020 -0400

Fix application-update to reference inst_path
    
    Issue with directory path to metadata_file set incorrectly,
    is fixed with setting to path.inst_path.
    
    Tests Performed:
    - Verified application-update passes
    - Verified updated application stx-monitor metadata
    - Verified updated application stx-openstack metadata
    
    Change-Id: I084bf34c6e19d9c05766639160af5dbe39aa4499
    Closes-Bug: 1874284
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 24a0284e3d182faac2b613ddb9f9f36c5ba3995a
Author: Robert Church <robert.church@windriver.com>
Date:   Sun Apr 19 06:22:50 2020 -0400

Patch Tiller deployment to ensure self-recovery
    
    On node startup, there appears to be a race condition between when
    kubelet sees a pod and when kubelet sees a service. Due to this race,
    required environment variable are missing to allow tiller to function
    properly.
    
    See the comment at
    https://github.com/kubernetes/kubernetes/blob/v1.18.1/pkg/kubelet/kubelet_pods.go#L566
    
    This change patches the tiller deployment to make sure the four classes
    of environment variables are present prior to starting tiller. If any
    class of variables are not present in the environment, then exit. This
    will recreate the pod and will populate the correct environment for
    tiller to function.
    
    Since the upgrade to v1.18.1, this has been seen in simplex and duplex
    controller configurations.
    
    Review https://review.opendev.org/#/c/699307/ will cover patching during
    initial provisioning via ansible. This change will check that tiller is
    patched every time the conductor starts as part of the tiller upgrade
    logic. This will cover scenarios where tiller is manually removed from
    the cluster and reinstalled via helm.
    
    This change should be reverted once StarlingX moves to helm v3.
    
    Also removed dead code: get_k8s_secret()
    
    Change-Id: Icd199ec1b1e10840094c0eae59d53838f32ffd6f
    Closes-Bug: #1856078
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 7e10e2091497a70bb39583c0678968171790bfdf
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Tue Apr 21 22:37:50 2020 -0400

Correct alarm calculations in health check
    
    The health check would incorrectly report all alarms as management
    affecting. This was a result of moving to the fm API instead of directly
    querying the database. As we are querying the API, a tuple is never
    returned and the mgmt_affecting property is calculated to "True" or
    "False".
    
    Same root cause as this bug/change:
    https://review.opendev.org/#/c/664274/
    
    Change-Id: Ia0b8a1df9526daa5052bf977f2c8812416b7e3b9
    Story: 2007403
    Task: 39517
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit 4ccb11cb4019734e424362d677afb00dd6ecc4b6
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Tue Apr 21 11:47:37 2020 +0300

Improve host-overrides
    
    Add missing variables for DC.
    
    Central+Subclod:
    system_mode
    location
    description
    
    Subcloud:
    region_config
    region_name
    system_controller_oam_subnet
    system_controller_oam_floating_address
    system_controller_subnet
    system_controller_floating_address
    
    Partial-Bug: 1870389
    Closes-Bug: 1873617
    Change-Id: Ieb12ffc0ad769dd6ca22eb4c15f9d6d55778fd4b
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 7b8ab9ff532dca6f1bf9e1a37deef9650e790167
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Tue Apr 21 13:47:50 2020 +0000

Support host routes for storage networks
    
    Add storage network type in the list of interfaces
    that support routes
    
    Change-Id: I6fd5117006159c6622649a563d5268bbd49d05d3
    Story: 2007391
    Task: 39511
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit b0e76a69277441b6becec6533214bdbbb38e6058
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Thu Dec 19 15:47:00 2019 +0200

Allow yaml formatting for controllerfs-list
    
    In oder to be easily parsed by ansible, the controllerfs-list
    command should support yaml output format.
    
    Change-Id: Ic766980645d618d54d34bd04d82339fd2cd36562
    Depends-On: https://review.opendev.org/#/c/719782/
    Partial-bug: 1854169
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit e169d1caea71b63034dbe1a008616df0f7a52639
Author: Andy Ning <andy.ning@windriver.com>
Date:   Mon Apr 6 10:47:09 2020 -0400

Generate admin_url to enable https for admin endpoints
    
    This commit updated platform services' sysinv puppet plugins to
    generate proper admin_url hiera data to enable https for these endpoints
    during controller unlock.
    
    This commit also updated controller_config to copy and install dc admin
    endpoint CA cert and haproxy cert for the second controller.
    
    Change-Id: I21345a96f8a0ffb416069ff28dbcfa51b9e12359
    Story: 2007347
    Task: 39314
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 4e0b2acdfed437e95abf789748969f26880a53a5
Author: John Kung <john.kung@windriver.com>
Date:   Thu Apr 2 10:46:56 2020 -0400

Enable duplex platform upgrades
    
    Enable the mechanism to upgrade the platform components on
    a running StarlingX system with duplex controllers.
    
    This includes upgrade updates for:
      o generation of kubernetes join_cmd to enable the N+1 controller
        to join the cluster
      o migrate of kubernetes config
      o migrate etcd on host-swact
      o migrate of DistributedCloud dcmanager and dcorch databases
    
    A maintenance release for stx3.x is required to upgrade to stx4.0
    
    Tests Performed with duplex controller: AIO-DX and Standard
    - system load-import
    - system health-query-upgrade
    - system upgrade-start
    - system host-upgrade controller-0
    - system host-lock/unlock host N while controller N, N+1
    - system host-lock/unlock controller-0 while controller N+1
    - system host-upgrade controller-1
    - system host-upgrade storage
    - system host-upgrade worker
    - system upgrade-activate
    - system upgrade-abort
    - system host-downgrade
    - system upgrade-complete
    - verified application (e.g. stx-monitor) over upgrade
    
    Change-Id: I4267c7b32b2e7b59b5ffdd8146288698962da1e0
    Story: 2007403
    Task: 39243
    Task: 39244
    Task: 39245
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 4247ed2fde53aa17b51feba93421090c432084e4
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Wed Apr 15 16:40:52 2020 -0400

Update verify-license call in sysinv
    
    This commit updates the parameters required to call verify-license in
    sysinv to bring it in sync with its most recent implementation.
    
    Story: 2007403
    Task: 39433
    
    Depends-on: https://review.opendev.org/#/c/720615/
    
    Change-Id: Ie35e5bb3f1237887dfff66f4ed8d71a24f95ebdb
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit 5c1361b0e81f53349d0d6715f7b627b4456147a0
Author: Robert Church <robert.church@windriver.com>
Date:   Sun Apr 19 06:23:44 2020 -0400

Update MatchNodeSelector recovery logic for NodeAffinity status
    
    NodeAffinity pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.
    
    This update now will look for NodeAffinity pods when the sysinv
    conductor starts. This is no longer limited to simplex nodes. This
    behavior is now observed on simplex and duplex controller configurations
    as of the upversion to k8s v1.18.1.
    
    Change-Id: I6384ffd1d14ac105e26b83c02aaa8f090e1fdde1
    Story: 2006999
    Task: 39475
    Related-Bug: #1849688
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit b1a290f0ccfa0b44af6fd7247be92f361d919467
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Fri Apr 17 10:48:09 2020 -0400

Fix beat fails to parse kubernetes.pod.labels.app
    
    Metricbeat and Filebeat fail to parse labels that are used as both
    objects and keywords in the Elasticsearch document hierarchy. This
    change addresses this issue by enabling the 'labels.dedot' and
    'annotations.dedot' options on Metricbeat kubernetes modules and
    Filebeat's kubernetes metadata processor, which automatically escapes
    conflicting labels and annotations.
    
    Story: 2007221
    Task: 39463
    
    Change-Id: Id7f6cd6fc499ea4644e16c80b68ebde19c6f59ad
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit e6dd6fee38f1f180c4b611db4570021eb7c85bae
Author: Zhipeng Liu <zhipengs.liu@intel.com>
Date:   Thu Apr 2 01:03:34 2020 +0800

Add mariadb database config override to support ipv6
    
    Override "config_override" in helm/mariadb.py according to ip version.
    Test Pass on both ipv4 and ipv6 simplex.
    
    Closes-Bug: 1859641
    
    Change-Id: Ic15865105f305a8d7b93187eb51ef5aaf3d7d96e
    Signed-off-by: Zhipeng Liu <zhipengs.liu@intel.com>

commit d4c2f23d4fd24fa60c79c2fe0ac7e0c6ab97887b
Author: Thomas Gao <Thomas.Gao@windriver.com>
Date:   Thu Apr 16 16:42:41 2020 -0400

Fixed invalid lldp tlv update by sysinv conductor
    
    Sysinv conductor reads tlv packets for a list of vlan names, and attempts
    to shove it into DB without checking its string size. Since in DB,
    'dot1_vlan_names' field only permits 255 char, the DB update can fail.
    
    This fix truncates the list of vlan names to ensure it is under 255 char.
    Unit tests are added to verify the conductor behavior.
    
    Closes-Bug: 1866230
    
    Change-Id: Ibe0f06bc5c6a96573a338ebbb991bfc88cde6fb4
    Signed-off-by: Thomas Gao <Thomas.Gao@windriver.com>

commit 2e40c98ed07abad6cc84b32b129cac52baea794f
Author: David Sullivan <david.sullivan@windriver.com>
Date:   Wed Apr 15 10:55:41 2020 -0400

Use ansible for simplex upgrade start
    
    Use the ansible backup playbook for simplex upgrade start. Pass the
    backup location and filename to the playbook.
    
    Change-Id: I624e38adfb5a7d4c1193da0dfe29991492f41d6a
    Story: 2007403
    Task: 39427
    Signed-off-by: David Sullivan <david.sullivan@windriver.com>

commit f64ae62e4dfede86ad821aa8282a783f3c406c8d
Author: Tao Liu <tao.liu@windriver.com>
Date:   Thu Apr 16 10:04:52 2020 -0400

Support subcloud deploy upload the common files
    
    Define a constant for /opt/platform/deploy/<version>
    
    Partial-Bug: 1864508
    
    Change-Id: Ide43993992aeae830631a0c1bb8ee377990a6974
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit 49f93b5d6d4d30d5717753efe499485ea15cca8f
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Tue Apr 14 13:34:03 2020 -0400

Add system fields to container logs
    
    These changes move the system fields out from Filebeat's 'log' input
    config so that they are added to all log inputs. System fields are now
    added to autodiscovered container logs as well.
    
    Change-Id: I4810df8c79f69029347554124849ee44068f5e5f
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit 994068cbd88f9eb3df99a4bff016df73493285e8
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Apr 13 11:46:07 2020 -0400

Container logs collected without Kubernetes metadata
    
    Container logs are now being collected using Filebeat's 'container'
    input.
    This change excludes container logs from being collected by the 'log'
    input, so that the logs can be enriched with Kubernetes metadata.
    
    Depends-On: https://review.opendev.org/#/c/719585/
    Change-Id: Ia7ed274975bfe4c4a5bd0dc78f256fa3fae23d5f
    Signed-off-by: Simon Cousineau <Simon.Cousineau@windriver.com>

commit b1ca87c7cfca4ac493fe8ef6e57de4d425effba2
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Fri Apr 10 16:48:49 2020 +0300

Change ceph manager port
    
    Free port 5001 to be used by keystone.
    
    Story: 2007347
    Task: 39391
    
    Depends-On: I45ee810c9b4686d98c246c3a73f21f0de4ba76a1
    Change-Id: Ie2901a5affc803e0c86af6a94ed27bfa9cd9d458
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit a6f0615860742b4323a6967a0f9a0059aabb1550
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Mar 23 20:35:46 2020 -0400

Update get_kube_versions to align with v1.18.1
    
    Change-Id: Ib5b2cb2849a2865b8e31bc37a84d35bb9736f131
    Story: 2006999
    Task: 39341
    Depends-On: https://review.opendev.org/#/c/718568/
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 362d905dad25369bf116bb1e34a659f33b7260af
Author: Dan Voiculeasa <dan.voiculeasa@windriver.com>
Date:   Fri Apr 10 11:31:06 2020 +0300

Improve host-overrides
    
    Add distributed cloud role information in the host overrides.
    The restore playbook needs this information.
    
    Partial-Bug: 1870389
    Change-Id: I278f19be32d1fe87687feb75e26b2898237de86f
    Signed-off-by: Dan Voiculeasa <dan.voiculeasa@windriver.com>

commit 55ce64cc58c3548e66b0e2aee454087f5d17c23d
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Tue Apr 7 15:11:35 2020 -0400

Refactor obsolete versions usage in sysinv
    
    This commit removes obsolete version checks from sysinv code.
    
    Story: 2007403
    Task: 39226
    
    Change-Id: Ibc5ba1d65c16971926dfd3aae05564fbb314aa1b
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit a68e15140886a8ed31a40ce8186012b25de77b87
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Mar 27 14:15:52 2020 -0400

Support adding admission plugin post bootstrap
    
    This commit adds a system service parameter for admission plugins of
    kube-apiserver. We need this for pod security plugin. Starting pod
    security plugin without any policies will result in all pods being
    denied. This means pod security plugin must be started by service
    parameter after bootstrap.
    
    Story: 2007351
    Task: 38897
    Depends-On:  https://review.opendev.org/#/c/717374
    
    Change-Id: I1a7e19f85a4be609112765c975bb81a248217168
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit fb8ae2dbae2e6d441579b04a9629439e2cced3c8
Author: Sharath Kumar K <sharath.kumar@intel.com>
Date:   Mon Apr 6 09:53:28 2020 +0200

De-branding in starlingx/config: Titanium Cloud -> StarlingX
    
    1. Rename Titanium Cloud to StarlingX for .spec files
    2. Rename Titanium Cloud to StarlingX for .service file
    
    Test:
    After the de-brand change, bootimage.iso has built in the flock layer
    and installed on the dev machine to validate the changes.
    
    Please note, doing de-brand changes in batches, this is batch4 changes.
    
    Story: 2006387
    Task: 36202
    
    Change-Id: I708a1edb07dcd21a623fa484bb3b935c5180d089
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>

commit b101cc1719e356baac24b7eda3f7ff2bdd5e984d
Author: Ambarish Das <ambarish.das@intel.com>
Date:   Fri Apr 3 21:38:17 2020 -0500

Clean up: python libvirt removed from test requirement of sysinv
    
    This patch removes the dependency of libvirt-python from
    test_requirements.txt file of sysinv.This package is no more
    used by sysinv and generates error in "tox" execution.
    
    Closes-Bug:#1869318
               libvirt-python in test requirement throws error in tox build
               for py27 for config module
    
    Change-Id: I6f662159d5d71465079746755dabc8c063d9a158
    Signed-off-by: Ambarish Das <ambarish.das@intel.com>

commit 01c8b191d19ed6dd7a0d6475aa3a439890e43379
Author: Carmen Rata <carmen.rata@windriver.com>
Date:   Fri Mar 27 14:03:28 2020 -0400

Config updates for stx3.0 upgrades
    
    Update controllerconfig to remove non-platform openstack components
    and fix db barbican migration.
    Create RPC call to allow to touch /etc/platform/.upgrade_controller_1
    Remove not needed upgrade-scripts.
    Obsolete software version related fixes.
    
    Story: 2007403
    Task: 39086
    Task: 39087
    Task: 39182
    Task: 39183
    Task: 39226
    
    Change-Id: I28e746f3d267c322f59402beaf25c271138a124d
    Signed-off-by: Carmen Rata <carmen.rata@windriver.com>

commit 898d48afe5ee894277246642e3533113771d1672
Author: Simon Cousineau <Simon.Cousineau@windriver.com>
Date:   Mon Mar 2 16:23:19 2020 -0500

Update helm overrides for elastic helm charts 7.6.0
    
    Update filebeat overrides to use "filebeatConfig" parameter for config
    files.
    Update logstash "replicas" and "elasticseachHosts" overrides.
    Update metricbeat module overrides to conform to metricbeat's
    configuration format.
    
    Story: 2007221
    Task: 38473
    Task: 38476
    Task: 38477
    Task: 38478
    
    Change-Id: Ie27916c1e26c4c1ada25c15277daa0598f7599b5
    Depends-On: https://review.opendev.org/#/c/708730/

commit d7ba6775212401f2bfc0bee04febe661152e504d
Author: Kevin Smith <kevin.smith@windriver.com>
Date:   Mon Mar 23 19:06:49 2020 -0400

Wait for pod termination on stx-monitor remove
    
    On removal of the stx-monitor application, wait for all pods
    to have terminated before moving to 'uploaded' status.
    This will prevent the user from issuing an application-delete
    command which could possibly timeout.
    
    Change-Id: I116a98bdc60a4a7fe05e50eb9b4ddd4e6ef2e24f
    Closes-Bug: 1868567
    Signed-off-by: Kevin Smith <kevin.smith@windriver.com>

commit 423a475aff4f9ea1b60af6a9a2989027d1506f10
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Thu Mar 12 14:06:08 2020 +0800

Refresh local registry auth info each time when access local registry
    
    Local registry uses admin account password as authentication info.
    And this password may be changed by openstack client at any time.
    When sysinv tries to download images from local registry, it cannot
    cache the auth info, otherwise it may lead to authentication failure
    in keystone, and account be locked at the end.
    
    Partial-Bug: 1853017
    
    Change-Id: I07f273a05a1bc3c08b48d13c94eb6df6aecdf7c3
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

StarlingX

platform-integ-apps apply-failed after lock/unlock controller

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches