platform-integ-apps apply-failed after lock/unlock controller

Bug #1849688 reported by Yang Liu
72
This bug affects 6 people
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bob Church

Bug Description

Brief Description
-----------------
platform-integ-apps in apply-failed state after lock unlock controller-0 on a simplex system.
tiller pod stuck at MatchNodeSelector.

Severity
--------
Major

Steps to Reproduce
------------------
1. Install and configure a simplex system --> Initial apply was successful
2. lock/unlock controller

TC-name: test_lock_unlock_host[controller]

Expected Behavior
------------------
2. lock/unlock succeeded, system is still healthy after that

Actual Behavior
----------------
2. lock/unlock succeeded, but platform-integ-apps failed, tiller pod stuck at MatchNodeSelector.

Reproducibility
---------------
Happened 2/3 times on simplex systems.

System Configuration
--------------------
One node system
Lab-name: wcp122

Branch/Pull Time/Commit
-----------------------
2019-10-23_20-00-00

Last Pass
---------
2019-10-21_20-00-00 on same system

Timestamp/Logs
--------------
[2019-10-24 07:04:45,331] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock controller-0'

[2019-10-24 07:05:24,844] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

[2019-10-24 07:14:26,862] 433 DEBUG MainThread ssh.expect :: Output:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system tiller-deploy-d6b59fcb-z4jzg 0/1 MatchNodeSelector 0 34m <none> controller-0 <none> <none>

[2019-10-24 07:14:48,208] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | apply-failed | operation aborted, check logs for detail |
+---------------------+---------+-------------------------------+---------------+--------------+------------------------------------------+

Test Activity
-------------
Sanity

Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Bob Church (rchurch) wrote :
Download full text (9.4 KiB)

Potentially looks like an endpoint/firewall update may be impacting an application apply that is in progress.

# Initial apply works during provisioning
2019-10-24 06:56:58.365 110188 INFO sysinv.conductor.manager [-] Platform managed application platform-integ-apps: Applying...
2019-10-24 06:56:58.640 110188 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 06:56:58.940 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 06:58:09.356 110188 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 70 seconds
2019-10-24 06:58:33.243 110188 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 06:58:33.244 110188 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

# An override change has been detected. Not sure why this is the case. Needs investigation…
2019-10-24 07:13:16.722 102497 INFO sysinv.conductor.manager [-] There has been an overrides change, setting up reapply of platform-integ-apps

# Firewall update is triggered
2019-10-24 07:13:16.726 102497 INFO sysinv.agent.rpcapi [-] config_apply_runtime_manifest: fanout_cast: sending config 66c95e55-43a1-4b79-847d-43e6960123d2 {'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime', 'platform::sysinv::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} to agent
2019-10-24 07:13:16.728 22171 INFO sysinv.agent.manager [req-337b5587-475e-4645-8aee-9b8013fcc669 admin None] config_apply_runtime_manifest: 66c95e55-43a1-4b79-847d-43e6960123d2 {u'classes': [u'openstack::keystone::endpoint::runtime', u'platform::firewall::runtime', u'platform::sysinv::runtime'], u'force': False, u'personalities': [u'controller'], u'host_uuids': [u'4624ddd2-6b83-4e12-ada6-f6862e120509']} controller

2019-10-24 07:13:31.950 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Updating platform data for host: 4624ddd2-6b83-4e12-ada6-f6862e120509 with: {u'availability': u'services-enabled'}
2019-10-24 07:13:32.171 102497 INFO sysinv.helm.manifest_base [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 07:13:32.172 102497 INFO sysinv.conductor.manager [req-de8da658-fc3b-423c-b47e-eb2ff9cf9342 admin admin] There has been an overrides change, setting up reapply of platform-integ-apps

# Re-apply occurs due to reapply flag being raised
2019-10-24 07:14:13.743 102497 INFO sysinv.conductor.manager [-] Reapplying platform-integ-apps app
2019-10-24 07:14:13.747 102497 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app platform-integ-apps
2019-10-24 07:14:14.054 102497 INFO sy...

Read more...

Revision history for this message
Yang Liu (yliu12) wrote :

Removed "tiller pod" portion from title - the tiller status may not be related to this LP since there are more than 1 tiller pods on system and one of them seems to be running.

There's another LP for multiple tiller pods - https://bugs.launchpad.net/starlingx/+bug/1848033

summary: - platform-integ-apps apply-failed after lock/unlock controller - tiller
- pod stuck at MatchNodeSelector
+ platform-integ-apps apply-failed after lock/unlock controller
Yang Liu (yliu12)
description: updated
Revision history for this message
Bob Church (rchurch) wrote :

Attempting a manual reapply fails:

2019-10-24 16:42:21.443 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']
2019-10-24 16:42:21.443 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-k9bw8']

Checking pod status, I observe some rather old pods that are stuck waiting on MatchNodeSelector

kube-system rbd-provisioner-7484d49cf6-k9bw8 0/1 MatchNodeSelector 0 9h <none> controller-0 <none> <none>
kube-system rbd-provisioner-7484d49cf6-vlr62 1/1 Running 1 9h 172.16.192.99 controller-0 <none> <none>
kube-system storage-init-rbd-provisioner-xwnrl 0/1 Completed 0 9h 172.16.192.72 controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-j8q2s 1/1 Running 1 9h 192.168.204.3 controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-ntd2v 0/1 MatchNodeSelector 0 9h <none> controller-0 <none> <none>
kube-system tiller-deploy-d6b59fcb-z4jzg 0/1 MatchNodeSelector 0 10h <none> controller-0 <none> <none>

Looking at the status of the pods. It looks like originally applied pods may have been scheduled and started, but eventually failed due to missing node selector.

2019-10-24 06:58:33.549 110188 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Name: rbd-provisioner-7484d49cf6-k9bw8
Namespace: kube-system
Priority: 0
Node: controller-0/
Start Time: Thu, 24 Oct 2019 06:58:10 +0000
Labels: app=rbd-provisioner
                pod-template-hash=7484d49cf6
Annotations: cni.projectcalico.org/podIP: 172.16.192.78/32
Status: Failed
Reason: MatchNodeSelector
Message: Pod Predicate MatchNodeSelector failed

Another provisioner is started later, but does not appear associated with a particular re-apply

Name: rbd-provisioner-7484d49cf6-vlr62
Namespace: kube-system
Priority: 0
Node: controller-0/192.168.204.3
Start Time: Thu, 24 Oct 2019 07:30:51 +0000
Labels: app=rbd-provisioner
              pod-template-hash=7484d49cf6
Annotations: cni.projectcalico.org/podIP: 172.16.192.99/32
Status: Running

Revision history for this message
Bob Church (rchurch) wrote :

Killing the pod enables the app reapply to work.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl delete pods -nkube-system rbd-provisioner-7484d49cf6-k9bw8

2019-10-24 17:35:50.842 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply started.
2019-10-24 17:35:50.868 96382 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-10-24 17:35:51.017 96382 INFO sysinv.helm.manifest_base [req-ef2900cd-a6a8-460d-9743-27eac307844f admin admin] Delete manifest file /opt/platform/armada/19.10/platform-integ-apps/1.0-8/platform-integ-apps-manifest-del.yaml generated
2019-10-24 17:35:51.017 96382 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-10-24 17:35:51.043 96382 INFO sysinv.conductor.kube_app [-] Armada manifest file has no img tags for chart helm-toolkit
2019-10-24 17:35:51.070 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download started from local registry
2019-10-24 17:35:51.072 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download started from local registry
2019-10-24 17:35:51.438 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/ceph-config-helper:v1.15.0 download succeeded in 0 seconds
2019-10-24 17:35:51.452 96382 INFO sysinv.conductor.kube_app [-] All docker images for application platform-integ-apps were successfully downloaded in 0 seconds
2019-10-24 17:35:51.458 96382 INFO sysinv.conductor.kube_app [-] Armada apply command = /bin/bash -c 'set -o pipefail; armada apply --enable-chart-cleanup --debug /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml --values /overrides/platform-integ-apps/1.0-8/kube-system-rbd-provisioner.yaml --values /overrides/platform-integ-apps/1.0-8/kube-system-ceph-pools-audit.yaml --values /overrides/platform-integ-apps/1.0-8/helm-toolkit-helm-toolkit.yaml --tiller-host tiller-deploy.kube-system.svc.cluster.local | tee /logs/platform-integ-apps-apply.log'
2019-10-24 17:35:52.456 96382 INFO sysinv.conductor.kube_app [-] Starting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:52.613 96382 INFO sysinv.conductor.kube_app [-] processing chart: stx-ceph-pools-audit, overall completion: 100.0%
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/platform-integ-apps/1.0-8/platform-integ-apps-manifest.yaml was successfully applied/re-applied.
2019-10-24 17:35:53.265 96382 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app platform-integ-apps
2019-10-24 17:35:53.546 96382 INFO sysinv.conductor.kube_app [-] Application platform-integ-apps (1.0-8) apply completed.

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (5.1 KiB)

Hit this issue again after lock/unlock on the single node system
2019-10-23_20-00-00

$ system application-list
...
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | apply-failed | operation aborted, check logs for detail

2019-10-24 20:09:03.486 11 ERROR armada.handlers.wait [-] [chart=kube-system-rbd-provisioner]: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada Traceback (most recent call last):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 225, in handle_result
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada result = get_result()
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 236, in <lambda>
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada if (handle_result(chart, lambda: deploy_chart(chart))):
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 214, in deploy_chart
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada chart, cg_test_all_charts, prefix, known_releases)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/chart_deploy.py", line 248, in execute
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada chart_wait.wait(timer)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 134, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada wait.wait(timeout=timeout)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 294, in wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada modified = self._wait(deadline)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada File "/usr/local/lib/python3.6/dist-packages/armada/handlers/wait.py", line 354, in _wait
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada raise k8s_exceptions.KubernetesWatchTimeoutException(error)
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada armada.exceptions.k8s_exceptions.KubernetesWatchTimeoutException: Timed out waiting for pods (namespace=kube-system, labels=(app=rbd-provisioner)). These pods were not ready=['rbd-provisioner-7484d49cf6-dfx6h']
2019-10-24 20:09:03.487 11 ERROR armada.handlers.armada
2019-10-24 20:09:03.489 11 ERROR armada.handlers.armada [-] Chart deploy(s) failed: ['kube-system-rbd-provisioner']
2019-10-24 20:09:03.979 11 ERROR armada.cli [-] Caught internal exception: armada.exceptions...

Read more...

Revision history for this message
Frank Miller (sensfan22) wrote :

Assigning to Bob to triage.

Changed in starlingx:
assignee: nobody → Bob Church (rchurch)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.3.0 / high priority - issue related to container recovery and occurs frequently

Changed in starlingx:
importance: Undecided → High
tags: added: stx.containers
Changed in starlingx:
status: New → Triaged
tags: added: stx.3.0
Yang Liu (yliu12)
tags: added: stx.retestneeded
Revision history for this message
Bob Church (rchurch) wrote :

I applied the following upstream proposed fix: https://github.com/kubernetes/kubernetes/pull/80976

With this incorporated in the build and installed on my test lab, I ran this lab in a continual lock/unlock cycle for over 36hrs without seeing the MatchNodeSelector issue. Prior to this patch inclusion, I’d see the issue in < ~45minutes.

Possible next steps here:

1) Live with the issue occurring periodically and wait for this to land upstream (if it does) and pull it in on the following rebase

2) Patch k8s in StarlingX with this change and evaluate if this fixes the issue over the coming weeks.

Revision history for this message
Frank Miller (sensfan22) wrote :

Changing the tag for this issue from stx.3.0 to stx.4.0 as the solution requires moving to a new version of kubernetes which won't happen until stx.4.0.

Also lowering the priority to medium as a workaround exists: delete any pods stuck in the MatchNodeSelector state.

tags: added: stx.4.0
removed: stx.3.0
Changed in starlingx:
importance: High → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/707571

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/707571
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=2b49e9f3f93c9913961b437d4e51d1e7d46f1222
Submitter: Zuul
Branch: master

commit 2b49e9f3f93c9913961b437d4e51d1e7d46f1222
Author: Robert Church <email address hidden>
Date: Thu Feb 13 10:00:56 2020 -0600

    Workaround for cleaning up MatchNodeSelector pods after host reboot

    Added a K8sPodOperator class to look for and remove Failed pods with a
    MatchNodeSelector reason.

    MatchNodeSelector pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.

    A check for this condition is added in two locations:
    - to the _k8s_application_audit() which is run immediately on
      sysinv-conductor startup and runs every minute. This runs 4 times in a
      5 minute window at startup on a simplex install. This should catch all
      cases unless there is a delay accessing the k8s API that lasts longer
      than 5 minutes at startup.
    - to the application-apply path. This would cover any case that occurs
      after the initial 5 minute conductor startup OR any occurance on a
      non-simplex installation (so far only observed on AIO-SX)

    NOTE: This commit will be reverted once a proper upstream k8S fix is
    provided.

    Related upstream bugs:
    - https://github.com/kubernetes/kubernetes/issues/80745
    - https://github.com/kubernetes/kubernetes/issues/85334

    The following PR was tested and fixed this issue but has not landed
    upstream in a new k8s release:
    - https://github.com/kubernetes/kubernetes/pull/80976

    Change-Id: Ia5418794a44e7821933e8335d5c5db25b58a739f
    Closes-Bug: #1849688
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

Has not seen this issue in recently simplex sanity. Closing.

tags: removed: stx.retestneeded
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716137

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (f/centos8)
Download full text (32.3 KiB)

Reviewed: https://review.opendev.org/716137
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=cb4cf4299c2ec10fb2eb03cdee3f6d78a6413089
Submitter: Zuul
Branch: f/centos8

commit 16477935845e1c27b4c9d31743e359b0aa94a948
Author: Steven Webster <email address hidden>
Date: Sat Mar 28 17:19:30 2020 -0400

    Fix SR-IOV runtime manifest apply

    When an SR-IOV interface is configured, the platform's
    network runtime manifest is applied in order to apply the virtual
    function (VF) config and restart the interface. This results in
    sysinv being able to determine and populate the puppet hieradata
    with the virtual function PCI addresses.

    A side effect of the network manifest apply is that potentially
    all platform interfaces may be brought down/up if it is determined
    that their configuration has changed. This will likely be the case
    for a system which configures SR-IOV interfaces before initial
    unlock.

    A few issues have been encountered because of this, with some
    services not behaving well when the interface they are communicating
    over suddenly goes down.

    This commit makes the SR-IOV VF configuration much more targeted
    so that only the operation of setting the desired number of VFs
    is performed.

    Closes-Bug: #1868584
    Depends-On: https://review.opendev.org/715669
    Change-Id: Ie162380d3732eb1b6e9c553362fe68cbc313ae2b
    Signed-off-by: Steven Webster <email address hidden>

commit 45c9fe2d3571574b9e0503af108fe7c1567007db
Author: Zhipeng Liu <email address hidden>
Date: Thu Mar 26 01:58:34 2020 +0800

    Add ipv6 support for novncproxy_base_url.

    For ipv6 address, we need url with below format
    [ip]:port

    Partial-Bug: 1859641

    Change-Id: I01a5cd92deb9e88c2d31bd1e16e5bce1e849fcc7
    Signed-off-by: Zhipeng Liu <email address hidden>

commit d119336b3a3b24d924e000277a37ab0b5f93aae1
Author: Andy Ning <email address hidden>
Date: Mon Mar 23 16:26:21 2020 -0400

    Fix timeout waiting for CA cert install during ansible replay

    During ansible bootstrap replay, the ssl_ca_complete_flag file is
    removed. It expects puppet platform::config::runtime manifest apply
    during system CA certificate install to re-generate it. So this commit
    updated conductor manager to run that puppet manifest even if the CA cert
    has already installed so that the ssl_ca_complete_flag file is created
    and makes ansible replay to continue.

    Change-Id: Ic9051fba9afe5d5a189e2be8c8c2960bdb0d20a4
    Closes-Bug: 1868585
    Signed-off-by: Andy Ning <email address hidden>

commit 24a533d800b2c57b84f1086593fe5f04f95fe906
Author: Zhipeng Liu <email address hidden>
Date: Fri Mar 20 23:10:31 2020 +0800

    Fix rabbitmq could not bind port to ipv6 address issue

    When we use Armada to deploy openstack service for ipv6, rabbitmq
    pod could not start listen on [::]:5672 and [::]:15672.
    For ipv6, we need an override for configuration file.

    Upstream patch link is:
    https://review.opendev.org/#/c/714027/

    Test pass for deploying rabbitmq service on both ipv...

tags: added: in-f-centos8
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to config (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/721163

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to config (master)

Reviewed: https://review.opendev.org/721163
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=5c1361b0e81f53349d0d6715f7b627b4456147a0
Submitter: Zuul
Branch: master

commit 5c1361b0e81f53349d0d6715f7b627b4456147a0
Author: Robert Church <email address hidden>
Date: Sun Apr 19 06:23:44 2020 -0400

    Update MatchNodeSelector recovery logic for NodeAffinity status

    NodeAffinity pods related to applications will not be removed by
    K8S automatically. These pods may block subsequent application applies
    as tiller expects these pods to be in a non failed state.

    This update now will look for NodeAffinity pods when the sysinv
    conductor starts. This is no longer limited to simplex nodes. This
    behavior is now observed on simplex and duplex controller configurations
    as of the upversion to k8s v1.18.1.

    Change-Id: I6384ffd1d14ac105e26b83c02aaa8f090e1fdde1
    Story: 2006999
    Task: 39475
    Related-Bug: #1849688
    Signed-off-by: Robert Church <email address hidden>

Revision history for this message
Peng Peng (ppeng) wrote :

Verify 1873933 on

Lab: WP_8_12
Load: 2020-04-25_13-17-56

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to config (f/centos8)

Related fix proposed to branch: f/centos8
Review: https://review.opendev.org/729812

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to config (f/centos8)
Download full text (37.5 KiB)

Reviewed: https://review.opendev.org/729812
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=539d476456277c22d0dcbc3cbbc832e623242264
Submitter: Zuul
Branch: f/centos8

commit 320cc40de8518787c2be234d7fdf88ec0a462df2
Author: Don Penney <email address hidden>
Date: Wed May 13 13:06:11 2020 -0400

    Add auto-versioning to starlingx/config packages

    This update makes use of the PKG_GITREVCOUNT variable to auto-version
    the packages in this repo.

    Change-Id: I3a2c8caeb4b4647608978b1f2ccfcf0661508803
    Depends-On: https://review.opendev.org/727837
    Story: 2006166
    Task: 39766
    Signed-off-by: Don Penney <email address hidden>

commit d9f2aea0fb228ed69eb9c9262e29041eedabc15d
Author: Sharath Kumar K <email address hidden>
Date: Wed Apr 22 16:22:22 2020 +0200

    De-branding in starlingx/config: CGCS -> StarlingX

    1. Rename CGCS to StarlingX for .spec files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

    Please note, doing de-brand changes in batches, this is batch9 changes.

    Story: 2006387
    Task: 39524

    Change-Id: Ia1fe0f2baafb78c974551100f16e6a7d99882f15
    Signed-off-by: Sharath Kumar K <email address hidden>

    De-branding in starlingx/config: CGCS -> StarlingX

    1. Rename CGCS to StarlingX for .spec file
    2. Rename TIS to StarlingX for .service files

    Test:
    After the de-brand change, bootimage.iso has been built in the flock
    Layer and installed on the dev machine to validate the changes.

    Please note, doing de-brand changes in batches, this is batch10 changes.

    Story: 2006387
    Task: 36202

    Change-Id: I404ce0da2621495175ad31489e9ad6f7b0211e26
    Signed-off-by: Sharath Kumar K <email address hidden>

commit d141e954fa6bbf688929ec90d1b6604a97792c43
Author: Teresa Ho <email address hidden>
Date: Tue Mar 31 10:08:57 2020 -0400

    Sysinv extensions for FPGA support

    This update adds cli and restapi to support FPGA device
    programming.

    CLI commands:
    system device-image-apply
    system device-image-create
    system device-image-delete
    system device-image-list
    system device-image-remove
    system device-image-show
    system device-image-state-list
    system device-label-list
    system host-device-image-update
    system host-device-image-update-abort
    system host-device-label-assign
    system host-device-label-list
    system host-device-label-remove

    Story: 2006740
    Task: 39498

    Change-Id: I556c2e7a51b3931b5a66ab27b67f51e3a8aebd9f
    Signed-off-by: Teresa Ho <email address hidden>

commit 491cca42ed854d2cb3ee3646b93c56a4f45f563c
Author: Elena Taivan <email address hidden>
Date: Wed Apr 29 11:25:26 2020 +0000

    Qcow2 conversion to raw can be done using 'image-conversion' filesystem

    1. Conversion filesystem can be added before/after
       stx-openstack is applied
    2. If conversion filesystem is added after stx-openstack
       is applied, changes to stx-openstack will only take effec...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.