Bug #1856064 “Active controller became degraded after lock/unloc...” : Bugs : StarlingX

Revision history for this message

Peng Peng (ppeng) wrote on 2019-12-11:

#1

ALL_NODES_20191211.152313.tar Edit (66.4 MiB, application/x-tar)

Yang Liu (yliu12) on 2019-12-11

description:

updated

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-12-11:

#2

Waiting from triage by Dan to understand if this issue is introduced by recent code changes related to: https://review.opendev.org/#/c/695917/

Changed in starlingx:
status:	New → Triaged
tags:	added: stx.config stx.storage
Changed in starlingx:
assignee:	nobody → Dan Voiculeasa (dvoicule)
status:	Triaged → New
tags:	removed: stx.config

Yang Liu (yliu12) on 2019-12-12

tags:

added: stx.retestneeded

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-12-16:

#3

As per Frank Miller, this was introduced by https://review.opendev.org/#/c/695917/
Given that this change is in stx.3.0, we need this LP to be fixed in the next stx.3.0 maintenance release.

Changed in starlingx:
importance:	Undecided → High
status:	New → Triaged
tags:	added: stx.3.0

Revision history for this message

Peng Peng (ppeng) wrote on 2019-12-23:

#4

ALL_NODES_20191223.094744.tar Edit (72.4 MiB, application/x-tar)

Issue seems reproduced on
Lab: WCP_71_75
Load: 2019-12-22_20-00-00

After compute node force reboot, activer controller became degraded.

[2019-12-23 09:13:41,521] 166 INFO MainThread host_helper.reboot_hosts:: Rebooting compute-0
[2019-12-23 09:13:41,521] 311 DEBUG MainThread ssh.send :: Send 'sudo reboot -f'

Revision history for this message

Frank Miller (sensfan22) wrote on 2020-01-20:

#5

Dan has a proposed fix in stx-ceph:
https://github.com/starlingx-staging/stx-ceph/pull/36

Ghada Khalil (gkhalil) on 2020-02-12

tags:

added: stx.4.0

Revision history for this message

Dan Voiculeasa (dvoicule) wrote on 2020-02-19:

#6

Stuck peering detection is based on ceph health output.
When an osd goes down, the pair may or may not go into stuck peering
for a brief moment. The issue is that ceph reports the time since the
osd peered succesfully instead of time since the pair got down.

The desired behavior is to detect real stuck peering above a threshold.

Implemented.

Changed in starlingx:
status:	Triaged → Fix Released

Revision history for this message

Peng Peng (ppeng) wrote on 2020-02-21:

#7

ALL_NODES_20200221.190513.tar Edit (76.3 MiB, application/x-tar)

Lab: WCP_63_66
Load: 2020-02-19_20-00-00

log attached

Yang Liu (yliu12) on 2020-02-26

Changed in starlingx:
status:	Fix Released → Confirmed

Revision history for this message

Yang Liu (yliu12) wrote on 2020-02-28:

#8

Download full text (5.3 KiB)

Just to be clear, the logs Peng Peng attached in comment #7, is for verification failure.

Email thread pasted:

Hi Dan,

Please put your comment in the ticket. What is you suggestion for next step? Should we reopen it for more investigation?

Thanks,
Peng

From: Voiculeasa, Dan
Sent: Monday, February 24, 2020 12:32 PM
To: Peng, Peng
Cc: Liu, Yang (YOW)
Subject: Re: LP-1856064 is reproduced

Hello,

Yes, the fix for the identified issue when investigating LP-1856064 is in that load.
It correctly detects stuck peering OSDs that are not false positives determined by host-lock operation.

Not sure if the issue at hand is related to lock-unlock. Seems an osd is in a wrong state.

var/log/bash.log:2020-02-21T15:16:12.000 controller-0 -sh: info HISTORY: PID=3318225 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock controller-0
var/log/bash.log:2020-02-21T15:17:11.000 controller-0 -sh: info HISTORY: PID=3318225 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock controller-1
var/log/bash.log:2020-02-21T15:25:49.000 controller-0 -sh: info HISTORY: PID=3318225 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock compute-0

log/bash.log:2020-02-21T15:19:10.000 controller-0 -sh: info HISTORY: PID=3318225 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-1
log/bash.log:2020-02-21T15:27:39.000 controller-0 -sh: info HISTORY: PID=3318225 UID=42425 system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock compute-0

# Successful restart on osd.1 of controller-0

020-02-21 15:39:02.413 /etc/init.d/ceph osd.1 WARN: Detected stuck peering for 202 seconds
2020-02-21 15:39:02.427 /etc/init.d/ceph-init-wrapper osd.1 INFO: Restarting OSD stuck peering
2020-02-21 15:39:02.947 /etc/init.d/ceph osd.1 INFO: Stopping process
2020-02-21 15:39:04.012 /etc/init.d/ceph osd.1 INFO: Process stopped, setting state to STOPPED
2020-02-21 15:39:04.151 /etc/init.d/ceph mgr.controller-0 WARN: /var/lib/ceph/mgr/ceph-controller-0/sysvinit file is missing
2020-02-21 15:39:04.569 /etc/init.d/ceph osd.1 INFO: Process STARTED successfully, waiting for it to become OPERATIONAL
2020-02-21 15:39:05.473 /etc/init.d/ceph-init-wrapper - INFO: Ceph START command received
...

For the first part there is an elegant solution: force ongoing connections to initiate from the static ip of a controller. This behavior is ensured when preferred_lft 0 is set on an ip address of an interface (as described here: http://www.davidc.net/networking/ipv6-source-address-selection-linux).
This would allow us to assign a floating IP and a static ip from the same subnet to the management network but at the same time make sure that all outgoing connections to ip's on other hosts in this subnet are initiated from the static ip.

In the example below we have the active controller with 3 ip addresses on the management interface, one static and two floating. The floating IPs move from controller to controller on swact:
1. static: fd01::2/64
2. floating: fd01::1/64
3. floating:  fd01::4/64

Connections w/o the option:
====================
[And give example here]

We can see that there are connections opened from the floating IPs.

Connection with the option:
====================
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 :::6800                 :::*                    LISTEN      2587586/ceph-osd    
tcp6       0      0 :::6801                 :::*                    LISTEN      2587586/ceph-osd    
tcp6       0      0 :::6802                 :::*                    LISTEN      2587586/ceph-osd    
tcp6       0      0 :::6803                 :::*                    LISTEN      2587586/ceph-osd    
tcp6       0      0 fd01::2:6789            :::*                    LISTEN      2587301/ceph-mon    
tcp6       0      0 fd01::2:40924           fd01::3:6789            ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:40888           fd01::3:6789            ESTABLISHED 2587301/ceph-mon    
tcp6       0      0 fd01::2:45520           fd01::3:6800            ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:43872           fd01::3:6804            ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:6801            fd01::3:47520           ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:6802            fd01::3:53458           ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:6803            fd01::3:48860           ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:48158           fd01::15f6:1bc9:22:6789 ESTABLISHED 2587301/ceph-mon    
tcp6       0      0 fd01::2:58020           fd01::3:6803            ESTABLISHED 2587586/ceph-osd    
tcp6       0      0 fd01::2:34342           fd01::3:6800            ESTABLISHED 81962/ceph-mgr      
tcp6       0      0 fd01::2:45484           fd01::3:6800            ESTABLISHED 2587301/ceph-mon    
tcp6       0      0 fd01::2:48000           fd01::15f6:1bc9:22:6789 ESTABLISHED 81962/ceph-mgr

How to configure the option by hand
===================

Result:
6: vlan10@enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 08:00:27:66:a6:e8 brd ff:ff:ff:ff:ff:ff
    inet6 fd01::1/64 scope global deprecated
       valid_lft forever preferred_lft 0sec
    inet6 fd01::4/64 scope global deprecated
       valid_lft forever preferred_lft 0sec
    inet6 fd01::2/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe66:a6e8/64 scope link
       valid_lft forever preferred_lft forever

Revision history for this message

Paul-Ionut Vaduva (pvaduva) wrote on 2020-03-18:

#14

Connections w/o the option:
====================

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp6 0 0 :::6800 :::* LISTEN 2856080/ceph-osd
tcp6 0 0 :::6801 :::* LISTEN 2856080/ceph-osd
tcp6 0 0 :::6802 :::* LISTEN 2856080/ceph-osd
tcp6 0 0 :::6803 :::* LISTEN 2856080/ceph-osd
tcp6 0 0 fd01::2:6789 :::* LISTEN 2855692/ceph-mon
tcp6 0 0 fd01::1:6802 fd01::3:45354 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:33368 fd01::3:6800 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:42210 fd01::15f6:1bc9:22:6789 ESTABLISHED 2855692/ceph-mon
tcp6 0 0 fd01::1:33338 fd01::3:6800 ESTABLISHED 2855692/ceph-mon
tcp6 0 0 fd01::1:57248 fd01::3:6804 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:48840 fd01::3:6802 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:54172 fd01::3:6803 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:38900 fd01::3:6789 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:6803 fd01::3:47868 ESTABLISHED 2856080/ceph-osd
tcp6 0 0 fd01::1:38868 fd01::3:6789 ESTABLISHED 2855692/ceph-mon
tcp6 0 0 fd01::2:34342 fd01::3:6800 ESTABLISHED 81962/ceph-mgr
tcp6 0 0 fd01::2:48000 fd01::15f6:1bc9:22:6789 ESTABLISHED 81962/ceph-mgr

We can see that there are connections opened from the floating IPs.

How to configure the option by hand
===================
Just add preferred_lft 0 flag when you add a floating ip address to an interface
sudo ip -6 addr add fd01::1/64 dev vlan10 preferred_lft 0

Connections w/o the option:
====================

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp6       0      0 :::6800                 :::*                    LISTEN      2856080/ceph-osd
tcp6       0      0 :::6801                 :::*                    LISTEN      2856080/ceph-osd
tcp6       0      0 :::6802                 :::*                    LISTEN      2856080/ceph-osd
tcp6       0      0 :::6803                 :::*                    LISTEN      2856080/ceph-osd
tcp6       0      0 fd01::2:6789            :::*                    LISTEN      2855692/ceph-mon
tcp6       0      0 fd01::1:6802            fd01::3:45354           ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:33368           fd01::3:6800            ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:42210           fd01::15f6:1bc9:22:6789 ESTABLISHED 2855692/ceph-mon
tcp6       0      0 fd01::1:33338           fd01::3:6800            ESTABLISHED 2855692/ceph-mon
tcp6       0      0 fd01::1:57248           fd01::3:6804            ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:48840           fd01::3:6802            ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:54172           fd01::3:6803            ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:38900           fd01::3:6789            ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:6803            fd01::3:47868           ESTABLISHED 2856080/ceph-osd
tcp6       0      0 fd01::1:38868           fd01::3:6789            ESTABLISHED 2855692/ceph-mon
tcp6       0      0 fd01::2:34342           fd01::3:6800            ESTABLISHED 81962/ceph-mgr
tcp6       0      0 fd01::2:48000           fd01::15f6:1bc9:22:6789 ESTABLISHED 81962/ceph-mgr

We can see that there are connections opened from the floating IPs.

How to configure the option by hand
===================
Just add preferred_lft 0 flag when you add a floating ip address to an interface 
sudo ip -6 addr add fd01::1/64 dev vlan10 preferred_lft 0

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-25:

#15

Fix proposed to branch: master
Review: https://review.opendev.org/714812

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-25: Change abandoned on integ (master)

#16

Change abandoned by Paul-Ionut Vaduva (<email address hidden>) on branch: master
Review: https://review.opendev.org/714812
Reason: Modify the puppet as opposed to ocf libs.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-26: Fix proposed to stx-puppet (master)

#17

Fix proposed to branch: master
Review: https://review.opendev.org/715120

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-29: Fix merged to integ (master)

#18

Reviewed: https://review.opendev.org/712117
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=bed7388b678b9eda0d06b4d16fb00711741f9ef0
Submitter: Zuul
Branch: master

commit bed7388b678b9eda0d06b4d16fb00711741f9ef0
Author: Paul Vaduva <email address hidden>
Date: Tue Mar 10 12:05:31 2020 -0400

Release FDs when stuck peering recovery

    During stuck peering recovery if file descriptors are
    not released the state machine does not advance to
    OPERATIONAL state

Partial-bug: 1856064

Change-Id: I3fba7be661ebf223eac63608574323ad98d33b75
Signed-off-by: Paul Vaduva <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-31: Fix proposed to integ (f/centos8)

#19

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716162

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-31: Fix merged to stx-puppet (master)

#20

Reviewed: https://review.opendev.org/715120
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=07edad67cc55caf4726d3db3529c8e71fff6254e
Submitter: Zuul
Branch: master

commit 07edad67cc55caf4726d3db3529c8e71fff6254e
Author: Paul Vaduva <email address hidden>
Date: Thu Mar 26 03:09:47 2020 +0200

Set preferred_lft to 0 for mgmt and nfs floating ips

    For ipv6 the only way to prefer the fixed ip for
    outgoing connection is to set preferred_lft to 0 for
    the floating ips

    Change-Id: I13573ac4628db1fc49146f353d7eb2c96eb1aff0
    Closes-bug: 1856064
    Signed-off-by: Paul Vaduva <email address hidden>

Changed in starlingx:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-31: Fix merged to integ (f/centos8)

#21

Download full text (10.7 KiB)

Reviewed: https://review.opendev.org/716162
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=246f33226dbb50a4c5e86d497df745120ca9e0e4
Submitter: Zuul
Branch: f/centos8

commit a745a5b6f8a02b74f69f828f14960e97a758853c
Author: Jim Somerville <email address hidden>
Date: Fri Mar 20 10:36:14 2020 -0400

Kernel: Workaround broken bios affecting iommu

    Problem:
    Broken bios creates inaccurate DMAR tables,
    reporting some bridges as having endpoint types.
    This causes IOMMU initialization to bail
    out early with an error code, the result of
    which is vfio not working correctly.
    This is seen on some Skylake based Wolfpass
    server platforms with up-to-date bios installed.

    Solution:
    Instead of just bailing out of IOMMU
    initialization when such a condition is found,
    we report it and continue. The IOMMU ends
    up successfully initialized anyway. We do this
    only on platforms that have the Skylake bridges
    where this issue has been seen.

This change is inspired by a similar one posted by
Lu Baolu of Intel Corp to lkml

https://lkml.org/lkml/2019/12/24/15

    Change-Id: Ief2df7099b6118eab7f99d5531616926a7a3eb27
    Closes-Bug: 1847335
    Signed-off-by: Jim Somerville <email address hidden>

commit 1435fe178ab88aa2b77970a3c07e8a907477a654
Author: Jim Somerville <email address hidden>
Date: Mon Mar 16 16:16:20 2020 -0400

Build mpt2sas and mpt3sas drivers as modules

    History:
    Back in the day, we didn't have an initramfs
    to allow us to load disk drivers as modules. All
    disk drivers had to be built-in. In CentOS 7.3,
    the mpt2sas and mpt3sas driver code was reorganized
    to allow for a common code base. But along with that,
    those drivers would only now build as modules. We
    created a patch which involved taking a snapshot of
    mpt driver code, and massaged it all into building
    as built-in drivers.

    Problem:
    That old code snapshot along with the fact
    that those two drivers initialize without their
    associated hardware being present (they are built-in),
    seems to cause interference with some other LSI raid
    controllers, namely Harpoon in AVAGO MR9460-8i via a
    Huawei enclosure.

    Solution:
    Simply revert to building those two mptsas drivers as
    modules, the way CentOS intended. They will reside
    on initramfs and be loaded automatically if the
    appropriate hardware is present. With these drivers now
    out of the way, the problematic raid controller works
    fine, driven by the megaraid_sas driver.

    Change-Id: I054c2396df4e659c324e70bffcf3940ad93c9354
    Closes-Bug: 1866293
    Signed-off-by: Jim Somerville <email address hidden>

commit bed7388b678b9eda0d06b4d16fb00711741f9ef0
Author: Paul Vaduva <email address hidden>
Date: Tue Mar 10 12:05:31 2020 -0400

Release FDs when stuck peering recovery

    During stuck peering recovery if file descriptors are
    not released the state machine does not advance to
    OPERATIONAL state

Partial-bug: 1856064

Change-Id: I3fba7be661ebf22...

Reviewed:  https://review.opendev.org/716162
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=246f33226dbb50a4c5e86d497df745120ca9e0e4
Submitter: Zuul
Branch:    f/centos8

commit a745a5b6f8a02b74f69f828f14960e97a758853c
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Fri Mar 20 10:36:14 2020 -0400

Kernel: Workaround broken bios affecting iommu
    
    Problem:
    Broken bios creates inaccurate DMAR tables,
    reporting some bridges as having endpoint types.
    This causes IOMMU initialization to bail
    out early with an error code, the result of
    which is vfio not working correctly.
    This is seen on some Skylake based Wolfpass
    server platforms with up-to-date bios installed.
    
    Solution:
    Instead of just bailing out of IOMMU
    initialization when such a condition is found,
    we report it and continue.  The IOMMU ends
    up successfully initialized anyway.  We do this
    only on platforms that have the Skylake bridges
    where this issue has been seen.
    
    This change is inspired by a similar one posted by
    Lu Baolu of Intel Corp to lkml
    
    https://lkml.org/lkml/2019/12/24/15
    
    Change-Id: Ief2df7099b6118eab7f99d5531616926a7a3eb27
    Closes-Bug: 1847335
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit 1435fe178ab88aa2b77970a3c07e8a907477a654
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Mon Mar 16 16:16:20 2020 -0400

Build mpt2sas and mpt3sas drivers as modules
    
    History:
    Back in the day, we didn't have an initramfs
    to allow us to load disk drivers as modules.  All
    disk drivers had to be built-in.  In CentOS 7.3,
    the mpt2sas and mpt3sas driver code was reorganized
    to allow for a common code base.  But along with that,
    those drivers would only now build as modules.  We
    created a patch which involved taking a snapshot of
    mpt driver code, and massaged it all into building
    as built-in drivers.
    
    Problem:
    That old code snapshot along with the fact
    that those two drivers initialize without their
    associated hardware being present (they are built-in),
    seems to cause interference with some other LSI raid
    controllers, namely Harpoon in AVAGO MR9460-8i via a
    Huawei enclosure.
    
    Solution:
    Simply revert to building those two mptsas drivers as
    modules, the way CentOS intended.  They will reside
    on initramfs and be loaded automatically if the
    appropriate hardware is present. With these drivers now
    out of the way, the problematic raid controller works
    fine, driven by the megaraid_sas driver.
    
    Change-Id: I054c2396df4e659c324e70bffcf3940ad93c9354
    Closes-Bug: 1866293
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit bed7388b678b9eda0d06b4d16fb00711741f9ef0
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Tue Mar 10 12:05:31 2020 -0400

Release FDs  when stuck peering recovery
    
    During stuck peering recovery if file descriptors are
    not released the state machine does not advance to
    OPERATIONAL state
    
    Partial-bug: 1856064
    
    Change-Id: I3fba7be661ebf223eac63608574323ad98d33b75
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit f80a248eda3990bb7788e80d102cc5f831551ee5
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Tue Mar 3 09:09:54 2020 +0000

Integrate the Netapp Trident installer in build
    
    The Netapp Trident installer comes in the form of
    a tarball. That tarball includes an example folder,
    an extras folder and the tridentctl installer itself.
    
    From this tarball all we need is the installer itself,
    so we only include it into an RPM to be installed.
    
    Change-Id: I91c6be915b097c934569469c9e0a7a16ab3e8177
    Story: 2007391
    Task: 38986
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit eeaaaf5796f3a028b9712065d01f249fd20f2e02
Author: albailey <Al.Bailey@windriver.com>
Date:   Wed Mar 4 17:02:35 2020 -0600

Fix build break for python-html5lib and python-webencodings
    
    The lst files which stage the downloaded tarballs were altering the
    internal directory structure during the mirror download phase, which
    meant the %setup commands in the spec files were failing.
    
    This change adapts the spec files to the files downloaded from
    the mirror.
    
    Change-Id: I8031c8af0f424a8e19bee062a4df6ddf3383f38d
    Closes-Bug: 1866133
    Signed-off-by: albailey <Al.Bailey@windriver.com>

commit fdaa4c4a9dd03d705c47f6ecc7069b10fe9be9f8
Author: Teresa Ho <teresa.ho@windriver.com>
Date:   Mon Mar 2 22:29:15 2020 -0500

Add dependency required by oidc-auth
    
    The oidc-auth CLI requires the libraries python-mechanize,
    python-html5lib and python-webencodings.
    These libraries do not have RPMs available therefore they
    need to be packaged here.
    
    Story: 2006711
    Task: 38919
    Depends-On: https://review.opendev.org/#/c/710991/
    
    Change-Id: Ife8719a70388bc9a0e96149059fd5cc2c1fb232a
    Signed-off-by: Teresa Ho <teresa.ho@windriver.com>

commit 4761e4f3fa00e33dc3c74dbb76bf9d3f24aea502
Author: Scott Little <scott.little@windriver.com>
Date:   Wed Oct 16 15:18:19 2019 -0400

Secure boot no longer working
    
    Secure Boot's hasn't been tested since July 2018
    
    The principle players in the Secure Boot chain of trust are Shim,
    Grub, and the Linux kernel.  All three components have seen multiple
    upgrades since the last test.
    
    A new build option has been added to shim, (ENABLE_SHIM_CERT) that
    enables/disables the support for an embedded shim key. It defaults
    to disabled.  It also controls the generation of a random shim key,
    and the build time signing of fallback and MokManager components.
    Since we don't want a random shim key (reproducable builds), and we do
    signing as a post build step, leaving it disabled seemed like the correct
    setting initially... until it's function to disable shim keys entirely
    was discovered.
    
    This update reworks the shim patch so that we can embed a prebuilt
    shim key, and still have shim key functionality active.
    
    Closes-Bug: 1864245
    Change-Id: Ibcb6bcfe3060ce0b3e2c2f3c23908bb7127b0ccd
    Signed-off-by: Scott Little <scott.little@windriver.com>

commit f466dc94e721311ac922474a3bd0080776e3cf9a
Author: Chris Friesen <chris.friesen@windriver.com>
Date:   Tue Feb 18 22:32:54 2020 -0500

Create Docker image for running Intel N3000 FPGA tools
    
    This creates a Docker image for the purpose of running the tools needed
    for programming the N3000 FPGA and querying information from it.
    
    The expectation is that this will be run from sysinv, similar to how the
    airship-armada tools are run.
    
    The basic usage of these tools are covered in the following document
    from Intel:
    https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-ias-n3000.pdf
    
    Change-Id: I94a7fb9fe6b348ced014666b37849691f72766e8
    Story: 2006740
    Task:  38829
    Signed-off-by: Chris Friesen <chris.friesen@windriver.com>

commit 357b18cacc372b56112e1a4b33b6881a89293406
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Thu Jan 16 16:37:25 2020 -0500

Upgrade opae fpga module to version 2.0.1-6
    
    We do not build the regmap_mmio module portion of it
    as that functionality is already in the kernel.
    
    Change-Id: Ia5c33ae9f72d3a7cdf7b771f353551ae664ccc76
    Story: 2006740
    Task: 38837
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit e326f7dac81955c43ed2aeb20f30e4bca211be88
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Fri Feb 14 17:04:11 2020 -0500

Upgrade i40e driver to 2.10.19.82
    
    Advised by Intel to move to this version for
    proper supported N3000 fpga behavior.
    
    Change-Id: Ice87cdcdec80c2533a5d01fd5fcac54ee91f6723
    Story: 2006740
    Task: 38760
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit 0e1ad4bbcdc8806ae283669e22a06e5833982baf
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date:   Wed Feb 12 09:10:52 2020 +0800

Fix containerd cannot pull image with old registry-token-server
    
    registry-token-server has been updated to support POST method.
    But for upgrade case(stx3.0 upgrade to stx4.0), containerd need
    talk with old registry-token-server which doesn't support POST
    method, and 400 error code will be returned. For this case,
    containerd still need fallback to GET method.
    
    Story: 2006145
    Task: 38763
    Change-Id: I9834d1afae406c7e1f80bea6034931d854d4e868
    Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>

commit 8856c3f2b25ed8570205f955bcd8ac21185e9f26
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Mon Jan 13 16:19:03 2020 -0500

Enable REGMAP_MMIO in kernel
    
    The newer versions of the opae fpga driver requires
    it.  As it is a hidden config option, we get it
    enabled indirectly by setting MFD_SYSCON.
    
    Change-Id: I9d97ab642ffffc2dff43ff1c1a591b0a52ad9c41
    Story: 2006740
    Task: 38742
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit 2821c9c6d4d15ae7c491e9110a6d857b63852855
Author: Don Penney <don.penney@windriver.com>
Date:   Mon Feb 10 17:44:52 2020 -0500

Add support for IPv6 and https to anaconda-preexec
    
    The anaconda-preexec script runs ahead of Anaconda to cache the IP
    address of the network boot server in the /etc/hosts file, to avoid
    further DNS queries during installation.
    
    This update extends the checks to add support for IPv6 and to allow
    for https network access.
    
    Change-Id: Iadc86cbf971d6ecb0c478a2a16d5040d18a82ef0
    Story: 2006980
    Task: 38728
    Signed-off-by: Don Penney <don.penney@windriver.com>

commit 4f388e940d2a33a9bb9e7b64fc6beb16b62255c5
Author: Bin Qian <bin.qian@windriver.com>
Date:   Thu Feb 6 10:49:58 2020 -0500

Adding job to upload commits to GitHub
    
    Add job to publish integ repo to GitHub
    
    Change-Id: I052a271a91d836ec4f644750517d0e12d352ac1b
    Story: 2007252
    Task: 38681
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 53914fc55cfc62ee15793f32e626ed7322008d93
Author: Kristal Dale <kristal.dale@intel.com>
Date:   Fri Jan 17 14:10:55 2020 -0800

Update landing pages for docs and release notes:
    
    - Use updated project name in titles/text
    - Correct text for link to Storyboard (docs)
    - Correct capitalization in section headings
    - Correct formatting for section headings
    - Update project name in link to release notes, api-ref
    - Update project name in config for docs/releasenotes/api-ref
    
    Story:2007193
    Task:38350
    
    Change-Id: I3c44012bc67136aed1e12c63037c822db7cdc4a6
    Signed-off-by: Kristal Dale <kristal.dale@intel.com>

tags:

added: in-f-centos8

Revision history for this message

Peng Peng (ppeng) wrote on 2020-04-21:

#22

Verified on
Lab: WCP_71_75
Load: 2020-04-20_20-00-00

tags:

removed: stx.retestneeded

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2020-04-27:

#23

Paul/Frank, This LP is marked as gating for stx.3.0. Please cherry-pick the code changes to the stx.3.0 branch if applicable or add a note explaining why it shouldn't be cherry-picked.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-21: Fix proposed to stx-puppet (f/centos8)

#24

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/729825

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-21: Fix merged to stx-puppet (f/centos8)

#25

Download full text (16.7 KiB)

Reviewed: https://review.opendev.org/729825
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=d4617fbad74a05f2af81ee85a47565083991e6f8
Submitter: Zuul
Branch: f/centos8

commit 4134023ab84d8a635b118d5e3ff26ade3bbe535b
Author: Sharath Kumar K <email address hidden>
Date: Thu May 7 10:08:11 2020 +0200

Tox and Zuul job for the bandit code scan in stx/stx-puppet

    Setting up the bandit tool for the scanning of HIGH severity issues
    in the python codes under Starlingx/stx-puppet folder.
    Expecting this merge will enable zuul job for CI/CD of bandit scan.

    Configuration files:
    1. tox.ini for adding bandit environment and command.
    2. test-requirements.txt for adding bandit version.
    3. .zuul.yaml file for adding bandit job and configuring under
       check job to run code scan every time before code commit.

    Test:
    Run tox -e bandit command inside the fault folder to validate the
    bandit scan and result.

    Story: 2007541
    Task: 39687
    Depends-On: https://review.opendev.org/#/c/721294/

Change-Id: I2982268db2b5e75feeb287bc95420fedc9b0d816
Signed-off-by: Sharath Kumar K <email address hidden>

commit 65daac29e4635f32a57e80cd18f96fd59dc8ebe0
Author: Bin Qian <email address hidden>
Date: Tue May 12 22:39:21 2020 -0400

DC cert manifest should only apply to controller nodes

    DC cert manifest should only apply to controller nodes on system
    controller.
    This fix is for DC with worker nodes in central cloud.

    Change-Id: I4233509a6f0afb3013c01e81dea6f655d9e15371
    Closes-Bug: 1878260
    Signed-off-by: Bin Qian <email address hidden>

commit 04a3cb8cbad9b1700286c5de67aa5d974cf54400
Author: Elena Taivan <email address hidden>
Date: Wed Apr 29 08:44:13 2020 +0000

Changing permissions for conversion folder

Adding writing permissions to '/opt/conversion' mountpoint
so openstack image conversion can happen there.

    Change-Id: Id1a91db6570dcbed3b8068e79e72f5bb800f24ad
    Partial-bug: 1819688
    Signed-off-by: Elena Taivan <email address hidden>

commit 4e9153cf234e714e4bbc9a9eb3d9b55b2828145a
Author: Tao Liu <email address hidden>
Date: Mon May 4 14:30:30 2020 -0500

Move subcloud audit to separate process

Subcloud audit is being removed from the dcmanager-manager
process and it is running in dcmanager-audit process.

This update adds associated puppet config.

    Story: 2007267
    Task: 39640
    Depends-On: https://review.opendev.org/#/c/725627/

Change-Id: Idd2e675126a01d6113597646ddd9eb4a0bc5be44
Signed-off-by: Tao Liu <email address hidden>

commit b793518f65ae932f3974ff85b797f505b5ef1c2a
Author: Robert Church <email address hidden>
Date: Wed Apr 29 12:49:04 2020 -0400

Ensure containerd binds to the loopback interface

Set the stream_server_address to bind to the loopback interface with a
value of "127.0.0.1" for IPv4 and "::1" for IPv6.

    Without setting the stream_server_address in config.toml, containerd was
    binding to the OAM interface. Under most situations this resulted in
    containe...

Reviewed:  https://review.opendev.org/729825
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=d4617fbad74a05f2af81ee85a47565083991e6f8
Submitter: Zuul
Branch:    f/centos8

commit 4134023ab84d8a635b118d5e3ff26ade3bbe535b
Author: Sharath Kumar K <sharath.kumar@intel.com>
Date:   Thu May 7 10:08:11 2020 +0200

Tox and Zuul job for the bandit code scan in stx/stx-puppet
    
    Setting up the bandit tool for the scanning of HIGH severity issues
    in the python codes under Starlingx/stx-puppet folder.
    Expecting this merge will enable zuul job for CI/CD of bandit scan.
    
    Configuration files:
    1. tox.ini for adding bandit environment and command.
    2. test-requirements.txt for adding bandit version.
    3. .zuul.yaml file for adding bandit job and configuring under
       check job to run code scan every time before code commit.
    
    Test:
    Run tox -e bandit command inside the fault folder to validate the
    bandit scan and result.
    
    Story: 2007541
    Task: 39687
    Depends-On: https://review.opendev.org/#/c/721294/
    
    Change-Id: I2982268db2b5e75feeb287bc95420fedc9b0d816
    Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>

commit 65daac29e4635f32a57e80cd18f96fd59dc8ebe0
Author: Bin Qian <bin.qian@windriver.com>
Date:   Tue May 12 22:39:21 2020 -0400

DC cert manifest should only apply to controller nodes
    
    DC cert manifest should only apply to controller nodes on system
    controller.
    This fix is for DC with worker nodes in central cloud.
    
    Change-Id: I4233509a6f0afb3013c01e81dea6f655d9e15371
    Closes-Bug: 1878260
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 04a3cb8cbad9b1700286c5de67aa5d974cf54400
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Apr 29 08:44:13 2020 +0000

Changing permissions for conversion folder
    
    Adding writing permissions to '/opt/conversion' mountpoint
    so openstack image conversion can happen there.
    
    Change-Id: Id1a91db6570dcbed3b8068e79e72f5bb800f24ad
    Partial-bug: 1819688
    Signed-off-by: Elena Taivan <elena.taivan@windriver.com>

commit 4e9153cf234e714e4bbc9a9eb3d9b55b2828145a
Author: Tao Liu <tao.liu@windriver.com>
Date:   Mon May 4 14:30:30 2020 -0500

Move subcloud audit to separate process
    
    Subcloud audit is being removed from the dcmanager-manager
    process and it is running in dcmanager-audit process.
    
    This update adds associated puppet config.
    
    Story: 2007267
    Task: 39640
    Depends-On: https://review.opendev.org/#/c/725627/
    
    Change-Id: Idd2e675126a01d6113597646ddd9eb4a0bc5be44
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit b793518f65ae932f3974ff85b797f505b5ef1c2a
Author: Robert Church <robert.church@windriver.com>
Date:   Wed Apr 29 12:49:04 2020 -0400

Ensure containerd binds to the loopback interface
    
    Set the stream_server_address to bind to the loopback interface with a
    value of "127.0.0.1" for IPv4 and "::1" for IPv6.
    
    Without setting the stream_server_address in config.toml, containerd was
    binding to the OAM interface. Under most situations this resulted in
    containerd binding to the OAM fixed host address. But in an IPv6
    configuration there were occasions where after controller-0 unlock, the
    OAM floating IP would be used. When this happened, swacting away from
    controller-0 would move the OAM floating IP to controller-1 and break
    access to containers residing on controller-0.
    
    This will explicitly update the containerd configuration to use the IP
    address of the loopback interface based on the system's network
    configuration.
    
    This also removes any security concerns with containerd binding to the
    OAM interface.
    
    Change-Id: I0f914d738e94b525cf217712675d3b4575817d1d
    Depends-On: https://review.opendev.org/#/c/725394/
    Closes-Bug: #1875891
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 4107faed7e3466cba6fe7b6867152c91c869105b
Author: Elena Taivan <elena.taivan@windriver.com>
Date:   Wed Mar 25 11:48:49 2020 +0000

Add a new filesystem for image conversion
    
    Adding runtime manifest for conversion logical volume.
    Adding new 'ensure' parameter for 'platform::filesystem' class.
    
    Change-Id: I622837959a5a7aabc462640b588713396354ce73
    Partial-bug: 1819688
    Signed-off-by: Elena Taivan <elena.taivan@windriver.com>

commit db97027fb7b8cf8484f6ddc9ee4906ca091107ec
Author: albailey <Al.Bailey@windriver.com>
Date:   Tue Apr 28 12:39:05 2020 -0500

Clamp pylint to be less than 2.5.0
    
    A new version of pylint was released on April 25
    and it is breaking zuul jobs so submissions cannot merge.
    Clamping pylint to be less than 2.5.0 for now.
    
    Change-Id: Ibd62a5d67bf8f37119b612a274c2d472a3474859
    Partial-Bug: 1875705
    Signed-off-by: albailey <Al.Bailey@windriver.com>

commit 77b2e1ccfa612b632a4831da8b9a2c95fa812e9b
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Fri Apr 24 15:09:15 2020 -0400

Rename the existing /opt/patch-vault filesystem to /opt/dc-vault
    
    The filesystem /opt/patch-vault is renamed to /opt/dc-vault so that
    it can be re-used to store FPGA images and software loads. Thus,
    necessary changes have been made in the puppet manifests.
    
    Story: 2006740
    Task: 39550
    Depends-On: https://review.opendev.org/#/c/723007/
    Change-Id: I26055b12e7bd241adb072c609f72b8d113b4a20e
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit 7a759239557ca69e2bc0c0b3084e0759b461f06b
Author: Robert Church <robert.church@windriver.com>
Date:   Wed Apr 22 02:42:13 2020 -0400

Enable --reserved-cpus option in k8s v1.18.1
    
    The option was introduced in k8s v1.17 and will now be used to define
    the explicit set of CPUs that are reserved for specific cpu functions in
    StarlingX.
    
    This retires setting the number of CPUs reserved in the --kube-reserved
    and --system-reserved options.
    
    Change-Id: I1a3d4e4cca7b6940682a787c2e7348e56a047a06
    Story: 2006999
    Task: 39529
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 9e86812ec1301f384ebc8a701c021af9932ac2c1
Author: Tee Ngo <tee.ngo@windriver.com>
Date:   Wed Apr 15 15:36:49 2020 -0400

Add a cron job to purge dcorch database
    
    This commit adds a daily cron job to purge deleted orch
    requests that are older than 3 days, their orch jobs
    and resources from dcorch database.
    
    Story: 2007267
    Task: 39044
    Depends-On: https://review.opendev.org/720277
    Change-Id: Ibc9f78ac89f4cc6706886a49062c3f5a6145cc9f
    Signed-off-by: Tee Ngo <tee.ngo@windriver.com>

commit e5f325ccca896e9ba96d199c6cff456cce0014f5
Author: Andy Ning <andy.ning@windriver.com>
Date:   Mon Apr 6 10:11:56 2020 -0400

Config platform service admin endpoints to https for DC
    
    With this update https is enabled for platform services' admin endpoints
    for System Controller and subclouds when the first controller is
    unlocked.
    
    The services with admin endpoints enabled are:
    - fm
    - patching
    - vim
    - smapi
    - barbican
    - keystone
    - sysinv
    - dcdbsync
    - dcmanager
    
    Change-Id: I45b3c541cdb6191dad6d3e2b3e9cf8a3398b3a1b
    Story: 2007347
    Task: 38891
    Depends-On: https://review.opendev.org/#/c/720224/
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 7910646e9bd97af02d7f95eec5d8bd3a19dfb0e1
Author: Tao Liu <tao.liu@windriver.com>
Date:   Thu Apr 16 10:08:59 2020 -0400

Support subcloud deploy upload the common files
    
    Create /opt/platform/deploy to host the deploy common files.
    
    Partial-Bug: 1864508
    
    Change-Id: Ifd40cb02d4a2ee17a05457b43c6227aaa069e01e
    Signed-off-by: Tao Liu <tao.liu@windriver.com>

commit 4fc8bdcf4a011864aabe9df561e2c9bd2165c481
Author: Stefan Dinescu <stefan.dinescu@windriver.com>
Date:   Tue Apr 14 09:59:54 2020 +0000

Add B&R information comments to DRBD manifest
    
    This commit adds a series of comments to the DRBD manifest
    so that users doing any changes to this manifest know also
    update the list of DRBD devices in the restore playbook.
    
    Change-Id: Iae1d9d98391759669871b016721418922aa134ce
    Partial-bug: 1854169
    Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>

commit c82b459703c65d9d64759c124236c1c60b3d1916
Author: Bin Qian <bin.qian@windriver.com>
Date:   Tue Apr 7 23:51:24 2020 -0400

Install DC adminep cert and DC root ca certificate
    
    This is to install DC admin endpoint certificate (pem).
    This also install root CA to trusted CA, so to trust the certificate
    issued directly and indirectly by DC root CA.
    
    Story: 2007347
    Task: 39430
    
    Depends-on: https://review.opendev.org/720273
    
    Change-Id: Ie242c6e833a574ff29562b468fff72352515d22a
    Signed-off-by: Bin Qian <bin.qian@windriver.com>

commit 9a18b7086035062bd326a279aea47c23c3c3f96e
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Wed Apr 15 09:56:42 2020 -0400

Introduce a wait until network interfaces are ready
    
    The DAD (Duplicate Address Detection) mechanism keeps
    ipv6 network interface in tentative state until it finishes.
    During this time no binding to this interface address is
    possible and networking dependent services fail to start
    
    Change-Id: I9cfa604a0d75400f6d3c7172b3b973b0d50c3578
    Closes-bug: 1871638
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit ccb72490976519ace03db8e5be4f7391f5e2942d
Author: Bart Wensley <barton.wensley@windriver.com>
Date:   Tue Apr 14 15:43:20 2020 -0500

Allow k8s upgrades to any release if necessary
    
    The default behaviour of the "kubeadm upgrade apply" command is
    to only allow upgrades to stable kubernetes versions. However,
    for both testing purposes and for potential critical fixes in
    the future, it may be necessary to upgrade to a release
    candidate or other release that kubernetes deems as unstable.
    Adding in the appropriate options when calling the "kubeadm
    upgrade apply" command to make this possible.
    
    Change-Id: I164caf495ee3680f549d651b97e7e502b1172c70
    Story: 2006781
    Task: 37578
    Signed-off-by: Bart Wensley <barton.wensley@windriver.com>

commit 3b7ab6010ee45f5b35de54ff1b6d147761ea5d7f
Author: Andy Ning <andy.ning@windriver.com>
Date:   Tue Apr 14 11:24:17 2020 -0400

Free dcdbsync openstack instance port for https admin endpoint
    
    Currently dcdbsync instance for openstack is listening on port 8220.
    With the admin endpoint of dcdbsync instance for platform has https
    enabled and uses port 8220, the port of dcdbsync instance for
    openstack is updated to use 8229.
    
    Change-Id: Ie3d60164e4e81de8e53ad452d4dbeab7ce4a5058
    Story: 2007347
    Task: 39409
    Signed-off-by: Andy Ning <andy.ning@windriver.com>

commit 438354a28cf34c63a807ca90b6ed8806e01376af
Author: Robert Church <robert.church@windriver.com>
Date:   Mon Mar 23 20:57:45 2020 -0400

Upversion sandbox image to align with k8s v1.18.0
    
    Change-Id: I02f6158d39b4f10764faf4055da4ab4cdc1f9662
    Story: 2006999
    Task: 39342
    Depends-On: https://review.opendev.org/#/c/718568
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 7134a062502bab3afde7d44c1d7cf6c21b2fa7ab
Author: Jessica Castelino <jessica.castelino@windriver.com>
Date:   Wed Apr 8 11:12:00 2020 -0400

Database connection exhaustion in dcmanager during sync
    
    When a data sync is triggered for large number of subclouds (~100),
    the sync fails for some subclouds due to database connection exhaustion.
    In order to fix this issue, the limit on the number of database
    connections has been increased.
    
    Story: 2007267
    Task: 38956
    Change-Id: I88ed37ba3a143e3abee78a9f5584b16f17becc76
    Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>

commit 21690922e2dc5653ba843167075e0f3577a7c8ed
Author: John Kung <john.kung@windriver.com>
Date:   Thu Apr 2 10:53:21 2020 -0400

Enable duplex platform upgrades: migrate etcd
    
    Enable the mechanism to upgrade the platform components on
    a running StarlingX system with duplex controllers.
    
    This includes upgrade updates for:
      o migrate etcd on host-swact
    
    Depends-On: https://review.opendev.org/#/c/717038/
    Change-Id: Ife45253b46a9d58216d6cc943d7f4d40dd48b970
    Story: 2007403
    Task: 39246
    Signed-off-by: John Kung <john.kung@windriver.com>

commit 6b11dcc799c62fd9690ece744cf6a9583b2db994
Author: Jim Somerville <Jim.Somerville@windriver.com>
Date:   Mon Apr 6 13:25:58 2020 -0400

lowlat: enable ktimer_lockless_check if it exists
    
    Enable check for raising timer interrupt only if one is pending.
    This allows nohz full mode to operate properly on isolated cores.
    Without it, ktimersoftd interferes with only one job being
    on the run queue on that core, causing it to drop out of nohz.
    
    If ktimer_lockless_check doesn't exist in the kernel, then no
    error is reported ie. it just fails silently.
    
    Closes-Bug: 1870456
    Change-Id: I93d0fab3e9f4f56f9afb9bbfaa04882cf9068db5
    Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>

commit 45ecd74e05deb3d37d51d7d4812ae9fdfa296d31
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Apr 3 15:25:51 2020 -0400

Support adding admission plugin post bootrstrap
    
    This commit adds mandatory plugins automatically, without having the
    user specify them through system service-parameters.
    
    Story: 2007351
    Task: 38897
    
    Change-Id: Ia423bc3b7be241297d9d1c7a917ac308855c6114
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit 93d22c438ed6939bd4b1723b37e23794eacb7006
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Thu Apr 2 13:06:57 2020 +0300

Configure docker and containerd once per AIO deploy
    
    Prevent a double configuration of docker and containerd
    for AIO scenarios.
    
    Change-Id: I0cb9fdde5acf8d5d44d526e70ae4af726932709f
    Closes-bug: 1869193
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit 296bd3d1f733e10b11f3dc2601e9fa1f08c9c719
Author: Robert Church <robert.church@windriver.com>
Date:   Fri Mar 27 23:38:24 2020 -0400

Ensure network config has been applied before containerd
    
    If containerd is started prior to networking providing a default route,
    the containerd cri plugin will fail to load with the following message:
    
    msg="failed to load plugin io.containerd.grpc.v1.cri" error="failed to
    create CRI service: failed to create stream server: failed to get stream
    server address: no default routes found in \"/proc/net/route\" or
    \"/proc/net/ipv6_route\""
    
    and the status of the plugin will be in 'error'
    
    TYPE                  ID  PLATFORMS   STATUS
    io.containerd.grpc.v1 cri linux/amd64 error
    
    This will prevent any crictl image pulls from working.
    
    This change will ensure the network config is applied prior to
    configuring and restarting containerd.
    
    Docker and containerd also have a dependency, so also ensure the
    network config is applied prior to configuring and restarting
    docker.
    
    Change-Id: I94a3349b438816d21b147cbd62054862d07d8bee
    Partial-Bug: #1868728
    Signed-off-by: Robert Church <robert.church@windriver.com>

commit 07edad67cc55caf4726d3db3529c8e71fff6254e
Author: Paul Vaduva <Paul.Vaduva@windriver.com>
Date:   Thu Mar 26 03:09:47 2020 +0200

Set preferred_lft to 0 for mgmt and nfs floating ips
    
    For ipv6 the only way to prefer the fixed ip for
    outgoing connection is to set preferred_lft to 0 for
    the floating ips
    
    Change-Id: I13573ac4628db1fc49146f353d7eb2c96eb1aff0
    Closes-bug: 1856064
    Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>

commit cc786eda4dafb88f857c7b5272338b4bcf4a5204
Author: Jerry Sun <jerry.sun@windriver.com>
Date:   Fri Mar 27 14:11:45 2020 -0400

Support adding admission plugin post bootstrap
    
    This commit adds the ability to change the admission plugins of
    kube-apiserver post bootstrap. We need this for pod security plugin.
    Starting pod security plugin without any policies will result in all
    pods being denied.
    
    Story: 2007351
    Task: 38897
    
    Change-Id: I3ad3ba91f3084bd2f0054d5d063d2242594997b2
    Signed-off-by: Jerry Sun <jerry.sun@windriver.com>

commit f24b2f5054156016284fb520022d259906fb3ef5
Author: Gerry Kopec <gerry.kopec@windriver.com>
Date:   Mon Mar 30 00:37:38 2020 -0400

Remove dcorch-snmp
    
    dcorch-snmp process/service is being removed from distributed cloud.
    Remove associated puppet config.
    
    Change-Id: I5691648887e2302eeda0b5e853a72df52ae0ba72
    Story: 2007267
    Task: 39190
    Depends-On: https://review.opendev.org/#/c/715765
    Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>

Revision history for this message

Bill Zvonar (billzvonar) wrote on 2020-08-13:

#26

Paul/Frank - reminder: This LP is marked as gating for stx.3.0. Please cherry-pick the code changes to the stx.3.0 branch if applicable or add a note explaining why it shouldn't be cherry-picked.

tags:

added: stx.cherrypickneeded

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-09-02: Fix proposed to stx-puppet (r/stx.3.0)

#27

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/749482

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-09-02: Fix proposed to integ (r/stx.3.0)

#28

Fix proposed to branch: r/stx.3.0
Review: https://review.opendev.org/749483

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-09-02: Fix merged to stx-puppet (r/stx.3.0)

#29

Reviewed: https://review.opendev.org/749482
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=d54b54b2cbdcec16beffa24b1e9418f0a1aad826
Submitter: Zuul
Branch: r/stx.3.0

commit d54b54b2cbdcec16beffa24b1e9418f0a1aad826
Author: Paul Vaduva <email address hidden>
Date: Thu Mar 26 03:09:47 2020 +0200

Set preferred_lft to 0 for mgmt and nfs floating ips

    For ipv6 the only way to prefer the fixed ip for
    outgoing connection is to set preferred_lft to 0 for
    the floating ips

    Change-Id: I13573ac4628db1fc49146f353d7eb2c96eb1aff0
    Closes-bug: 1856064
    Signed-off-by: Paul Vaduva <email address hidden>
    (cherry picked from master commit 07edad67cc55caf4726d3db3529c8e71fff6254e)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-09-03: Fix merged to integ (r/stx.3.0)

#30

Reviewed: https://review.opendev.org/749483
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=23ea1d7ed9da72742d65fb48945c100e00c1302b
Submitter: Zuul
Branch: r/stx.3.0

commit 23ea1d7ed9da72742d65fb48945c100e00c1302b
Author: Paul Vaduva <email address hidden>
Date: Tue Mar 10 12:05:31 2020 -0400

Release FDs when stuck peering recovery

    During stuck peering recovery if file descriptors are
    not released the state machine does not advance to
    OPERATIONAL state

Partial-bug: 1856064

    Change-Id: I3fba7be661ebf223eac63608574323ad98d33b75
    Signed-off-by: Paul Vaduva <email address hidden>
    (cherry picked from master commit bed7388b678b9eda0d06b4d16fb00711741f9ef0)

Bill Zvonar (billzvonar) on 2020-09-03

tags:

removed: stx.cherrypickneeded

StarlingX

Active controller became degraded after lock/unlock compute node

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Changed in starlingx:
assignee:	Dan Voiculeasa (dvoicule) → Paul-Ionut Vaduva (pvaduva)