dex pod not get scheduled on active controller after network restart

Bug #2058075 reported by João Victor Portal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
João Victor Portal

Bug Description

Brief Description
-----------------
Dex pod (eg oidc-dex-5cdc87d8c9-5z5fs) is not scheduled on controller-0 after network restart. This happens because the podAntiAffinity rules are not working correctly, allowing the schedule of 2 dex pods on the same controller. As a consequence, the other controller will have no dex pods.

Severity
--------
Minor

Steps to Reproduce
------------------
Steps to Reproduce
1. Setup any multinode lab (DX/Std/DC with multinode System Controller) with OIDC.
2. Create Ldap user and verify it has got kubectl access after oidc token generation.
3. Restart network on active controller.

Expected Behavior
------------------
once the controller is online and available, dex pod should be scheduled on it.

ldap user should be able to login to the controller and access k8s after generating new token.

Actual Behavior
----------------
dex pod is not scheduled on controller that was n/w reboot. Both dex pods are scheduled on controller-1. (lab has 2 controllers and hence 2 dex pods)

No impact on ldap login. User is able to login, generate token and access k8s cli.

Reproducibility
---------------
Reproducible

Executed n/w restart 2 times and observed same behavior.

System Configuration
--------------------
DX/Std/DC with multinode System Controller

Branch/Pull Time/Commit
-----------------------
NA.

Last Pass
---------
NA.

Timestamp/Logs
--------------
Before network restart:
sysadmin@controller-0:~$ kubectl get pod -A -o wide | grep oidc
kube-system oidc-dex-5cdc87d8c9-54lwv 1/1 Running 0 15h aefd:206::8e22:765f:6121:eb72 controller-0 <none> <none>
kube-system oidc-dex-5cdc87d8c9-pf8xc 1/1 Running 0 15h aefd:206::a4ce:fec1:5423:e328 controller-1 <none> <none>
kube-system stx-oidc-client-8485996446-7zmz4 1/1 Running 0 16h aefd:206::8e22:765f:6121:eb4d controller-0 <none> <none>
kube-system stx-oidc-client-8485996446-mw4hg 1/1 Running 0 16h aefd:206::a4ce:fec1:5423:e323 controller-1 <none> <none>
sysadmin@controller-0:~$ sudo systemctl restart networking
sysadmin@controller-0:~$

After controller-0 is online and available
[sysadmin@controller-1 ~(keystone_admin)]$ kubectl get pod -A -o wide | grep oidc
kube-system oidc-dex-5cdc87d8c9-5z5fs 1/1 Running 0 12m aefd:206::a4ce:fec1:5423:e325 controller-1 <none> <none>
kube-system oidc-dex-5cdc87d8c9-pf8xc 1/1 Running 0 15h aefd:206::a4ce:fec1:5423:e328 controller-1 <none> <none>
kube-system stx-oidc-client-8485996446-7lk2d 1/1 Running 0 12m aefd:206::8e22:765f:6121:eb5b controller-0 <none> <none>
kube-system stx-oidc-client-8485996446-mw4hg 1/1 Running 0 16h aefd:206::a4ce:fec1:5423:e323 controller-1 <none> <none>

Test Activity
-------------
Feature Testing

Workaround
----------
Delete the pod. New pod would be created and scheduled on controller-0.

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to docs (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/docs/+/913455

description: updated
description: updated
Revision history for this message
Michel Thebeau [WIND] (mthebeau) wrote :

Unclear why the infrastructure didn't add comment for this review:

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/oidc-auth-armada-app/+/913336

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oidc-auth-armada-app (master)

Reviewed: https://review.opendev.org/c/starlingx/oidc-auth-armada-app/+/913336
Committed: https://opendev.org/starlingx/oidc-auth-armada-app/commit/defdac3b1a8f1c1641d3deba782a2e72f062fa27
Submitter: "Zuul (22348)"
Branch: master

commit defdac3b1a8f1c1641d3deba782a2e72f062fa27
Author: Joao Victor Portal <email address hidden>
Date: Fri Mar 15 11:00:56 2024 -0300

    Add missing label to dex pods

    The podAntiAffinity rules of dex pods expect each dex pod to have the
    label "app: dex" to prevent more than one pod from being scheduled in
    the same node. However, this label was not present in dex pods. This
    commit fixes this.

    Test Plan:

    PASS: Successfully compile a new OIDC app tarball, upload it to a AIO-DX
    environment, test OIDC app functionality, restart the active controller
    network and verify that the dex pods never get scheduled to the same
    controller.

    Partial-Bug: 2058075

    Change-Id: Ibc948c68478f97563daf1dc523258500c278ab6a
    Signed-off-by: Joao Victor Portal <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to docs (master)

Reviewed: https://review.opendev.org/c/starlingx/docs/+/913455
Committed: https://opendev.org/starlingx/docs/commit/d7693a255320355c31f250e029da69449750bd83
Submitter: "Zuul (22348)"
Branch: master

commit d7693a255320355c31f250e029da69449750bd83
Author: Joao Victor Portal <email address hidden>
Date: Fri Mar 15 19:50:57 2024 -0300

    Update default dex helm overrides

    A new override for dictionary "podLabels" was added in dex helm
    overrides. This change updates the documentation.

    Closes-Bug: 2058075

    Depends-On: https://review.opendev.org/c/starlingx/oidc-auth-armada-app/+/913336
    Change-Id: I6bf81391a28462b2adf8d72609bcb9140321d2e9
    Signed-off-by: Joao Victor Portal <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.10.0 stx.apps
Changed in starlingx:
assignee: nobody → João Victor Portal (jvictorp)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.