system helm override update for oidc-auth-apps failed due to sysinv RPC error

Bug #1875448 reported by ayyappa
This bug report is a duplicate of:  Bug #1875891: helm cmd failed after host swact. Edit Remove
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Jerry Sun

Bug Description

Brief Description
-----------------
System override update for oidc-auth-app fails on ipv6 duplex lab

Note: Before applying oidc-auth-apps, the app values should be overriden with WAD values

Severity
--------
Major

Steps to Reproduce
------------------
1)create the secrects
kubectl create secret tls local-dex.tls --cert=ssl/dex-cert.pem --key=ssl/dex-key.pem -n kube-system
kubectl create secret generic dex-client-secret --from-file=/home/sysadmin/ssl/dex-ca.pem -n kube-system
kubectl create secret generic wadcert --from-file=/home/sysadmin/ssl/AD_CA.cer -n kube-system

2)override with the following values
config:
  expiry:
    idTokens: "20m"
  connectors:
  - type: ldap
    name: OpenLDAP
    id: ldap
    config:
      host: pv-windows-acti.cumulus.wrs.com:636
      rootCA: /etc/ssl/certs/adcert/AD_CA.cer
      insecureNoSSL: false
      insecureSkipVerify: false
      bindDN: cn=Administrator,cn=Users,dc=cumulus,dc=wrs,dc=com
      bindPW: Li69nux*
      usernamePrompt: Username
      userSearch:
        baseDN: ou=Users,ou=Titanium,dc=cumulus,dc=wrs,dc=com
        filter: "(objectClass=user)"
        username: sAMAccountName
        idAttr: sAMAccountName
        emailAttr: sAMAccountName
        nameAttr: displayName
      groupSearch:
        baseDN: ou=Groups,ou=Titanium,dc=cumulus,dc=wrs,dc=com
        filter: "(objectClass=group)"
        userAttr: DN
        groupAttr: member
        nameAttr: cn
extraVolumes:
- name: certdir
  secret:
    secretName: wadcert
extraVolumeMounts:
- name: certdir
  mountPath: /etc/ssl/certs/adcert

3)the following error shows while overriding with the above values

[2020-04-27 10:02:49,327] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne helm-override-update --values /home/sysadmin/ssl/dex-overrides.yaml oidc-auth-apps dex kube-system'
[2020-04-27 10:03:51,064] 436 DEBUG MainThread ssh.expect :: Output:
Timeout while waiting on RPC response - topic: "sysinv.conductor_manager", RPC method: "merge_overrides" info: "<unknown>"
[sysadmin@controller-1 ~(keystone_admin)]$
[2020-04-27 10:03:51,065] 314 DEBUG MainThread ssh.send :: Send 'echo $?'
[2020-04-27 10:03:51,168] 436 DEBUG MainThread ssh.expect :: Output:
1
[sysadmin@controller-1 ~(keystone_admin)]$
[2020-04-27 10:03:51,363] 60 DEBUG MainThread conftest.update_results:: ***Failure at test setup: /home/svc-cgcsauto/wassp-repos.new/testcases/cgcs/CGCSAuto/utils/cli.py:152: utils.exceptions.CLIRejected: CLI command is rejected.

Expected Behavior
------------------
The oidc-auth-app application should be applied successfully

Actual Behavior
----------------
failed to override the oidc-auth-apps

Reproducibility
---------------
100%

System Configuration
--------------------

duplex system,wcp_78_79_ipv6

Branch/Pull Time/Commit
-----------------------
2020-04-25

Last Pass
---------
2020-02-24
The override values didn't take effect, but the apply didn't get rejected

Timestamp/Logs
--------------
2020-04-27 10:03:51,363

Test Activity
-------------
Automation

Workaround
----------
Haven't found any

Revision history for this message
ayyappa (mantri425) wrote :
description: updated
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Ayyappa, Does the oidc application apply successfully if you have no overrides?

Changed in starlingx:
assignee: nobody → Jerry Sun (jerry-sun-u)
description: updated
tags: added: stx.4.0 stx.apps stx.security
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
ayyappa (mantri425)
description: updated
Revision history for this message
ayyappa (mantri425) wrote :

@Ghada, This is a automation failure on ipv6 duplex lab, tested on ipv4 duplex lab i.e. override, apply and it seems to be working fine. Haven't tried the apply without override.

Revision history for this message
Jerry Sun (jerry-sun-u) wrote :

From the error logs, I can only tell the same information Ayyappa mentioned in 3), that sysinv is having issues with RPC. This issue should exist for all application overrides, not just oidc.

As for the previous comment, we do not support applying the oidc auth application without overrides. the user should specify at least one back end (ldap windows active directory for example) before applying.

Ghada Khalil (gkhalil)
summary: - system helm override update for oidc-auth-apps failed
+ system helm override update for oidc-auth-apps failed due to sysinv RPC
+ error
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Ayyappa, can you repeat the automated test-case a number of times and provide info on how reproducible this issue is?

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Was there a swact initiated on this system before the application overrides started failing?
I'm wondering if this is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1875891

Revision history for this message
ayyappa (mantri425) wrote :

@Ghada yes there were tests which initiated swact before this

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Let's wait for the fix for https://bugs.launchpad.net/starlingx/+bug/1875891 and then retest this

Revision history for this message
Senthil Mukundakumar (smukunda) wrote :

The issue is reproducible in non-ceps SX system (SM3) using load 2020-04-30_20-00-00

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Senthil, what is the exact issue you are seeing? an override issue with odic only or an override issue with multiple apps? is helm functional?
Please provide more information and a new set of logs

Revision history for this message
Senthil Mukundakumar (smukunda) wrote :

The override issue reproduced applying OIDC app. I have attached logs.
E Details: CLI 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne helm-override-update --values /home/sysadmin/ssl/dex-overrides.yaml oidc-auth-apps dex kube-system' failed to execute. Output: Timeout while waiting on RPC response - topic: "sysinv.conductor_manager", RPC method: "merge_overrides" info: "<unknown>"

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Senthil, was helm functional when you reproduced the issue?

Revision history for this message
Jerry Sun (jerry-sun-u) wrote :

@Senthil, the new logs show only that the same command failed. It does not have any other logs from the system or answers to the other questions in the thread, specifically,

is helm functional at the time this was reproduced
    can use a simple sanity check on helm with "helm list -a", that will show if helm is down completely, but there could be issues with it without the entirety of helm being down

is this specific to oidc-auth-apps? (i suspect no since it is a generic overrides command that failed)
    can override another application with some unused value:
    create a file (/home/sysadmin/cert-manager.yaml) with 1 line "me: hungry" or something
    system helm-override-update cert-manager cert-manager cert-manager --values /home/sysadmin/cert-manager.yaml

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Senthil, please open a new LP for the issue you are seeing on AIO-SX with the full set of logs. The original issue reported here is related to a helm issue after swact as described in https://bugs.launchpad.net/starlingx/+bug/1875891

This LP is being marked as a duplicate

Revision history for this message
Ghada Khalil (gkhalil) wrote :
Changed in starlingx:
status: Triaged → Fix Released
Revision history for this message
Ghada Khalil (gkhalil) wrote :

It's recommended to wait for the other outstanding helm fix tracked by https://bugs.launchpad.net/starlingx/+bug/1876396 before re-testing

ayyappa (mantri425)
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.