sysinv api should not return successfully if bmc password cannot be stored

Bug #1834673 reported by Allain Legacy
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
zhipeng liu

Bug Description

Brief Description
-----------------
This is related to: https://bugs.launchpad.net/starlingx/+bug/1834670.

A command to set the BMC password should not return successfully the API caller if the request could not be completed.

Severity
--------
Major, system not configured to user specification without any error/result indication.

Steps to Reproduce
------------------
1. bootstrap the system
2. configure controller-0 as per wiki
3. add BMC username/password
system host-update controller-0 bm_type=bmc bm_username=root bm_password=root
4. Step 3 should fail if the password was not stored in barbican as per LP#1834670

Expected Behavior
------------------
The system command should fail if any part of that command fails.

Actual Behavior
----------------
The command returns with a successful return code.

Reproducibility
---------------
100%

System Configuration
--------------------
Any

Branch/Pull Time/Commit
-----------------------
20190628T013000Z

Last Pass
---------
unknown

Timestamp/Logs
--------------

2019-06-28 06:44:09.610 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost_patch_start_2019-06-28-06-44-09 patch
2019-06-28 06:44:09.610 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 1. delta_handle ['bm_username', 'bm_ip', 'bm_type']
2019-06-28 06:44:09.610 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 2. delta_handle ['bm_username', 'bm_ip', 'bm_type']
2019-06-28 06:44:09.611 104162 INFO sysinv.api.controllers.v1.host [-] bm_ip in delta=set(['bm_username', 'bm_ip', 'bm_type']) obm_ip= nbm_ip=128.224.64.171
2019-06-28 06:44:09.611 104162 INFO sysinv.api.controllers.v1.host [-] Updating bm_type from bmc to bmc
2019-06-28 06:44:10.044 90155 ERROR barbicanclient.client [-] 5xx Server error: Service Unavailable
2019-06-28 06:44:10.045 90155 ERROR sysinv.conductor.openstack [req-3923eb1c-9465-4ee3-b766-e432b6df392c admin admin] Unable to find Barbican secret 9c630c87-f605-4d9a-82c5-ba49744a3806
2019-06-28 06:44:10.055 90155 ERROR barbicanclient.client [-] 5xx Server error: <html>
 <head>
  <title>503 Service Unavailable</title>
 </head>
 <body>
  <h1>503 Service Unavailable</h1>
  The server is currently unavailable. Please try again at a later time.<br /><br />
The Keystone service is temporarily unavailable.

 </body>
</html>
2019-06-28 06:44:10.055 90155 ERROR sysinv.conductor.openstack [req-3923eb1c-9465-4ee3-b766-e432b6df392c admin admin] Unable to create Barbican secret 9c630c87-f605-4d9a-82c5-ba49744a3806
2019-06-28 06:44:10.061 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 bm semantic checks for user_agent gophercloud/2.0.0 passed
2019-06-28 06:44:10.061 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 post delta_handle hostupdate action=None notify_vim=False notify_mtc=True skip_notify_mtce=False
2019-06-28 06:44:10.062 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 apply ihost_val {'bm_type': 'bmc'}
2019-06-28 06:44:10.078 104162 INFO sysinv.api.controllers.v1.host [-] controller-0 Action none perform notify_mtce
2019-06-28 06:44:10.081 104162 INFO sysinv.api.controllers.v1.mtce_api [-] number of calls to rest_api_request=1 (max_retry=3)
2019-06-28 06:44:10.081 104162 INFO sysinv.api.controllers.v1.rest_api [-] PATCH cmd:http://localhost:2112/v1/hosts/9c630c87-f605-4d9a-82c5-ba49744a3806 hdr:{'Content-type': 'application/json', 'User-Agent': 'sysinv/1.0'} payload:{"tboot": "false", "ttys_dcd": null, "subfunctions": "controller", "bm_ip": "128.224.64.171", "install_state": null, "rootfs_device": "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0", "ihost_action": null, "bm_username": "root", "operation": "modify", "serialid": null, "id": 1, "vim_progress_status": null, "console": "ttyS0,115200n8", "uuid": "9c630c87-f605-4d9a-82c5-ba49744a3806", "mgmt_ip": "192.168.204.3", "software_load": "19.01", "config_status": null, "hostname": "controller-0", "iscsi_initiator_name": null, "capabilities": {"stor_function": "monitor"}, "install_output": "text", "location": {}, "availability": "online", "invprovision": "provisioned", "peer_id": null, "administrative": "locked", "personality": "controller", "recordtype": "standard", "bm_mac": null, "mtce_info": null, "isystem_uuid": "e22eac3b-b6d8-4e93-b68c-29ec4e0ac6c4", "boot_device": "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0", "install_state_info": null, "mgmt_mac": "00:00:00:00:00:00", "subfunction_oper": "disabled", "task": "", "target_load": "19.01", "vsc_controllers": null, "operational": "disabled", "subfunction_avail": "not-installed", "action": "none", "bm_type": "bmc"}
2019-06-28 06:44:10.083 104162 INFO sysinv.api.controllers.v1.rest_api [-] Response={u'status': u'pass'}

Test Activity
-------------
Developer testing

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.2.0 -- silent system failure

tags: added: stx.2.0 stx.config
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Cindy Xie (xxie1)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

sysinv should have a check and propagate the failure returned by barbican in this case.

Cindy Xie (xxie1)
Changed in starlingx:
assignee: Cindy Xie (xxie1) → zhipeng liu (zhipengs)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/673417

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/673417
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=5a9a373297b9a2358c1cb704c2f61075a3ae1905
Submitter: Zuul
Branch: master

commit 5a9a373297b9a2358c1cb704c2f61075a3ae1905
Author: zhipengl <email address hidden>
Date: Wed Jul 31 00:10:46 2019 +0800

    Fix sysinv handle bmc password cannot be stored case

    This implementation will raise exception to the client when bm_password
    cannot be stored successfully.

    Below test pass!
    1) Before controller-0 is unlocked, send host-update command and get
    expected reject message like below.
    $ system host-update 1 bm_type=bmc bm_username=admin bm_password=123
    controller-0 Rejected: failed to create barbican secret.
    2) After controller-0 unlocked, send this command again, it returns
    host information list as expected.

    Closes-Bug: #1834673

    Change-Id: I9b6bdc20c8bffa7a3c997ffccd088c4d6789a3f2
    Signed-off-by: zhipengl <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

We are now seeing following issue on 0809 load on about half of the sanity systems.
This command was run after controller-0 is configured and unlocked, pods were in good states as well.
It did go away when I rerun the command some time later.

[2019-08-09 14:06:49,438] 301 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-bulk-add hosts_bulk_add.xml'
Error:
 controller: Timeout while waiting on RPC response - topic: "sysinv.conductor_manager", RPC method: "create_barbican_secret" info: "<unknown>"
 controller: Timeout while waiting on RPC response - topic: "sysinv.conductor_manager", RPC method: "create_barbican_secret" info: "<unknown>"

[sysadmin@controller-0 ~(keystone_admin)]$
[2019-08-09 14:08:51,519] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
0

A second problem is the return code should not be 0 in this case. It should be 1 instead.

Revision history for this message
Yang Liu (yliu12) wrote :

Seems the issue in above comment is different than the original issue. I will raise a different LP instead.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Re-opening as the original commit by Zhipeng was reverted by John Kung via review:
https://review.opendev.org/#/c/675698/

Changed in starlingx:
status: Fix Released → Confirmed
Revision history for this message
zhipeng liu (zhipengs) wrote :

Hi Ghada,

I have added my comment in related LP
https://bugs.launchpad.net/starlingx/+bug/1839665

========================================================================
Hi John,

RPC timeout causing command failure should be expected action, right? (This is a coincidental scenario)
In my patch, it explicitly report command error and let user to try it again later.
If we revert this patch, we just ignore the timeout issue.

For timeout case, we need handle it in bulk_add functions, the place where we catch
the timeout exception.

So this is another issue existed all the time.
We'd better merge my original patch and close this one. Then add a new patch to fix timeout case instead of reverting my patch, right?
===============================================================================================
Thanks!
Zhipeng

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Zhipeng, I suggest you discuss directly with John on the next steps. As the core for config, he already reverted your commit. So we can't leave this bug as "Fix Released" given your code is removed.

Revision history for this message
zhipeng liu (zhipengs) wrote :

Hi Ghada,

Thanks for your proposal!

Zhipeng

Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per agreement with the community, moving all unresolved medium priority bugs from stx.2.0 to stx.3.0

tags: added: stx.3.0
removed: stx.2.0
Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/688318
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=583f3dd2860a96331c7dfb177a8860012536b54f
Submitter: Zuul
Branch: master

commit 583f3dd2860a96331c7dfb177a8860012536b54f
Author: zhipeng liu <email address hidden>
Date: Mon Oct 14 02:27:20 2019 +0000

    Revert "Revert "Fix sysinv handle bmc password cannot be stored case""

    This reverts commit f632a4c50b6a9ccf69eac6261865d684dba6ede1.

    Suspected issue already root caused and fixed with below patch
    https://review.opendev.org/681805

    Closes-Bug: #1834673

    Change-Id: Iac904b7cbc81218fc5045559e59ecee1db033df7
    Signed-off-by: zhipengl <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.