Host re-install fails due to error in getting IPv6 BMC password from Barbican

Bug #1839870 reported by Chris Winnicki
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Alexander Kozyrev

Bug Description

Brief Description
-----------------
Mtc (maintenance) fails to retrieve IPv6 BMC password from Barbican

/var/log/mtcAgent.log snippet

2019-08-09T17:48:43.193 [324744.358043] controller-0 mtcAgent --- httpUtil.cpp ( 564) httpUtil_handler :Error : controller-0 secretUtil_get_secret 'get secret reference' Request Timeout (5)
2019-08-09T17:48:43.193 [324744.358044] controller-0 mtcAgent --- httpUtil.cpp ( 621) httpUtil_handler :Error : controller-0 secretUtil_get_secret 'get secret reference' Failed (rc:16)
2019-08-09T17:48:43.193 [324744.358045] controller-0 mtcAgent pwd secretUtil.cpp ( 83) secretUtil_manage_secret: Warn : be3e31ee-3ea5-4a1e-9c97-4fbaa68c5109 getting secret reference failed
2019-08-09T17:48:43.193 [324744.358046] controller-0 mtcAgent --- httpUtil.cpp ( 503) httpUtil_status : Warn : controller-0 failed to maintain connection to '[abde::2]:9311' for 'controller-0 secretUtil_get_secret 'get secret reference''

Severity
--------
Major

Steps to Reproduce
------------------
Install the following system:
(IPv6) 2 controlers 2 storage 4 workers
Configure BMC with IPv6/user/password
Lock a host
Attempt to perform host reinstall

Expected Behavior
-----------------
Host should reinstall successfully

Actual Behavior
----------------
Host reinstall results in host being stuck in:
task: Reinstall Wait ; BMC not accessible

[wrsroot@controller-0 ~(keystone_admin)]$ system host-reinstall controller-1
+---------------------+--------------------------------------------+
| Property | Value |
+---------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 2620:10a:a001:a107::36 |
| bm_type | bmc |
| bm_username | root |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| config_applied | install |
| config_status | None |
| config_target | None |
| console | ttyS0,115200n8 |
| created_at | 2019-08-08T19:15:27.597121+00:00 |
| hostname | controller-1 |
| id | 6 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abde::4 |
| mgmt_mac | 90:e2:ba:ac:6a:c4 |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 19.05 |
| task | Reinstalling |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-08-09T14:05:04.994768+00:00 |
| uptime | 66174 |
| uuid | 90db3508-4db1-42e2-aa79-27ae85feadba |
| vim_progress_status | services-disabled |
+---------------------+--------------------------------------------+

[wrsroot@controller-0 ~(keystone_admin)]$ system host-lock controller-1
+---------------------+--------------------------------------------+
| Property | Value |
+---------------------+--------------------------------------------+
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | 2620:10a:a001:a107::36 |
| bm_type | bmc |
| bm_username | root |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| config_applied | install |
| config_status | None |
| config_target | None |
| console | ttyS0,115200n8 |
| created_at | 2019-08-08T19:15:27.597121+00:00 |
| hostname | controller-1 |
| id | 6 |
| install_output | text |
| install_state | booting |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abde::4 |
| mgmt_mac | 90:e2:ba:ac:6a:c4 |
| operational | enabled |
| personality | controller |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 19.05 |
| task | Locking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-08-09T14:04:04.971300+00:00 |
| uptime | 65919 |
| uuid | 90db3508-4db1-42e2-aa79-27ae85feadba |
| vim_progress_status | services-enabled |
+---------------------+--------------------------------------------+

[wrsroot@controller-0 ~(keystone_admin)]$ system host-show controller-1
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 2620:10a:a001:a107::36 |
| bm_type | bmc |
| bm_username | root |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | install |
| config_status | None |
| config_target | None |
| console | ttyS0,115200n8 |
| created_at | 2019-08-08T19:15:27.597121+00:00 |
| hostname | controller-1 |
| id | 6 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abde::4 |
| mgmt_mac | 90:e2:ba:ac:6a:c4 |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 19.05 |
| task | Reinstall Wait ; BMC not accessible |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-08-09T14:06:05.014186+00:00 |
| uptime | 66239 |
| uuid | 90db3508-4db1-42e2-aa79-27ae85feadba |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+

Reproducibility
---------------
100%

System Configuration
--------------------
(IPv6)
2 controlers
2 storage
4 workers

Wind River internal lab name: cgcs-wildcat-15-22.cumulus.wrs.com

Branch/Pull Time/Commit
-----------------------
2019-06-03_18-34-53

Last Pass
---------
not known

Timestamp/Logs
--------------
not required

Test Activity
-------------
Regression: Installation and Configuration

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Alex Kozyrev (akozyrev)
summary: - Mtc (maintenance) fails to retrieve IPv6 BMC password from Barbican
+ Host re-install fails due to error in getting IPv6 BMC password from
+ Barbican
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (master)

Fix proposed to branch: master
Review: https://review.opendev.org/676006

Changed in starlingx:
status: New → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.2.0 gating medium priority as this impacts IPv6 configs. Should be fixed in master and cherry-picked to the release branch before the stx.2.0 release date.

Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.2.0 stx.metal
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/676006
Committed: https://git.openstack.org/cgit/starlingx/metal/commit/?id=00835385016a50aa17455d076a549a4310332867
Submitter: Zuul
Branch: master

commit 00835385016a50aa17455d076a549a4310332867
Author: Alex Kozyrev <email address hidden>
Date: Mon Aug 12 14:30:45 2019 -0400

    Properly handle Barbican IPv6 address in MTCE

    barbican.conf stores Barbican IPv6 address enclosed by square brackets:
    bind_host=[abde::2]
    MTCE fails to connect to Barbican with such an IP address.
    Need to strip square brackets during barbican.conf file read in MTCE.

    Change-Id: I28ae627cd4998a5975d39b3edc466180e11aedf6
    Closes-Bug: 1839870
    Signed-off-by: Alex Kozyrev <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/676180

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (r/stx.2.0)

Reviewed: https://review.opendev.org/676180
Committed: https://git.openstack.org/cgit/starlingx/metal/commit/?id=f1f0e626f363423cae8b0b2d316e426b211ccc23
Submitter: Zuul
Branch: r/stx.2.0

commit f1f0e626f363423cae8b0b2d316e426b211ccc23
Author: Alex Kozyrev <email address hidden>
Date: Mon Aug 12 14:30:45 2019 -0400

    Properly handle Barbican IPv6 address in MTCE

    barbican.conf stores Barbican IPv6 address enclosed by square brackets:
    bind_host=[abde::2]
    MTCE fails to connect to Barbican with such an IP address.
    Need to strip square brackets during barbican.conf file read in MTCE.

    Change-Id: I28ae627cd4998a5975d39b3edc466180e11aedf6
    Closes-Bug: 1839870
    Signed-off-by: Alex Kozyrev <email address hidden>
    (cherry picked from commit 00835385016a50aa17455d076a549a4310332867)

Ghada Khalil (gkhalil)
tags: added: in-r-stx20
Yang Liu (yliu12)
tags: added: stx.retestneeded
Revision history for this message
Boris Shteinbock (bshteinb) wrote :

# Testing Status
PASSED

# Configuration
2 controllers AIO
3 computes

# Load Tested
[sysadmin@controller-1 ~(keystone_admin)]$ cat /etc/build.info
###
### Wind River Titanium Cloud
### Release 19.10
###
### Wind River Systems, Inc.
###

SW_VERSION="19.10"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="2019-10-14_20-00-00"
SRC_BUILD_ID="49"

JOB="TC_19.10_Build"
BUILD_BY="jenkins"
BUILD_NUMBER="49"
BUILD_HOST="yow-cgts4-lx.wrs.com"
BUILD_DATE="2019-10-14 20:01:09 -0400"

Test scenario.

1.BMC was provisioned with IPv6.
2. controller-0 node was locked
3. controller-0 node was reinstalled

No BMC-related issues observed..

[sysadmin@controller-1 ~(keystone_admin)]$ system host-show controller-0
+-----------------------+-----------------------------------------------------------------------+
| Property | Value |
+-----------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 2620:10a:a001:a102::132 |
| bm_type | bmc |
| bm_username | root |
...
Events
| 2019-10-16T | log | 200. | controller-0 reinstall completed successfully
| 2019-10-16T | log | 200. | controller-0 is now 'online'
| 2019-10-16T | log | 200. | controller-0 is now 'offline'
| 2019-10-16T | log | 200. | controller-0 manual 'reinstall' request |

Yang Liu (yliu12)
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.