2021-02-05 22:39:25 |
Adriano Oliveira |
bug |
|
|
added bug |
2021-02-05 22:41:47 |
Adriano Oliveira |
description |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, compute node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of compute-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 20.06 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of compute-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 20.06 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
|
2021-02-05 22:43:28 |
Adriano Oliveira |
description |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of compute-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | 20.06 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of compute-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | stx 4.0 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
|
2021-02-05 22:43:37 |
Adriano Oliveira |
starlingx: assignee |
|
Adriano Oliveira (aoliveir) |
|
2021-02-05 22:43:59 |
Adriano Oliveira |
starlingx: status |
New |
In Progress |
|
2021-02-05 22:47:18 |
Adriano Oliveira |
description |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of compute-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | stx 4.0 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of worker-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | stx 4.0 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
|
2021-02-05 22:51:09 |
Adriano Oliveira |
description |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock compute-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of worker-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 compute-0 was administratively locked to take it out-of-service. host=compute-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | stx 4.0 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
Brief Description
-----------------
During distributed cloud orchestrated upgrade, worker node unlock failed.
Output indicates sriov_numvfs configuration might need more time to be applied.
Severity
--------
Major
Steps to Reproduce
------------------
Follow upgrade procedure as per upgrade orchestration.
The issue is seen when orchestration attempts to unlock worker-0.
Expected Behavior
------------------
No failure on host unlock on any node during upgrade orchestration.
Actual Behavior
----------------
Unlock of worker-0 fails.
Reproducibility
---------------
Intermitent
System Configuration
--------------------
Distributed Cloud
Branch/Pull Time/Commit
-----------------------
stx4.0 as of "2020-06-27_18-35-20"
Last Pass
---------
Timestamp/Logs
--------------
Alarm ID Reason Text Entity ID Severity Time Stamp
------------------------------------------------------------------------------------------+----------------------------------------------+------------------
900.203 Software upgrade auto-apply failed orchestration=sw-upgrade critical 2020-07-02T18:24:
53.413091
800.001 Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s' for more details. cluster=04a34a69-8c18-4494-a97f- warning 2020-07-02T18:11:
5035682d7427 32.397315
200.001 worker-0 was administratively locked to take it out-of-service. host=worker-0 warning 2020-07-02T18:10:
53.991463
750.006 A configuration change requires a reapply of the oidc-auth-apps application. k8s_application=oidc-auth-apps warning 2020-07-02T17:31:
13.243285
750.006 A configuration change requires a reapply of the platform-integ-apps application. k8s_application=platform-integ-apps warning 2020-07-02T17:31:
13.059851
900.005 System Upgrade in progress. host=controller minor 2020-07-02T17:30:
08.041908
[2020-11-16 23:55:01,419] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:55:07,956] 436 DEBUG MainThread ssh.expect :: Output:
Expecting number of interface sriov_numvfs=32. Please wait a few minutes for inventory update and retry host-unlock.
[2020-11-16 23:56:15,366] 314 DEBUG MainThread ssh.send :: Send 'system --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'
[2020-11-16 23:56:24,457] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+--------------------------------------------+
| Property | Value |
+-----------------------+--------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | None |
| bm_type | none |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| capabilities | {u'stor_function': u'monitor'} |
| clock_synchronization | ntp |
| config_applied | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| config_status | None |
| config_target | 1d1484c5-dd15-49b3-ab87-0ed0fc4c4a3d |
| console | ttyS0,115200n8 |
| created_at | 2020-11-16T14:59:44.949842+00:00 |
| device_image_update | None |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | abcd:204::2 |
| mgmt_mac | 3c:fd:fe:a0:16:78 |
| operational | disabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| serialid | None |
| software_load | stx 4.0 |
| subfunction_avail | online |
| subfunction_oper | disabled |
| subfunctions | controller,worker,lowlatency |
| task | Unlocking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2020-11-16T23:55:38.864171+00:00 |
| uptime | 111 |
| uuid | 20167fc8-c125-4b72-aace-fc10bb8de147 |
| vim_progress_status | services-disabled |
+-----------------------+--------------------------------------------+
Test Activity
-------------
Regression Testing
Workaround
----------
Wait a couple of minutes and try to unlock the node again. |
|
2021-02-08 16:48:40 |
Ghada Khalil |
tags |
|
stx.5.0 stx.networking stx.update |
|
2021-02-08 16:49:50 |
Ghada Khalil |
starlingx: importance |
Undecided |
Medium |
|
2021-03-25 17:33:34 |
Adriano Oliveira |
starlingx: status |
In Progress |
Fix Released |
|
2021-06-15 17:39:25 |
OpenStack Infra |
tags |
stx.5.0 stx.networking stx.update |
in-f-centos8 stx.5.0 stx.networking stx.update |
|
2021-06-16 12:24:31 |
OpenStack Infra |
bug watch added |
|
https://github.com/kubernetes-client/python/issues/765 |
|