Brief Description -----------------. After storage node(storage-0) was locked and unlocked never recovered due to ceph (osd.0, osd.1, ) process. Auto recovery was not successful. Storage-0 was not recovered. Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show storage-0' [2019-05-26 02:02:28,248] 387 DEBUG MainThread ssh.expect :: Output: +---------------------+---------------------------------------------------------------+ | Property | Value | +---------------------+---------------------------------------------------------------+ | action | none | | administrative | locked | | availability | online | | bm_ip | 128.224.64.220 | | bm_type | bmc | | bm_username | root | | boot_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | capabilities | {u'stor_function': u'monitor'} | | config_applied | bfa21b40-4963-486c-9d0b-a978245be1ef | | config_status | None | | config_target | bfa21b40-4963-486c-9d0b-a978245be1ef | | console | ttyS0,115200 | | created_at | 2019-05-25T16:15:52.115877+00:00 | | hostname | storage-0 | | id | 6 | | install_output | text | | install_state | completed | | install_state_info | None | | invprovision | provisioned | | location | {} | | mgmt_ip | 192.168.204.191 | | mgmt_mac | 90:e2:ba:c6:95:ec | | operational | disabled | | peers | {u'hosts': [u'storage-1', u'storage-0'], u'name': u'group-0'} | | personality | storage | | reserved | False | | rootfs_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | serialid | None | | software_load | 19.05 | | task | | | tboot | false | | ttys_dcd | None | | updated_at | 2019-05-26T02:02:22.902168+00:00 | | uptime | 33217 | | uuid | a56c9e15-ca13-4c26-b9b2-c6d93e14c8da | | vim_progress_status | services-disabled | +---------------------+---------------------------------------------------------------+ [wrsroot@controller-0 ~(keystone_admin)]$ Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock storage-0' [2019-05-26 02:03:03,616] 387 DEBUG MainThread ssh.expect :: Output: +---------------------+---------------------------------------------------------------+ | Property | Value | +---------------------+---------------------------------------------------------------+ | action | none | | administrative | locked | | availability | online | | bm_ip | 128.224.64.220 | | bm_type | bmc | | bm_username | root | | boot_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | capabilities | {u'stor_function': u'monitor'} | | config_applied | bfa21b40-4963-486c-9d0b-a978245be1ef | | config_status | None | | config_target | bfa21b40-4963-486c-9d0b-a978245be1ef | | console | ttyS0,115200 | | created_at | 2019-05-25T16:15:52.115877+00:00 | | hostname | storage-0 | | id | 6 | | install_output | text | | install_state | completed | | install_state_info | None | | invprovision | provisioned | | location | {} | | mgmt_ip | 192.168.204.191 | | mgmt_mac | 90:e2:ba:c6:95:ec | | operational | disabled | | peers | {u'hosts': [u'storage-1', u'storage-0'], u'name': u'group-0'} | | personality | storage | | reserved | False | | rootfs_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | serialid | None | | software_load | 19.05 | | task | Unlocking | | tboot | false | | ttys_dcd | None | | updated_at | 2019-05-26T02:02:22.902168+00:00 | | uptime | 33217 | | uuid | a56c9e15-ca13-4c26-b9b2-c6d93e14c8da | | vim_progress_status | services-disabled | +---------------------+---------------------------------------------------------------+ DEBUG MainThread ssh.exec_cmd:: Executing command... [2019-05-26 02:25:41,372] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show storage-0' [2019-05-26 02:25:43,014] 387 DEBUG MainThread ssh.expect :: Output: +---------------------+---------------------------------------------------------------+ | Property | Value | +---------------------+---------------------------------------------------------------+ | action | none | | administrative | unlocked | | availability | failed | | bm_ip | 128.224.64.220 | | bm_type | bmc | | bm_username | root | | boot_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | capabilities | {u'stor_function': u'monitor'} | | config_applied | bfa21b40-4963-486c-9d0b-a978245be1ef | | config_status | None | | config_target | bfa21b40-4963-486c-9d0b-a978245be1ef | | console | ttyS0,115200 | | created_at | 2019-05-25T16:15:52.115877+00:00 | | hostname | storage-0 | | id | 6 | | install_output | text | | install_state | completed | | install_state_info | None | | invprovision | provisioned | | location | {} | | mgmt_ip | 192.168.204.191 | | mgmt_mac | 90:e2:ba:c6:95:ec | | operational | disabled | | peers | {u'hosts': [u'storage-1', u'storage-0'], u'name': u'group-0'} | | personality | storage | | reserved | False | | rootfs_device | /dev/disk/by-path/pci-0000:83:00.0-nvme-1 | | serialid | None | | software_load | 19.05 | | task | Service Failure, threshold reached, Lock/Unlock to retry | | tboot | false | | ttys_dcd | None | | updated_at | 2019-05-26T02:24:51.733772+00:00 | | uptime | 841 | | uuid | a56c9e15-ca13-4c26-b9b2-c6d93e14c8da | | vim_progress_status | services-disabled | +---------------------+---------------------------------------------------------------+ [wrsroot@controller-0 ~(keystone_admin)]$ 019-05-26 03:09:00,908] 387 DEBUG MainThread ssh.expect :: Output: fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internal URL --os-region-name RegionOne alarm-list --nowrap --uuid +--------------------------------------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------+----------------------------+ | UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp | +--------------------------------------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------+----------------------------+ | 94c0c118-4cb0-45af-a72e-429058555b9b | 200.006 | storage-0 is degraded due to the failure of its 'ceph (osd.0, osd.1, )' process. Auto recovery of this major process is in progress. | host=storage-0.process=ceph (osd.0, osd.1, ) | major | 2019-05-26T02:14:47.616363 | | 574b7d0e-426d-4d34-9733-dff2d693b1dd | 200.004 | storage-0 experienced a service-affecting failure. Auto-recovery in progress. Manual Lock and Unlock may be required if auto-recovery is unsuccessful. | host=storage-0 | critical | 2019-05-26T02:07:44.159670 | | 484d6b73-fe37-459c-80dd-f943dd6cc81e | 800.011 | Loss of replication in replication group group-0: OSDs are down | cluster=bafbb0f1-b04c-42b6-a492-03911e6c21f3.peergroup=group-0.host=storage-0 | major | 2019-05-26T02:02:27.949793 | | e8a473ea-1e6d-4f34-93a7-89f253c1ccc7 | 800.001 | Storage Alarm Condition: HEALTH_WARN [PGs are degraded/stuck or undersized]. Please check 'ceph -s' for more details. | cluster=bafbb0f1-b04c-42b6-a492-03911e6c21f3 | warning | 2019-05-26T02:02:27.676319 | +--------------------------------------+----------+----------------------------------- Severity --------. Major Steps to Reproduce ------------------ 1. Install storage lab with open stack application as per install procedure. 2. lock and unlock the storage node. 3. storage node was not in available state as per despcription. Expected Behavior ------------------ No alarms and PTP enabled in all the hosts. Actual Behavior ---------------- storage-0 was not recovered . It was on failed mode. Reproducibility --------------- System Configuration -------------------- Regular system Branch/Pull Time/Commit ----------------------- BUILD_DATE=":2019-05-24_17-39-51" Last Pass --------- 20190503T013000Z Timestamp/Logs -------------- 2019-05-26T02:02:22.902168+00:00 Test Activity ------------- Regression test