Confusing behaviour if a host is unlocked before the VIM has finished disabling services after a host-lock
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
John Kung |
Bug Description
Brief Description
-----------------
If a user locks a host, then unlocks it soon after (before the VIM has disabled all of the services), the lock appears to get cancelled, and the VIM enables all of the services again.
I believe this is by design, but there are a couple of issues:
1. A system host-show shows the host's task seemingly 'stuck' in 'Locking-', even after the unlock is issued and rejected.
2. The user is not notified that the previous lock has been cancelled. The host-unlock simply gives the message:
"Avoiding 'unlock' action on already 'unlocked' host compute-0"
Additional info:
-----------------
I have seen this while testing the potential fix for https:/
In the above potential fix, the VIM waits until all pods are terminated before locking the host. This can take up to 30 seconds. If a host-unlock is issued before the pods are terminated, the VIM will take away the NoExecute taint on the node and restart all of the terminating pods.
So from a user's perspective of:
1. lock a host
2. make a system change
3. unlock the host
They may not be getting behaviour that they expect if 2,3) happen before the VIM has disabled all of the services. I think the user should be given some indication that the initial 'lock' action has been cancelled. I'm also not sure the task should be 'Locking-' if that's the case (the lock being cancelled) either.
If the unlock happens after the pods are terminated, the VIM has disabled services, and the host is actually locked, everything works as expected.
sysinv logs:
host-lock:
2019-10-08 19:00:58.470 95074 INFO sysinv.
2019-10-08 19:00:58.470 95074 INFO sysinv.
2019-10-08 19:00:58.470 95074 INFO sysinv.
2019-10-08 19:00:58.470 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.478 95074 INFO sysinv.
2019-10-08 19:00:58.479 95074 INFO sysinv.
2019-10-08 19:00:58.479 95074 INFO sysinv.
2019-10-08 19:00:58.479 95074 INFO sysinv.
2019-10-08 19:00:58.490 95074 INFO sysinv.
2019-10-08 19:00:58.490 95074 INFO sysinv.
2019-10-08 19:00:58.490 95074 WARNING sysinv.
2019-10-08 19:00:58.491 95074 WARNING sysinv.
2019-10-08 19:00:58.491 95074 INFO sysinv.
2019-10-08 19:00:58.502 95074 INFO sysinv.
2019-10-08 19:00:58.595 95074 INFO sysinv.
2019-10-08 19:00:58.609 95074 INFO sysinv.
2019-10-08 19:01:11.024 95074 INFO sysinv.
host-unlock:
2019-10-08 19:01:11.978 95073 INFO sysinv.
2019-10-08 19:01:11.978 95073 INFO sysinv.
2019-10-08 19:01:11.990 95073 INFO sysinv.
2019-10-08 19:01:12.001 95073 WARNING wsme.api [-] Client-side error: Avoiding 'unlock' action on already 'unlocked' host compute-1
2019-10-08 19:01:13.950 95074 INFO sysinv.
2019-10-08 19:01:13.950 95074 INFO sysinv.
2019-10-08 19:01:13.978 95074 INFO sysinv.
2019-10-08 19:01:14.040 95073 INFO sysinv.
2019-10-08 19:01:29.809 95074 INFO sysinv.
2019-10-08 19:01:29.809 95074 INFO sysinv.
2019-10-08 19:01:48.883 95074 INFO sysinv.
2019-10-08 19:01:48.883 95074 INFO sysinv.
2019-10-08 19:01:48.883 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 WARNING sysinv.
2019-10-08 19:01:48.884 95074 WARNING sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
2019-10-08 19:01:48.884 95074 INFO sysinv.
vim logs:
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
2019-10-
cli:
[sysadmin@
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | None |
| bm_type | None |
| bm_username | None |
| boot_device | /dev/disk/
| capabilities | {} |
| clock_synchroni
| config_applied | 5d0880f6-
| config_status | None |
| config_target | 5d0880f6-
| console | tty0 |
| created_at | 2019-10-
| hostname | compute-1 |
| id | 3 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.6 |
| mgmt_mac | 08:00:27:48:33:46 |
| operational | enabled |
| personality | worker |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 19.09 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-10-
| uptime | 2215 |
| uuid | c4a7e3d5-
| vim_progress_status | services-enabled |
+------
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | None |
| bm_type | None |
| bm_username | None |
| boot_device | /dev/disk/
| capabilities | {} |
| clock_synchroni
| config_applied | 5d0880f6-
| config_status | None |
| config_target | 5d0880f6-
| console | tty0 |
| created_at | 2019-10-
| hostname | compute-1 |
| id | 3 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.6 |
| mgmt_mac | 08:00:27:48:33:46 |
| operational | enabled |
| personality | worker |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 19.09 |
| task | Locking |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-10-
| uptime | 2215 |
| uuid | c4a7e3d5-
| vim_progress_status | services-enabled |
+------
[sysadmin@
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | None |
| bm_type | None |
| bm_username | None |
| boot_device | /dev/disk/
| capabilities | {} |
| clock_synchroni
| config_applied | 5d0880f6-
| config_status | None |
| config_target | 5d0880f6-
| console | tty0 |
| created_at | 2019-10-
| hostname | compute-1 |
| id | 3 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.6 |
| mgmt_mac | 08:00:27:48:33:46 |
| operational | enabled |
| personality | worker |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 19.09 |
| task | Locking- |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-10-
| uptime | 2215 |
| uuid | c4a7e3d5-
| vim_progress_status | services-enabled |
+------
While pods are still terminating:
[sysadmin@
Avoiding 'unlock' action on already 'unlocked' host compute-1
~~~ A minute or two or 1 hour later ~~~~
[sysadmin@
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | None |
| bm_type | None |
| bm_username | None |
| boot_device | /dev/disk/
| capabilities | {} |
| clock_synchroni
| config_applied | 5d0880f6-
| config_status | None |
| config_target | 5d0880f6-
| console | tty0 |
| created_at | 2019-10-
| hostname | compute-1 |
| id | 3 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.6 |
| mgmt_mac | 08:00:27:48:33:46 |
| operational | enabled |
| personality | worker |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 19.09 |
| task | Locking- |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-10-
| uptime | 2280 |
| uuid | c4a7e3d5-
| vim_progress_status | services-enabled |
+------
Severity
--------
Minor
Steps to Reproduce
------------------
May need the potential fix for https:/
- Standard system.
- lock a host
- wait one second and unlock the host
Expected Behavior
------------------
The user should be given more information about why the unlock could not be completed, or the interaction between the VIM/sysinv should change to account for this scenario.
Actual Behavior
----------------
The host will not unlock until the host is locked again and the VIM has been allowed to finish disabling services
Reproducibility
---------------
100%
System Configuration
-------
2 controller, 2 worker standard config
Branch/Pull Time/Commit
-------
BUILD_DATE=
tags: | added: stx.config |
Appears to be a day 1 limitation; assigning to stx.config TL for review and recommendation.
For now, I will mark this as low priority / not gating.