Locking controller timed out waiting for helm-controller to terminate
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Joshua Reed |
Bug Description
Brief Description
-----------------
Locking controller timed out with error 'Terminating pods on disabled host controller-0 timed out..'
Severity
-----------------
Major
Steps to Reproduce
-----------------
Manual:
Find the standby controller.
system host-list
Lock standby controller
system host-lock controller-0
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| apparmor | disabled |
| availability | available |
| bm_ip | 2620:10a:
| bm_type | dynamic |
| bm_username | sysadmin |
| boot_device | /dev/disk/
| capabilities | {'is_max_
| clock_synchroni
| config_applied | 525564f9-
| config_status | None |
| config_target | 525564f9-
| console | ttyS0,115200n8 |
| created_at | 2023-08-
| cstates_available | C1,C1E,C6,POLL |
| device_image_update | None |
| hostname | controller-0 |
| hw_settle | 0 |
| id | 1 |
| install_output | text |
| install_state | None |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| max_cpu_mhz_allowed | 3300 |
| max_cpu_
| mgmt_ip | fdff:719a:
| mgmt_mac | b4:83:51:00:ae:f8 |
| min_cpu_mhz_allowed | 800 |
| operational | enabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 23.09 |
| task | Locking |
| tboot | |
| ttys_dcd | False |
| updated_at | 2023-08-
| uptime | 23186 |
| uuid | f1b4280a-
| vim_progress_status | services-enabled |
+------
[sysadmin@
Wait for some time. Locking controller-0 failed
system host-show controller-0
+------
| Property | Value |
+------
| action | none |
| administrative | unlocked |
| apparmor | disabled |
| availability | available |
| bm_ip | 2620:10a:
| bm_type | dynamic |
| bm_username | sysadmin |
| boot_device | /dev/disk/
| capabilities | {'is_max_
| | 'Personality': 'Controller-
| clock_synchroni
| config_applied | 525564f9-
| config_status | None |
| config_target | 525564f9-
| console | ttyS0,115200n8 |
| created_at | 2023-08-
| cstates_available | C1,C1E,C6,POLL |
| device_image_update | None |
| hostname | controller-0 |
| hw_settle | 0 |
| id | 1 |
| install_output | text |
| install_state | None |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} |
| max_cpu_mhz_allowed | 3300 |
| max_cpu_
| mgmt_ip | fdff:719a:
| mgmt_mac | b4:83:51:00:ae:f8 |
| min_cpu_mhz_allowed | 800 |
| operational | enabled |
| personality | controller |
| reboot_needed | False |
| reserved | False |
| rootfs_device | /dev/disk/
| serialid | None |
| software_load | 23.09 |
| task | |
| tboot | |
| ttys_dcd | False |
| updated_at | 2023-08-
| uptime | 23956 |
| uuid | f1b4280a-
| vim_progress_status | Terminating pods on disabled host controller-0 timed out... |
+------
Expected Behavior
-----------------
Controller lock should be successful
Actual Behavior
-----------------
Controller lock failed. Reverts to unlocked state.
Reproducibility
-----------------
Intermittent
System Configuration
-----------------
AIO-PLUX, AIO-DX, STANDARD with Storage.
Last Pass
-----------------
8/31/23 - Bug introduced by https:/
Test Activity
-----------------
Sanity
Workaround
1. kubectl edit deployment -n flux-helm helm-controller
2. Edit the following:
- Add argument: "--graceful-
- Change the "terminationGra
- from: terminationGrac
- to: terminationGrac
Changed in starlingx: | |
assignee: | nobody → Joshua Reed (jreed7) |
Fix proposed to branch: master /review. opendev. org/c/starlingx /ansible- playbooks/ +/893978
Review: https:/