Reinstall of worker node results in Configuration failure. "Could not find command 'configure'"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Al Bailey |
Bug Description
Brief Description
-----------------
Host Reinstall (worker node) failed to apply puppet manifest
Configuration failure, threshold reached
Severity
--------
standard
Steps to Reproduce
------------------
1. worker nodes was running prior to this test
2. reinstall initiated
3. attempt to unlock after worker node installed and online
Expected Behavior
------------------
reinstalls, runs puppet manifest without failures
Actual Behavior
----------------
reinstalled but failed to successfully unlock
compute-2 Worker Unlocked Disabled Failed 2 minutes
Configuration failure, threshold reached, Lock/Unlock to retry
[ 101.778238] worker_
[ 101.783196] worker_
-05-18-
[ 101.796110] worker_
**********
[ 101.805131] worker_
**********
[ 101.814059] worker_
[ 101.823052] worker_
**********
[ 101.832051] worker_
**********
Reproducibility
---------------
yes
System Configuration
-------
2+3
(Lab: wp_3-7)
Branch/Pull Time/Commit
-------
BUILD_ID=
BUILD_ID=
Last Pass
---------
Timestamp/Logs
--------------
see attached puppet.log output
~
2019-04-
Kubernetes:
to 0 failed: Could not find command 'configure
Test Activity
-------------
[Platform pinning Feature Testing]
tags: | added: stx.retestneeded |
Changed in starlingx: | |
assignee: | Kevin Smith (kevin.smith.wrs) → Al Bailey (albailey1974) |
This problem is almost certainly happening because the re-install will cause the worker node to attempt to join the cluster (with kubadm join). This will fail because the node still exists in kubernetes. The solution is likely to have sysinv delete the node when the re-install is done. We are already doing this when the node is deleted.