aws-k8s-storage hung evaluating manifests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
AWS Cloud Provider Charm |
Fix Released
|
Medium
|
Adam Dyess | ||
Azure Cloud Provider |
Fix Released
|
Medium
|
Adam Dyess | ||
Charm AWS Kubernetes Storage |
Fix Released
|
High
|
Adam Dyess | ||
Charm GCP Kubernetes Storage |
Fix Released
|
Medium
|
Adam Dyess | ||
KubeVirt Charm |
Fix Released
|
Medium
|
Adam Dyess | ||
vSphere Cloud Provider Charm |
Fix Released
|
Medium
|
Adam Dyess |
Bug Description
During a run of k8s 1.26 on aws, the aws-k8s-storage charm hangs for 4 hours with the status "Evaluating Manifests" Unfortunately I cant seem to find any logs about why this is happening or what its doing. A few minutes before the message happens, the unit log has an unauthorized when trying to set the aws-secret, but that action never retries, so maybe the failure is that the charm never gets permissions to access aws.
Testrun:
https:/
crashdump:
https:/
bundle:
https:/
Changed in charm-aws-cloud-provider: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-aws-k8s-storage: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-azure-cloud-provider: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-gcp-k8s-storage: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-vsphere-cloud-provider: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-aws-cloud-provider: | |
status: | In Progress → Fix Committed |
Changed in charm-aws-k8s-storage: | |
status: | In Progress → Fix Committed |
Changed in charm-azure-cloud-provider: | |
status: | In Progress → Fix Committed |
Changed in charm-gcp-k8s-storage: | |
status: | In Progress → Fix Committed |
Changed in charm-kubevirt: | |
status: | In Progress → Fix Committed |
Changed in charm-aws-cloud-provider: | |
status: | Fix Committed → Fix Released |
Changed in charm-aws-k8s-storage: | |
status: | Fix Committed → Fix Released |
Changed in charm-azure-cloud-provider: | |
status: | Fix Committed → Fix Released |
Changed in charm-gcp-k8s-storage: | |
status: | Fix Committed → Fix Released |
Changed in charm-kubevirt: | |
status: | Fix Committed → Fix Released |
Changed in charm-vsphere-cloud-provider: | |
status: | Fix Committed → Fix Released |
Looks like the manifests failed to apply once due to a temporary API outage during the deployment. The event was deferred[1], so the charm reprocessed the event later.
There is an early return[2] that prevents manifests from being applied if the config has not changed since the last time it was run. I think that early return prevented a reattempt from occurring.
[1]: https:/ /github. com/charmed- kubernetes/ aws-k8s- storage/ blob/007f315ba0 90e3c8f02754c51 e4bb87c36afdc9e /src/charm. py#L181- L186 /github. com/charmed- kubernetes/ aws-k8s- storage/ blob/007f315ba0 90e3c8f02754c51 e4bb87c36afdc9e /src/charm. py#L168- L169
[2]: https:/