STX-Openstack: Pods locked in Init state after node reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Daniel Marques Caires |
Bug Description
Brief Description
-----------------
After a node reboot, several Pods were locked on Init state with its Init containers waiting for Jobs that were already cleaned after completion by TTL configuration.
Cause: airship/
In order to be able to add the "app.starlingx.
Therefore, at least for now, Jobs should not have the TTL configuration and, consequentially, should not receive the spec update adding the "app.starlingx.
[1] https:/
[2] https:/
[3] https:/
[4] https:/
Severity
--------
Major: System endurance was jeopardized
Steps to Reproduce
------------------
- Apply stx-openstack
- Reboot a node
Expected Behavior
-----------------
All pods should be Running
Actual Behavior
---------------
Some pods are locked in the Init phase
Reproducibility
------
Reproducible
System Configuration
-------
Found in AIO-SX (virtual) and AIO-DX (physical) deployments.
Timestamp/Logs
--------------
$ kubectl -n openstack logs -f pod/cinder-
Entrypoint WARNING: 2024/07/30 13:09:43 entrypoint.go:72: Resolving dependency Job cinder-db-sync in namespace openstack failed: jobs.batch "cinder-db-sync" not found .
$ kubectl -n openstack logs -f pod/cinder-
Entrypoint WARNING: 2024/07/30 13:10:17 entrypoint.go:72: Resolving dependency Job cinder-db-sync in namespace openstack failed: jobs.batch "cinder-db-sync" not found .Entrypoint WARNING: 2024/07/30 13:10:17 entrypoint.go:72: Resolving dependency Job cinder-ks-user in namespace openstack failed: jobs.batch "cinder-ks-user" not found Entrypoint WARNING: 2024/07/30 13:10:17 entrypoint.go:72: Resolving dependency Job cinder-ks-endpoints in namespace openstack failed: jobs.batch "cinder-
$ kubectl -n openstack logs -f pod/heat-
$ kubectl -n openstack logs -f pod/nova-
Entrypoint WARNING: 2024/07/29 23:41:11 entrypoint.go:72: Resolving dependency Job nova-db-sync in namespace openstack failed: jobs.batch "nova-db-sync" not found .
Alarms
------
None
Test Activity
-------------
Developer Testing
Workaround
------------
Remove and apply the application.
Changed in starlingx: | |
assignee: | nobody → Daniel Marques Caires (daniel-caires) |
importance: | Undecided → High |
tags: | added: stx.distro.openstack |
Changed in starlingx: | |
status: | New → Confirmed |
Changed in starlingx: | |
assignee: | Daniel Marques Caires (daniel-caires) → nobody |
assignee: | nobody → Daniel Marques Caires (dcaires) |
Changed in starlingx: | |
status: | In Progress → Fix Released |
tags: | added: stx.10.0 |
Fix proposed to branch: master /review. opendev. org/c/starlingx /openstack- armada- app/+/925972
Review: https:/