tridentctl commands fail with "could not find a Trident pod in the trident namespace"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
New
|
Undecided
|
Unassigned |
Bug Description
Brief Description
-----------------
sysadmin@
Error: could not find a Trident pod in the trident namespace. You may need to use the -n option to specify the correct namespace
Severity
--------
Critical: System/Feature is not usable due to the defect
Expected Behavior
------------------
NetApp trident drivers should be running 21.04.1 and I should be able to view what version is installed with `trident version --namespace trident`
Actual Behavior
----------------
Above command fails. Looking at the pods, some are not running.
Reproducibility
---------------
100% reproducible
System Configuration
-------
2-controllers, 1-worker
Timestamp/Logs
--------------
[sysadmin@
Error: could not find a Trident pod in the trident namespace. You may need to use the -n option to specify the correct namespace
[sysadmin@
NAME READY STATUS RESTARTS AGE
trident-
trident-csi-dbwpz 2/2 Running 2 30h
trident-csi-qq4ft 1/2 Running 2 30h
trident-csi-tvxwh 1/2 Running 0 30h
[sysadmin@
Name: trident-
Namespace: trident
Priority: 0
Node: <none>
Labels: app=controller.
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/
Containers:
trident-main:
Image: registry.
Ports: 8678/TCP, 8001/TCP
Host Ports: 0/TCP, 0/TCP
Command:
/
Args:
-
--k8s_pod
--https_rest
-
-
-
-
-
-
--port=8677
--metrics
-
Liveness: exec [tridentctl -s 127.0.0.1:8677 version] delay=120s timeout=90s period=120s #success=1 #failure=2
Environment:
KUBE_
CSI_ENDPOINT: unix://
TRIDENT_
Mounts:
/certs from certs (ro)
/plugin from socket-dir (rw)
/
csi-provisioner:
Image: registry.
Port: <none>
Host Port: <none>
Args:
--v=2
-
-
-
-
Environment:
ADDRESS: /var/lib/
Mounts:
/
/
csi-attacher:
Image: registry.
Port: <none>
Host Port: <none>
Args:
--v=2
--timeout=60s
-
-
Environment:
ADDRESS: /var/lib/
Mounts:
/
/
csi-resizer:
Image: registry.
Port: <none>
Host Port: <none>
Args:
--v=2
-
-
Environment:
ADDRESS: /var/lib/
Mounts:
/
/
csi-snapshotter:
Image: registry.
Port: <none>
Host Port: <none>
Args:
--v=2
-
-
Environment:
ADDRESS: /var/lib/
Mounts:
/
/
Conditions:
Type Status
PodScheduled False
Volumes:
socket-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
certs:
Type: Secret (a volume populated by a Secret)
SecretName: trident-csi
Optional: false
asup-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: 1Gi
kube-
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpira
ConfigMapName: kube-root-ca.crt
ConfigMapOp
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.
Tolerations: node.kubernetes
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 120m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had taint {node-role.
Warning FailedScheduling 120m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had taint {node-role.
Warning FailedScheduling 100m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {node-role.
Warning FailedScheduling 87m default-scheduler 0/3 nodes are available: 1 node(s) had taint {services: disabled}, that the pod didn't tolerate, 2 node(s) had taint {node-role.
Warning FailedScheduling 64m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {node-role.
Warning FailedScheduling 63m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {node-role.
Warning FailedScheduling 58m default-scheduler 0/3 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {node-role.
Workaround
----------
$ cat <<EOF >> ~/trident_patch.yml
spec:
template:
spec:
tolerations:
- key: "node-role.
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.
operator: "Exists"
effect: "NoSchedule"
EOF
$ kubectl patch deployment trident-csi -n trident --patch "$(cat ~/trident_