A Calico pod is stuck waiting during a k8s deployment. The logs show
-----------------------------------------------------------
2024-02-09 08:35:22 ERROR unit.calico/4.juju-log server.go:325 b'resource does not exist: Node(ip-172-31-37-35) with error: <nil>\n'
2024-02-09 08:35:22 ERROR unit.calico/4.juju-log server.go:325 b'null\n'
2024-02-09 08:35:22 ERROR unit.calico/4.juju-log server.go:325 Failed to configure node.
Traceback (most recent call last):
File "./src/charm.py", line 298, in _configure_node
node = self._calicoctl_get("node", node_name)
File "./src/charm.py", line 640, in _calicoctl_get
output = self.calicoctl(*args)
File "./src/charm.py", line 632, in calicoctl
return subprocess.check_output(cmd, env=env, stderr=subprocess.PIPE, timeout=timeout)
File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/calicoctl/calicoctl', 'get', '-o', 'yaml', '--export', 'node', 'ip-172-31-37-35']' returned non-zero exit status 1.
2024-02-09 08:35:22 ERROR unit.calico/4.juju-log server.go:325 Failed to configure Calico, will retry.
Traceback (most recent call last):
File "./src/charm.py", line 174, in _install_or_upgrade
self._configure_calico()
File "./src/charm.py", line 125, in _configure_calico
self._configure_node()
File "./src/charm.py", line 305, in _configure_node
raise e
File "./src/charm.py", line 298, in _configure_node
node = self._calicoctl_get("node", node_name)
File "./src/charm.py", line 640, in _calicoctl_get
output = self.calicoctl(*args)
File "./src/charm.py", line 632, in calicoctl
return subprocess.check_output(cmd, env=env, stderr=subprocess.PIPE, timeout=timeout)
File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/calicoctl/calicoctl', 'get', '-o', 'yaml', '--export', 'node', 'ip-172-31-37-35']' returned non-zero exit status 1.
---------------------------------------------------------------------------------------------
and in the cloud manager logs we see
I0209 04:26:41.428228 1 node_controller.go:484] Successfully initialized node ip-172-31-42-39.ec2.internal with cloud provider
I0209 04:26:41.428263 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
I0209 04:26:41.428621 1 event.go:294] "Event occurred" object="ip-172-31-42-39.ec2.internal" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="Synced" message="Node synced successfully"
E0209 04:26:41.519423 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:41.529643 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:41.619732 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:41.639955 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:41.731775 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:41.772253 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:41.857045 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:41.937298 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:42.018322 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:42.178671 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:42.269708 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:42.590273 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:42.677117 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:43.317318 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:43.467989 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:44.748498 1 node_controller.go:415] Initializing node ip-172-31-37-35 with cloud provider
E0209 04:26:44.840428 1 node_controller.go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing
I0209 04:26:45.070342 1 node_lifecycle_controller.go:164] deleting node since it is no longer present in cloud provider: ip-172-31-37-35
I0209 04:26:45.070488 1 event.go:294] "Event occurred" object="ip-172-31-37-35" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="DeletingNode" message="Deleting node ip-172-31-37-35 because it does not exist in the cloud provider"
---------------------------------------------------------------------------------------------------
It seems like the instance was deleted by no new pod was provisioned and the old ip kept on being reused;
A Calico pod is stuck waiting during a k8s deployment. The logs show
------- ------- ------- ------- ------- ------- ------- ------- --- 4.juju- log server.go:325 b'resource does not exist: Node(ip- 172-31- 37-35) with error: <nil>\n' 4.juju- log server.go:325 b'null\n' 4.juju- log server.go:325 Failed to configure node. _get("node" , node_name) *args) check_output( cmd, env=env, stderr= subprocess. PIPE, timeout=timeout) python3. 8/subprocess. py", line 415, in check_output python3. 8/subprocess. py", line 516, in run ror(retcode, process.args, CalledProcessEr ror: Command '['/opt/ calicoctl/ calicoctl' , 'get', '-o', 'yaml', '--export', 'node', 'ip-172-31-37-35']' returned non-zero exit status 1. 4.juju- log server.go:325 Failed to configure Calico, will retry. _configure_ calico( ) _configure_ node() _get("node" , node_name) *args) check_output( cmd, env=env, stderr= subprocess. PIPE, timeout=timeout) python3. 8/subprocess. py", line 415, in check_output python3. 8/subprocess. py", line 516, in run ror(retcode, process.args, CalledProcessEr ror: Command '['/opt/ calicoctl/ calicoctl' , 'get', '-o', 'yaml', '--export', 'node', 'ip-172-31-37-35']' returned non-zero exit status 1.
2024-02-09 08:35:22 ERROR unit.calico/
2024-02-09 08:35:22 ERROR unit.calico/
2024-02-09 08:35:22 ERROR unit.calico/
Traceback (most recent call last):
File "./src/charm.py", line 298, in _configure_node
node = self._calicoctl
File "./src/charm.py", line 640, in _calicoctl_get
output = self.calicoctl(
File "./src/charm.py", line 632, in calicoctl
return subprocess.
File "/usr/lib/
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/
raise CalledProcessEr
subprocess.
2024-02-09 08:35:22 ERROR unit.calico/
Traceback (most recent call last):
File "./src/charm.py", line 174, in _install_or_upgrade
self.
File "./src/charm.py", line 125, in _configure_calico
self.
File "./src/charm.py", line 305, in _configure_node
raise e
File "./src/charm.py", line 298, in _configure_node
node = self._calicoctl
File "./src/charm.py", line 640, in _calicoctl_get
output = self.calicoctl(
File "./src/charm.py", line 632, in calicoctl
return subprocess.
File "/usr/lib/
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/
raise CalledProcessEr
subprocess.
------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --
and in the cloud manager logs we see
I0209 04:26:41.428228 1 node_controller .go:484] Successfully initialized node ip-172- 31-42-39. ec2.internal with cloud provider .go:415] Initializing node ip-172-31-37-35 with cloud provider "ip-172- 31-42-39. ec2.internal" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="Synced" message="Node synced successfully" .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing .go:415] Initializing node ip-172-31-37-35 with cloud provider .go:229] error syncing 'ip-172-31-37-35': failed to get provider ID for node ip-172-31-37-35 at cloudprovider: failed to get instance ID from cloud provider: instance not found, requeuing controller. go:164] deleting node since it is no longer present in cloud provider: ip-172-31-37-35 "ip-172- 31-37-35" fieldPath="" kind="Node" apiVersion="" type="Normal" reason= "DeletingNode" message="Deleting node ip-172-31-37-35 because it does not exist in the cloud provider"
I0209 04:26:41.428263 1 node_controller
I0209 04:26:41.428621 1 event.go:294] "Event occurred" object=
E0209 04:26:41.519423 1 node_controller
I0209 04:26:41.529643 1 node_controller
E0209 04:26:41.619732 1 node_controller
I0209 04:26:41.639955 1 node_controller
E0209 04:26:41.731775 1 node_controller
I0209 04:26:41.772253 1 node_controller
E0209 04:26:41.857045 1 node_controller
I0209 04:26:41.937298 1 node_controller
E0209 04:26:42.018322 1 node_controller
I0209 04:26:42.178671 1 node_controller
E0209 04:26:42.269708 1 node_controller
I0209 04:26:42.590273 1 node_controller
E0209 04:26:42.677117 1 node_controller
I0209 04:26:43.317318 1 node_controller
E0209 04:26:43.467989 1 node_controller
I0209 04:26:44.748498 1 node_controller
E0209 04:26:44.840428 1 node_controller
I0209 04:26:45.070342 1 node_lifecycle_
I0209 04:26:45.070488 1 event.go:294] "Event occurred" object=
------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- -
It seems like the instance was deleted by no new pod was provisioned and the old ip kept on being reused;
Crashdumps and more information can be found here: https:/ /oil-jenkins. canonical. com/artifacts/ f987f8e9- 1490-4812- aed3-07d3ba4623 d1/index. html