One of calico instace stuck in "Configuring Calico"

Bug #2064145 reported by Jeffrey Chang
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Calico Charm
Triaged
Undecided
Unassigned

Bug Description

In this SQA run - https://solutions.qa.canonical.com/testruns/18cd9a82-ba5a-4938-98a5-2be7e8d6e5a5
When deploying Charmed Kubernetes on top of baremetal with charm Cailco rev 105 on focal,
one of the calico instance stuck in "Configuring Calico",
and kubernetes-control-plane is pending calico service.

Error logs
2024-04-26 16:00:11 INFO unit.calico/1.juju-log server.go:325 Configured Calico IP pool.
2024-04-26 16:00:11 ERROR unit.calico/1.juju-log server.go:325 b'resource does not exist: Node(duosion) with error: <nil>\n'
2024-04-26 16:00:11 ERROR unit.calico/1.juju-log server.go:325 b'null\n'
2024-04-26 16:00:11 ERROR unit.calico/1.juju-log server.go:325 Failed to configure node.
Traceback (most recent call last):
  File "./src/charm.py", line 298, in _configure_node
    node = self._calicoctl_get("node", node_name)
  File "./src/charm.py", line 640, in _calicoctl_get
    output = self.calicoctl(*args)
  File "./src/charm.py", line 632, in calicoctl
    return subprocess.check_output(cmd, env=env, stderr=subprocess.PIPE, timeout=timeout)
  File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/calicoctl/calicoctl', 'get', '-o', 'yaml', '--export', 'node', 'duosion']' returned non-zero exit status 1.
2024-04-26 16:00:11 ERROR unit.calico/1.juju-log server.go:325 Failed to configure Calico, will retry.
Traceback (most recent call last):
  File "./src/charm.py", line 174, in _install_or_upgrade
    self._configure_calico()
  File "./src/charm.py", line 125, in _configure_calico
    self._configure_node()
  File "./src/charm.py", line 305, in _configure_node
    raise e
  File "./src/charm.py", line 298, in _configure_node
    node = self._calicoctl_get("node", node_name)
  File "./src/charm.py", line 640, in _calicoctl_get
    output = self.calicoctl(*args)
  File "./src/charm.py", line 632, in calicoctl
    return subprocess.check_output(cmd, env=env, stderr=subprocess.PIPE, timeout=timeout)
  File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/calicoctl/calicoctl', 'get', '-o', 'yaml', '--export', 'node', 'duosion']' returned non-zero exit status 1.
2024-04-26 16:00:11 DEBUG unit.calico/1.juju-log server.go:325 Deferring <InstallEvent via CalicoCharm/on/install[1]>.

Revision history for this message
Michael Fischer (michaelandrewfischer) wrote (last edit ):
Download full text (3.7 KiB)

Having this issue in revision 105 on jammy. All of the calico nodes are stuck in this infinite loop failing to configure. If I ssh into any of the calico nodes and run the command (/opt/calicoctl/calicoctl get -o yaml --export node novel-bird it does not return null neither does it return a non-zero exit status of 1.

command:
/opt/calicoctl/calicoctl get -o yaml --export node novel-bird

output:
apiVersion: projectcalico.org/v3
kind: Node
metadata:
  annotations:
    projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","juju-application":"kubernetes-control-plane","juju-charm":"kubernetes-control-plane","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"novel-bird","kubernetes.io/os":"linux","node-role.kubernetes.io/control-plane":""}'
  creationTimestamp: null
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    juju-application: kubernetes-control-plane
    juju-charm: kubernetes-control-plane
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: novel-bird
    kubernetes.io/os: linux
    node-role.kubernetes.io/control-plane: ""
  name: novel-bird
spec:
  addresses:
  - address: 192.168.2.38
    type: InternalIP
  orchRefs:
  - nodeName: novel-bird
    orchestrator: k8s
status: {}

error logs:

unit-calico-3: 09:10:37 ERROR unit.calico/3.juju-log b'resource does not exist: Node(novel-bird) with error: <nil>\n'
unit-calico-3: 09:10:37 ERROR unit.calico/3.juju-log b'null\n'
unit-calico-3: 09:10:37 ERROR unit.calico/3.juju-log Failed to configure node.
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 298, in _configure_node
    node = self._calicoctl_get("node", node_name)
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 640, in _calicoctl_get
    output = self.calicoctl(*args)
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 632, in calicoctl
    return subprocess.check_output(cmd, env=env, stderr=subprocess.PIPE, timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/calicoctl/calicoctl', 'get', '-o', 'yaml', '--export', 'node', 'novel-bird']' returned non-zero exit status 1.
unit-calico-3: 09:10:37 ERROR unit.calico/3.juju-log Failed to configure Calico, will retry.
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 174, in _install_or_upgrade
    self._configure_calico()
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 125, in _configure_calico
    self._configure_node()
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 305, in _configure_node
    raise e
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 298, in _configure_node
    node = self._calicoctl_get("node", node_name)
  File "/var/lib/juju/agents/unit-calico-3/charm/./src/charm.py", line 640, in _calicoctl_get
    output = s...

Read more...

Adam Dyess (addyess)
Changed in charm-calico:
milestone: none → 1.30+ck1
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.