Charm Calico Enterprise

Tigera units do not become active after the first installation of the bundle

Bug #2053143 reported by Ebrar Leblebici on 2024-02-14

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Charm Calico Enterprise	Fix Released	High	Kevin W Monroe	Charm Calico Enterprise 1.29+ck1

Bug Description

Hi,

After the deployment of the bundle, most of the tigera units become stuck in "waiting" state and some of them become stuck in "error" state.

For the "waiting" units we have the message:

tigera-operator POD not found

And for the "error" units we have the message:

hook-failed: "calico-enterprise-relation-changed"

We have the error here for the unit in error state: https://pastebin.ubuntu.com/p/xrYrTXK3JV/

It seems it is failing when it tries to label the nodes because of the apiserver has not running yet.

After running the "juju resolved" command for the units in "error" state, after approx. 10-15 minutes all the tigera units become "active" and "idle". So, there may be a race condition?

Regards,
Ebrar

Revision history for this message

Adrian Flynn (flynna) wrote on 2024-02-14:

This bug requires manual user intervention and is a blocker to deploying Charmed K8S in a nightly automated cluster builds.

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2024-03-07:

Thanks for the report. Will you share the bundle used to hit this? Or otherwise let me know the steps to reproduce?

Changed in charm-calico-enterprise:
assignee:	nobody → Kevin W Monroe (kwmonroe)
importance:	Undecided → High
milestone:	none → 1.29+ck1
status:	New → Incomplete

Revision history for this message

Adrian Flynn (flynna) wrote on 2024-03-08:

Unable to share bundle on a public accessible site.

Kevin W Monroe (kwmonroe) on 2024-03-11

Changed in charm-calico-enterprise:
status:	Incomplete → In Progress

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2024-03-11:

Taking a closer look at the failure from the description, I can tell the charm is going through lifecycle hooks faster than it should; e.g. attempting to use kubectl before the api server is up.

This charm has significant config requirements, and if they're not all provided at deployment (and if the hook timing isn't quite right), we'll end up in the described failed state.

I made adjustments to the charm so that it only proceeds as far as the provided config allows. IOW, if you haven't provided image registry credentials, it won't try to pull images. If you haven't provided CIDR ranges, it won't attempt to configure bgp peering data.

PR for review:

https://github.com/charmed-kubernetes/charm-calico-enterprise/pull/5

Revision history for this message

Adrian Flynn (flynna) wrote on 2024-03-16:

I have over the last 24 hours had to deploy a charmed cluster 4 times before it would successfully deploy. This is really not a place we want to be when deploying Kubernetes clusters.

Is the PR above likely to fix things? Any ETA when the fix will be available?

I had updated support case 00380420.

Thanks

Regards

Adrian

Kevin W Monroe (kwmonroe) on 2024-03-24

Changed in charm-calico-enterprise:
status:	In Progress → Fix Committed

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2024-03-24:

cherry picks to release_1.29:

https://github.com/charmed-kubernetes/charm-calico-enterprise/commit/e3c1316c21d5ca01a5f8615510d61b2190d9db74

https://github.com/charmed-kubernetes/charm-calico-enterprise/commit/5a696f268f97ec9ec22440a891d6b6445f50e3e4

Revision history for this message

Kevin W Monroe (kwmonroe) wrote on 2024-03-24:

overnight builds should get this landed in the 1.29/candidate channel, with promotion to 1.29/stable shortly after (pending ci for the whole 1.29+ck1 release).

Revision history for this message

Ebrar Leblebici (birru2) wrote on 2024-04-15:

Thank you Kevin for your effort on this one. But we need a backport to 1.28. Can you please help on this one, too?

Kevin W Monroe (kwmonroe) on 2024-04-23

Changed in charm-calico-enterprise:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.