Canonical Juju

Status Code 418 (I'm a teapot) thrown by the Pebble readiness check

Bug #2059105 reported by Bartlomiej Gmerek on 2024-03-26

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Canonical Juju	Expired	Undecided	Unassigned

Bug Description

Hello Team,

While working on integration tests for my projects (Charmed 5G), I've noticed around 20%-25% of the runs fails because Pebble in not able to start.
Charm goes to error state (i.e. hook failed: "start") and when I look into the pod's logs it turns out that the pebble is a teapot:
[pebble] Check "readiness" failure 190 (threshold 3): received non-20x status code 418

My env is Juju 3.4 + Microk8s 1.29-strict running on Canonical's self-hosted GH runner.

From my observations, the problem visibility increases when the infrastructure performance starts to be a problem. Charmed 5G includes around 20 charms. When using Canonical's self-hosted runners, if I try to deploy it on the `large` runner, there's almost 100% chance for failure. If I use `xlarge`, the failure rate would go down to maybe 10-15%.

It would be great if the status code 418 could be replaced with something meaningful.

Latest failed run is available at https://github.com/canonical/sdcore-tests/actions/runs/8434000175.
At the bottom of the page, there's a Juju crashdump and K8s logs available for your reference.

BR,
Bartek

Revision history for this message

Ben Hoyt (benhoyt) wrote on 2024-03-26:

It looks like this is coming from the Juju "caasprober" worker here: https://github.com/juju/juju/pull/12048/files#diff-17cd0462495cd82a91e96cdd4070e2e3a39e1e51db5d0d05e9d2df114657da64R103 ... it's not Pebble that's unable to start, but the Juju probe returning a not-good return value from the "suppler" (not sure what that is, haven't followed it through).

Thomas Miller added this code in the above PR, so he may be able to help here.

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2024-03-26:

Are we able to see why the start hook failed? The unit is not considered ready until the start hook has completed successfully.

Changed in juju:
status:	New → Incomplete

Revision history for this message

Launchpad Janitor (janitor) wrote on 2024-05-26:

[Expired for Canonical Juju because there has been no activity for 60 days.]

Changed in juju:
status:	Incomplete → Expired

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.