racy configuration of kubelet with recent images - eks nodes failing to taint / label on creation

Bug #2023284 reported by jan grant
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
cloud-images
Fix Released
High
George Kraft

Bug Description

Hi. I'm using EKS with a self-managed nodepool, set up via TF.

Over the last 72 hours or so, there has been a high incidence (but not 100%) of new instances registering themselves with k8s, but not honouring the --register-with-taint or --label-node flags.

These instances are booted with the following user-data:

[[[
#!/bin/bash
set -e
B64_CLUSTER_CA=
API_SERVER_URL=
/etc/eks/bootstrap.sh serverless-dev-europe-1 --container-runtime containerd --kubelet-extra-args '--register-with-taints=foo-node=true:NoSchedule --node-labels=foo-node=true' --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL
]]]

On inspection of a broken node, kubelet is running from snap with those command-line arguments.

My *suspicion* is that the user-data script is racing - perhaps kubelet is coming up once without the appropriate arguments. This'd cause the node to be registered, then the process is replaced but by that point the node resource is created in k8s and isn't updated. (Certainly, restarting the kubelet process doenst' address the issue.)

I see that https://bugs.launchpad.net/cloud-images/+bug/2012689 is mentioned as a recent fix; I'm wondering whether this is related.

Revision history for this message
jan grant (jangrant) wrote :

The problem here is two "snap set" commands, one after the other, in bootstrap.sh.

The first gives kubelet all the information to contact the API server and register itself, but not the extra args. This restarts kubelet and begins a race.

The second adds the extra-args and restarts kubelet.

--register-with-taint and --node-label have no effect if they are applied after the node's registered.

If the first restart of kubelet gets as far as registering the node, those taints / labels will *never* apply.

The fix is to merge the two `kubelet set` commands into one.

A more principled fix would be to not start kubelet at all until it's configured properly.

jan grant (jangrant)
summary: - amazon/ubuntu-eks/k8s_1.24/images/hvm-ssd/ubuntu-
- focal-20.04-amd64-server-20230607 - eks nodes failing to taint / label
- on creation
+ racy configuration of kubelet with recent images - eks nodes failing to
+ taint / label on creation
Revision history for this message
jan grant (jangrant) wrote :

Our workaround is to preconfigure the critical flags in snap prior to launching the bootstrap script: userdata now has this -

[[[
snap set kubelet-eks register-with-taints=foo-node=true:NoSchedule
snap set kubelet-eks node-labels=foo-node=true
B64_CLUSTER_CA=
API_SERVER_URL=
/etc/eks/bootstrap.sh yet-another-cluster --container-runtime containerd --kubelet-extra-args '--register-with-taints=foo-node=true:NoSchedule --node-labels=foo-node=true' --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL
]]]

and this is reliably working around the race.

Revision history for this message
Robby Pocase (rpocase) wrote :

Hey Jan - thanks for filing this! The thorough notes are really appreciated. I agree with your assessment. This is an unforeseen consequence of [1]. I'll aim to work on a more permanent resolution ASAP. For a full fix, I believe we need to do two things.

* Update bootstrap.sh as mentioned (single snap set - original impl assumed snap set was atomic and did not restart the service)
* Follow up with the MP owner and discuss methods to adjust this to not add this inconsistency for users of older AMI deployments (e.g. if they manage their fleet and want to snap upgrade to consume the other fixes without a full AMI upgrade).

[1] - https://git.launchpad.net/snap-kubelet/commit/?id=6c89724888bc5427bbc8828620de0d3c509a5884

Revision history for this message
Robby Pocase (rpocase) wrote :

@Nikita added you as a watched here as the first proposed fix may be sufficient to address your packer concerns. I'll let you know when we have AMIs out with this fix (hopefully tomorrow, but depends on if I can fully cycle to bootstrap.sh). If this does not address it, we should follow up with a new LP bug to focus our troubleshooting.

Revision history for this message
Badal (badaldavda8) wrote :
Download full text (12.1 KiB)

I get the following output on running self-managed nodes -

etc/eks/bootstrap.sh myeksABC123 --kubelet-extra-args --max-pods=29 --b64-cluster-ca ABC123.gr7.eu-west-1.eks.amazonaws.com --use-max-pods false
Using containerd as the container runtime
Aliasing EKS k8s snap commands
error: snap "kubelet-eks" has "auto-refresh" change in progress
Exited with error on line 349

I added sleep of 30 secs, but I still get issues. Issue with Ubuntu EKS AMI v1.26.4

Output -
nap tasks --last=auto-refresh
Status Spawn Ready Summary
Done today at 19:26 UTC today at 19:26 UTC Ensure prerequisites for "snapd" are available
Done today at 19:26 UTC today at 19:26 UTC Download snap "snapd" (19361) from channel "latest/stable"
Done today at 19:26 UTC today at 19:26 UTC Fetch and check assertions for snap "snapd" (19361)
Done today at 19:26 UTC today at 19:26 UTC Mount snap "snapd" (19361)
Done today at 19:26 UTC today at 19:26 UTC Run pre-refresh hook of "snapd" snap if present
Done today at 19:26 UTC today at 19:26 UTC Stop snap "snapd" services
Done today at 19:26 UTC today at 19:26 UTC Remove aliases for snap "snapd"
Done today at 19:26 UTC today at 19:26 UTC Make current revision for snap "snapd" unavailable
Done today at 19:26 UTC today at 19:26 UTC Copy snap "snapd" data
Done today at 19:26 UTC today at 19:26 UTC Setup snap "snapd" (19361) security profiles
Done today at 19:26 UTC today at 19:26 UTC Make snap "snapd" (19361) available to the system
Done today at 19:26 UTC today at 19:26 UTC Automatically connect eligible plugs and slots of snap "snapd"
Done today at 19:26 UTC today at 19:26 UTC Set automatic aliases for snap "snapd"
Done today at 19:26 UTC today at 19:26 UTC Setup snap "snapd" aliases
Done today at 19:26 UTC today at 19:26 UTC Run post-refresh hook of "snapd" snap if present
Done today at 19:26 UTC today at 19:26 UTC Start snap "snapd" (19361) services
Done today at 19:26 UTC today at 19:26 UTC Clean up "snapd" (19361) install
Done today at 19:26 UTC today at 19:26 UTC Run health check of "snapd" snap
Done today at 19:26 UTC today at 19:26 UTC Ensure prerequisites for "core18" are available
Done today at 19:26 UTC today at 19:27 UTC Download snap "core18" (2785) from channel "latest/stable"
Done today at 19:26 UTC today at 19:27 UTC Fetch and check assertions for snap "core18" (2785)
Done today at 19:26 UTC today at 19:27 UTC Mount snap "core18" (2785)
Done today at 19:26 UTC today at 19:27 UTC Run pre-refresh hook of "core18" snap if present
Done today at 19:26 UTC today at 19:27 UTC Stop snap "core18" services
Done today at 19:26 UTC today at 19:27 UTC Remove aliases for snap "core18"
Done today at 19:26 UTC today at 19:27 UTC Make current revision for snap "core18" unavailable
Done today at 19:26 UTC today at 19:27 UTC Copy snap "core18" data
Done today at 19:26 UTC today at 19:27 UTC Setup snap "core18" (2785) security profiles
Done today at 19:26 UTC today at 19:27 UTC Make snap "core18" (2785) available to the system
Done today at 19:26 UTC today at 19:27 UT...

Revision history for this message
Nikita Somikov (qwedcftyu) wrote :

@Robby, thanks!

Revision history for this message
Robby Pocase (rpocase) wrote :

George reverted some of the kubelet snap changes yesterday. New serials should roll out today with this fix included and it is expected to resolve the sited problems in this bug report. For the time being, we are holding off on bootstrap.sh changes since George beat us to the bunch on one of the possible solution paths (thanks again!). I'll be monitoring pipelines and will try to notify on latest serials as soon as they become available.

Changed in cloud-images:
assignee: nobody → George Kraft (cynerva)
importance: Undecided → High
status: New → Fix Committed
Revision history for this message
Robby Pocase (rpocase) wrote :

I just checked our latest serials and we still haven't built for some reason. I'll touch base with the team early AM to make sure this is triggered properly to roll out new serials.

Revision history for this message
Robby Pocase (rpocase) wrote :

Quick update - new serials are out for 1.23 (20230616), but we are seeing test failures on 1.24-1.26. These seem related to not using a kubelet version that matches EKS exactly. We are working on this and hope to have new builds out by Monday, but is likely to slip to Tuesday because of the late day Friday request for new snap channels and most of the people on teams responsible for the snaps + EKS image will be out Monday (US Holiday).

Revision history for this message
Krishna Venkata (krishna-venkata) wrote :

@rpocase do you have an update about 1.24 AMI's? when it would be available!

Revision history for this message
Robby Pocase (rpocase) wrote :

@krishna 1.24 SHOULD be out today. We've hit multiple (completely unrelated) issues on getting these released, but I have dedicated eyes on the EKS pipelines today.

Revision history for this message
Thomas Bechtold (toabctl) wrote :

we have updated images for 1.23, 1.24, 1.25 and 1.26 (serial is 20230623) which hopefully help with this issue. please try those.

Changed in cloud-images:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.