cluster fails to deploy greater than k8s 1.23

Bug #2063473 reported by Vivian Rook
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Magnum
New
Undecided
Unassigned

Bug Description

Deploying k8s cluster on Bobcat I get a failure to deploy k8s clusters on versions greater than 1.23.

I can deploy a template changing only the kube_tag variable from v1.23.15-rancher1-linux-amd64 to v1.26.8-rancher1 (or versions of 1.24 or 1.25) the template builds, and the cluster starts to deploy, though only the master node deploys.

The heat log (/var/log/heat-config/heat-config-script/15fceb87-7fdf-40db-aa7e-0c35bcae1b6b-paws-dev-126-1-lurpkkeiwwva-kube_masters-jfbcxmblt4n4-0-o6nsp2f3habo-master_config-3im4cpl67mdb.log) gives:
```
Trying to label master node with node-role.kubernetes.io/master=""
++ kubectl get --raw=/healthz
+ '[' ok = ok ']'
+ kubectl patch node paws-dev-126-1-lurpkkeiwwva-master-0 --patch '{"metadata": {"labels": {"node-role.kubernetes.io/master": ""}}}'
Error from server (NotFound): nodes "paws-dev-126-1-lurpkkeiwwva-master-0" not found
+ echo 'Trying to label master node with node-role.kubernetes.io/master=""'
+ sleep 5s
```
paws-dev-126-1-lurpkkeiwwva-master-0 does appear to exist in `openstack server list`

I'm deploying with tofu, my resources look like the following:

```
resource "openstack_containerinfra_cluster_v1" "k8s_126_1" {
  name = "paws${var.name[var.datacenter]}-126-1"
  cluster_template_id = resource.openstack_containerinfra_clustertemplate_v1.template_126_1.id
  master_count = 1
  node_count = var.workers[var.datacenter]
}

resource "openstack_containerinfra_clustertemplate_v1" "template_126_1" {
  name = "paws${var.name[var.datacenter]}-126-1"
  coe = "kubernetes"
  dns_nameserver = "8.8.8.8"
  docker_storage_driver = "overlay2"
  docker_volume_size = var.volume_size[var.datacenter]
  external_network_id = var.external_network_id[var.datacenter]
  fixed_subnet = var.fixed_subnet[var.datacenter]
  fixed_network = var.fixed_network[var.datacenter]
  flavor = var.worker_flavor[var.datacenter]
  floating_ip_enabled = "false"
  image = "Fedora-CoreOS-38-2"
  keypair_id = "pawsdev"
  master_flavor = var.control_flavor[var.datacenter]
  network_driver = "flannel"

  labels = {
    kube_tag = "v1.26.15-rancher1-linux-amd64"
    hyperkube_prefix = "docker.io/rancher/"
    cloud_provider_enabled = "true"
  }
}

```
My Fedora-CoreOS-38-2 image is a copy of fedora-coreos-38.20230806.3.0

Revision history for this message
Vivian Rook (vrook) wrote :

I found the labelThe following labels were needed
```
    container_runtime = "containerd"
    containerd_version = "1.6.20"
    containerd_tarball_sha256 = "1d86b534c7bba51b78a7eeb1b67dd2ac6c0edeb01c034cc5f590d5ccd824b416"
```
I found them in the documentation. This bug can be closed

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.