lxd profile failure during upgrade-charm with more than 1 unit on a machine.

Bug #1904619 reported by Chris Johnston on 2020-11-17
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
juju
Medium
Heather Lanigan
2.9
Medium
Heather Lanigan

Bug Description

Using Juju 2.8.6

Deploy CK 1.18+ck2 [1] where kubernetes-master goes on a container. I did -n 3 with 3 different bare metal machines for kubernetes-master.
Check your lxd profile on the metal nodes hosting kubernetes-master (I deployed from a local charm, hense profile is -0):

$ sudo lxc profile list
+-------------------------------------+---------+
| NAME | USED BY |
+-------------------------------------+---------+
| default | 6 |
+-------------------------------------+---------+
| juju-kubernetes-kubernetes-master-0 | 1 |
+-------------------------------------+---------+
ubuntu@node02:~$ sudo lxc profile show juju-kubernetes-kubernetes-master-0
config:
  linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
  raw.lxc: |
    lxc.apparmor.profile=unconfined
    lxc.mount.auto=proc:rw sys:rw
    lxc.cgroup.devices.allow=a
    lxc.cap.drop=
  security.nesting: "true"
  security.privileged: "true"
description: ""
devices:
  aadisable:
    path: /dev/kmsg
    source: /dev/kmsg
    type: unix-char
name: juju-kubernetes-kubernetes-master-0
used_by:
- /1.0/containers/juju-bf96f0-1-lxd-3

Upgrade the charm:

$ juju upgrade-charm kubernetes-master --switch cs:~containers/kubernetes-master-891

When upgrade has been completed, look at the lxd-profile again:

ubuntu@node02:~$ sudo lxc profile list
+---------------------------------------+---------+
| NAME | USED BY |
+---------------------------------------+---------+
| default | 6 |
+---------------------------------------+---------+
| juju-kubernetes-kubernetes-master-891 | 1 |
+---------------------------------------+---------+
ubuntu@node02:~$ sudo lxc profile show juju-kubernetes-kubernetes-master-891
config: {}
description: ""
devices: {}
name: juju-kubernetes-kubernetes-master-891
used_by:
- /1.0/containers/juju-bf96f0-1-lxd-3

[1] https://api.jujucharms.com/charmstore/v5/charmed-kubernetes-485/archive/bundle.yaml

Tags: sts Edit Tag help
Heather Lanigan (hmlanigan) wrote :

Confirmed.

I can reproduce with the charmed-kubernetes-485 bundle and upgrading the kubernetes-master or kubernetes-worker charms.

I cannot reproduce with a local lxd profile test charm. Including the charm upgrade part. Nor can I reproduce with the cs:~juju-qa/bionic/lxd-profile-subordinate charm. I started with revision 0 and upgraded to the current revision, 2.

Heather Lanigan (hmlanigan) wrote :

If needed, the profiles can be edited by hand as a work around.

Changed in juju:
status: New → Triaged
importance: Undecided → High
Chris Johnston (cjohnston) wrote :

This seems quite similar to LP#1856832

I have also seen this in an environment running Juju 2.7.7.

We should be able to use the workaround as described:

https://discourse.charmhub.io/t/juju-charms-and-lxd-profiles-gotchas-and-troubleshooting-and-potential-recovery/3387

Felipe Reyes (freyes) on 2020-11-17
tags: added: sts
Felipe Reyes (freyes) wrote :

could we assign this bug to the 2.8.7 milestone?

Chris Johnston (cjohnston) wrote :

Please also backport this to 2.7.x

John A Meinel (jameinel) wrote :

2.7 is generally out of support at this point (being >1 year old). If it is particularly necessary, we can certainly entertain a discussion, but it isn't something we would normally do.

At this point it won't go to 2.8.7 given where that release is at, but it is plausible to target it to a 2.8 release. I haven't followed the discussion to understand what the scope of the fix is.

From what I read above, it looks like we are creating a new profile for the updated version of the charm, and we are switching the container to use that profile, but we are just not populating the content of that profile correctly, is that correct?

John A Meinel (jameinel) wrote :

Also, do we know if this is something that happens only when using '--switch' ? or is it any upgrade?

Changed in juju:
milestone: none → 3.0.0
milestone: 3.0.0 → none
importance: High → Medium
Chris Johnston (cjohnston) wrote :

It also happens on upgrade

On Tue, 2020-12-08 at 15:01 +0000, Pete Vander Giessen wrote:
> ** Changed in: juju
> Importance: High => Medium

Considering upgrading k8s charms would leave you with a broken
environment, shouldn't this qualify as High?.

@jameinel, I agree that no need exists for the 2.7 since this is on the upgrade path. So, being scheduled for a 2.8 release would make SEG and Support feel better. Thanks for your assistance here.

James Troup (elmo) wrote :

I've subscribed field-high to this bug as when it triggers it can (and has) cause(d) a complete outage of a Keystone backed Kubernetes.

Changed in juju:
assignee: nobody → Heather Lanigan (hmlanigan)
status: Triaged → In Progress
Heather Lanigan (hmlanigan) wrote :

This can happen on any juju machine with a multiple units, where not all units have lxd profiles.

PR for 2.8
https://github.com/juju/juju/pull/12560

Changed in juju:
milestone: none → 2.8.8
summary: - kubernetes-master charm lxc profile not getting updated with the correct
- configuration
+ lxd profile failure during upgrade-charm with more than 1 unit on a
+ machine.
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers