lxd profile failure during upgrade-charm with more than 1 unit on a machine.

Bug #1904619 reported by Chris Johnston
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Heather Lanigan
2.9
Fix Released
Medium
Heather Lanigan

Bug Description

Using Juju 2.8.6

Deploy CK 1.18+ck2 [1] where kubernetes-master goes on a container. I did -n 3 with 3 different bare metal machines for kubernetes-master.
Check your lxd profile on the metal nodes hosting kubernetes-master (I deployed from a local charm, hense profile is -0):

$ sudo lxc profile list
+-------------------------------------+---------+
| NAME | USED BY |
+-------------------------------------+---------+
| default | 6 |
+-------------------------------------+---------+
| juju-kubernetes-kubernetes-master-0 | 1 |
+-------------------------------------+---------+
ubuntu@node02:~$ sudo lxc profile show juju-kubernetes-kubernetes-master-0
config:
  linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
  raw.lxc: |
    lxc.apparmor.profile=unconfined
    lxc.mount.auto=proc:rw sys:rw
    lxc.cgroup.devices.allow=a
    lxc.cap.drop=
  security.nesting: "true"
  security.privileged: "true"
description: ""
devices:
  aadisable:
    path: /dev/kmsg
    source: /dev/kmsg
    type: unix-char
name: juju-kubernetes-kubernetes-master-0
used_by:
- /1.0/containers/juju-bf96f0-1-lxd-3

Upgrade the charm:

$ juju upgrade-charm kubernetes-master --switch cs:~containers/kubernetes-master-891

When upgrade has been completed, look at the lxd-profile again:

ubuntu@node02:~$ sudo lxc profile list
+---------------------------------------+---------+
| NAME | USED BY |
+---------------------------------------+---------+
| default | 6 |
+---------------------------------------+---------+
| juju-kubernetes-kubernetes-master-891 | 1 |
+---------------------------------------+---------+
ubuntu@node02:~$ sudo lxc profile show juju-kubernetes-kubernetes-master-891
config: {}
description: ""
devices: {}
name: juju-kubernetes-kubernetes-master-891
used_by:
- /1.0/containers/juju-bf96f0-1-lxd-3

[1] https://api.jujucharms.com/charmstore/v5/charmed-kubernetes-485/archive/bundle.yaml

Tags: sts
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

Confirmed.

I can reproduce with the charmed-kubernetes-485 bundle and upgrading the kubernetes-master or kubernetes-worker charms.

I cannot reproduce with a local lxd profile test charm. Including the charm upgrade part. Nor can I reproduce with the cs:~juju-qa/bionic/lxd-profile-subordinate charm. I started with revision 0 and upgraded to the current revision, 2.

Revision history for this message
Heather Lanigan (hmlanigan) wrote :

If needed, the profiles can be edited by hand as a work around.

Changed in juju:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Chris Johnston (cjohnston) wrote :

This seems quite similar to LP#1856832

I have also seen this in an environment running Juju 2.7.7.

We should be able to use the workaround as described:

https://discourse.charmhub.io/t/juju-charms-and-lxd-profiles-gotchas-and-troubleshooting-and-potential-recovery/3387

Felipe Reyes (freyes)
tags: added: sts
Revision history for this message
Felipe Reyes (freyes) wrote :

could we assign this bug to the 2.8.7 milestone?

Revision history for this message
Chris Johnston (cjohnston) wrote :

Please also backport this to 2.7.x

Revision history for this message
John A Meinel (jameinel) wrote :

2.7 is generally out of support at this point (being >1 year old). If it is particularly necessary, we can certainly entertain a discussion, but it isn't something we would normally do.

At this point it won't go to 2.8.7 given where that release is at, but it is plausible to target it to a 2.8 release. I haven't followed the discussion to understand what the scope of the fix is.

From what I read above, it looks like we are creating a new profile for the updated version of the charm, and we are switching the container to use that profile, but we are just not populating the content of that profile correctly, is that correct?

Revision history for this message
John A Meinel (jameinel) wrote :

Also, do we know if this is something that happens only when using '--switch' ? or is it any upgrade?

Pen Gale (pengale)
Changed in juju:
milestone: none → 3.0.0
milestone: 3.0.0 → none
importance: High → Medium
Revision history for this message
Chris Johnston (cjohnston) wrote :

It also happens on upgrade

Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 1904619] Re: kubernetes-master charm lxc profile not getting updated with the correct configuration

On Tue, 2020-12-08 at 15:01 +0000, Pete Vander Giessen wrote:
> ** Changed in: juju
> Importance: High => Medium

Considering upgrading k8s charms would leave you with a broken
environment, shouldn't this qualify as High?.

Revision history for this message
Jeffrey Jay Scheel (jscheel) wrote : Re: kubernetes-master charm lxc profile not getting updated with the correct configuration

@jameinel, I agree that no need exists for the 2.7 since this is on the upgrade path. So, being scheduled for a 2.8 release would make SEG and Support feel better. Thanks for your assistance here.

Revision history for this message
James Troup (elmo) wrote :

I've subscribed field-high to this bug as when it triggers it can (and has) cause(d) a complete outage of a Keystone backed Kubernetes.

Changed in juju:
assignee: nobody → Heather Lanigan (hmlanigan)
status: Triaged → In Progress
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

This can happen on any juju machine with a multiple units, where not all units have lxd profiles.

PR for 2.8
https://github.com/juju/juju/pull/12560

Changed in juju:
milestone: none → 2.8.8
summary: - kubernetes-master charm lxc profile not getting updated with the correct
- configuration
+ lxd profile failure during upgrade-charm with more than 1 unit on a
+ machine.
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.