sunbeam-microk8s doesn't join cluster on 2023.2/edge

Bug #2045670 reported by Radu Malica
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Snap
Status tracked in 2023.2
2023.1
Fix Released
High
James Page
2023.2
Fix Released
High
James Page

Bug Description

Hello,

Trying to deploy microstack/sunbeam as a multi node setup on 3 machines within the recommended spec, in this case they have 8cpu , 32gb memory, 50gb as a OS disk and 32gb as ceph disk, all on SSD enterprise storage, using instructions from here: https://microstack.run/docs/multi-node

After multiple other bugs that i've encounter on 2023.1/stable on multinode deployment i've tried 2023.2/edge since a lot of fixes are already applied in this version, however:

when a second node tries to join the cluster, it will fail with the following error in 'juju status'

```
microk8s blocked 2 microk8s legacy/stable 121 no join failed, will try again
```

then, the node that tries to join the cluster will eventually time out.

debug-log shows:

```
unit-microk8s-1: 14:47:51 DEBUG unit.microk8s/1.config-changed Contacting cluster at 10.8.1.228
unit-microk8s-1: 14:47:51 DEBUG unit.microk8s/1.config-changed Joining cluster failed. Could not verify the identity of 10.8.1.228. Use '--skip-verify' to skip server certificate check.
unit-microk8s-1: 14:47:51 ERROR unit.microk8s/1.juju-log Failed to join cluster; deferring to try again later.
```

Upon inspecting the config of microk8s charm, the "skip_verify" is set to False by default.

In microk8scluster.py on function "on_node_added":

```
        try:
            join_cmd = ["microk8s", "join", url]
            if self.model.config.get("skip_verify"):
                join_cmd += ["--skip-verify"]
```

For some reason the SSL certificate on the master node is not "transported" via the registration token to the joining node, or no other verification is done when "sunbeam join" command is issued, so microk8s configuration would be automatically set to "skip_verify: True"

Setting manually the setting with "juju config microk8s skip_verify=True" BEFORE sunbeam cluster join command times out on the joining node fixes the issue

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.