Relevant test run in Solutions QA lab: https://solutions.qa.canonical.com/testruns/508a6008-f1b4-460a-8fd2-e04804416a8c
All artefacts from the run: https://oil-jenkins.canonical.com/artifacts/508a6008-f1b4-460a-8fd2-e04804416a8c/index.html
Relevant artefact: generated/generated/microk8s/juju-crashdump-microk8s-2023-12-14-09.32.16.tar.gz
Our test run failed while trying to add microk8s_cloud model with the following error:
```
2023-12-14-09:31:15 root DEBUG [localhost]: juju add-model -c foundations-maas juju-system-microk8s microk8s_cloud
2023-12-14-09:31:17 root ERROR [localhost] Command failed: juju add-model -c foundations-maas juju-system-microk8s microk8s_cloud
2023-12-14-09:31:17 root ERROR 1[localhost] STDOUT follows:
b''
2023-12-14-09:31:17 root ERROR 2[localhost] STDERR follows:
ERROR creating namespace "juju-system-microk8s": rpc error: code = Unknown desc = exec (try: 500): database is locked
```
In the relevant artifact, the syslog file at 0/baremetal/var/log/syslog shows the following lines:
```
Dec 14 09:31:11 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: I1214 09:31:11.024783 9310 job_controller.go:562] "enqueueing job" key="rook-ceph/rook-ceph-csi-detect-version"
Dec 14 09:31:11 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: I1214 09:31:11.106641 9310 replica_set.go:676] "Finished syncing" kind="ReplicaSet" key="rook-ceph/csi-rbdplugin-provisioner-5885496bf5" duration="83.646528ms"
Dec 14 09:31:11 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: I1214 09:31:11.116299 9310 replica_set.go:676] "Finished syncing" kind="ReplicaSet" key="rook-ceph/csi-rbdplugin-provisioner-5885496bf5" duration="651.454µs"
Dec 14 09:31:11 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: I1214 09:31:11.709716 9310 job_controller.go:562] "enqueueing job" key="rook-ceph/rook-ceph-csi-detect-version"
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-k8s-dqlite[11911]: time="2023-12-14T09:31:17Z" level=error msg="error in txn: exec (try: 500): database is locked"
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: E1214 09:31:17.734550 9310 status.go:71] apiserver received an error that is not an metav1.Status: &status.Error{s:(*status.Status)(0xc00950fd48)}: rpc error: code = Unknown desc = exec (try: 500): database is locked
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: I1214 09:31:17.734937 9310 trace.go:236] Trace[1854421033]: "Create" accept:application/json, */*,audit-id:c1a0ffa6-961c-4f85-92c5-da79207e583e,client:10.246.167.34,protocol:HTTP/2.0,resource:namespaces,scope:resource,url:/api/v1/namespaces,user-agent:Go-http-client/2.0,verb:POST (14-Dec-2023 09:31:15.682) (total time: 2052ms):
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: Trace[1854421033]: ["Create etcd3" audit-id:c1a0ffa6-961c-4f85-92c5-da79207e583e,key:/namespaces/juju-system-microk8s,type:*core.Namespace,resource:namespaces 2030ms (09:31:15.704)
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: Trace[1854421033]: ---"Txn call failed" err:rpc error: code = Unknown desc = exec (try: 500): database is locked 2029ms (09:31:17.734)]
Dec 14 09:31:17 microk8s-55-1-3 microk8s.daemon-kubelite[9310]: Trace[1854421033]: [2.052580676s] [2.052580676s] END
```
The version of things being deployed:
maas 3.3.5-
juju 3.1.6
cpe-foundation 2.21.2+git.4.g72d209f6
infra-ubuntu focal
solutions-qa-ci b6014422
ceph quincy/stable
charms yoga/stable
fce-container-image ubuntu:jammy
openstack yoga
landscape-server 23.10+1-0landscape0
charmed-kubernetes 1.28
microk8s v1.28.3
cloud-init 23.3.3-0ubuntu0~22.04.1
sku fcb-master-yoga-jammy-ironic
cos-lite latest/stable:11
fcbtest latest/beta
This looks like a k8s error - juju asks k8s to create a namespace for the new model and k8s responds with a n error. If there's a db contention issue, k8s should be handling that internally. Juju can consider its own retry strategy to deal with flakey upstream components, but I think we need to push on microk8s to fix the issue closer to the source.