I did an experiment: I've installed the cluster step by step using CLI instead of bundle to catch at what point /etc/ceph/ceph.conf gets clobbered: juju add-machine -n 3 juju deploy ceph-mon --channel quincy/stable -n 3 --to lxd:0,lxd:1,lxd:2 juju deploy ceph-osd --channel quincy/stable -n 3 --to 0,1,2\ --config osd-devices=/dev/nvme0n1 juju integrate ceph-mon:osd ceph-osd:mon # at this point /etc/ceph/ceph.conf shows up on machines 0..2 with correct contents juju deploy ceph-fs --channel quincy/stable -n 3 --to lxd:0,lxd:1,lxd:2 juju integrate ceph-mon:mds ceph-fs:ceph-mds juju deploy easyrsa --channel 1.28/stable -n 3 --to lxd:0,lxd:1,lxd:2 juju deploy etcd --channel 1.28/stable -n 3 --to lxd:0,lxd:1,lxd:2 juju integrate easyrsa:client etcd:certificates juju deploy kubernetes-control-plane --channel 1.28/stable -n 3 --to 0,1,2 \ --config extra_sans="127.0.0.1 192.168.3.5 k8s.stagnum.caltha.eu"\ --config loadbalancer-ips=192.168.3.5\ --config service-cidr=10.152.180.0/22\ --config register-with-taints=""\ --config proxy-extra-config="{ mode: ipvs, ipvs: { strictARP: true } }"\ --config sysctl="{ net.bridge.bridge-nf-call-iptables: 0, net.ipv4.conf.all.forwarding: 1, net.ipv4.conf.all.rp_filter: 1, net.ipv4.neigh.default.gc_thresh1: 128, net.ipv4.neigh.default.gc_thresh2: 28672, net.ipv4.neigh.default.gc_thresh3: 32768, net.ipv6.neigh.default.gc_thresh1: 128, net.ipv6.neigh.default.gc_thresh2: 28672, net.ipv6.neigh.default.gc_thresh3: 32768, fs.inotify.max_user_instances: 8192, fs.inotify.max_user_watches: 1048576, kernel.panic: 10, kernel.panic_on_oops: 1, vm.overcommit_memory: 1 }"\ --config allow-privileged=true juju integrate easyrsa:client kubernetes-control-plane:certificates juju integrate etcd:db kubernetes-control-plane:etcd juju deploy containerd --channel 1.28/stable juju integrate kubernetes-control-plane:container-runtime containerd:containerd juju deploy calico --channel 1.28/stable\ --config cidr=92.168.64.0/20 juju integrate etcd:db calico:etcd juju integrate kubernetes-control-plane:cni calico:cni juju deploy kubeapi-load-balancer --channel 1.28/stable -n 3 --to lxd:0,lxd:1,lxd:2\ --config extra_sans="127.0.0.1 192.168.3.5 k8s.stagnum.caltha.eu" juju deploy keepalived --channel stable\ --config vip_hostname=k8s.stagnum.caltha.eu\ --config virtual_ip=192.168.3.5 juju integrate easyrsa:client kubeapi-load-balancer:certificates juju integrate kubeapi-load-balancer:juju-info keepalived:juju-info juju integrate kubernetes-control-plane:loadbalancer-internal kubeapi-load-balancer:lb-consumers juju integrate kubernetes-control-plane:loadbalancer-external kubeapi-load-balancer:lb-consumers juju deploy ceph-csi --channel stable\ --config namespace=kube-system\ --config cephfs-enable=true juju integrate ceph-csi:kubernetes kubernetes-control-plane:juju-info juju integrate ceph-csi:ceph-client ceph-mon:client # as soon as ceph-csi starts, /etc/ceph/ceph.conf on machines 0..2 gets overwritten with incorrect contents My hypothesis is that running kubernetes-control-plane and ceph-osd units on the same host unconfined by LXD causes the interference. I need to run ceph-osd unconfined to let it access the physical disks (I think) but I can try to run kubernetes-control-plane in LXD containers. I tried it before at some point before and it seemed to work, but I thought I'd better run it unconfined to avoid any potential containerd and calico problems. I'll try it and report back.