Ceph Monitor Charm

Q -> R: ceph mgr down after upgrade due to start-limit-hit

Bug #2038518 reported by Peter Sabaini on 2023-10-05

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Ceph Monitor Charm	New	Undecided	Unassigned

Bug Description

**Description:**
While upgrading `ceph-mon` from Quincy to Reef, I encountered an issue where `ceph-mgr` restarts too quickly. This leads to hitting the start limit for `systemd`.

This does not appear to be consistent though, on two consecutive runs I've first seen 3 of 3 mgrs down, on the next run only 1 of 3 was down.

**Reproduction Steps**
1. Deploy quincy cloud
2. Run `juju config ceph-mon source=cloud:jammy-bobcat`.

**Error Message:**
When checking the status using `sudo systemctl status <email address hidden>`, this error was shown:

```shell
ubuntu@juju-bc9f56-zaza-5ec88f2270ac-7:~$ sudo systemctl status <email address hidden>
× <email address hidden> - Ceph cluster manager daemon
...
Oct 05 09:02:11 juju-bc9f56-zaza-5ec88f2270ac-7 systemd[1]: <email address hidden>: Start request repeated too quickly.
Oct 05 09:02:11 juju-bc9f56-zaza-5ec88f2270ac-7 systemd[1]: <email address hidden>: Failed with result 'start-limit-hit'.
Oct 05 09:02:11 juju-bc9f56-zaza-5ec88f2270ac-7 systemd[1]: Failed to start Ceph cluster manager daemon.

```

**Workaround:**
Reloading `systemd` seems to solve this, as the service starts correctly after running `sudo systemctl daemon-reload`.

**Additional Information**
The `charm` version was latest/edge at git revision 55beb25.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.