Comment 1 for bug 2000630

Revision history for this message
Nobuto Murata (nobuto) wrote :

By configuring authentication for the crash module, OSD nodes posted a recent crash to MON. Real outages can happen after a few crashes of MONs or OSDs so this should be helpful to give a heads-up to operators to diagnose a recent crash.

https://docs.ceph.com/en/quincy/mgr/crash/

$ juju ssh ceph-mon/leader -- sudo ceph auth get-or-create client.crash mon 'profile crash' mgr 'profile crash'
[client.crash]
     key = AQCRI6xje9HrHxAAU20bKTeL3k2pIlPNazeVfQ==

$ juju run -a ceph-osd '
cat <<EOF | sudo tee /etc/ceph/ceph.client.crash.keyring
[client.crash]
     key = AQCRI6xje9HrHxAAU20bKTeL3k2pIlPNazeVfQ==
EOF
'

$ sudo ceph health detail
HEALTH_WARN 1 daemons have recently crashed
[WRN] RECENT_CRASH: 1 daemons have recently crashed
    osd.2 crashed on host famous-skunk at 2022-12-28T10:42:04.661282Z