Bug #1966414 “Changing osd-devices config restarts all OSD daemo...” : Bugs : Ceph OSD Charm

Revision history for this message

Matus Kosut (matuskosut) wrote on 2022-04-04:

#1

Generally regarding control of restarts it feels like Ceph charms could use same approach as Openstack charms are using:

https://docs.openstack.org/charm-guide/latest/admin/deferred-events.html

Option to disable restarts and having control over what is restarted and when restarts happen is super important when running production grade cluster. The whole big bang approach used currently introduces too much uncertainty.

Question to Openstack charmers: is there any interest/plan to implement some of these points? If so I would be willing to help out.

1. Enable/disable auto restarts:
juju config <charm-name> enable-auto-restarts=False

2. Deferred events.
Possibility to run show-deferred-events/run-deferred-hooks actions on specific units

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2022-04-04:

#2

I'm uncertain that we really understand what's happening here. The ceph-osd charms should almost never restart the OSD daemons and, in the one case that they do (ceph upgrade), they should go one disk at a time, on one machine at a time, through the whole cluster. A config-change of the osd-devices should absolutely not do that, unless there's something else very strange going on.

Revision history for this message

Sandor Zeestraten (szeestraten) wrote on 2022-04-04 (last edit on 2022-04-04):

#3

Download full text (8.6 KiB)

Hi Chris, this is pretty simple to replicate. I'm pretty sure this gets called to restart all OSDs on each host: https://github.com/openstack/charm-ceph-osd/blob/55720fa087f3ddaddbd761d24c2ceb1ef72d70d3/lib/charms_ceph/utils.py#L2699

# bundle.yaml
relations:
- - ceph-osd:mon
  - ceph-mon:osd
series: focal
applications:
  ceph-mon:
    charm: cs:ceph-mon-73
    num_units: 3
    constraints: tags=ceph-mon
    bindings:
      "": site2-oam
      public: site2-ceph-public
    options:
      monitor-count: 3
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
  ceph-osd:
    charm: cs:ceph-osd-513
    num_units: 3
    constraints: tags=ceph-osd
    options:
      osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf"
      #osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm"
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
      ceph-cluster-network: "172.17.106.0/23"
      bluestore-db: "/dev/nvme0n1"
      bluestore-block-db-size: "266000000000"
    bindings:
      "": site2-oam
      public: site2-ceph-public
      cluster: site2-ceph-cluster

# Steps
juju deploy ./bundle.yaml
wait for deploy to finish and ceph health_ok
juju config ceph-osd osd-devices="/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg"

# Ceph monitor logs showing all OSDs reported down per host after changing osd-devices config flag
root@hcc-admin24:/var/log/ceph# tail -f ceph.log | grep down
2022-04-04T13:05:37.800286+0000 mon.hcc-admin19 (mon.0) 1334 : cluster [WRN] Health check failed: 6 osds down (OSD_DOWN)
2022-04-04T13:05:37.800340+0000 mon.hcc-admin19 (mon.0) 1335 : cluster [WRN] Health check failed: 1 host (6 osds) down (OSD_HOST_DOWN)
2022-04-04T13:05:41.834425+0000 mon.hcc-admin19 (mon.0) 1359 : cluster [INF] Health check cleared: OSD_DOWN (was: 6 osds down)
2022-04-04T13:05:41.834483+0000 mon.hcc-admin19 (mon.0) 1360 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (6 osds) down)
2022-04-04T13:06:04.373403+0000 mon.hcc-admin19 (mon.0) 1622 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:06:04.373456+0000 mon.hcc-admin19 (mon.0) 1623 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:06:08.416227+0000 mon.hcc-admin19 (mon.0) 1648 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)
2022-04-04T13:08:55.647011+0000 mon.hcc-admin19 (mon.0) 2013 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:08:55.647064+0000 mon.hcc-admin19 (mon.0) 2014 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:08:59.694032+0000 mon.hcc-admin19 (mon.0) 2038 : cluster [INF] Health check cleared: OSD_DOWN (was: 5 osds down)
2022-04-04T13:08:59.694076+0000 mon.hcc-admin19 (mon.0) 2039 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)

# Example of a few ceph OSD logs showing they all got restarted at same time
root@hcc-store36:~# journalctl -u ceph-osd@3
-- Logs begin at Mon 2022-04-04 12:56:29 UTC, end at Mon 2022-04-04 13:11:27 UTC. --
Apr 0...

Hi Chris, this is pretty simple to replicate. I'm pretty sure this gets called to restart all OSDs on each host: https://github.com/openstack/charm-ceph-osd/blob/55720fa087f3ddaddbd761d24c2ceb1ef72d70d3/lib/charms_ceph/utils.py#L2699

# bundle.yaml
relations:
- - ceph-osd:mon
  - ceph-mon:osd
series: focal
applications:
  ceph-mon:
    charm: cs:ceph-mon-73
    num_units: 3
    constraints: tags=ceph-mon
    bindings:
      "": site2-oam
      public: site2-ceph-public
    options:
      monitor-count: 3
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
  ceph-osd:
    charm: cs:ceph-osd-513
    num_units: 3
    constraints: tags=ceph-osd
    options:
      osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf"
      #osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm"
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
      ceph-cluster-network: "172.17.106.0/23"
      bluestore-db: "/dev/nvme0n1"
      bluestore-block-db-size: "266000000000"
    bindings:
      "": site2-oam
      public: site2-ceph-public
      cluster: site2-ceph-cluster

# Steps
juju deploy ./bundle.yaml
wait for deploy to finish and ceph health_ok
juju config ceph-osd osd-devices="/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg"

# Ceph monitor logs showing all OSDs reported down per host after changing osd-devices config flag
root@hcc-admin24:/var/log/ceph# tail -f ceph.log | grep down
2022-04-04T13:05:37.800286+0000 mon.hcc-admin19 (mon.0) 1334 : cluster [WRN] Health check failed: 6 osds down (OSD_DOWN)
2022-04-04T13:05:37.800340+0000 mon.hcc-admin19 (mon.0) 1335 : cluster [WRN] Health check failed: 1 host (6 osds) down (OSD_HOST_DOWN)
2022-04-04T13:05:41.834425+0000 mon.hcc-admin19 (mon.0) 1359 : cluster [INF] Health check cleared: OSD_DOWN (was: 6 osds down)
2022-04-04T13:05:41.834483+0000 mon.hcc-admin19 (mon.0) 1360 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (6 osds) down)
2022-04-04T13:06:04.373403+0000 mon.hcc-admin19 (mon.0) 1622 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:06:04.373456+0000 mon.hcc-admin19 (mon.0) 1623 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:06:08.416227+0000 mon.hcc-admin19 (mon.0) 1648 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)
2022-04-04T13:08:55.647011+0000 mon.hcc-admin19 (mon.0) 2013 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:08:55.647064+0000 mon.hcc-admin19 (mon.0) 2014 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:08:59.694032+0000 mon.hcc-admin19 (mon.0) 2038 : cluster [INF] Health check cleared: OSD_DOWN (was: 5 osds down)
2022-04-04T13:08:59.694076+0000 mon.hcc-admin19 (mon.0) 2039 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)

# Example of a few ceph OSD logs showing they all got restarted at same time
root@hcc-store36:~# journalctl -u ceph-osd@3
-- Logs begin at Mon 2022-04-04 12:56:29 UTC, end at Mon 2022-04-04 13:11:27 UTC. --
Apr 04 13:01:36 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.3...
Apr 04 13:01:36 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.3.
Apr 04 13:01:39 hcc-store36 ceph-osd[19395]: 2022-04-04T13:01:39.788+0000 7f169a26c2c0 -1 osd.3 0 log_to_monitors {default=true}
Apr 04 13:01:41 hcc-store36 ceph-osd[19395]: 2022-04-04T13:01:41.416+0000 7f16877fe700 -1 osd.3 0 waiting for initial osdmap
Apr 04 13:01:41 hcc-store36 ceph-osd[19395]: 2022-04-04T13:01:41.452+0000 7f16857fa700 -1 osd.3 24 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Apr 04 13:06:03 hcc-store36 ceph-osd[19395]: 2022-04-04T13:06:03.400+0000 7f1695ea9700 -1 received  signal: Terminated from /sbin/init  (PID: 1) UID: 0
Apr 04 13:06:03 hcc-store36 ceph-osd[19395]: 2022-04-04T13:06:03.400+0000 7f1695ea9700 -1 osd.3 118 *** Got signal Terminated ***
Apr 04 13:06:03 hcc-store36 ceph-osd[19395]: 2022-04-04T13:06:03.400+0000 7f1695ea9700 -1 osd.3 118 *** Immediate shutdown (osd_fast_shutdown=true) ***
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopping Ceph object storage daemon osd.3...
Apr 04 13:06:03 hcc-store36 systemd[1]: ceph-osd@3.service: Succeeded.
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopped Ceph object storage daemon osd.3.
Apr 04 13:06:03 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.3...
Apr 04 13:06:03 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.3.
Apr 04 13:06:07 hcc-store36 ceph-osd[32764]: 2022-04-04T13:06:07.156+0000 7fda19d8a2c0 -1 osd.3 118 log_to_monitors {default=true}
Apr 04 13:06:07 hcc-store36 ceph-osd[32764]: 2022-04-04T13:06:07.428+0000 7fda04ff9700 -1 osd.3 118 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory

root@hcc-store36:~# journalctl -u ceph-osd@6
-- Logs begin at Mon 2022-04-04 12:56:29 UTC, end at Mon 2022-04-04 13:11:27 UTC. --
Apr 04 13:02:00 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.6...
Apr 04 13:02:00 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.6.
Apr 04 13:02:03 hcc-store36 ceph-osd[21448]: 2022-04-04T13:02:03.768+0000 7f05624722c0 -1 osd.6 0 log_to_monitors {default=true}
Apr 04 13:02:05 hcc-store36 ceph-osd[21448]: 2022-04-04T13:02:05.616+0000 7f054ffff700 -1 osd.6 0 waiting for initial osdmap
Apr 04 13:02:05 hcc-store36 ceph-osd[21448]: 2022-04-04T13:02:05.656+0000 7f054d7fa700 -1 osd.6 40 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Apr 04 13:06:03 hcc-store36 ceph-osd[21448]: 2022-04-04T13:06:03.400+0000 7f055e0af700 -1 received  signal: Terminated from /sbin/init  (PID: 1) UID: 0
Apr 04 13:06:03 hcc-store36 ceph-osd[21448]: 2022-04-04T13:06:03.400+0000 7f055e0af700 -1 osd.6 118 *** Got signal Terminated ***
Apr 04 13:06:03 hcc-store36 ceph-osd[21448]: 2022-04-04T13:06:03.400+0000 7f055e0af700 -1 osd.6 118 *** Immediate shutdown (osd_fast_shutdown=true) ***
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopping Ceph object storage daemon osd.6...
Apr 04 13:06:03 hcc-store36 systemd[1]: ceph-osd@6.service: Succeeded.
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopped Ceph object storage daemon osd.6.
Apr 04 13:06:03 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.6...
Apr 04 13:06:03 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.6.
Apr 04 13:06:07 hcc-store36 ceph-osd[32741]: 2022-04-04T13:06:07.084+0000 7fef467f32c0 -1 osd.6 118 log_to_monitors {default=true}
Apr 04 13:06:07 hcc-store36 ceph-osd[32741]: 2022-04-04T13:06:07.428+0000 7fef397fa700 -1 osd.6 118 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory

root@hcc-store36:~# journalctl -u ceph-osd@9
-- Logs begin at Mon 2022-04-04 12:56:29 UTC, end at Mon 2022-04-04 13:11:27 UTC. --
Apr 04 13:02:24 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.9...
Apr 04 13:02:24 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.9.
Apr 04 13:02:27 hcc-store36 ceph-osd[23482]: 2022-04-04T13:02:27.928+0000 7fa11436e2c0 -1 osd.9 0 log_to_monitors {default=true}
Apr 04 13:02:29 hcc-store36 ceph-osd[23482]: 2022-04-04T13:02:29.824+0000 7fa1097fa700 -1 osd.9 0 waiting for initial osdmap
Apr 04 13:02:29 hcc-store36 ceph-osd[23482]: 2022-04-04T13:02:29.868+0000 7fa0faffd700 -1 osd.9 59 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Apr 04 13:06:03 hcc-store36 ceph-osd[23482]: 2022-04-04T13:06:03.400+0000 7fa10b7fe700 -1 received  signal: Terminated from /sbin/init  (PID: 1) UID: 0
Apr 04 13:06:03 hcc-store36 ceph-osd[23482]: 2022-04-04T13:06:03.400+0000 7fa10b7fe700 -1 osd.9 118 *** Got signal Terminated ***
Apr 04 13:06:03 hcc-store36 ceph-osd[23482]: 2022-04-04T13:06:03.400+0000 7fa10b7fe700 -1 osd.9 118 *** Immediate shutdown (osd_fast_shutdown=true) ***
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopping Ceph object storage daemon osd.9...
Apr 04 13:06:03 hcc-store36 systemd[1]: ceph-osd@9.service: Succeeded.
Apr 04 13:06:03 hcc-store36 systemd[1]: Stopped Ceph object storage daemon osd.9.
Apr 04 13:06:03 hcc-store36 systemd[1]: Starting Ceph object storage daemon osd.9...
Apr 04 13:06:03 hcc-store36 systemd[1]: Started Ceph object storage daemon osd.9.
Apr 04 13:06:07 hcc-store36 ceph-osd[32732]: 2022-04-04T13:06:07.064+0000 7fbb82a792c0 -1 osd.9 118 log_to_monitors {default=true}
Apr 04 13:06:07 hcc-store36 ceph-osd[32732]: 2022-04-04T13:06:07.424+0000 7fbb757fa700 -1 osd.9 118 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2022-04-04:

#4

I agree that an upgrade triggers a service restart (although it should be coordinated to only bring one OSD down at a time). My concern is that it should not be thinking it needs an upgrade in the first place.

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2022-04-04:

#5

I suspect that you're using a version of the ceph charms that doesn't understand xena == pacific yet. Can you confirm if this[1] is included in the charm that you've deployed

1: https://github.com/openstack/charms.ceph/blob/18fc2c67483f6d277b26f9f1287451f4bebe058b/charms_ceph/utils.py#L3216

Revision history for this message

Sandor Zeestraten (szeestraten) wrote on 2022-04-04:

#6

We're using rev 513 which is the latest stable charm according to https://charmhub.io/ceph-osd

Looking at utils.py shows only older releases

UCA_CODENAME_MAP = {
    'icehouse': 'firefly',
    'juno': 'firefly',
    'kilo': 'hammer',
    'liberty': 'hammer',
    'mitaka': 'jewel',
    'newton': 'jewel',
    'ocata': 'jewel',
    'pike': 'luminous',
    'queens': 'luminous',
    'rocky': 'mimic',
    'stein': 'mimic',
    'train': 'nautilus',
    'ussuri': 'octopus',
}

root@hcc-store36:/var/lib/juju/agents/unit-ceph-osd-0/charm# cat repo-info
commit-sha-1: b4642b0e2f3c145252a3e29ff5f5a1453656abed
commit-short: b4642b0
branch: HEAD
remote: https://opendev.org/openstack/charm-ceph-osd
info-generated: Wed Nov 24 09:44:22 UTC 2021
note: This file should exist only in a built or released charm artifact (not in the charm source code tree).

Revision history for this message

Sandor Zeestraten (szeestraten) wrote on 2022-04-08:

#7

Hi Chris, it looks like there are 2 issues here.

1. The latest/stable charm revisions ended up being old and not supporting Xena/Pacific. I learned from Peter that you're all busy with the new release and transition to charmhub and channels so I assume that will be fixed later this month.
2. Do you wish to keep this issue open for the problem where the charm trying to upgrade and restarting all ceph-osd services in case of an unknown source or UCA_CODENAME_MAP? Please feel free to edit/change the bug if so.

Ceph OSD Charm

Changing osd-devices config restarts all OSD daemons and attempts to upgrade

Bug Description

Other bug subscribers

Remote bug watches