Changing osd-devices config restarts all OSD daemons and attempts to upgrade

Bug #1966414 reported by Sandor Zeestraten
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
New
Undecided
Unassigned

Bug Description

# Problems

1. When changing the `osd-devices` config flag for example when adding disks, the charm uncontrollably restarts all OSD daemons which is not recommended as it can impact cluster performance under traffic load. This should not happen for most normal config changes.

2. The charm also apparently attempts to upgrade Ceph packages when looking at source code (https://github.com/openstack/charm-ceph-osd/blob/73fe60b3dfde78c21e2b4ade0654ce750453dc7b/hooks/ceph_hooks.py#L472) and the charm unit logs.

3. Finally there are many WARNING messages as the charm attempts to read the OSD daemon state which fails (https://github.com/openstack/charm-ceph-osd/blob/99ef6cb306b361621cdbcd6e44268b80051955b9/lib/charms_ceph/utils.py#L2877-L2912)

# /var/log/juju/unit-ceph-osd-0.log
https://paste.ubuntu.com/p/ZYjdHFnp7X/

# version
juju 2.9.27
ceph-mon rev 73
ceph-osd rev 513

Revision history for this message
Matus Kosut (matuskosut) wrote :

Generally regarding control of restarts it feels like Ceph charms could use same approach as Openstack charms are using:

https://docs.openstack.org/charm-guide/latest/admin/deferred-events.html

Option to disable restarts and having control over what is restarted and when restarts happen is super important when running production grade cluster. The whole big bang approach used currently introduces too much uncertainty.

Question to Openstack charmers: is there any interest/plan to implement some of these points? If so I would be willing to help out.

1. Enable/disable auto restarts:
juju config <charm-name> enable-auto-restarts=False

2. Deferred events.
Possibility to run show-deferred-events/run-deferred-hooks actions on specific units

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I'm uncertain that we really understand what's happening here. The ceph-osd charms should almost never restart the OSD daemons and, in the one case that they do (ceph upgrade), they should go one disk at a time, on one machine at a time, through the whole cluster. A config-change of the osd-devices should absolutely not do that, unless there's something else very strange going on.

Revision history for this message
Sandor Zeestraten (szeestraten) wrote (last edit ):
Download full text (8.6 KiB)

Hi Chris, this is pretty simple to replicate. I'm pretty sure this gets called to restart all OSDs on each host: https://github.com/openstack/charm-ceph-osd/blob/55720fa087f3ddaddbd761d24c2ceb1ef72d70d3/lib/charms_ceph/utils.py#L2699

# bundle.yaml
relations:
- - ceph-osd:mon
  - ceph-mon:osd
series: focal
applications:
  ceph-mon:
    charm: cs:ceph-mon-73
    num_units: 3
    constraints: tags=ceph-mon
    bindings:
      "": site2-oam
      public: site2-ceph-public
    options:
      monitor-count: 3
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
  ceph-osd:
    charm: cs:ceph-osd-513
    num_units: 3
    constraints: tags=ceph-osd
    options:
      osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf"
      #osd-devices: "/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm"
      source: cloud:focal-xena
      ceph-public-network: "172.17.104.0/23"
      ceph-cluster-network: "172.17.106.0/23"
      bluestore-db: "/dev/nvme0n1"
      bluestore-block-db-size: "266000000000"
    bindings:
      "": site2-oam
      public: site2-ceph-public
      cluster: site2-ceph-cluster

# Steps
juju deploy ./bundle.yaml
wait for deploy to finish and ceph health_ok
juju config ceph-osd osd-devices="/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg"

# Ceph monitor logs showing all OSDs reported down per host after changing osd-devices config flag
root@hcc-admin24:/var/log/ceph# tail -f ceph.log | grep down
2022-04-04T13:05:37.800286+0000 mon.hcc-admin19 (mon.0) 1334 : cluster [WRN] Health check failed: 6 osds down (OSD_DOWN)
2022-04-04T13:05:37.800340+0000 mon.hcc-admin19 (mon.0) 1335 : cluster [WRN] Health check failed: 1 host (6 osds) down (OSD_HOST_DOWN)
2022-04-04T13:05:41.834425+0000 mon.hcc-admin19 (mon.0) 1359 : cluster [INF] Health check cleared: OSD_DOWN (was: 6 osds down)
2022-04-04T13:05:41.834483+0000 mon.hcc-admin19 (mon.0) 1360 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (6 osds) down)
2022-04-04T13:06:04.373403+0000 mon.hcc-admin19 (mon.0) 1622 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:06:04.373456+0000 mon.hcc-admin19 (mon.0) 1623 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:06:08.416227+0000 mon.hcc-admin19 (mon.0) 1648 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)
2022-04-04T13:08:55.647011+0000 mon.hcc-admin19 (mon.0) 2013 : cluster [WRN] Health check failed: 5 osds down (OSD_DOWN)
2022-04-04T13:08:55.647064+0000 mon.hcc-admin19 (mon.0) 2014 : cluster [WRN] Health check failed: 1 host (5 osds) down (OSD_HOST_DOWN)
2022-04-04T13:08:59.694032+0000 mon.hcc-admin19 (mon.0) 2038 : cluster [INF] Health check cleared: OSD_DOWN (was: 5 osds down)
2022-04-04T13:08:59.694076+0000 mon.hcc-admin19 (mon.0) 2039 : cluster [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (5 osds) down)

# Example of a few ceph OSD logs showing they all got restarted at same time
root@hcc-store36:~# journalctl -u ceph-osd@3
-- Logs begin at Mon 2022-04-04 12:56:29 UTC, end at Mon 2022-04-04 13:11:27 UTC. --
Apr 0...

Read more...

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I agree that an upgrade triggers a service restart (although it should be coordinated to only bring one OSD down at a time). My concern is that it should not be thinking it needs an upgrade in the first place.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I suspect that you're using a version of the ceph charms that doesn't understand xena == pacific yet. Can you confirm if this[1] is included in the charm that you've deployed

1: https://github.com/openstack/charms.ceph/blob/18fc2c67483f6d277b26f9f1287451f4bebe058b/charms_ceph/utils.py#L3216

Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

We're using rev 513 which is the latest stable charm according to https://charmhub.io/ceph-osd

Looking at utils.py shows only older releases

UCA_CODENAME_MAP = {
    'icehouse': 'firefly',
    'juno': 'firefly',
    'kilo': 'hammer',
    'liberty': 'hammer',
    'mitaka': 'jewel',
    'newton': 'jewel',
    'ocata': 'jewel',
    'pike': 'luminous',
    'queens': 'luminous',
    'rocky': 'mimic',
    'stein': 'mimic',
    'train': 'nautilus',
    'ussuri': 'octopus',
}

root@hcc-store36:/var/lib/juju/agents/unit-ceph-osd-0/charm# cat repo-info
commit-sha-1: b4642b0e2f3c145252a3e29ff5f5a1453656abed
commit-short: b4642b0
branch: HEAD
remote: https://opendev.org/openstack/charm-ceph-osd
info-generated: Wed Nov 24 09:44:22 UTC 2021
note: This file should exist only in a built or released charm artifact (not in the charm source code tree).

Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

Hi Chris, it looks like there are 2 issues here.

1. The latest/stable charm revisions ended up being old and not supporting Xena/Pacific. I learned from Peter that you're all busy with the new release and transition to charmhub and channels so I assume that will be fixed later this month.
2. Do you wish to keep this issue open for the problem where the charm trying to upgrade and restarting all ceph-osd services in case of an unknown source or UCA_CODENAME_MAP? Please feel free to edit/change the bug if so.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.