Setting up multisite values causes blocked state with message "Services not running that should be: ceph-radosgw@rgw..."

Bug #1987127 reported by utkarsh bhatt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph RADOS Gateway Charm
Fix Committed
Undecided
utkarsh bhatt
Quincy.2
Fix Released
Undecided
Unassigned

Bug Description

In a RGW Multisite deployment, secondary site rgw service stops and fails to restart.

Multisite Configuration values used as follows:

juju config ceph-radosgw realm="demo_realm"
juju config ceph-radosgw zonegroup="demo_zg"
juju config ceph-radosgw zone="demo_sz"

A look at the ceph log shows that the system fails to find the --default zonegroup.
>
2022-08-19T14:57:43.056+0000 7f29ee596e40 0 deferred set uid:gid to 64045:64045 (ceph:ceph)
2022-08-19T14:57:43.056+0000 7f29ee596e40 0 ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable), process radosgw, pid 33167
2022-08-19T14:57:43.056+0000 7f29ee596e40 0 framework: beast
2022-08-19T14:57:43.056+0000 7f29ee596e40 0 framework conf key: port, val: 423
2022-08-19T14:57:43.056+0000 7f29ee596e40 1 radosgw_Main not setting numa affinity
2022-08-19T14:57:43.060+0000 7f29ee596e40 1 rgw_d3n: rgw_d3n_l1_local_datacache_enabled=0
2022-08-19T14:57:43.060+0000 7f29ee596e40 1 D3N datacache enabled: 0
2022-08-19T14:57:43.104+0000 7f29ee596e40 0 rgw main: ERROR: could not find zonegroup (demo_zg)
2022-08-19T14:57:43.104+0000 7f29ee596e40 0 rgw main: ERROR: failed to start notify service ((2) No such file or directory
2022-08-19T14:57:43.104+0000 7f29ee596e40 0 rgw main: ERROR: failed to init services (ret=(2) No such file or directory)
2022-08-19T14:57:43.108+0000 7f29ee596e40 -1 Couldn't init storage provider (RADOS)
>

Interestingly, If we query the site for zonegroups, we find that the configured default zonegroup (in our case "demo_zg") is not even created on the site yet.
>
ubuntu@juju-3effcd-zaza-339a902d9c20-3:~$ sudo radosgw-admin --id=rgw.$hn zonegroup list
{
    "default_info": "22608c8b-4dd6-4b4e-9f4e-74309fba15fa",
    "zonegroups": [
        "default"
    ]
}
>

However if we take a look at the ceph.conf file: https://paste.ubuntu.com/p/zcTFZ5Z2GF/
We can confirm that the config values are configured as default.

Thus we can conclude that, ceph.conf overwrite defaults for realm/zonegroup/zone from the charm config values without verifying their existence on the site.

Changed in charm-ceph-radosgw:
assignee: nobody → utkarsh bhatt (utkarshbhatthere)
status: New → In Progress
Revision history for this message
utkarsh bhatt (utkarshbhatthere) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-radosgw (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-radosgw (master)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-radosgw/+/853838
Committed: https://opendev.org/openstack/charm-ceph-radosgw/commit/e97e3607e26d14d6f2da47ab6b6b6a28c098105a
Submitter: "Zuul (22348)"
Branch: master

commit e97e3607e26d14d6f2da47ab6b6b6a28c098105a
Author: utkarshbhatthere <email address hidden>
Date: Sat Aug 20 00:30:42 2022 +0530

    Adds existence verification for config values

    Multisite config values (realm, zonegroup, zone) are written
    to ceph.conf as the defaults without verifying their existence, this
    causes failure for commands which use the default values.

    Closes-Bug: #1987127
    Change-Id: I0ab4df34f0000339227e5d5b80352355ea7bd36e

Changed in charm-ceph-radosgw:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.