In multisite config, secondary zone is unable to sync when TLS is enabled
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph RADOS Gateway Charm |
Fix Released
|
High
|
James Page |
Bug Description
Ceph version: 15.2.14-
When creating a new bucket and insert an object on the Secondary this is replicated on Primary
but if when inserting an object on the Primary this is not replicated with the Secondary
and sync status shows
Secondary sync status:
# radosgw-admin sync status
realm 9db21932-
zonegroup b94227a8-
zone 399fe045-
metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is behind on 1 shards
behind shards: [23]
oldest incremental change not applied: 2022-03-
data sync source: d0b0d796-
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
1 shards are recovering
recovering shards: [45]
Secondary zone is full of Permission in `radosgw-admin sync error list `:
i.e.
{
"shard_id": 10,
"entries": [
{
}
}
]
},
Increasing loglevel on ceph-radosgw and we can see signature mismatch error on Primary logs:
2022-03-
2022-03-
2022-03-
2022-03-
2022-03-
2022-03-
2022-03-
the above returns an http 403 error.
We have tried to remove Secondary zone, clean pools and recreate the zone, but as soon as we create a new bucket on the primary a behind shard in metadata appears on the secondary and once we create an object on Primary a recovering shard appears on Secondary data sync status output.
description: | updated |
Changed in charm-ceph-radosgw: | |
milestone: | none → 22.04 |
Changed in charm-ceph-radosgw: | |
status: | Fix Committed → Fix Released |
I've reproduced this in the lab - this only impacts deployments where TLS is enabled.
When TLS is enabled Apache2 is used to terminate the secure connection and then proxy the connection to the radosgw process - something in this data pipeline is causing the client provided signature to mismatch with the server calculated signature and authentication fails as a result.
Bypassing Apache2 and terminating the secure connection on haproxy works around the issue but does change the security profile of the deployment.