[RBD] Amend the formula of retrieving total_capacity for a rbd pool

Bug #1960206 reported by zhaoleilc
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Gorka Eguileor

Bug Description

Description
===========
At present, the value of free_capacity for a rbd pool is "max_avail"
retrieved from the command 'ceph df -f json' when quota limit is not
set for this pool. And in that scenario the value of total_capacity
is sum of free_capacity and 'bytes_used' also retrieved from the
command 'ceph df -f json' in the case of dynamic total capacity.

However, there is a considerably difference for the command
'ceph df -f json' for different ceph versions. For example, the output
of that command of the version 12.2.11 of ceph lacks 'STORED' field and
the formula of its '%USED' filed is relevant to 'USED' field while the
output of that command of the version 12.2.13 possesses 'STORED' field
and the formula of its '%USED' filed is relevant to 'STORED' field.
More concrete outputs and formulas are as follows.

root@cmn01:~# ceph --version
ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
root@cmn01:~# ceph df
GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    375TiB 320TiB 54.5TiB 14.54
POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    cn-a.rgw.meta 1 219KiB 0 68.6TiB 789
    cn-a.rgw.buckets.index 2 0B 0 68.6TiB 352
    cn-a.rgw.control 3 0B 0 68.6TiB 8
    cn-a.rgw.buckets.data 4 312GiB 0.44 68.6TiB 442440
    .rgw.root 5 23.3KiB 0 68.6TiB 37
    volumes 6 3.99TiB 31.37 8.73TiB 1123150
    images 7 6.31TiB 41.47 8.90TiB 828990
    vms 8 3.40TiB 24.84 10.3TiB 889292
    cn-a.rgw.log 9 197B 0 68.6TiB 316
    cn-a.rgw.buckets.non-ec 22 0B 0 68.6TiB 25
    volumes_nvme 44 4.25TiB 29.24 10.3TiB 1913190
    cn-a.rgw.buckets-ec.data 47 4.30GiB 0 137TiB 1792

%USED of volumes pool:USED/(USED+MAX AVAIL) = 3.99/(3.99+8.73) ≈ 31.37%

root@stor-mgt01:~# ceph --version
ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) luminous (stable)
root@stor-mgt01:~# ceph df
RAW STORAGE:
    CLASS SIZE AVAIL USED RAW USED %RAW USED
    sas 1.6 TiB 1.6 TiB 522 MiB 6.5 GiB 0.39
    sata 106 TiB 104 TiB 2.2 TiB 2.2 TiB 2.08
    ssd 21 TiB 18 TiB 3.2 TiB 3.2 TiB 15.42
    TOTAL 129 TiB 124 TiB 5.4 TiB 5.5 TiB 4.23

POOLS:
    POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
    nova.vms 3 64 277 GiB 84.14k 831 GiB 4.97 5.2 TiB
    cinder.volumes_ssd 4 1024 254 GiB 76.76k 759 GiB 4.56 5.2 TiB
    glance.images 5 256 565 GiB 73.36k 1.7 TiB 9.63 5.2 TiB
    cinder.volumes_sas 17 256 0 B 0 0 B 0 529 GiB
    cinder.volumes_sata 18 1024 332 GiB 99.07k 1018 GiB 1.00 33 TiB
    region-stackdeva.rgw.meta 19 16 5.1 KiB 20 3.0 MiB 0 33 TiB
    .rgw.root 20 16 7.7 KiB 19 3.4 MiB 0 33 TiB
    region-stackdeva.rgw.control 21 16 0 B 8 0 B 0 33 TiB
    region-stackdeva.rgw.log 22 16 399 KiB 214 783 KiB 0 33 TiB
    region-stackdeva.rgw.buckets.index 23 16 72 MiB 4 72 MiB 0 33 TiB
    region-stackdeva.rgw.buckets.data 24 256 98 GiB 100.53k 294 GiB 0.29 33 TiB
    region-stackdeva.rgw.buckets.non-ec 25 16 0 B 0 0 B 0 33 TiB
    region-stackdeva.rgw.buckets-ec.data 27 64 0 B 0 0 B 0 66 TiB
    rbd 32 64 389 B 1 132 KiB 0 32 TiB
%USED of cinder.volumes_ssd pool:STORED/(STORED+MAX AVAIL) = 254/(254+5.2*1024) ≈ 4.55%

The presently rbd driver calculates the total_capacity by free_capacity plus 'bytes_used' which just corresponds to 'USED' field.
Therefore, is it better to take into accout to 'STORED' field when this field exists?

zhaoleilc (zhaoleilc)
description: updated
description: updated
description: updated
summary: - Amend the formula of retrieving total_capacity for the rbd driver
+ Amend the formula of retrieving total_capacity for a rbd pool
summary: - Amend the formula of retrieving total_capacity for a rbd pool
+ [RBD] Amend the formula of retrieving total_capacity for a rbd pool
Changed in cinder:
importance: Undecided → Medium
tags: added: quotas rbd total-capacity
Changed in cinder:
importance: Medium → Low
tags: removed: quotas
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/829565

Changed in cinder:
status: New → In Progress
Gorka Eguileor (gorka)
Changed in cinder:
assignee: nobody → Gorka Eguileor (gorka)
Revision history for this message
Sofia Enriquez (lsofia-enriquez) wrote :

It's how the RBD driver is calculating the stats of the backend to report to the scheduler. Bug discussed on 02-16-2022 Bug Meeting https://meetings.opendev.org/meetings/cinder_bs/2022/cinder_bs.2022-02-16-15.01.log.html

weisongf (songwei-8)
description: updated
Revision history for this message
yao ning (mslovy11022) wrote :
Eric Harney (eharney)
Changed in cinder:
importance: Low → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/829565
Committed: https://opendev.org/openstack/cinder/commit/86d9ec5d5932557ade18e7893cc2b8f564b5b2d8
Submitter: "Zuul (22348)"
Branch: master

commit 86d9ec5d5932557ade18e7893cc2b8f564b5b2d8
Author: Gorka Eguileor <email address hidden>
Date: Wed Feb 16 17:03:41 2022 +0100

    RBD: Fix total_capacity

    Ceph has changed the meaning of the ``bytes_used`` column in the pools
    reported by the ``df`` command, which means that in some deployments the
    rbd driver is not reporting the expected information ot the schedulers.

    The information we should be used for the calculations is returned in
    the ``stored`` field in those systems.

    This patch uses ``stored`` when present and fallbacks to ``bytes_used``
    if not.

    Closes-Bug: #1960206
    Change-Id: I0ca25789a0b279d82f766091235f24f429405da6

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/cinder/+/876472

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/cinder/+/876473

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/cinder/+/876474

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 22.0.0.0rc1

This issue was fixed in the openstack/cinder 22.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/876472
Committed: https://opendev.org/openstack/cinder/commit/97926fb888a6cdb96640ebe4e06785e6ce198226
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 97926fb888a6cdb96640ebe4e06785e6ce198226
Author: Gorka Eguileor <email address hidden>
Date: Wed Feb 16 17:03:41 2022 +0100

    RBD: Fix total_capacity

    Ceph has changed the meaning of the ``bytes_used`` column in the pools
    reported by the ``df`` command, which means that in some deployments the
    rbd driver is not reporting the expected information ot the schedulers.

    The information we should be used for the calculations is returned in
    the ``stored`` field in those systems.

    This patch uses ``stored`` when present and fallbacks to ``bytes_used``
    if not.

    Closes-Bug: #1960206
    Change-Id: I0ca25789a0b279d82f766091235f24f429405da6
    (cherry picked from commit 86d9ec5d5932557ade18e7893cc2b8f564b5b2d8)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 21.2.0

This issue was fixed in the openstack/cinder 21.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/876473
Committed: https://opendev.org/openstack/cinder/commit/ad5a711fdb1c73744537636c0567c4198666925a
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit ad5a711fdb1c73744537636c0567c4198666925a
Author: Gorka Eguileor <email address hidden>
Date: Wed Feb 16 17:03:41 2022 +0100

    RBD: Fix total_capacity

    Ceph has changed the meaning of the ``bytes_used`` column in the pools
    reported by the ``df`` command, which means that in some deployments the
    rbd driver is not reporting the expected information ot the schedulers.

    The information we should be used for the calculations is returned in
    the ``stored`` field in those systems.

    This patch uses ``stored`` when present and fallbacks to ``bytes_used``
    if not.

    Closes-Bug: #1960206
    Change-Id: I0ca25789a0b279d82f766091235f24f429405da6
    (cherry picked from commit 86d9ec5d5932557ade18e7893cc2b8f564b5b2d8)
    (cherry picked from commit 97926fb888a6cdb96640ebe4e06785e6ce198226)

tags: added: in-stable-yoga
tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/876474
Committed: https://opendev.org/openstack/cinder/commit/84693e0207cfc34a6b99c50066f3d1846bd4050a
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 84693e0207cfc34a6b99c50066f3d1846bd4050a
Author: Gorka Eguileor <email address hidden>
Date: Wed Feb 16 17:03:41 2022 +0100

    RBD: Fix total_capacity

    Ceph has changed the meaning of the ``bytes_used`` column in the pools
    reported by the ``df`` command, which means that in some deployments the
    rbd driver is not reporting the expected information ot the schedulers.

    The information we should be used for the calculations is returned in
    the ``stored`` field in those systems.

    This patch uses ``stored`` when present and fallbacks to ``bytes_used``
    if not.

    Closes-Bug: #1960206
    Change-Id: I0ca25789a0b279d82f766091235f24f429405da6
    (cherry picked from commit 86d9ec5d5932557ade18e7893cc2b8f564b5b2d8)
    (cherry picked from commit 97926fb888a6cdb96640ebe4e06785e6ce198226)
    (cherry picked from commit 2f798cb06a34a669459f2de6ab55d06aa985d221)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 20.3.1

This issue was fixed in the openstack/cinder 20.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder xena-eom

This issue was fixed in the openstack/cinder xena-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.