Cinder

Bug #1927186
Comment #6

Comment 6 for bug 1927186

Revision history for this message

Ilya Popov (ilya-p) wrote on 2023-06-14:

Well, Rajat asked me on Cinder team meeting where in cinder-volume source code we calculate allocated_capacity_gb:
https://meetings.opendev.org/meetings/cinder/2023/cinder.2023-06-14-14.00.log.html

So there are three places:

1. On cinder-volume startup:
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L403

2. When we destroy volume
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L1074

3. On volume creation process:
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L759 when calling _update_allocated_capacity
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L3717

So each cinder volume instance has its own local value of allocated_capacity_gb for each pool it serves.
When cinder volume instance starts - it recalculate allocated_capacity_gb for each pool it serves based on volumes in that pool.
Each time when instance of cinder volume got task to create volume - it increase its local value.
When cinder volume instance fetch task to delete volume - it decrease this value.

It works more or less good for independent cinder volume deployment case - because in this case we have one pool for each instance of cinder volume.

When we have Active-Active cinder volume setup - we have only ONE pool with allocated_capacity_gb. And each instance of cinder volume reports its own local (and different for each instance) value
to scheduler. If first instance of cinder volume report 1 you will see 1 in allocated_capacity_gb (cinder get-pools --detail) till the next report of second cinder volume which report 2.
Just after scheduler receives 2 - you will see 2 in allocated_capacity_gb (cinder get-pools --detail). When scheduler will get next report from first instance of cinder volume - it will show 1
(till the next report from second instance of cindre volume, which report 2) and so on

There is the case from my lab:

3 instances of cinder volume in one cluster with one (and the same) ceph backend. So these cinder volumes are in one same cluster.
I created 200 volumes of 50Gb each and than deleted one volume. Total allocated capacity should be 9950Gb. Tasks for volume creation were spreaded on each instance of cinder volume as about 200/3

2023-06-14 18:45:59.240 7 DEBUG cinder.scheduler.host_manager [req-bcbf17da-59b6-44a8-9985-c4337aef53f5 - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host: os_lab-vct02@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3350, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host: update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575
2023-06-14 18:46:21.213 7 DEBUG cinder.scheduler.host_manager [req-71d4339d-f0cd-4ad4-8259-5c7e793fd56e - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host: os_lab-vct01@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3250, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host: update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575
2023-06-14 18:46:24.627 7 DEBUG cinder.scheduler.host_manager [req-f50cf689-b375-4109-87bf-b59033009858 - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host: os_lab-vct03@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3350, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host: update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575

As we can see from log example - each cinder volume reports its own local allocated capacity based on only volumes created by itself:

os_lab-vct02@ceph_hdd: allocated_capacity_gb': 3350
os_lab-vct01@ceph_hdd: allocated_capacity_gb': 3250
os_lab-vct03@ceph_hdd: allocated_capacity_gb': 3350

So if we won't create or delete volumes and will get information about pools in cicle, we will be happy to see
3250 or 3350 in allocated_capacity_gb for pool as the scheduler will update value for pool after it gets report from each instance of cinder volume

So what is incorrect:

1. allocated_capacity_gb should be 9950, not 3350 or 3250 for pool if we configured Active-Active cluster
2. allocated_capacity_gb is only one value per pool, it should be shared between all cinder volume instances serving this pool.

Well, Rajat asked me on Cinder team meeting where in cinder-volume source code we calculate allocated_capacity_gb:
https://meetings.opendev.org/meetings/cinder/2023/cinder.2023-06-14-14.00.log.html

So there are three places:

1. On cinder-volume startup:
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L403

2. When we destroy volume
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L1074

3. On volume creation process:
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L759 when calling _update_allocated_capacity 
https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L3717

So each cinder volume instance has its own local value of allocated_capacity_gb for each pool it serves.
When cinder volume instance starts - it recalculate allocated_capacity_gb for each pool it serves based on volumes in that pool. 
Each time when instance of cinder volume got task to create volume - it increase its local value. 
When cinder volume instance fetch task to delete volume - it decrease this value.

It works more or less good for independent cinder volume deployment case - because in this case we have one pool for each instance of cinder volume.

When we have Active-Active cinder volume setup - we have only ONE pool with allocated_capacity_gb. And each instance of cinder volume reports its own local (and different for each instance) value 
to scheduler. If first instance of cinder volume report 1 you will see 1 in allocated_capacity_gb (cinder get-pools --detail) till the next report of second cinder volume which report 2. 
Just after scheduler receives 2 - you will see 2 in allocated_capacity_gb (cinder get-pools --detail). When scheduler will get next report from first instance of cinder volume - it will show 1 
(till the next report from second instance of cindre volume, which report 2) and so on

There is the case from my lab:

2023-06-14 18:45:59.240 7 DEBUG cinder.scheduler.host_manager [req-bcbf17da-59b6-44a8-9985-c4337aef53f5 - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host:  os_lab-vct02@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3350, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host:  update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575
2023-06-14 18:46:21.213 7 DEBUG cinder.scheduler.host_manager [req-71d4339d-f0cd-4ad4-8259-5c7e793fd56e - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host:  os_lab-vct01@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3250, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host:  update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575
2023-06-14 18:46:24.627 7 DEBUG cinder.scheduler.host_manager [req-f50cf689-b375-4109-87bf-b59033009858 - - - - -] Received volume service update from Cluster: os_lab@ceph_hdd - Host:  os_lab-vct03@ceph_hdd: {'vendor_name': 'Open Source', 'driver_version': '1.2.0', 'storage_protocol': 'ceph', 'total_capacity_gb': 125821.54, 'free_capacity_gb': 125804.88, 'reserved_percentage': 0, 'multiattach': True, 'thin_provisioning_support': True, 'max_over_subscription_ratio': '20.0', 'location_info': 'ceph:/etc/ceph/ceph.conf:0252f788-fb05-11ec-bf1d-0117d320bc05:cl1ceph1_os_lab_cinder:cl1ceph1_os_lab_volumes', 'backend_state': 'up', 'volume_backend_name': 'cinder_ceph_hdd', 'replication_enabled': False, 'allocated_capacity_gb': 3350, 'filter_function': None, 'goodness_function': None}Cluster: os_lab@ceph_hdd - Host:  update_service_capabilities /var/lib/kolla/venv/lib/python3.8/site-packages/cinder/scheduler/host_manager.py:575

As we can see from log example - each cinder volume reports its own local allocated capacity based on only volumes created by itself:

os_lab-vct02@ceph_hdd: allocated_capacity_gb': 3350
os_lab-vct01@ceph_hdd: allocated_capacity_gb': 3250
os_lab-vct03@ceph_hdd: allocated_capacity_gb': 3350

So what is incorrect: