rbd backend reports wrong 'local_gb_used' for compute node
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
When instance's disk in rbd backend, compute node reports the whole ceph cluster status, that makes sense. We get the local_gb usage in https:/
def get_pool_
with RADOSClient(self) as client:
stats = client.
return {'total': stats['kb'] * units.Ki,
This reports same disk usages with command 'ceph -s', for example:
[root@node-1 ~]# ceph -s
cluster e598930a-
health HEALTH_OK
monmap e1: 1 mons at {node-1=
osdmap e28: 2 osds: 2 up, 2 in
pgmap v3985: 576 pgs, 5 pools, 295 MB data, 57 objects
21149 MB used, 76479 MB / 97628 MB avail
[root@node-1 ~]# rbd -p compute ls
45892200-
8c6a5555-
944d9028-
9ea375dc-
9fce4606-
cedce585-
e17c9391-
e19143c7-
f9caf4a7-
[root@node-1 ~]# rbd -p compute info 45892200-
rbd image '45892200-
size 20480 MB in 2560 objects
order 23 (8192 kB objects)
block_name_prefix: rbd_data.
format: 2
features: layering
parent: compute/
overlap: 40162 kB
In above example. we have two compute node , and can create 4 instances with 20G disk in each compute. The interesting thing is the total local_gb is 95G, and allocate 160G for instances.
The root cause is client.
An alternative solution fo calcuate all instance's disk size by some way as local_gb_used.
tags: | added: ceph libvirt |
If you look at https:/ /github. com/openstack/ nova/blob/ master/ nova/virt/ libvirt/ driver. py#L4628
you can see the three different functions used for getting available disk space.
'get_volume_ group_info'
'get_pool_info' and
'get_fs_info'
All of these methods are going to return the ACTUAL disk space used, rather than the theoretical maximum of all the instance sizes. This is because disks stored locally will be stored as sparse qcow images. LVM disks are sparse volumes.
I believe that the intention of 'local_gb_used' is to report the actual disk space.