CapacityFilter calculates free virtual capacity incorrectly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Cinder |
Undecided
|
Arnon Yaari |
Bug Description
max_oversubscri
Even though max_over_
* The Pure Storage volume driver returns the ratio between provisioned space and used (written) space. This is the ratio between currently allocated and written data, not totals available on the array.
* The Kaminario volume driver does the same, although the driver doesn't even report this capability correctly (returns "max_oversubscr
I'm assuming that the drivers are calculating the value wrong (or I don't understand the values they use) because the rest of Cinder itself treats max_over_
There is one other place in Cinder which uses max_over_
If max_over_
The function caculate_
Changed in cinder: | |
assignee: | nobody → Arnon Yaari (arnony) |
status: | New → In Progress |
Gorka Eguileor (gorka) wrote : | #2 |
We are using max_over_
I agree that we should consolidate the way we do calculations, but first we would need to make a statement on the assumptions that are made regarding the use of the pools used for cinder (ie: exclusive use by Cinder, and by 1 and only 1 service).
Changed in cinder: | |
status: | In Progress → Opinion |
Arnon Yaari (arnony) wrote : | #3 |
The problem with using max_over_
Here are two examples, similar to what I wrote above:
1. Physical capacity: 1 TB. Virtual capacity: 1.2 TB. Provisioned capacity: 200 GB. Written data: 100 GB. In this case: free physical: 900 GB, free virtual: 1TB. ratio would be 1000/900 = 1.11
2. Same as above but now we write more data - 200 GB. Now free physical is 800 GB and free virtual is still 1 TB. Ratio is now 1.25.
i.e. we cannot calculate free virtual capacity based on free physical capacity with a constant ratio. The calculation in CapacityFilter is not correct, and it's not a matter of opinion of what we decide this ratio is.
Since we're using a constant ratio, this ratio *has* to be between totals and not between currently allocated and written. Cinder also treats it like that in every other place in the code.
I'm not sure how other data being stored on the same pool changes that (and isn't that what "reserved" is for?)
Gorka Eguileor (gorka) wrote : | #4 |
You can think of our over provisioning calculation as an adaptive calculation, where, as you say, we'll be evaluating if there will be enough capacity based on the current free space.
You may not agree with the current mechanism, but there are always cases in favor of both approaches, for example:
Physical 1TB
Virtual 1.2TB
Provisioned: 1TB
Written: 1TB
If under this circumstances you request a 1GB disk to be created from an image then with your approach it would succeed the scheduling phase but fail while trying to write data, while with current code it would fail to schedule the creation because there is not enough space.
That's why I say that changing current calculation it's just an opinion.
Reserved is used to ensure that created volumes will have enough space to add new data, not to count for another cinder volume service using the same pool and therefore reducing available space.
Arnon Yaari (arnony) wrote : | #5 |
Regarding the example: over-provisioning always means that writing data may fail. That's why it's called "over". I understand now why it can be considered a matter of opinion - because in my opinion the creation should succeed in that case - but this opinion comes because that's the behavior of over-provisioning in general and storage arrays generally follow this logic and allow creation. It's also the way the feature is documented:
"Default ratio is 20.0, meaning provisioned capacity can be 20 times of the total physical capacity". It doesn't say anything about free capacities.
Should we bring this up in a Cinder meeting to get consensus?
In either case (no matter whose opinion we follow), the "adaptive" approach simply doesn't work... Again - because the ratio is not constant. There is no real arithmetical connection between physical free space and virtual free space - so I don't understand how the approach in CapacityFiler makes sense and I don't think we should leave it like that.
Gorka Eguileor (gorka) wrote : | #6 |
Yes, probably best to discuss this in the meeting.
Xing Yang (xing-yang) wrote : | #7 |
Please take a look of the following patch set which included a lengthy discussion on the exact same
line you are trying to change in this patch. After discussion, we decided it is safer to keep this calculation. Also we decided not to use utils.calculate
https:/
After discussion, we agreed just to rename the variable to adjusted_
https:/
Here is a list of patches from the past on oversubscription:
Oversubscription in thin provisioning:
https:/
Initial spec review:
https:/
Code in scheduler to introduce oversubscription ratio:
https:/
Code to add provisioned_
https:/
Update spec after code is merged:
https:/
Fix capacity filter to allow oversubscription:
https:/
Attempt to not to use virtual free capacity but it is abandoned after lengthy discussion:
https:/
Patch to rename variable to adjusted_
https:/
Differentiate thick and thin provisioning:
https:/
https:/
Huang Zhiteng (zhiteng-huang) wrote : | #8 |
I am re-posting my comment from the patch.
calculate_
'free' = total * max_osr - provisioned (note it doesn't used used capacity).
There is significant difference between provisioned capacity and actual used capacity. What current behavior does is:
'free' = total * max_osr - used * max_osr = (total - used) * max_osr
Does it make sense to count used capacity more than it is actually consumed like we are doing right now? I think so.
Let's look deeper into max_osr, and find out where it should be applied:
1) total / provisioned <= max_osr
2) used / provisioned <= 1/max_osr
1) is so obvious that we all agree; 2) is not mentioned in earlier discussions but it's actually quite intuitive too. Because max_osr basically is percentage that admin thinks (at certain confident level) that on average consumption ratio of all volumes don't go beyond/over, otherwise he/she would be in trouble. So 1/max_osr = max_consumption
3) 'free' <= total * max_osr - provisioned
4) 'free' * max_consumption
4) can be changed to
5) 'free' <= (total - used) * max_osr
The meaningful 'free' should be the lower value calculated from 3) and 5), and that's exactly what we are doing now.
Huang Zhiteng (zhiteng-huang) wrote : | #9 |
I think the fundamental disagreement between me and Yaari is the interpretation of 'over-provision'. I can accept that 'over provision' means at some point service level would become degraded. But a storage device becoming read-only is not degraded service to me, and when this happens to a storage cluster, it means read-only for everybody. That is disaster to me.
Arnon Yaari (arnony) wrote : | #10 |
Your approach causes a problem where max_over_
If I'm a storage admin (or storage device) who knows exactly, in absolute values, the following:
1. Total virtual capacity (amount storage backend will let me provision)
2. Total physical capacity (amount storage backend will let me write)
3. Used virtual (amount I provisioned)
4. Used physical (amount I wrote)
What should I write or return in max_over_
1. should I return total virtual / total physical to make the calculation in calculate_
2. should I return free virtual / free physical to make the calculation in CapacityFilter correct?
3. should I return used virtual / used physical to return the same as other volume drivers like Pure's and Kaminario's?
As a volume driver developer, after this discussion I'm leaning towards 4: not calculate this and just return the value in the configuration, so the operator will have to break his head about what this means. That seems to be what all the other vendors are doing.
Gorka Eguileor (gorka) wrote : | #11 |
I have proposed a spec where I propose supporting standard over-provisioning calculations: https:/
Arnon Yaari (arnony) wrote : | #12 |
The inconsistency between the two calculations method has been resolved by spec:
https:/
and commit https:/
Changed in cinder: | |
status: | Opinion → Invalid |
status: | Invalid → Fix Released |
Change abandoned by Arnon Yaari (<email address hidden>) on branch: master
Review: https:/
Reason: The inconsistency between the two calculation methods has been resolved by spec:
https:/
and commit https:/
Fix proposed to branch: master /review. openstack. org/484354
Review: https:/