But it does not solve my issue, let me explain further my use case:
Imagine a compute host with 100 vms, at night customer run backup (instance snaphot).
We want to allow 5/10 simultanous snapshot even if snapshot dir is busy,
As backup is best effort, it is not an issue for us if it takes hours.
And unfortunetly I can have the issue with max_concurrent_disk_ops=1 if snap directory is on a network shared filesystem, as network drive can be busy regardless of my compute activity.
-> resource contention on the snapshot directory should just slowdown upload, not make hang nova-compute.
Currently nova-compute process own IO during upload to glance(it uses glance client)
To scale nova-compute, we need to offload nova-compute from doing those IO.
We workaround this by running snapshot upload to glance in an utils.execute(*curl) in order to get it fork().
this prevent us to have hundreds of nova-compute flapping as describe in bugs report.
As you suggested, we can discuss that during VPTG!
Thanks for the link https:/ /review. opendev. org/#/c/ 609180/, It helps to avoid overload by limiting concurrent IO task running.
But it does not solve my issue, let me explain further my use case:
Imagine a compute host with 100 vms, at night customer run backup (instance snaphot).
We want to allow 5/10 simultanous snapshot even if snapshot dir is busy,
As backup is best effort, it is not an issue for us if it takes hours.
And unfortunetly I can have the issue with max_concurrent_ disk_ops= 1 if snap directory is on a network shared filesystem, as network drive can be busy regardless of my compute activity.
-> resource contention on the snapshot directory should just slowdown upload, not make hang nova-compute.
Currently nova-compute process own IO during upload to glance(it uses glance client)
To scale nova-compute, we need to offload nova-compute from doing those IO. *curl) in order to get it fork().
We workaround this by running snapshot upload to glance in an utils.execute(
this prevent us to have hundreds of nova-compute flapping as describe in bugs report.
As you suggested, we can discuss that during VPTG!