[NFS/Quobyte/NetApp] No feedback from Nova when extending attached Volumes

Bug #1978294 reported by Konrad Gube
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
In Progress
Medium
Konrad Gube
OpenStack Compute (nova)
In Progress
Undecided
Konrad Gube

Bug Description

When extending an attached Volume, Cinder will notify Nova after running the volume driver's extend_volume method, so that Nova can take steps to make the additional space available to the instance the volume is attached to.

Nova is notified using the external server event API [1], which is an asynchronous operation and has no effect on the volume status or the success of the extend.
For most drivers this is fine, but there are proposed changes to the NFS [2] and NetApp [3] drivers that rely on Nova to perform the actual resize if the volume is attached.
A failure to resize by Nova would then lead to an inconsistent volume status and size, because currently both are set before Nova even gets notified of the resize.

Lucio Seki, who first worked on [2] and [3], collected some thoughts on this in an etherpad [4] together with other potential issues of the changes.

There are essentially two approaches to fixing this:

1. Have Cinder poll Nova for the completion.
Might require [5] to get fixed, to make resize failures show up in the instance action log.
Cinder would then poll the actions log of the instance to get the status of the operation and update the volume status accordingly.

2. Have Nova call back into Cinder when the resize is finished.
The "os-reset_status" volume action could perhaps be abused for this, but a cleaner solution would be the introduction of a new volume action, akin to "os-migrate_volume_completion".

I believe a new volume action would be the best way to implement this, and I would like to work on this, with the goal of getting [2] and [3] merged eventually.
I'm new to Cinder development though, so I am very happy about any suggestions or feedback.

[1]: https://docs.openstack.org/api-ref/compute/?expanded=run-events-detail#run-events
[2]: https://review.opendev.org/c/openstack/cinder/+/739079
[3]: https://review.opendev.org/c/openstack/cinder/+/820578
[4]: https://etherpad.opendev.org/p/fix-nfs-online-extend
[5]: https://review.opendev.org/c/openstack/nova/+/680648/

Konrad Gube (kgube)
description: updated
Konrad Gube (kgube)
Changed in cinder:
assignee: nobody → Konrad Gube (kgube)
description: updated
Konrad Gube (kgube)
Changed in cinder:
status: New → In Progress
tags: added: attach nova
Changed in cinder:
importance: Undecided → Medium
summary: - No feedback from Nova when extending attached Volumes
+ [NFS/NetApp] No feedback from Nova when extending attached Volumes
tags: added: netapp nfs
tags: added: quobyte
summary: - [NFS/NetApp] No feedback from Nova when extending attached Volumes
+ [NFS/Quobyte/NetApp] No feedback from Nova when extending attached
+ Volumes
Konrad Gube (kgube)
Changed in nova:
status: New → In Progress
assignee: nobody → Konrad Gube (kgube)
Revision history for this message
Konrad Gube (kgube) wrote :

Since we have to rely on Nova to actually perform the volume resize, we need to transform the external server event that notifies Nova of the resize into a synchronous operation that returns information on the success of the resize.

Introducing a new callback into Cinder to be called from Nova does accomplish this, but has a number of drawbacks.
For example, the external server event API only has a single field for event-specific data, in this case holding the ID of the volume. The new size of the volume would need to be either set in the database before the actual resize, which I would like to avoid, or put into the volumes admin metadata for Nova to access, which is not a clean solution either.
Another problem is how to deal with the case of the callback never arriving (e.g. caused by a network error). It would be difficult to implement a timeout with this design.

The much cleaner solution would be to replace the event with a new server action in the Nova API that can then be called synchronously by Cinder. This solves the mentioned Problems and is also semantically closer to what we actually want to do.

So I am going to focus now on introducing a new admin server action to Nova for extending attached volumes, that can be used by Cinder to trigger the operation and get feedback on its success.

Revision history for this message
Konrad Gube (kgube) wrote :

I submitted a spec for a synchronous assisted volume extend API in nova:
https://blueprints.launchpad.net/nova/+spec/assisted-volume-extend

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.