NetApp ONTAP: QoS policy group is deleted after migration

Bug #1906291 reported by Lucio Seki on 2020-11-30
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Medium
Unassigned

Bug Description

NetApp ONTAP Cinder driver has support for setting QoS.
When a Cinder volume is created with a volume-type associated to a QoS entity, the driver creates a QoS policy group at the ONTAP back end, and associates it to the entity representing the Cinder volume (either a LUN or a file within an NFS share).
When a migrate operation is issued, the QoS policy group is deleted.

Steps to reproduce:
- Set up 2 ONTAP back ends `ontap1` and `ontap2`
- Create a Cinder QoS `qos_test`
- Create a Cinder volume type `ontap`
- Associate the QoS `qos_test` to the volume type `ontap`
- Create a Cinder volume with the volume type `ontap`
- Migrate the volume to another ONTAP back end
- Wait for the driver to perform a host-assited migration
- Wait for the driver to create a new QoS policy group and associate it to the new LUN/file representing the volume

Expected result:
- Have the new QoS policy group associated to the LUN/file permanently.

Actual result:
- The new QoS policy group is deleted afer a few minutes.

Detailed commands and outputs are here [0].

[0] http://paste.openstack.org/show/800563/

Changed in cinder:
status: New → Triaged
importance: Undecided → Medium
Gorka Eguileor (gorka) wrote :

I believe the issue is on the NetApp driver itself and is a 2 piece issue:

- When checking which qos to delete we are not taking into account the volume's name, and are only checking its id [1]

- The call to "update_migrated_volume" is not implemented on NetApp's NFS driver and uses the inherited one from cinder.volume.drivers.nfs.NFSDriver, where does an "os.rename" [2] to rename the file, creating a problem with the matching of the file and qos.

[1]: https://github.com/openstack/cinder/blob/d3ffa90baa959530eaa1cd1d4e3800fbe9148806/cinder/volume/drivers/netapp/utils.py#L269
[2]: https://github.com/openstack/cinder/blob/d3ffa90baa959530eaa1cd1d4e3800fbe9148806/cinder/volume/drivers/nfs.py#L484

Lucio Seki (lseki) wrote :

The change suggested by Gorka at [0] fixed the issue:
---
geguileo lseki: https://github.com/openstack/cinder/blob/d3ffa90baa959530eaa1cd1d4e3800fbe9148806/cinder/volume/drivers/netapp/utils.py#L269 18:07
geguileo lseki: it's using the id and not the name... 18:07
geguileo lseki: I believe changing that line to something like return OPENSTACK_PREFIX + (volume.get('name') or volume['id'])
---

I created a Cinder volume on an ONTAP NFS back end with QoS and migrate it to another ONTAP NFS back end.

The operation successfully created a new QoS policy group, and associated it to the new file backing the Cinder volume.

After a while, the ONTAP driver properly deleted the old QoS policy group.

Detailed commands and outputs are available here [1].

It was not necessary to implement update_migrated_volume method.

[0] http://eavesdrop.openstack.org/irclogs/%23openstack-cinder/%23openstack-cinder.2020-12-09.log.html#t2020-12-09T18:07:33
[1] http://paste.openstack.org/show/800913

Lucio Seki (lseki) wrote :

I'll test it again with different pools, as I was testing with 2 back ends using the same pool.

Lucio Seki (lseki) wrote :

Just modifying the `cinder/volume/drivers/netapp/utils.py:get_qos_policy_group_name` was not sufficient [0].
The QoS policy group gets associated to a wrong filename, and ends up being deleted after a while (as the associated file does not exist).

As Gorka suggested, I had to also implement `cinder/volume/drivers/netapp/dataontap/nfs_cmode.py:NetAppCmodeNfsDriver.update_migrated_volume` that raises a NotImplementedError in order to prevent the base class (cinder/volume/drivers/nfs.py:NfsDriver) from renaming the file.

Now the QoS policy group is associated to the correct file [1].

[0] http://paste.openstack.org/show/800918
[1] http://paste.openstack.org/show/800919

Changed in cinder:
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers