Fail to extend attached volume using generic NFS driver

Bug #1870367 reported by Arthur Nascimento Santos
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Cinder
In Progress
High
Andre Luiz Beltrami Rocha

Bug Description

Command to extend an attached volume fails using generic NFS driver. After executing it the volume is in status 'error_extending'. The problem is in the qemu-img can't lock for write, since other proccess is using (as log shows). The error occous only while the volume is attached on an Nova instance, so if I run the command on an volume without it being attached, it works.

# Error log (cinder-volume service)

Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager [req-f190705c-c270-4280-9c13-31eab0580797 req-971d7a69-ae16-4049-9ea2-36051f2
0535d admin None] Extend volume failed.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img resize /opt/stack/data/cinder/mnt/1af9e5
980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081 2G
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: Exit code: 1
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: Stdout: ''
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: Stderr: 'WARNING: Image format was not specified for \'/opt/stack/data/cinder/mnt/1af9e5980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081\' and probing guessed raw.\n Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.\n Specify the \'raw\' format explicitly to remove the restrictions.\nqemu-img: Could not open \'/opt/stack/data/cinder/mnt/1af9e5980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081\': Failed to get "write" lock\nIs another process using the image?\n'
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager Traceback (most recent call last):
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/manager.py", line 2740, in extend_volume
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager self.driver.extend_volume(volume, new_size)
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/drivers/nfs.py", line 376, in extend_volume
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager run_as_root=self._execute_as_root)
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/image/image_utils.py", line 354, in resize_image
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager utils.execute(*cmd, run_as_root=run_as_root)
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/utils.py", line 126, in execute
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager return processutils.execute(*cmd, **kwargs)
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager File "/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py", line 424, in execute
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager cmd=sanitized_cmd)
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img resize /opt/stack/data/cinder/mnt/1af9e5980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081 2G
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager Exit code: 1
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager Stdout: ''
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager Stderr: 'WARNING: Image format was not specified for \'/opt/stack/data/cinder/mnt/1af9e5980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081\' and probing guessed raw.\n Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.\n Specify the \'raw\' format explicitly to remove the restrictions.\nqemu-img: Could not open \'/opt/stack/data/cinder/mnt/1af9e5980aa9bae6e080782e4ebad95e/volume-613d3202-052c-4c4d-b0eb-62117a9de081\': Failed to get "write" lock\nIs another process using the image?\n'
Apr 02 13:46:01 47-throne-dev-openstack cinder-volume[27508]: ERROR cinder.volume.manager

# Environment

-Devstack on Master Branch
-Ubuntu 18.04

- cinder.conf
[nfs]
nfs_shares_config = /etc/cinder/nfs_share
volume_driver = cinder.volume.drivers.nfs.NfsDriver
volume_backend_name = nfsbackend

- /etc/cinder/nfs_share
localhost:/mnt/sharedfolder

- /etc/exports
/mnt/sharedfolder *(rw,sync,no_subtree_check,no_root_squash)

# Steps to reproduce the error

1 - Create the volume with cinder

cinder type-create nfs
cinder type-key nfs set volume_backend_name=nfsbackend
cinder create volume --name vol1 1

2 - Create the nova instance

openstack server create --network private --image cirros --flavor m1.small server1

3 - Attach the volume to the nova instance

openstack server add volume server1 vol1

4 - Try to extend the attached volume

cinder volume extend vol1 2

* the error occurs here in the cinder-volume service

Lucio Seki (lseki)
description: updated
description: updated
description: updated
Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Discussed at today's cinder meeting. We may need to fix this is two stages:

(1) don't allow extend of an attached volume for nfs drivers
(2) figure out how to do the online extend safely, and then re-enable that functionality

Changed in cinder:
status: New → Triaged
importance: Undecided → High
milestone: none → ussuri-rc1
Revision history for this message
Silvan Kaiser (2-silvan) wrote :

This also affects the Quobyte driver and I think more remotefs based drivers might be affected.

Revision history for this message
Lucio Seki (lseki) wrote :

This affects NetApp Cinder NFS driver as well.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/725805

Changed in cinder:
assignee: nobody → Silvan Kaiser (2-silvan)
status: Triaged → In Progress
Changed in cinder:
milestone: ussuri-rc1 → victoria-1
Revision history for this message
Lucio Seki (lseki) wrote :

Talked to lyarwood and sean-k-mooney at #openstack-nova [0] and did some tests.

In summary, n-cpu is able to extend an attached NFS volume by itself, but c-vol is preventing this from happening when it tries to perform `qemu-img resize`.

The should be fixed in 2 patches:

1. Add a condition to `image_utils.resize_image` [1], so c-vol will call `qemu-img resize`only when the volume is detached.
2. Implement `LibvirtNFSVolumeDriver.extend_volume` [2], so it won't raise `
NotImplementedError` [3].

On the n-cpu side, `LibvirtDriver.extend_volume` [4] will call `LibvirtNFSVolumeDriver.extend_volume` [5] implemented in step 2, and then `LibvirtDriver._resize_attached_volume` [6] will call `BlockDevice.resize`, which in turn calls `blockResize` [7] method that will perform the actual resize operation.

[0] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2020-07-01.log.html#t2020-07-01T13:08:24
[1] https://github.com/openstack/cinder/blob/master/cinder/image/image_utils.py#L408
[2] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/nfs.py#L20
[3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/volume.py#L153
[4] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2105
[5] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L1665
[6] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2053
[7] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L819

Revision history for this message
Lucio Seki (lseki) wrote :

The fix worked when I manually tested it in a DevStack Train deployment.
I'll submit the patches for both Cinder and Nova repos.

Revision history for this message
Lucio Seki (lseki) wrote :

Also, we need to modify DevStack to make it run extend_attached_volume tests on generic NFS driver. Currently it runs only for LVM [0].

[0] https://opendev.org/openstack/devstack/src/branch/master/lib/tempest#L471

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/739079

Changed in cinder:
assignee: Silvan Kaiser (2-silvan) → Lucio Seki (lseki)
Revision history for this message
Lucio Seki (lseki) wrote :

Relating the steps and the patches:

1) don't allow extend of an attached volume for nfs drivers => addressed by https://review.opendev.org/725805 "Disallow extension of attached volumes for NFS & Quobyte drivers"

2) figure out how to do the online extend safely, and then re-enable that functionality => addressed by https://review.opendev.org/739079 "Generic NFS: skip qemu-img resize if volume is attached"
      Depends-On: https://review.opendev.org/#/c/739049 "Enable extend attached volume tests for Cinder generic NFS driver" (this adresses comment #7)
      Depends-On: https://review.opendev.org/#/c/739077 "Implement extend_volume for libvirt NFS volume driver"

If the patch and dependencies in 2) are merged soon, we can skip step 1).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/740531

Revision history for this message
Lucio Seki (lseki) wrote :

Created an etherpad tracking the issues with the proposed fix:
https://etherpad.opendev.org/p/fix-nfs-online-extend

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/756358

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/725805
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=57892623dee9d64c0ac52e2830ff28b866f10d65
Submitter: Zuul
Branch: master

commit 57892623dee9d64c0ac52e2830ff28b866f10d65
Author: Silvan Kaiser <email address hidden>
Date: Wed May 6 11:10:49 2020 +0200

    Disallow extension of attached volumes for NFS & Quobyte drivers

    NFS and Quobyte drivers no longer allow extension of an attached
    volume. The extend operation raises an ExtendVolumeError
    in the Cinder Volume service in case the volume to be extended
    is currently attached.

    This should be reverted once a a more capable solution for
    bug #1870367 has been found.

    Partial-bug: #1870367

    Change-Id: Ib2a7c1cdf269b4907ff8adff1b9d900072eedde2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/victoria)

Reviewed: https://review.opendev.org/756358
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=390a63244706ab2d41c852528ac3ab71dfed2b25
Submitter: Zuul
Branch: stable/victoria

commit 390a63244706ab2d41c852528ac3ab71dfed2b25
Author: Silvan Kaiser <email address hidden>
Date: Wed May 6 11:10:49 2020 +0200

    Disallow extension of attached volumes for NFS & Quobyte drivers

    NFS and Quobyte drivers no longer allow extension of an attached
    volume. The extend operation raises an ExtendVolumeError
    in the Cinder Volume service in case the volume to be extended
    is currently attached.

    This should be reverted once a a more capable solution for
    bug #1870367 has been found.

    Partial-bug: #1870367

    Change-Id: Ib2a7c1cdf269b4907ff8adff1b9d900072eedde2
    (cherry picked from commit 57892623dee9d64c0ac52e2830ff28b866f10d65)

tags: added: in-stable-victoria
Revision history for this message
Felipe Rodrigues (felipefutty) wrote :

I could reproduce the bug on the current master branch, reverting the patch that disallows the online extend [1]. Error log [2].

I uploaded a new patch to the correct fix [3], rebasing and allowing the feature.

[1] https://review.opendev.org/c/openstack/cinder/+/725805/
[2] https://paste.opendev.org/show/811227/
[3] https://review.opendev.org/c/openstack/cinder/+/739079

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/820578

Changed in cinder:
assignee: Lucio Seki (lseki) → Andre Luiz Beltrami Rocha (andrebeltrami)
Revision history for this message
Konrad Gube (kgube) wrote :

I had a look at the issues described in the etherpad, specifically the lack of feedback from Nova on the success of the resize.
This seems to be a problem not just for the NFS driver, since drivers using os-brick may fail [1], and Nova also handles extending of the LUKS structure of attached encrypted volumes, which may also fail.

So I believe it might be good idea to handle this as a separate bug.
The cleanest way to implement it would probably be to leave the volume status as "extending" for attached volumes, and use a new volume action, akin to the "os-migrate_volume_completion" action, to notify Cinder once Nova is done.

It might also be an option to just use the existing "os-reset_status" action to set the volume status to "error_extending" or back to "in-use", depending on whether something went wrong in Nova.

[1]: https://bugs.launchpad.net/cinder/+bug/1849425

Revision history for this message
Konrad Gube (kgube) wrote :

I created a new bug for the Nova feedback issue:

https://bugs.launchpad.net/cinder/+bug/1978294

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.