NFS retype w/ migrate issue

Bug #1798468 reported by Eric Harney
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Critical
Alan Bishop

Bug Description

Something is up with the NFS driver's migration code, I think it may be
relying on qemu-img locking behavior to work correctly... which has changed
in new versions of qemu-img.

This implies that anyone installing a qemu-img newer than 2.10 may encounter a data loss bug, but I'm still working out the details.

Apply https://review.openstack.org/#/c/611423/ to Cinder, then:

$ cinder type-list
+--------------------------------------+------+-------------+-----------+
| ID | Name | Description | Is_Public |
+--------------------------------------+------+-------------+-----------+
| 08290bc3-0bab-4ddb-9602-fcb2ced76bcc | nfs2 | - | True |
| e5eb00a3-5814-4459-b1de-94b4bcebdfdd | nfs | - | True |
+--------------------------------------+------+-------------+-----------+

$ cinder create 1 --poll
+--------------------------------+--------------------------------------+
| Property | Value |
+--------------------------------+--------------------------------------+
| attachments | [] |
| availability_zone | nova |
| bootable | false |
| consistencygroup_id | None |
| created_at | 2018-10-17T20:02:06.000000 |
| description | None |
| encrypted | False |
| group_id | None |
| id | 66774b90-01ed-4d34-8b9a-5ef586f01380 |
| metadata | {} |
| migration_status | None |
| multiattach | False |
| name | None |
| os-vol-host-attr:host | None |
| os-vol-mig-status-attr:migstat | None |
| os-vol-mig-status-attr:name_id | None |
| os-vol-tenant-attr:tenant_id | 1a081dd2505547f5a8bb1a230f2295f4 |
| provider_id | None |
| replication_status | None |
| size | 1 |
| snapshot_id | None |
| source_volid | None |
| status | available |
| updated_at | None |
| user_id | ad9fe430b3a6416f908c79e4de3bfa98 |
| volume_type | nfs |
+--------------------------------+--------------------------------------+

$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| 66774b90-01ed-4d34-8b9a-5ef586f01380 | available | - | 1 | nfs | false | |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+

$ ls -l /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/
total 0
-rw-rw-rw-. 1 root root 1073741824 Oct 17 16:02 volume-66774b90-01ed-4d34-8b9a-5ef586f01380

$ cinder retype --migration-policy on-demand 66774b90-01ed-4d34-8b9a-5ef586f01380 nfs2

During migration:

$ ls -l /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/
total 0
-rw-rw-rw-. 1 root root 1073741824 Oct 17 16:02 volume-66774b90-01ed-4d34-8b9a-5ef586f01380
-rw-r--r--. 1 root root 1073741824 Oct 17 16:03 volume-e256cf97-0392-44e9-9faa-0ef263cdafdc

After migration:

$ ls -l /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/
total 0

!!
Where is the volume file?

snippets from c-vol log:

Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG oslo_concurrency.processutils [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf dd if=/opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-66774b90-01ed-4d34-8b9a-5ef586f01380 of=/opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-e256cf97-0392-44e9-9faa-0ef263cdafdc count=1073741824 bs=1M iflag=count_bytes,direct oflag=direct
 conv=sparse" returned: 0 in 4.374s {{(pid=20690) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409}}

Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG cinder.volume.utils [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Volume copy details: src /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-66774b90-01ed-4d34-8b9a-5ef586f01380, dest /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-e256cf97-0392-44e9-9faa-0ef263cdafdc, size 1024.00 MB, duration 4.38 sec {{(pid=20690) _copy_volume_with_path /opt/stack/cinder/cinder/volume/utils.py:487}}

Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG os_brick.initiator.connectors.remotefs [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] ==> disconnect_volume: call u"{'connection_properties': {u'name': u'volume-e256cf97-0392-44e9-9faa-0ef263cdafdc', u'format': u'raw', u'encrypted': False, u'qos_specs': None, u'export': u'localhost:/srv/nfs1', u'access_mode': u'rw', u'options': None}, 'self': <os_brick.initiator.connectors.remotefs.RemoteFsConnector object at 0x7fa3add7dc50>, 'force': True, 'device_info': {'path': u'/opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-e256cf97-0392-44e9-9faa-0ef263cdafdc'}, 'ignore_errors': False}" {{(pid=20690) trace_logging_wrapper /opt/stack/os-brick/os_brick/utils.py:146}}
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG os_brick.initiator.connectors.remotefs [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] <== disconnect_volume: return (0ms) None {{(pid=20690) trace_logging_wrapper /opt/stack/os-brick/os_brick/utils.py:170}}
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Terminate volume connection completed successfully.
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG os_brick.initiator.connectors.remotefs [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] ==> disconnect_volume: call u"{'connection_properties': {'name': u'volume-66774b90-01ed-4d34-8b9a-5ef586f01380', 'format': 'raw', 'encrypted': False, 'qos_specs': None, 'export': u'localhost:/srv/nfs1', 'access_mode': 'rw', 'options': None}, 'self': <os_brick.initiator.connectors.remotefs.RemoteFsConnector object at 0x7fa3ae03e9d0>, 'force': True, 'device_info': {'path': u'/opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-66774b90-01ed-4d34-8b9a-5ef586f01380'}, 'ignore_errors': False}" {{(pid=20690) trace_logging_wrapper /opt/stack/os-brick/os_brick/utils.py:146}}

Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG os_brick.initiator.connectors.remotefs [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] <== disconnect_volume: return (0ms) None {{(pid=20690) trace_logging_wrapper /opt/stack/os-brick/os_brick/utils.py:170}}
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Terminate volume connection completed successfully.
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Remove volume export completed successfully.
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Remove volume export completed successfully.
Oct 17 16:03:15 centstack.localdomain cinder-volume[20664]: DEBUG cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] migrate_volume_completion: completing migration for volume 66774b90-01ed-4d34-8b9a-5ef586f01380 (temporary volume e256cf97-0392-44e9-9faa-0ef263cdafdc {{(pid=20690) migrate_volume_completion /opt/stack/cinder/cinder/volume/manager.py:2221}}
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Complete-Migrate volume completed successfully.
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: INFO cinder.volume.manager [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Migrate volume completed successfully.
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: DEBUG cinder.coordination [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Lock "cinder-e256cf97-0392-44e9-9faa-0ef263cdafdc-delete_volume" acquired by "delete_volume" :: waited 0.033s {{(pid=20690) _synchronized /opt/stack/cinder/cinder/coordination.py:150}}
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: DEBUG cinder.coordination [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Lock "cinder-nfs-e256cf97-0392-44e9-9faa-0ef263cdafdc" acquired by "delete_volume" :: waited 0.021s {{(pid=20690) _synchronized /opt/stack/cinder/cinder/coordination.py:150}}
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: DEBUG cinder.volume.drivers.nfs [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Deleting volume e256cf97-0392-44e9-9faa-0ef263cdafdc, provider_location: localhost:/srv/nfs1 {{(pid=20690) delete_volume /opt/stack/cinder/cinder/volume/drivers/nfs.py:517}}
Oct 17 16:03:16 centstack.localdomain cinder-volume[20664]: DEBUG oslo_concurrency.processutils [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf rm -f /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-66774b90-01ed-4d34-8b9a-5ef586f01380 {{(pid=20690) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372}}
Oct 17 16:03:17 centstack.localdomain cinder-volume[20664]: DEBUG oslo_concurrency.processutils [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf rm -f /opt/stack/data/cinder/mnt/896fb15da6036b68a917322e72ebfe57/volume-66774b90-01ed-4d34-8b9a-5ef586f01380" returned: 0 in 0.639s {{(pid=20690) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409}}
Oct 17 16:03:17 centstack.localdomain cinder-volume[20664]: DEBUG cinder.coordination [req-4b404526-db5d-4bbf-ba20-e23f684c83b5 req-283ab440-82d4-4eff-8953-f732d18ce01f demo None] Lock "cinder-nfs-e256cf97-0392-44e9-9faa-0ef263cdafdc" released by "delete_volume" :: held 0.655s {{(pid=20690) _synchronized /opt/stack/cinder/cinder/coordination.py:162}}

Stops happening if we revert the fix for force-share.

Eric Harney (eharney)
Changed in cinder:
importance: Undecided → Critical
assignee: nobody → Eric Harney (eharney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/678278

Changed in cinder:
assignee: Eric Harney (eharney) → Alan Bishop (alan-bishop)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/678278
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=33a3a4bf6fe72b093702922e2e5f102d06249375
Submitter: Zuul
Branch: master

commit 33a3a4bf6fe72b093702922e2e5f102d06249375
Author: Alan Bishop <email address hidden>
Date: Fri Aug 23 10:44:08 2019 -0700

    Fix NFS volume retype with migrate

    When migrating a volume to an NFS backend, don't rename the associated
    volume file if the original source volume was on the same backend (same
    provider_location). Volume migration always deletes the source volume,
    so the volume file should *not* be renamed to the original.

    Closes-Bug: #1798468
    Change-Id: Iadf537521af93bbd28452a54e0e70e7ed18f606c

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/678936

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/679322

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/rocky)

Reviewed: https://review.opendev.org/679322
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=058700edc93a5815bedc8c7f53cccd672618019c
Submitter: Zuul
Branch: stable/rocky

commit 058700edc93a5815bedc8c7f53cccd672618019c
Author: Alan Bishop <email address hidden>
Date: Fri Aug 23 10:44:08 2019 -0700

    Fix NFS volume retype with migrate

    When migrating a volume to an NFS backend, don't rename the associated
    volume file if the original source volume was on the same backend (same
    provider_location). Volume migration always deletes the source volume,
    so the volume file should *not* be renamed to the original.

    Closes-Bug: #1798468
    Change-Id: Iadf537521af93bbd28452a54e0e70e7ed18f606c
    (cherry picked from commit 33a3a4bf6fe72b093702922e2e5f102d06249375)
    (cherry picked from commit 6f3bec3c64e27f7daac830e62de935b2d5ffa20f)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/679863

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/queens)

Reviewed: https://review.opendev.org/679863
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=f3639046cbdc5202340ffd7a691c21ce3b6cc1bf
Submitter: Zuul
Branch: stable/queens

commit f3639046cbdc5202340ffd7a691c21ce3b6cc1bf
Author: Alan Bishop <email address hidden>
Date: Fri Aug 23 10:44:08 2019 -0700

    Fix NFS volume retype with migrate

    When migrating a volume to an NFS backend, don't rename the associated
    volume file if the original source volume was on the same backend (same
    provider_location). Volume migration always deletes the source volume,
    so the volume file should *not* be renamed to the original.

    Closes-Bug: #1798468
    Change-Id: Iadf537521af93bbd28452a54e0e70e7ed18f606c
    (cherry picked from commit 33a3a4bf6fe72b093702922e2e5f102d06249375)
    (cherry picked from commit 6f3bec3c64e27f7daac830e62de935b2d5ffa20f)
    (cherry picked from commit 058700edc93a5815bedc8c7f53cccd672618019c)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/680237

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike)

Reviewed: https://review.opendev.org/680237
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=01d3d2ee242f265ce9639349a8353891284c8142
Submitter: Zuul
Branch: stable/pike

commit 01d3d2ee242f265ce9639349a8353891284c8142
Author: Alan Bishop <email address hidden>
Date: Fri Aug 23 10:44:08 2019 -0700

    Fix NFS volume retype with migrate

    When migrating a volume to an NFS backend, don't rename the associated
    volume file if the original source volume was on the same backend (same
    provider_location). Volume migration always deletes the source volume,
    so the volume file should *not* be renamed to the original.

    Closes-Bug: #1798468
    Change-Id: Iadf537521af93bbd28452a54e0e70e7ed18f606c
    (cherry picked from commit 33a3a4bf6fe72b093702922e2e5f102d06249375)
    (cherry picked from commit 6f3bec3c64e27f7daac830e62de935b2d5ffa20f)
    (cherry picked from commit 058700edc93a5815bedc8c7f53cccd672618019c)
    (cherry picked from commit f3639046cbdc5202340ffd7a691c21ce3b6cc1bf)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.opendev.org/681477

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 15.0.0.0rc1

This issue was fixed in the openstack/cinder 15.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 14.0.2

This issue was fixed in the openstack/cinder 14.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 13.0.7

This issue was fixed in the openstack/cinder 13.0.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.9

This issue was fixed in the openstack/cinder 12.0.9 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/ocata)

Reviewed: https://review.opendev.org/681477
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=936a4aef6e02b6f8d96cb3510ca94c2b17eb73cd
Submitter: Zuul
Branch: stable/ocata

commit 936a4aef6e02b6f8d96cb3510ca94c2b17eb73cd
Author: Alan Bishop <email address hidden>
Date: Fri Aug 23 10:44:08 2019 -0700

    Fix NFS volume retype with migrate

    When migrating a volume to an NFS backend, don't rename the associated
    volume file if the original source volume was on the same backend (same
    provider_location). Volume migration always deletes the source volume,
    so the volume file should *not* be renamed to the original.

    Closes-Bug: #1798468
    Change-Id: Iadf537521af93bbd28452a54e0e70e7ed18f606c
    (cherry picked from commit 33a3a4bf6fe72b093702922e2e5f102d06249375)
    (cherry picked from commit 6f3bec3c64e27f7daac830e62de935b2d5ffa20f)
    (cherry picked from commit 058700edc93a5815bedc8c7f53cccd672618019c)
    (cherry picked from commit f3639046cbdc5202340ffd7a691c21ce3b6cc1bf)
    (cherry picked from commit 01d3d2ee242f265ce9639349a8353891284c8142)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (driverfixes/newton)

Fix proposed to branch: driverfixes/newton
Review: https://review.opendev.org/687126

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (driverfixes/newton)

Change abandoned by Sean McGinnis (<email address hidden>) on branch: driverfixes/newton
Review: https://review.opendev.org/687126
Reason: driverfixes branches are now EOL and going away.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.