Cannot create to boot from volume images concurrently

Bug #1840712 reported by David Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
Eric Harney

Bug Description

Cannot create to boot from volume images concurrently when using glance with cinder backend for storing the images when on NFS storage due to the way we generate the temporary snapshot name:

            temp_snapshot = Snapshot(volume_name=volume_name,
                                     volume_size=src_vref.size,
                                     name='clone-snap-%s' % src_vref.id,
                                     volume_id=src_vref.id,
                                     id='tmp-snap-%s' % src_vref.id,
                                     volume=src_vref)

the above behavior prevent concurrency.

            temp_snapshot = Snapshot(volume_name=volume_name,
                                     volume_size=src_vref.size,
                                     name='clone-snap-%s' % src_vref.id,
                                     volume_id=src_vref.id,
                                     id='tmp-snap-%s' % volume.id,
                                     volume=src_vref)

The above would be better.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/677292

Changed in cinder:
assignee: nobody → David Hill (david-hill-ubisoft)
status: New → In Progress
Changed in cinder:
assignee: David Hill (david-hill-ubisoft) → Takashi Kajinami (kajinamit)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/686386

Changed in cinder:
assignee: Takashi Kajinami (kajinamit) → David Hill (david-hill-ubisoft)
Changed in cinder:
assignee: David Hill (david-hill-ubisoft) → Eric Harney (eharney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (master)

Reviewed: https://review.opendev.org/693186
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=7cc2e402f95371fa8d94295016127eb3b716508f
Submitter: Zuul
Branch: master

commit 7cc2e402f95371fa8d94295016127eb3b716508f
Author: Eric Harney <email address hidden>
Date: Wed Nov 6 09:22:27 2019 -0500

    Fix remotefs clone volume locking

    Any method in the remotefs/nfs code that manipulates
    the qcow2 snapshot chain must be run separately
    from other methods that may touch this snapshot chain.

    This code intended to do this with a lock on the
    volume id, but it unintentionally locked only on
    the destination volume id rather than the source
    volume id where the snapshots are.

    This causes concurrent clone operations to fail in
    the NFS driver. This patch fixes this in the
    RemoteFSSnapDriverDistributed class which fixes the
    NFS driver and a handful of others.

    A follow up patch should be applied for the
    RemoteFSSnapDriver class with a similar change, but as
    that class is only used by one driver (which I can't
    test), this patch only adds a TODO for that change.

    Related-Bug: #1840712
    Related-Bug: #1852449
    Closes-Bug: #1851512

    Change-Id: I64e041feaeb50c95808da46a34f334a9985018a8

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Brian Rosmaita (<email address hidden>) on branch: master
Review: https://review.opendev.org/686386
Reason: This change is implemented by https://review.opendev.org/#/c/677292/ (I think this set of patches arose from a bad merge that removed the original Change-Id)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/677292
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=f3ed9d436bdac7368068853991e9314fbccc0a00
Submitter: Zuul
Branch: master

commit f3ed9d436bdac7368068853991e9314fbccc0a00
Author: David Hill <email address hidden>
Date: Mon Aug 19 16:28:45 2019 -0400

    RemoteFS: Use dest vol id instead of source id in snapshot temp name

    Use the destination volume id instead of the source volume id
    in the temporary snapshot file name. This is likely not strictly
    needed after Change I64e041fe, which ensures that multiple clone
    volume operations won't run simultaneously from the same source
    volume, but is still a good idea to ensure that there is less
    that can go wrong in a failure scenario.

    Change-Id: I5bd185d04dbda673a5882d61aaab7acdd99b74a6
    Related-Bug: #1851512
    Closes-Bug: #1840712

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/694653

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/694654

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/train)

Reviewed: https://review.opendev.org/694653
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=408155b8905ad23035ce06ec5aee8532ab9642e9
Submitter: Zuul
Branch: stable/train

commit 408155b8905ad23035ce06ec5aee8532ab9642e9
Author: Eric Harney <email address hidden>
Date: Wed Nov 6 09:22:27 2019 -0500

    Fix remotefs clone volume locking

    Any method in the remotefs/nfs code that manipulates
    the qcow2 snapshot chain must be run separately
    from other methods that may touch this snapshot chain.

    This code intended to do this with a lock on the
    volume id, but it unintentionally locked only on
    the destination volume id rather than the source
    volume id where the snapshots are.

    This causes concurrent clone operations to fail in
    the NFS driver. This patch fixes this in the
    RemoteFSSnapDriverDistributed class which fixes the
    NFS driver and a handful of others.

    A follow up patch should be applied for the
    RemoteFSSnapDriver class with a similar change, but as
    that class is only used by one driver (which I can't
    test), this patch only adds a TODO for that change.

    Related-Bug: #1840712
    Related-Bug: #1852449
    Closes-Bug: #1851512

    Change-Id: I64e041feaeb50c95808da46a34f334a9985018a8
    (cherry picked from commit 7cc2e402f95371fa8d94295016127eb3b716508f)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/train)

Reviewed: https://review.opendev.org/694654
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=2f230b6bd75c73eb7507f4252e5801b6cfb3b78c
Submitter: Zuul
Branch: stable/train

commit 2f230b6bd75c73eb7507f4252e5801b6cfb3b78c
Author: David Hill <email address hidden>
Date: Mon Aug 19 16:28:45 2019 -0400

    RemoteFS: Use dest vol id instead of source id in snapshot temp name

    Use the destination volume id instead of the source volume id
    in the temporary snapshot file name. This is likely not strictly
    needed after Change I64e041fe, which ensures that multiple clone
    volume operations won't run simultaneously from the same source
    volume, but is still a good idea to ensure that there is less
    that can go wrong in a failure scenario.

    Change-Id: I5bd185d04dbda673a5882d61aaab7acdd99b74a6
    Related-Bug: #1851512
    Closes-Bug: #1840712
    (cherry picked from commit f3ed9d436bdac7368068853991e9314fbccc0a00)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 15.0.1

This issue was fixed in the openstack/cinder 15.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 16.0.0.0b1

This issue was fixed in the openstack/cinder 16.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.