cinder-volume can create truncated volumes when masking glanceclient errors

Bug #1799221 reported by Francois Deppierraz
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Francois Deppierraz
Pike
Fix Committed
Undecided
Brian Rosmaita
Queens
Fix Committed
Undecided
Brian Rosmaita
Rocky
Fix Released
Undecided
iain MacDonnell
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned

Bug Description

The following code from cinder/image/image_utils.py doesn't correctly handle cases where image_service.download() raises an IOError due to something else than disk space issues (ie. network connection timeouts).

            try:
                image_service.download(context, image_id, image_file)
            except IOError as e:
                if e.errno == errno.ENOSPC:
                    params = {'path': os.path.dirname(path),
                              'image': image_id}
                    reason = _("No space left in image_conversion_dir "
                               "path (%(path)s) while fetching "
                               "image %(image)s.") % params
                    LOG.exception(reason)
                    raise exception.ImageTooBig(image_id=image_id,
                                                reason=reason)

Using openstack-ansible for queens, we experienced this issue because the storage system behind /var/lib/cinder/conversion was too slow and glance connections sometimes failed because of haproxy timeouts. When this happened, a truncated volume in available state was created while we would have expected it to be in error state instead.

The solution looks simple: raise the initial exception when e.errno != errno.ENOSPC.

Changed in cinder:
assignee: nobody → Francois Deppierraz (francois-ctrlaltdel)
description: updated
Changed in cinder:
status: New → In Progress
Revision history for this message
Francois Deppierraz (francois-ctrlaltdel) wrote :
Jay Bryant (jsbryant)
Changed in cinder:
importance: Undecided → High
Revision history for this message
iain MacDonnell (imacdonn) wrote :

This also causes checksum/hash verification failures to get missed, as glanceclient throws a IOError with EPIPE in that case, so you can get a volume with corrupted image content and not know it....

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/625135

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (master)

Reviewed: https://review.openstack.org/625135
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=80fdc0a71b55659d92efdda826807d60d8bd4ab9
Submitter: Zuul
Branch: master

commit 80fdc0a71b55659d92efdda826807d60d8bd4ab9
Author: imacdonn <email address hidden>
Date: Fri Dec 14 00:54:32 2018 +0000

    Set message property in ImageDownloadFailed

    Set the correct property, "message". Looks like "_msg_fmt" may have
    been erroneously carried over from some other project (Ironic?)

    Change-Id: I2aa56da73660794c6dedcbb8a66e84bcec511a9c
    Closes-Bug: #1808443
    Related-Bug: #1799221

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/625148

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/rocky)

Reviewed: https://review.openstack.org/625148
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=9c696ce29f63275ec229a943a9b9bcf63f688f0c
Submitter: Zuul
Branch: stable/rocky

commit 9c696ce29f63275ec229a943a9b9bcf63f688f0c
Author: imacdonn <email address hidden>
Date: Fri Dec 14 00:54:32 2018 +0000

    Set message property in ImageDownloadFailed

    Set the correct property, "message". Looks like "_msg_fmt" may have
    been erroneously carried over from some other project (Ironic?)

    Change-Id: I2aa56da73660794c6dedcbb8a66e84bcec511a9c
    Closes-Bug: #1808443
    Related-Bug: #1799221
    (cherry picked from commit 80fdc0a71b55659d92efdda826807d60d8bd4ab9)

tags: added: in-stable-rocky
Changed in cinder:
assignee: Francois Deppierraz (francois-ctrlaltdel) → Brian Rosmaita (brian-rosmaita)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/612393
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=864c074ff15db601c896dadc2842ae703861c0dd
Submitter: Zuul
Branch: master

commit 864c074ff15db601c896dadc2842ae703861c0dd
Author: Francois Deppierraz <email address hidden>
Date: Mon Oct 22 15:33:25 2018 +0200

    cinder-volume: Stop masking IOError different than ENOSPC

    When glanceclient raises an IOError with a different errno than ENOSPC,
    cinder-volume silently masked it and continued its volume creation
    process. The result was volumes with invalid content being successfuly
    created.

    With the patch, an ImageDownloadFailed exception is raised in this case,
    which makes the volume creation process fail and gives enough
    information to operators for troubleshooting.

    Change-Id: Ic011fe30b4840e5098db1a594ea276ec98768bff
    Closes-Bug: #1799221

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/629091

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Note to security team: this defect was introduced in Pike. See comment #2 for security implications. Not sure if this is worth a note or not.

Changed in cinder:
assignee: Brian Rosmaita (brian-rosmaita) → Francois Deppierraz (francois-ctrlaltdel)
milestone: none → stein-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/rocky)

Reviewed: https://review.openstack.org/629091
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=bf89f76fb1b7a52299c17467106018eae01608e8
Submitter: Zuul
Branch: stable/rocky

commit bf89f76fb1b7a52299c17467106018eae01608e8
Author: Francois Deppierraz <email address hidden>
Date: Mon Oct 22 15:33:25 2018 +0200

    cinder-volume: Stop masking IOError different than ENOSPC

    When glanceclient raises an IOError with a different errno than ENOSPC,
    cinder-volume silently masked it and continued its volume creation
    process. The result was volumes with invalid content being successfuly
    created.

    With the patch, an ImageDownloadFailed exception is raised in this case,
    which makes the volume creation process fail and gives enough
    information to operators for troubleshooting.

    Change-Id: Ic011fe30b4840e5098db1a594ea276ec98768bff
    Closes-Bug: #1799221
    (cherry picked from commit 864c074ff15db601c896dadc2842ae703861c0dd)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/629463

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

The backport to Queens is complicated by the fact that the exception used in this fix doesn't exist in Queens.

It was introduced in Stein and backported to Rocky as change If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 to fix Bug #1798147.

Change If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 didn't contain any tests, so I figured it would be a good idea to add one to increase the probablity that the stable cores would approve a backport to Queens of the fix for Bug #1798147 (which would give us the exception we need in Queens). While adding the test, I came across Bug #1811184.

Unfortunately, the code fixing Bug #1798147 also introduced Bug #1808443, which has already been fixed by change I2aa56da73660794c6dedcbb8a66e84bcec511a9c in Stein (and backported to Rocky).

What I think we should do to get this bug (i.e., the one where this comment is being displayed) fixed in Queens is:

1. Fix Bug #1811184 in master (change I6d8dedfd056add3414f8f4bf7f7279eae4763286) and backport it to Rocky.

2. From Rocky, cherry pick to Queens:
   If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 (fixes Bug #1798147)
   I2aa56da73660794c6dedcbb8a66e84bcec511a9c (fixes Bug #1808443)
   I6d8dedfd056add3414f8f4bf7f7279eae4763286 (fixes Bug #1811184)
   Ic011fe30b4840e5098db1a594ea276ec98768bff (fixes Bug #1799221, i.e., this very bug)

3. Squash them all together in a single backport patch to Queens.

And then hope it's a simple backport from Queens to Pike!

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I should add that even though there are 4 patches, we're talking a very few lines of code (it's just that a surprising number of bugs have been packed into a very small space).

Revision history for this message
Jay Bryant (jsbryant) wrote :

Brian,

I think you plan makes sense. This is a notable enough problem to try to get the fix backported. It is good that it is a limited number of lines touched. If you can clearly articulate in the commit message explaining how the backport was done, then I think it will be able to be merged.

Jay

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I had this as affecting "ossn", but am thinking that was incorrect. Anyway, this defect was introduced in Pike. See comment #2 for security implications. Not sure if this is worth a note or not.

affects: ossn → ossa
Nick Tait (nickthetait)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/queens)
Download full text (3.8 KiB)

Reviewed: https://review.openstack.org/629463
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=937af5be0e7c17b34860e25cc434a219d7143387
Submitter: Zuul
Branch: stable/queens

commit 937af5be0e7c17b34860e25cc434a219d7143387
Author: Francois Deppierraz <email address hidden>
Date: Mon Oct 22 15:33:25 2018 +0200

    cinder-volume: Stop masking IOError different than ENOSPC

    When glanceclient raises an IOError with a different errno than ENOSPC,
    cinder-volume silently masked it and continued its volume creation
    process. The result was volumes with invalid content being successfuly
    created.

    With the patch, an ImageDownloadFailed exception is raised in this case,
    which makes the volume creation process fail and gives enough
    information to operators for troubleshooting.

    As explained in detail below, this patch is a squash of four cherry
    picks to fix Bug #1799221. The cherry-picks are being squashed
    instead of done separately per normal cinder practice because they
    are dependent; this will make sure that one of them isn't further
    backported without the supporting patches.

    Here's the relationship between the four cherry-picks:
    (1) The purpose of this backport is to fix Bug #1799221, which was
        introduced in Pike. It is fixed in Stein and Rocky by
        Ic011fe30b4840e5098db1a594ea276ec98768bff
    (2) That change requires an exception introduced in Stein and backported
        to Rocky by If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 to fix Bug
        #1798147 (which defect is also present in Queens and Pike)
    (3) The change in (2) introduced Bug #1808443 which was fixed in Stein
        and Rocky by I6d8dedfd056add3414f8f4bf7f7279eae4763286
    (4) The change in (2) also introduced Bug #1811184, which is fixed by
        I6d8dedfd056add3414f8f4bf7f7279eae4763286 in Stein and Rocky, and
        which adds a unit test for Bug #1798147.
    In short, in order to backport the fix for (1), we need to backport (2),
    but in order to backport (2) we need to follow up immediately with
    backports of (3) and (4) to fix the defects (2) introduces.

    The attentive reader will note that this patch smuggles in the fix for
    Bug #1798147. We could have left this out, but it's a very small
    isolated change, the defect is present in Queens and Pike (remember that
    Bug #1799221, the subject of this patch, was introduced in Pike), and it
    has a unit test (see (4), above). Finally, leaving out the fix for
    Bug #1798147 and backporting only the exception would still require
    backporting the fixes for Bug #1808443 and Bug #1811184, so it really
    would not simplify this patch.

    To summarize what's being included here:
    commit changeId fixed bug
    805368e If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 Bug #1798147
    9c696ce I2aa56da73660794c6dedcbb8a66e84bcec511a9c Bug #1808443
    844b627 I6d8dedfd056add3414f8f4bf7f7279eae4763286 Bug #1811184
    bf89f76 Ic011fe30b4840e5098db1a594ea276ec98768bff Bug #1799221

    Change-Id: Ic011fe30b4840e5098db1a594ea276ec98768bf...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/633015

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.5

This issue was fixed in the openstack/cinder 12.0.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 13.0.3

This issue was fixed in the openstack/cinder 13.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike)
Download full text (3.9 KiB)

Reviewed: https://review.openstack.org/633015
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=0551f0a9ff7b787fcb3e1b686b83e25f99cad874
Submitter: Zuul
Branch: stable/pike

commit 0551f0a9ff7b787fcb3e1b686b83e25f99cad874
Author: Francois Deppierraz <email address hidden>
Date: Mon Oct 22 15:33:25 2018 +0200

    cinder-volume: Stop masking IOError different than ENOSPC

    When glanceclient raises an IOError with a different errno than ENOSPC,
    cinder-volume silently masked it and continued its volume creation
    process. The result was volumes with invalid content being successfuly
    created.

    With the patch, an ImageDownloadFailed exception is raised in this case,
    which makes the volume creation process fail and gives enough
    information to operators for troubleshooting.

    As explained in detail below, this patch is a squash of four cherry
    picks to fix Bug #1799221. The cherry-picks are being squashed
    instead of done separately per normal cinder practice because they
    are dependent; this will make sure that one of them isn't further
    backported without the supporting patches.

    Here's the relationship between the four cherry-picks:
    (1) The purpose of this backport is to fix Bug #1799221, which was
        introduced in Pike. It is fixed in Stein and Rocky by
        Ic011fe30b4840e5098db1a594ea276ec98768bff
    (2) That change requires an exception introduced in Stein and backported
        to Rocky by If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 to fix Bug
        #1798147 (which defect is also present in Queens and Pike)
    (3) The change in (2) introduced Bug #1808443 which was fixed in Stein
        and Rocky by I6d8dedfd056add3414f8f4bf7f7279eae4763286
    (4) The change in (2) also introduced Bug #1811184, which is fixed by
        I6d8dedfd056add3414f8f4bf7f7279eae4763286 in Stein and Rocky, and
        which adds a unit test for Bug #1798147.
    In short, in order to backport the fix for (1), we need to backport (2),
    but in order to backport (2) we need to follow up immediately with
    backports of (3) and (4) to fix the defects (2) introduces.

    The attentive reader will note that this patch smuggles in the fix for
    Bug #1798147. We could have left this out, but it's a very small
    isolated change, the defect is present in Queens and Pike (remember that
    Bug #1799221, the subject of this patch, was introduced in Pike), and it
    has a unit test (see (4), above). Finally, leaving out the fix for
    Bug #1798147 and backporting only the exception would still require
    backporting the fixes for Bug #1808443 and Bug #1811184, so it really
    would not simplify this patch.

    To summarize what's being included here:
    commit changeId fixed bug
    805368e If7c22ac4516f8c2a6ccd8bf6b6ed98409312b138 Bug #1798147
    9c696ce I2aa56da73660794c6dedcbb8a66e84bcec511a9c Bug #1808443
    844b627 I6d8dedfd056add3414f8f4bf7f7279eae4763286 Bug #1811184
    bf89f76 Ic011fe30b4840e5098db1a594ea276ec98768bff Bug #1799221

    Change-Id: Ic011fe30b4840e5098db1a594ea276ec98768bff
...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 11.2.0

This issue was fixed in the openstack/cinder 11.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 14.0.0.0rc1

This issue was fixed in the openstack/cinder 14.0.0.0rc1 release candidate.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I don't think this needs an OSSA. It's kind of old, plus the outcome was a corrupt download, not a leakage of content or anything like that.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Per Brian's comment #23 this sounds like a security hardening opportunity, not an exploitable vulnerability.

Changed in ossa:
status: New → Won't Fix
tags: added: security
Revision history for this message
Zhang Xu (mchiwxzn) wrote :

You can delete some unused image in the glance-registry and check if it usable

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.