Support chunked reads, uploads, etc

Bug #1513545 reported by Brian Curtin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack SDK
Confirmed
High
Unassigned

Bug Description

At least in object_store, but potentially elsewhere (images, maybe?), we may be reading huge files and trying to send them all at once, e.g., uploading something big to swift. We should probably be smart about not trying to take everything into memory at once and uploading as one piece, but reading chunks and uploading them in sequence, which at least Swift supports.

Tags: objectstore
Revision history for this message
Ian Cordasco (icordasc) wrote :

So I'm not familiar with Swift's upload, but let me give you a few options if you're going to be trying out chunked or other streaming options with requests:

1. If you want to stream (not chunk) uploads that are multipart form-data uploads, there's the MultipartEncoder from the requests_toolbelt library which works very well for this use-case

2. If you want to stream a file, just provide the file handle to requests as the data parameter and it will intelligently stream the data for you (actually httplib does it and there's a problem with it that also applies to 1 which I'll explain in [footnote#1]).

3. If you want to do real chunked upload, you need to provide requests with a generator object. Requests will set the appropriate headers for you and then send the data in the chunks provided by iterating over the generator. If you wanted to do an upload with a size of 16MB you could do something like

    from oslo_utils import units

    upload_size = 16 * units.MB

    def generate_file_content(file_object):
        while True:
            chunk = file_object.read(upload_size)
            if not chunk:
                break
            yield chunk

    with open(file, 'rb') as fd:
        requests.post(url, data=generate_file_content(fd))

[footnote#1] httplib when passed a file-like object will read 8192 *bytes* at a time. It's wildly inefficient and slow for very large files. There is no way to override that without wrapping your file object in something that forces larger read sizes.

Revision history for this message
Brian Curtin (brian.curtin) wrote :
tags: added: objectstore
Changed in python-openstacksdk:
milestone: 1.0 → 1.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.