OpenStack SDK

Support chunked reads, uploads, etc

Bug #1513545 reported by Brian Curtin on 2015-11-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack SDK	Confirmed	High	Unassigned	OpenStack SDK 1.1

Bug Description

At least in object_store, but potentially elsewhere (images, maybe?), we may be reading huge files and trying to send them all at once, e.g., uploading something big to swift. We should probably be smart about not trying to take everything into memory at once and uploading as one piece, but reading chunks and uploading them in sequence, which at least Swift supports.

Tags:

Revision history for this message

Ian Cordasco (icordasc) wrote on 2015-11-05:

So I'm not familiar with Swift's upload, but let me give you a few options if you're going to be trying out chunked or other streaming options with requests:

1. If you want to stream (not chunk) uploads that are multipart form-data uploads, there's the MultipartEncoder from the requests_toolbelt library which works very well for this use-case

2. If you want to stream a file, just provide the file handle to requests as the data parameter and it will intelligently stream the data for you (actually httplib does it and there's a problem with it that also applies to 1 which I'll explain in [footnote#1]).

3. If you want to do real chunked upload, you need to provide requests with a generator object. Requests will set the appropriate headers for you and then send the data in the chunks provided by iterating over the generator. If you wanted to do an upload with a size of 16MB you could do something like

from oslo_utils import units

upload_size = 16 * units.MB

    def generate_file_content(file_object):
        while True:
            chunk = file_object.read(upload_size)
            if not chunk:
                break
            yield chunk

with open(file, 'rb') as fd:
requests.post(url, data=generate_file_content(fd))

[footnote#1] httplib when passed a file-like object will read 8192 *bytes* at a time. It's wildly inefficient and slow for very large files. There is no way to override that without wrapping your file object in something that forces larger read sizes.

Revision history for this message

Brian Curtin (brian.curtin) wrote on 2015-11-05:

We need to use http://docs.openstack.org/developer/swift/overview_large_objects.html

Brian Curtin (brian.curtin) on 2015-11-06

tags:

added: objectstore

Brian Curtin (brian.curtin) on 2015-11-06

Changed in python-openstacksdk:
milestone:	1.0 → 1.1

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.