Uploads have terrible performance

Bug #1671621 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
python-swiftclient
Fix Released
Undecided
Unassigned

Bug Description

Looks like we've got some 40+% penalty hiding somewhere?

  $ dd if=/dev/zero of=big_file bs=1M count=1024
  1024+0 records in
  1024+0 records out
  1073741824 bytes (1.1 GB) copied, 4.25035 s, 253 MB/s

  $ `swift auth`

  $ time swift upload c big_file
  big_file

  real 0m35.958s
  user 0m0.339s
  sys 0m8.279s

  $ time curl $OS_STORAGE_URL/c/big_file2 -T big_file -H "x-auth-token: $OS_AUTH_TOKEN"

  real 0m25.114s
  user 0m0.240s
  sys 0m2.999s

Even concurrent uploads don't get us back up to curl's single-upload speed:

  $ time swift upload c big_file -S 100M
  big_file segment 3
  big_file segment 0
  big_file segment 2
  big_file segment 1
  big_file segment 9
  big_file segment 6
  big_file segment 8
  big_file segment 4
  big_file segment 7
  big_file segment 5
  big_file segment 10
  big_file

  real 0m27.509s
  user 0m0.583s
  sys 0m7.193s

This seems to do with Python's 8k blocksize when httplib is handed a file-like [1], which we force requests to do through our use of LengthWrapper [2] -- the alternative seemed to involve requests sending both `Content-Length` *and* `Transfer-Encoding: chunked` headers? Hacking up my standard library to use 1MB blocks instead gets us much closer to curl speeds:

  $ time swift upload c big_file
  big_file

  real 0m25.866s
  user 0m0.864s
  sys 0m3.481s

...and concurrent uploads make it better still:

  $ time swift upload c big_file -S 100M
  big_file segment 3
  big_file segment 1
  big_file segment 0
  big_file segment 2
  big_file segment 5
  big_file segment 7
  big_file segment 9
  big_file segment 4
  big_file segment 8
  big_file segment 6
  big_file segment 10
  big_file

  real 0m22.209s
  user 0m1.013s
  sys 0m3.245s

[1] https://github.com/python/cpython/blob/2.7/Lib/httplib.py#L850
[2] https://github.com/openstack/python-swiftclient/blob/3.3.0/swiftclient/utils.py#L257-L261

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-swiftclient (master)

Reviewed: https://review.openstack.org/449771
Committed: https://git.openstack.org/cgit/openstack/python-swiftclient/commit/?id=638d7c789cf3ccab61bf6af6fcab6e6d79b9e0a4
Submitter: Jenkins
Branch: master

commit 638d7c789cf3ccab61bf6af6fcab6e6d79b9e0a4
Author: Tim Burke <email address hidden>
Date: Wed Mar 15 18:05:38 2017 +0000

    Buffer reads from disk

    Otherwise, Python defaults to 8k reads which seems kinda terrible.

    Change-Id: I3160626e947083af487fd1c3cb0aa6a62646527b
    Closes-Bug: #1671621

Changed in python-swiftclient:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-swiftclient 3.4.0

This issue was fixed in the openstack/python-swiftclient 3.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.