Bug #1249181 “500 Error when getting /recon/diskusage” : Bugs : OpenStack Object Storage (swift)

Revision history for this message

Kun Huang (academicgareth) wrote on 2013-11-12:

#1

Could you provide more traceback details in log? It seems os.lstat(<your-mount-point>) raise an un-excepted error

Revision history for this message

WangChao (chaowsh) wrote on 2013-11-12:

#2

yes , it really the os.lstat() (in the method swift.common.utils.ismount(path)) raise the IO exception, and the method did not catch the errno.EIO exception.
In swift 1.9.0 , I did the same thing and it worked ok. But in Havana the recon daemon crash.

Revision history for this message

Kun Huang (academicgareth) wrote on 2013-11-12:

#3

I submit a patch to catch all oserror. A reason is the newer implementation of os.path.ismount does this too.

Changed in swift:
assignee:	nobody → Kun Huang (academicgareth)

Revision history for this message

Peter Portante (peter-a-portante) wrote on 2013-11-12:

#4

Before we go changing things, can we find out what the error is that is being raised and why it is being raised? This smells like a configuration problem that will likely be hidden by catching all errors as proposed.

Revision history for this message

WangChao (chaowsh) wrote on 2013-11-13:

#5

Below is the error log from both kernel and swift log. When I run the command "curl -v http://proxy-ip:6000/recon/diskusage" the 500 report. I am sure in swift-1.9.0 ,I can also get the recon/diskusage.

Nov 4 11:35:59 swift-node7 kernel: [1818200.382441] XFS (sdd): metadata I/O error: block 0x7470770d ("xlog_iodone") error 5 numblks 64
Nov 4 11:35:59 swift-node7 kernel: [1818200.382729] XFS (sdd): xfs_do_force_shutdown(0x2) called from line 1038 of file /build/buildd/linux-lts-quantal-3.5.0/fs/xfs/xfs_log.c. Return address = 0xffffffffa0265ac1
Nov 4 11:35:59 swift-node7 kernel: [1818200.382754] XFS (sdd): Log I/O Error Detected. Shutting down filesystem
Nov 4 11:35:59 swift-node7 kernel: [1818200.382762] XFS (sdd): xfs_log_force: error 5 returned.
Nov 4 11:35:59 swift-node7 kernel: [1818200.382768] XFS (sdd): xfs_do_force_shutdown(0x1) called from line 1098 of file /build/buildd/linux-lts-quantal-3.5.0/fs/xfs/xfs_buf.c. Return address = 0xffffffffa02115a4
Nov 4 11:35:59 swift-node7 kernel: [1818200.382977] XFS (sdd): Please umount the filesystem and rectify the problem(s)
Nov 4 11:35:59 swift-node7 kernel: [1818200.384131] sd 0:0:11:0: [sdd] Synchronizing SCSI cache

Nov 4 11:36:00 swift-node7 account-server ERROR __call__ error with REPLICATE /sdd/989/7bba9492c36a4163c0c8a3991c3c5625 : [Errno 5] Input/output error: '/srv/node/sdd'

Revision history for this message

Peter Portante (peter-a-portante) wrote on 2013-11-13:

#6

Looks like you have a corrupted disk. Perhaps it was a good thing we did not fix this imount command?

Revision history for this message

Kun Huang (academicgareth) wrote on 2013-11-13:

#7

@peter

If we don't catch EIO in utils.ismount, we have to catch this in places which use utils.ismount. For example in this case, we should catch EIO at https://github.com/openstack/swift/blob/master/swift/common/middleware/recon.py#L200. And we also have to catch this in more places. Compared with newer codes in os.path.ismount (http://hg.python.org/releasing/3.3.3/file/6e81c3a16d1c/Lib/posixpath.py#l211), ismount catch all OS error and seems be designed to return True or False as an easy check. So my understanding is that if there're something wrong with your disk, just give you a False. So the patch here(https://review.openstack.org/#/c/55991/) catch all OS error.

Revision history for this message

Peter Portante (peter-a-portante) wrote on 2013-11-13:

#8

@kun

Wouldn't we want to have recon raise a red flag to say that there is a major problem with a configured device, rather than just reporting that the disk is simply not mounted?

Personally, the upstream python code is performing a disservice swallowing all errors and just reporting that it is not mounted.

In your case, the device IS mounted, but it is reporting errors. That seems to be the information recon should be catching to help report what is actually happening.

I will not prevent that patch from going forward, so if other swifters disagree, that is fine.

Revision history for this message

WangChao (chaowsh) wrote on 2013-11-13:

#9

@peter
I agree with Huang Kun, I don't think it is a good idea that the recon crash just one disk failed while the rest all works fine.

Revision history for this message

Peter Portante (peter-a-portante) wrote on 2013-11-13:

#10

So then we need to fix recon to catch errors checking disk availability, so it won't crash, flagging errors instead, rather than changing utils.ismount. Do you agree?

Revision history for this message

Kun Huang (academicgareth) wrote on 2013-11-13:

#11

In this case catch the error in recon is ok, but that means a request to object server on that disk also raise errors because ismount doesn't catch it. So we have to catch this error in many places using ismount or mount_check. I don't think that's a good idea.

In another word, IMO, to return False means device is not mounted CORRECTLY. So catching all OS errors seems more reasonable.

Revision history for this message

Peter Portante (peter-a-portante) wrote on 2013-11-13:

#12

@Kun, respectfully, I disagree. there is only one place in object server that checks mounts in master, so I don't think this is an issue. There are other places in account and container servers that make that same check today, but there is a series of patches proposed that will reduce those to one place as well.

Hiding the errors in ismount seems too expedient. Let's get our code to handle the errors properly.

Revision history for this message

WangChao (chaowsh) wrote on 2013-11-13:

#13

@peter it is ok to fix recon to catch errors checking disk availability.

OpenStack Infra (hudson-openstack) on 2013-11-13

Changed in swift:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2013-11-13: Fix merged to swift (master)

#14

Reviewed: https://review.openstack.org/55991
Committed: http://github.com/openstack/swift/commit/fd4843f8e72227318b1e2ee5fbd2d43cc3732a1d
Submitter: Jenkins
Branch: master

commit fd4843f8e72227318b1e2ee5fbd2d43cc3732a1d
Author: Kun Huang <email address hidden>
Date: Wed Nov 13 19:20:16 2013 +0800

catch OSError to prevent breaking request /recon/diskusage

    swift.common.utils.ismount maybe raise some OSError in some special
    cases; and the request against /recon/diskusage doesn't handle it
    before. This patch let output of mounted keyword is the error's message.

Change-Id: I5d9018f580181e618a3fa072b7a760d41795d8eb
Closes-Bug: #1249181

Changed in swift:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2013-12-03: Fix proposed to swift (feature/ec)

#15

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/59766

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2013-12-03: Fix merged to swift (feature/ec)

#16

Download full text (43.6 KiB)

Reviewed: https://review.openstack.org/59766
Committed: http://github.com/openstack/swift/commit/239f88a42b00a71a07860f953d00771c8aef4305
Submitter: Jenkins
Branch: feature/ec

commit 8a64bff2dc28b43b3ed4fa7b65da1a9ea29677cc
Author: Samuel Merritt <email address hidden>
Date: Wed Nov 27 17:23:59 2013 -0800

Report transaction ID in failure exceptions

This way, when something fails in Jenkins, you have some chance of
searching the logs for the relevant transaction.

Change-Id: I3cf606cb4963e32b5c6ac3deda08e73541b3ff7d

commit e0147e60d800fd67bc05bc4299c315f1761bd60b
Author: Peter Portante <email address hidden>
Date: Fri Nov 22 16:59:09 2013 -0500

Add a unit test to verify proxy logging fields

Also bring unit test coverage to 100% (well, at least every line is
reported as "covered").

Change-Id: I659d0c02008368897b1307a7a5c9aaba73b80588
Signed-off-by: Peter Portante <email address hidden>

commit 87cd5598476d0835c526918a9e1f03fe2d698866
Author: Alex Gaynor <email address hidden>
Date: Sun Nov 24 20:24:45 2013 -0600

Account for a platform difference in semaphores

    On OS X (and probably other Operating Systems) it isn't possible to
    introspect the value of a semaphore. Account for this by skipping a
    test about this.

Change-Id: I97824f9fc4e36de4f7a62c8ce53865e6977dfdfe

commit 3c7c355120a3ebe5c3f47e62176cec8cab824143
Author: Peter Portante <email address hidden>
Date: Mon Nov 25 13:30:41 2013 -0500

Use TCP_NODELAY for created sockets.

    Mark Seger at HP has been looking at small objects, 1 and 2 KB size,
    and with Rick Jones' help noticed that TCP protocol traces showed
    effects from the Nagel algorithm client-to-server and
    server-to-client.

This patch just addresses our WSGI server responses, but does not
address out-bound connections from the various servers.

Change-Id: I11f86df1f56fba1c6ab6084dc1f580c395f072dc
Signed-off-by: Peter Portante <email address hidden>

commit 39032c359f01a5e397fce2eb8326b961c9673607
Author: Darrell Bishop <email address hidden>
Date: Wed Nov 27 12:07:42 2013 -0500

Add HTML reporting for test branch coverage.

    When including branch coverage results, also generate HTML reports into
    a "cover" subdirectory under the directory in which .unittests resides
    (i.e. known location at the top of the swift tree).

Change-Id: I493d74f38755f7bf0d7043052585efb27840b238

commit 0ba071f27c009e1d028189e812f722e8583a07ee
Author: Darrell Bishop <email address hidden>
Date: Tue Nov 26 15:08:13 2013 -0500

Fix bug in obj updater run_once().

The "not" in front of the ismount() call got accidentally dropped in a
recent change. This patch adds it back along with a few more tests.

    Note that this bug only showed up on an SAIO during probe tests because
    I used actually-mounted (virtual) "disks". So keep that in mind when
    building SAIOs for development/testing.

Change-Id: Ia193f3c4b73203605954036863575c22ddab6b03

commit edc9f62ed6c537c4c112cf552310705b99fa08b8
Author: Peter Portante <peter.portante@redha...

Reviewed:  https://review.openstack.org/59766
Committed: http://github.com/openstack/swift/commit/239f88a42b00a71a07860f953d00771c8aef4305
Submitter: Jenkins
Branch:    feature/ec

commit 8a64bff2dc28b43b3ed4fa7b65da1a9ea29677cc
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Wed Nov 27 17:23:59 2013 -0800

Report transaction ID in failure exceptions
    
    This way, when something fails in Jenkins, you have some chance of
    searching the logs for the relevant transaction.
    
    Change-Id: I3cf606cb4963e32b5c6ac3deda08e73541b3ff7d

commit e0147e60d800fd67bc05bc4299c315f1761bd60b
Author: Peter Portante <peter.portante@redhat.com>
Date:   Fri Nov 22 16:59:09 2013 -0500

Add a unit test to verify proxy logging fields
    
    Also bring unit test coverage to 100% (well, at least every line is
    reported as "covered").
    
    Change-Id: I659d0c02008368897b1307a7a5c9aaba73b80588
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 87cd5598476d0835c526918a9e1f03fe2d698866
Author: Alex Gaynor <alex.gaynor@gmail.com>
Date:   Sun Nov 24 20:24:45 2013 -0600

Account for a platform difference in semaphores
    
    On OS X (and probably other Operating Systems) it isn't possible to
    introspect the value of a semaphore. Account for this by skipping a
    test about this.
    
    Change-Id: I97824f9fc4e36de4f7a62c8ce53865e6977dfdfe

commit 3c7c355120a3ebe5c3f47e62176cec8cab824143
Author: Peter Portante <peter.portante@redhat.com>
Date:   Mon Nov 25 13:30:41 2013 -0500

Use TCP_NODELAY for created sockets.
    
    Mark Seger at HP has been looking at small objects, 1 and 2 KB size,
    and with Rick Jones' help noticed that TCP protocol traces showed
    effects from the Nagel algorithm client-to-server and
    server-to-client.
    
    This patch just addresses our WSGI server responses, but does not
    address out-bound connections from the various servers.
    
    Change-Id: I11f86df1f56fba1c6ab6084dc1f580c395f072dc
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 39032c359f01a5e397fce2eb8326b961c9673607
Author: Darrell Bishop <darrell@swiftstack.com>
Date:   Wed Nov 27 12:07:42 2013 -0500

Add HTML reporting for test branch coverage.
    
    When including branch coverage results, also generate HTML reports into
    a "cover" subdirectory under the directory in which .unittests resides
    (i.e. known location at the top of the swift tree).
    
    Change-Id: I493d74f38755f7bf0d7043052585efb27840b238

commit 0ba071f27c009e1d028189e812f722e8583a07ee
Author: Darrell Bishop <darrell@swiftstack.com>
Date:   Tue Nov 26 15:08:13 2013 -0500

Fix bug in obj updater run_once().
    
    The "not" in front of the ismount() call got accidentally dropped in a
    recent change.  This patch adds it back along with a few more tests.
    
    Note that this bug only showed up on an SAIO during probe tests because
    I used actually-mounted (virtual) "disks".  So keep that in mind when
    building SAIOs for development/testing.
    
    Change-Id: Ia193f3c4b73203605954036863575c22ddab6b03

commit edc9f62ed6c537c4c112cf552310705b99fa08b8
Author: Peter Portante <peter.portante@redhat.com>
Date:   Mon Sep 30 12:35:07 2013 -0400

Report path information in failure exceptions
    
    When an error occurs during functional tests that use the
    swift_test_client module, the reported error message includes the
    method and path:
    
        ResponseError: 500: 'Internal Error' ('HEAD' \
        '/v1/AUTH_test/d5ce...')
    
    Change-Id: I631cd9e83879fb644778d4ded62625483bf38045

commit 71d180568322dc23eea2b48ebf93dd7d89b9362b
Author: John Dickinson <me@not.mn>
Date:   Tue Nov 26 14:39:30 2013 -0800

bare excepts, as is proper
    
    Change-Id: Ifd28f6f14a781a67644315690491888161a7250c

commit cdb6cd830ada81252a670a49eafe198ad898fd7a
Author: John Dickinson <me@not.mn>
Date:   Wed Nov 20 17:15:19 2013 -0800

add bare except to catch errors
    
    Change-Id: Ibe78912cf923591bddd6a8cf0e683cd028c9c4e8

commit 853853edcedec2b2d2ea6c46c812111d8a7895d0
Author: Peter Portante <peter.portante@redhat.com>
Date:   Tue Sep 3 15:27:57 2013 -0400

Push cooperative sleep call down into ThreadPool
    
    The PUT REST API has no idea how writes are performed, so when thread
    pools are in use, the sleep is not necessary, though it is still
    necessary when thread pools are not in use. Since the ThreadPool
    object knows when threads are actually in use, it can take care of
    being cooperative with the eventlet hub.
    
    In addition, we can hide the cooperative iterator hook, given that the
    only other consumer of this was the auditor, which does not need it
    any longer.  The only consumer of the DiskFile class that wants the
    cooperative behavior is the REST API layer of the object server, which
    is also using thread pools.
    
    Change-Id: Ibc4ac672899f9a35fd68c85d7f56403c19b4f991
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 6e313e957d2fd7e02bc3abe59f6a3c6cd37f23a2
Author: Peter Portante <peter.portante@redhat.com>
Date:   Tue Nov 19 22:55:09 2013 -0500

Fix for memcache middleware configuration
    
    The documentation rightly said to use "memcache_max_connections", but
    the code was looking for "max_connections", and only looking for it in
    proxy-server.conf, not in memcache.conf as a fall back.
    
    This commit brings the code coverage for the memcache middleware to
    100%.
    
    Closes-Bug: 1252893
    Change-Id: I6ea64baa2f961a09d60b977b40d5baf842449ece
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 15c31c0373dc8f102320637908c1ed59e854a4df
Author: Peter Portante <peter.portante@redhat.com>
Date:   Mon Oct 21 18:19:30 2013 -0400

Remove unnecessary "is not None" check
    
    From a review comment on https://review.openstack.org/30051 remove
    the "is not None" check, as the assignment in the try block will
    never assign None as value as the int() built-in will not return it.
    
    There was some concern of long-term maintenance of the DiskFile
    class's _quarantine method raising exceptions. If that routine were
    ever mistakenly changed to NOT raise an exception, subtle problems
    could creep into the code (see https://review.openstack.org/53237).
    We address this concern by raising an exception explicitly at the
    call sites of DiskFile._quarantine().
    
    Change-Id: I1729a2d77a6b72b4494b24a8838b47ad5272c075
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 2c0bbaf05c06968cc0376f34acbfb09e72603d47
Author: Peter Portante <peter.portante@redhat.com>
Date:   Sun Nov 24 23:29:53 2013 -0500

Protect against hash cleanup errors on PUTs
    
    In http://launchpad.net/bugs/1254405 an exception occurs when
    finalizing the PUT of an object, but it is obscured by the thread pool
    code, so we don't see the actual line where it originated. However,
    there are two possible functions where this exception could originate:
    
        1. renamer()
        2. hash_cleanup_listdir()
    
    If this is an error in renamer(), there is some other waky problem
    where the temporary file has been removed. It is likely that this is a
    problem where a file name from os.listdir ends up disappearing (but
    even this is a rare occurence). Regardless, it is not clear that we
    really want an error from hash_cleanup_listdir() from affecting the
    return result of the PUT.
    
    To that end, we squelsh OSErrors from hash_cleanup_listdir() for
    now. One might argue for all errors, but since os.unlink() and
    os.listdir() raise OSError today, that is probably sufficient.
    
    Even that we use "Closes-Bug" below, it is not clear it can even be
    determined that this closes that bug report.
    
    Change-Id: I2f55df835c387e4d17cffda74c04c9994aebbe1f
    Closes-Bug: 1254405
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 7207926cffb617b89869c9cd4bed850395334eac
Author: Michael Barton <mike@weirdlooking.com>
Date:   Mon Nov 25 18:58:34 2013 +0000

slightly less early quorum
    
    The early quorum change has maybe added a little bit too much
    eventual to the consistency of requests in Swift, and users can
    sometimes get unexpected
    results.
    
    This change gives us a knob to turn in finding the right balance,
    by adding a timeout where pending requests can finish after quorum
    is achieved.
    
    Change-Id: Ife91aaa8653e75b01313bbcf19072181739e932c

commit fa1f7d9420150dcfa30f4197b20813fcbfd3d9e0
Author: John Dickinson <me@not.mn>
Date:   Thu Nov 14 15:58:52 2013 -0800

Use upstream patched Pool.get
    
    This works around an eventlet bug in eventlet 0.9.16.
    This version properly keeps track of pool size accounting, and
    therefore doesn't let the pool grow without bound. This patched
    version is the result of commit
    f5e5b2bda7b442f0262ee1084deefcc5a1cc0694 in eventlet and is
    documented at https://bitbucket.org/eventlet/eventlet/issue/91
    
    This patch includes full test coverage of the back-ported code, even
    when the actually-installed eventlet is newer.
    
    This fixes bug #1254119
    
    Change-Id: I075bb5e40e08571d52fe17fcc3fa0e25be5befed

commit 91deed871bba196a7ac35e0b0baf63339f734e24
Author: Alex Gaynor <alex.gaynor@gmail.com>
Date:   Sun Nov 24 09:11:04 2013 -0600

Use a more portable errno in tests.
    
    ELIBBAD doesn't exist on OS X. The exact value we use here isn't
    important, so use something more portable.
    
    Change-Id: Id03dc1773f416a94bbd14ad31b2b2a70f16b9a51

commit 2c4bf81464ad2058226f457eb7ef64addb2f136e
Author: Richard (Rick) Hawkins <richard.hawkins@rackspace.com>
Date:   Wed Oct 16 19:28:37 2013 -0500

Added discoverable capabilities.
    
    Swift can now optionally be configured to allow requests to '/info',
    providing information about the swift cluster.  Additionally a HMAC
    signed requests to
    '/info?swiftinfo_sig=<sign>&swiftinfo_expires=<expires>' can be
    configured allowing privileged access to more sensitive information
    not meant to be public.
    
    DocImpact
    Change-Id: I2379360fbfe3d9e9e8b25f1dc34517d199574495
    Implements: blueprint capabilities
    Closes-Bug: #1245694

commit c859ebf5ce1161e0fc2ca5258b8d3f45e29fc9ea
Author: gholt <z-launchpad@brim.net>
Date:   Sat Nov 9 03:18:11 2013 +0000

Per device replication_lock
    
    New replication_one_per_device (True by default)
    that restricts incoming REPLICATION requests to
    one per device, replication_currency allowing.
    
    Also has replication_lock_timeout (15 by default)
    to control how long a request will wait to obtain
    a replication device lock before giving up.
    
    This should be very useful in that you can be
    assured any concurrent REPLICATION requests are
    each writing to distinct devices. If you have 100
    devices on a server, you can set
    replication_concurrency to 100 and be confident
    that, even if 100 replication requests were
    executing concurrently, they'd each be writing to
    separate devices. Before, all 100 could end up
    writing to the same device, bringing it to a
    horrible crawl.
    
    NOTE: This is only for ssync replication. The
    current default rsync replication still has the
    potentially horrible behavior.
    
    Change-Id: I36e99a3d7e100699c76db6d3a4846514537ff685

commit f5648638ee6f939556ebfcb40dfdb8a590d3b5ae
Author: David Goetz <david.goetz@rackspace.com>
Date:   Mon Nov 4 17:06:06 2013 +0000

Get retry.
    
    If a source times out on read try another one of them with a
    modified range.  There had to be a lot of moved around code
    to get this working but it should all make sense.
    
    Change-Id: Ieaf045690a8823927a6f38098a95b37a4d4adb70

commit b5b0b78fc7c2add5ee5211f504ee79c5ae15f162
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Fri Nov 22 12:23:58 2013 -0800

Remove obsolete future imports
    
    The with statement has been standard since Python 2.5, so we can get
    rid of these imports.
    
    Change-Id: I280971c3d8c01e94cc2c17cacaedcbe9d9c8a3c3

commit 4cb5e2f4563593adab9a9ccbb1577ae53514d504
Author: Peter Portante <peter.portante@redhat.com>
Date:   Wed Nov 20 15:52:07 2013 -0500

Simple fix for proxy-logging empty field handling
    
    Change-Id: Ia135575bd30a0bc04a2105291e68a6d18c7a3047
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 4ed1c8473f50f7f11082ad1ee389f5339707e212
Author: Peter Portante <peter.portante@redhat.com>
Date:   Fri Nov 22 00:37:11 2013 -0500

Handle optional arguments for run_forever()
    
    All the other daemons do this, and since the out deamon wrapper
    scripts pass all the command line options through directly, seems
    simple enough to handle them by ignoring.
    
    This is also applied to run_once().
    
    Change-Id: I1df83bdf78f0dc3d911019f67f78301967b5da72
    Closes-Bug: #1253891
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit b9efe1cd462177f5bb86363d767f74f3a4a53a23
Author: John Dickinson <me@not.mn>
Date:   Fri Nov 22 09:44:36 2013 -0800

add an "inline" query parameter to tempurl
    
    Giving the inline query parameter will cause the tempurl
    response to be given a "Content-Disposition: inline" header,
    regardless of other query parameters or metadata. This allows
    easy in-line viewing, eg in browsers.
    
    DocImpact
    
    Change-Id: Icd5c544d6a749d4f58e8a921968f4e432a2185db

commit a410730a2b64838d6bd0d102b6a9fb276ce1e7ae
Author: Peter Portante <peter.portante@redhat.com>
Date:   Wed Nov 20 16:43:40 2013 -0500

Do not format messages before they are logged
    
    Change-Id: Ia645c9eca47b7f404d9b987f68a96b4744031e9d
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 62b87fca2091970a2369ad7ff059255259422f2b
Author: Steven Lang <Steven.Lang@hgst.com>
Date:   Wed Nov 20 10:57:24 2013 -0800

Fixed locale test in the presence of LANGUAGE
    
    According to GNU documents, the priority order of language variables
    is LANGUAGE, LC_ALL, LC_*, LANG. Therefore, if LANGUAGE is set, it
    overrides the LC_ALL setting from the test. An empty value is ignored,
    and setting it to empty is easier to deal with than just deleting the
    variable.
    
    Also fixed the Google translate fail esperanto grammar.
    
    Fixes bug 1235058
    
    Change-Id: Ic97b90dfc21997e19cc473250794a9b3c526beb5

commit 700479bd67be23648b5998ee4353db7e290f8601
Author: Steven Lang <Steven.Lang@hgst.com>
Date:   Thu Nov 21 11:46:15 2013 -0800

Fix DB locked error on commit
    
    This bug was introduced in ef7f9e27; while moving timeout for execute
    to the cursor wrapper, commit was moved as well; however commit is
    purely a connection method, only execute is passed on to a cursor.
    Added unit tests to check both methods for correct timeouts.
    
    This manifested in a test failure as:
    ERROR __call__ error with POST /sdb1/418/AUTH_d1c4b610b16a48de83219c696261009c/TestContainer-tempest-1572414684 :
    Traceback (most recent call last):
      File "/opt/stack/new/swift/swift/container/server.py", line 486, in __call__
        res = method(req)
      File "/opt/stack/new/swift/swift/common/utils.py", line 1915, in wrapped
        return func(*a, **kw)
      File "/opt/stack/new/swift/swift/common/utils.py", line 687, in _timing_stats
        resp = func(ctrl, *args, **kwargs)
      File "/opt/stack/new/swift/swift/container/server.py", line 464, in POST
        broker.update_metadata(metadata)
      File "/opt/stack/new/swift/swift/common/db.py", line 677, in update_metadata
        conn.commit()
    OperationalError: database is locked (txn: tx5065394f288740e69fcec-00528e184e)
    
    Change-Id: I269b133fac53d4792d21b62f801cc0c0ccf337ea

commit 4f6d89ab5156526e3a371c0c954166d298621de6
Author: Peter Portante <peter.portante@redhat.com>
Date:   Thu Nov 14 13:56:47 2013 -0500

Import filter and app into namespace correctly
    
    The module used to simply "import swift", and then reference
    the classes:
    
        swift.common.middleware.catch_errors.CatchErrorsMiddleware
        swift.proxy.server.Application
    
    in order to very that the WSGI services loaded the proper filters and
    apps.
    
    However, those references only happen to work, as the WSGI loading
    would properly import the rest of the path so that the namespace
    reference would be okay. If the WSGI configuration were to change, or
    if the behavior of WSGI broke, instead of of seeing the actual failure
    condition, a module attribute error would result instead:
    
        AttributeError: 'module' object has no attribute 'middleware'
    
    The referenced names are now properly imported with this change to
    avoid misleading error conditions.
    
    Change-Id: Ifff4271bc5be1136bf17e4e5b291b01033d608db
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 14c5b547f207d5d93e659a2aa4a78c725dcb081d
Author: Gonéri Le Bouder <goneri.lebouder@enovance.com>
Date:   Wed Nov 20 16:26:03 2013 +0100

test: improve db_replicator coverage
    
    This patch adds a test for ReplicatorRpc.complete_rsync()
    and complete extract_device() coverage.
    
    test_extract_device:
      test the case the parameter is invalid
    
    test_complete_rsync_with_bad_input:
      ensure the use of invalid parameters return a 404 erro
    
    test_complete_rsync:
      validate the returned code in case of success
    
    Change-Id: I59e0d26a1efe59d8beff1e81c2a7edc6de0872e9

commit 9e80fd45a0072e15e9d49299a6fa9e9c725e2207
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Mon Nov 18 11:41:58 2013 -0800

Add a DebugLogger for wsgi server tests
    
    Change-Id: Ifd2528be443ba3879bf4921f6c5f4ef31f29044b

commit 934354f0dea1b257596f352498bffa82091d207c
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Nov 21 01:33:48 2013 -0800

workaround probetest race from early response
    
    Change-Id: I594633887c86fc2212850409a37ee2257633a23c

commit 37e0654adb3563bc84176ebdea4e36f97e3c3bb5
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Tue Oct 29 13:04:59 2013 -0700

in case you lose your builder backups
    
    Change-Id: Ica555be2be492c3ec5fdeab738058ff35989a603

commit ef7f9e27a2d10aeaa9ed550e9595c54ccacdd4f2
Author: Steven Lang <Steven.Lang@hgst.com>
Date:   Wed Oct 23 11:35:57 2013 -0700

Fixed concurrent PUT requests to accounts or containers
    
    When concurrent PUT requests come in for an account or container, the
    resulting DB access will try to lock the DB for writing. Normal access
    will retry when it encounters a locked DB, change 0fdad0d9 introduced
    a cursor for doing the initialization which did not have the retry
    capability, resulting in a hard failure.
    
    Fixes bug 1243973
    
    Change-Id: I73b219e0f5eacf314d87b4d5e56c03daf51b2eca

commit 94090e8760da324d6586b5d45b7c339457e33647
Author: Kun Huang <academicgareth@gmail.com>
Date:   Tue Oct 22 18:05:43 2013 +0800

Use POST in bulk-delete
    
    The DELETE verb applies to a single resource, and doesn't define any
    semantics for the body.
    
    http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.7
    
    The swift Bulk Delete command affects multiple resources specified in a
    DELETE body.
    
    http://docs.openstack.org/developer/swift/misc.html#module-swift.common.middleware.bulk
    
    While Bulk Delete is a welcome operation, its usage of DELETE is
    unusual: affecting multiple resources and relying on reading content.
    
    More typically, such an operation employs POST (or PUT), which folks
    including api-craft usually agree is the best "catch-all" verb for
    behaviors such as those affecting multiple resources. That's the TL;DR;
    of the thread below.
    
    https://groups.google.com/forum/#!searchin/api-craft/Regarding$20Bulk$20actions/api-craft/wY-W1NdZDRs/7YDwMhCR608J
    
    Note that this topic isn't nasal or abstract. The current behavior is
    unsupported using the built-in java http client. Even if third-party
    libraries can work around this behavior, it is probably best to not be a
    snowflake wrt http verb semantics where possible!
    
    http://stackoverflow.com/questions/9100776/http-delete-with-request-body-issues
    
    DocImpact
    
    Closes-Bug: #1232787
    Change-Id: I0fc74c85618fe4dd7ff5e7f9756c7f6f67aa0465

commit 9151bcc92c7854b04ca41ac3c96efd903ad12b5b
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Mon Nov 11 14:59:14 2013 -0800

Reorganize SLO unit tests
    
    This brings some sanity to the SLO test app (the thing the middleware
    wraps in the unit tests) as well as splits things into multiple test
    classes.
    
    This is part of the effort to move all SLO functionality to
    middleware, but it's a separate commit to prove that these test
    changes don't harm anything when running against the old code.
    
    Change-Id: I52a16f15a80dfaf9b3c595b0e634d52f418caf6c

commit 2e1fc7446f188399408eb8695c12d751b48defa1
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Mon Nov 18 13:17:48 2013 -0800

Some functional tests for static large objects
    
    There's some sort-of-hacky code in there to detect SLO support in
    order to skip tests when SLO is off so that the functests won't fail
    on older clusters.
    
    Change-Id: I6ad5974a0db7213747b0f4497d08ffc706d3f220

commit 63370ea466fb1ae4eb8f29ef5318c50a5a0356ae
Author: Peter Portante <peter.portante@redhat.com>
Date:   Sat Oct 5 11:12:43 2013 -0400

Quiet all locale warnings and dummy thread, too
    
    Change-Id: I0c68b94ec234e470ce2d50da01d8ae1cd10fae58
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit c2c2a14df700e2e0eaa855acec977311dc82134a
Author: Peter Portante <peter.portante@redhat.com>
Date:   Fri Nov 1 14:39:53 2013 -0400

Replace httplib with bufferedhttp in sphinx docs
    
    We also replace references in a few comments as well.
    
    Change-Id: Ifc8d78e943219fefc73f41abed7d393a060e3926
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit b1f51d00cde51a77aacbe41401cde0b3b9e35719
Author: Peter Portante <peter.portante@redhat.com>
Date:   Wed Nov 13 11:16:59 2013 -0500

Use utils.ismount in place of os.path.ismount
    
    See comments from: https://review.openstack.org/55991
    
    Change-Id: Ibb4153702b3dc4c60f66abb11cd3fa1953449827
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 092eb56e79e79891351003ff2cc97f7cf5d4104c
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Mon Nov 18 14:24:55 2013 -0800

minor fix to unittest fake error
    
    Change-Id: Ife7add5646afb94ec56a38f335112ab70998b1b1

commit 304fb34153b0761f45136b304756c463fd158e70
Author: Steven Lang <Steven.Lang@hgst.com>
Date:   Thu Oct 24 11:46:49 2013 -0700

Fix UnboundLocalError on container PUT
    
    Fixes bug-1243973
    
    Change-Id: If165fdcccb5d4712570b1cdabcc89e618f539849

commit ef42c0f0dec044543b0698aa24709b9c2ed847e7
Author: John Dickinson <me@not.mn>
Date:   Fri Nov 15 15:10:39 2013 -0800

allow bare excepts in flake8
    
    Change-Id: I4751f9f65712960019f9a303b7d7fe017c6e85c7

commit 8255810e7da289004363e06c7186aa9e1844267a
Author: Chmouel Boudjnah <chmouel@enovance.com>
Date:   Fri Nov 15 15:08:36 2013 +0100

Add swiftsync.
    
    Change-Id: Ie588db762aedc55e00f8f51b3d2e329ba9a08a6c

commit b468f79daa39ac42a182f4a6e7c15db4631368a5
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Wed Nov 13 12:51:18 2013 -0800

Fix test to work with mock 0.8.0
    
    This particular test case used a construct that worked in mock 1.0,
    but not in 0.8. Our test-requirements.txt says mock>=0.8.0.
    
    Change-Id: I93ed4b7b5d169572bbed6490cb9b0dd421a3b9e2

commit fd4843f8e72227318b1e2ee5fbd2d43cc3732a1d
Author: Kun Huang <academicgareth@gmail.com>
Date:   Wed Nov 13 19:20:16 2013 +0800

catch OSError to prevent breaking request /recon/diskusage
    
    swift.common.utils.ismount maybe raise some OSError in some special
    cases; and the request against /recon/diskusage doesn't handle it
    before. This patch let output of mounted keyword is the error's message.
    
    Change-Id: I5d9018f580181e618a3fa072b7a760d41795d8eb
    Closes-Bug: #1249181

commit 985c7bf38b8448e342c9547b5712fe7272296fd0
Author: gholt <z-launchpad@brim.net>
Date:   Tue Nov 12 18:00:15 2013 +0000

Fix probe test
    
    Fix for a probe test that failed every once in a
    while due to the early-majority change previously
    committed. Sometimes a write would return success
    before the third node had succeeded and the probe
    test would look for on-disk evidence and fail,
    when it would've been fine had it waited just a
    bit longer for the third node to complete.
    
    Since there's no real way for the probe test to
    know when all three nodes are done, I just made
    it retry once a second for several seconds before
    reporting an error.
    
    There may be more tests like this we'll have to
    fix as we run across them.
    
    Change-Id: I749e43d4580a7c726a9a8648f71bafefa70a05f5

commit 615239188f1eed2f11e5bfe7560c6a28b3ecd4a5
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Nov 7 17:00:26 2013 -0800

Remove redundant hash check
    
    Change-Id: I46f69d48e60349d28c6a297b703c9ff16b79bfe3

commit a80c720af598627a8419cc72403e33b9d59fa10d
Author: gholt <z-launchpad@brim.net>
Date:   Wed Aug 28 16:10:43 2013 +0000

Object replication ssync (an rsync alternative)
    
    For this commit, ssync is just a direct replacement for how
    we use rsync. Assuming we switch over to ssync completely
    someday and drop rsync, we will then be able to improve the
    algorithms even further (removing local objects as we
    successfully transfer each one rather than waiting for whole
    partitions, using an index.db with hash-trees, etc., etc.)
    
    For easier review, this commit can be thought of in distinct
    parts:
    
    1)  New global_conf_callback functionality for allowing
        services to perform setup code before workers, etc. are
        launched. (This is then used by ssync in the object
        server to create a cross-worker semaphore to restrict
        concurrent incoming replication.)
    
    2)  A bit of shifting of items up from object server and
        replicator to diskfile or DEFAULT conf sections for
        better sharing of the same settings. conn_timeout,
        node_timeout, client_timeout, network_chunk_size,
        disk_chunk_size.
    
    3)  Modifications to the object server and replicator to
        optionally use ssync in place of rsync. This is done in
        a generic enough way that switching to FutureSync should
        be easy someday.
    
    4)  The biggest part, and (at least for now) completely
        optional part, are the new ssync_sender and
        ssync_receiver files. Nice and isolated for easier
        testing and visibility into test coverage, etc.
    
    All the usual logging, statsd, recon, etc. instrumentation
    is still there when using ssync, just as it is when using
    rsync.
    
    Beyond the essential error and exceptional condition
    logging, I have not added any additional instrumentation at
    this time. Unless there is something someone finds super
    pressing to have added to the logging, I think such
    additions would be better as separate change reviews.
    
    FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
    CLUSTERS. Some of us will be in a limited fashion to look
    for any subtle issues, tuning, etc. but generally ssync is
    an experimental feature. In its current implementation it is
    probably going to be a bit slower than rsync, but if all
    goes according to plan it will end up much faster.
    
    There are no comparisions yet between ssync and rsync other
    than some raw virtual machine testing I've done to show it
    should compete well enough once we can put it in use in the
    real world.
    
    If you Tweet, Google+, or whatever, be sure to indicate it's
    experimental. It'd be best to keep it out of deployment
    guides, howtos, etc. until we all figure out if we like it,
    find it to be stable, etc.
    
    Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6

commit 845b8beeb5d7b32fce66e6e9100d232970dcc595
Author: Juan J. Martinez <juan@memset.com>
Date:   Thu Nov 7 11:36:55 2013 +0000

Default region loading an old-style pickled ring
    
    This is to support upgrades from swift < 1.8 using old-style pickled
    rings to 1.10. Old-style pickled rings won't have region information.
    
    Change-Id: I18b2acba3d346e41def9d25d3d4dbd12705e5375
    Closes-Bug: #1248919

commit f0c0855ec8207fa8be63ed4f3135c8e29a31acc0
Author: Michael Barton <mike@weirdlooking.com>
Date:   Wed Oct 30 21:43:35 2013 +0000

early quorum responses
    
    Allow the proxy to respond to many types of requests as soon as it has a
    quorum.  This can help speed up responses (without changing the results),
    especially when one node is acting up.
    
    I had to fix a few unit tests that no longer match the backend http requests
    made by our proxy.
    
    Change-Id: Ieb070dc3019e217e717b96154a7a809409bf40a5

commit 729430f349b651ec889382ba9092b68a24b655db
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Mon Oct 28 14:57:18 2013 -0700

Alternate DiskFile constructor for efficient auditing.
    
    Before, to audit an object, the auditor:
     - calls listdir(object-hash-dir)
     - picks out the .data file from the listing
     - pulls out all N of its user.swift.metadata* xattrs
     - unpickles them
     - pulls out the value for 'name'
     - splits the name into a/c/o
     - then instantiates and opens a DiskFile(a, c, o),
    which does the following
     - joins a/c/o back into a name
     - hashes the name
     - calls listdir(object-hash-dir) (AGAIN)
     - picks out the .data file (and maybe .meta) from the listing (AGAIN)
     - pulls out all N of its user.swift.metadata* xattrs (AGAIN)
     - unpickles them (AGAIN)
     - starts reading object's contents off disk
    
    Now, the auditor simply locates the hash dir on the filesystem (saving
    one listdir) and then hands it off to
    DiskFileManager.get_diskfile_from_audit_location, which then
    instantiates a DiskFile in a way that lazy-loads the name later
    (saving one xattr reading).
    
    As part of this, DiskFile.open() will now quarantine a hash
    "directory" that's actually a file. Before, the audit location
    generator would skip those, but now they make it clear into
    DiskFile(). It's better to quarantine them anyway, as they're not
    doing any good the way they are.
    
    Also, removed the was_quarantined attribute on DiskFileReader. Now you
    can pass in a quarantine_hook callable to DiskFile.reader() that gets
    called if the file was quarantined. Default is to log quarantines, but
    otherwise do nothing.
    
    Change-Id: I04fc14569982a17fcc89e00832725ae71009335a

commit 68025b8f9a6b7e7ffb1976920208185a8377448b
Author: Peter Portante <peter.portante@redhat.com>
Date:   Fri Nov 1 14:57:28 2013 -0400

Fix up sphinx docs for *make_request* methods
    
    Change-Id: I05ac17ba45eda99dc9fb9d6ecf2374d1e8371a32
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 26483a2fd11e82bbfa4f43c6c36f0916c3bb7f49
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Tue Sep 24 16:46:45 2013 -0700

Add more stuff to SAIO doc's proxy pipeline.
    
    If you're setting one of these up, you're probably going to use it for
    development, in which case you want everything but the kitchen sink
    turned on so you can just start hacking away.
    
    Change-Id: I98d178ff545cbf8d853c102e9fce76fb9f6773ac

commit 691fa2c506d24261a502bcdb9ead1161f1a1dd0b
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Tue Nov 5 12:48:35 2013 -0800

Remove some WebOb leftovers.
    
    The deleted comments were talking about a swob EmptyResponse object,
    which doesn't exist; it seems this was left over from the big WebOb
    removal. Also, swob doesn't exhibit that behavior, so there's no point
    in having this extra code path around.
    
    Further, it seems like the hack was only needed with WebOb < 1.0.8. I
    went back to Swift 1.4.8 (Essex) and compared it to master, and this
    is the result:
    
    DLO manifest HEAD
    =================
    
    Essex (ffadbc3)
    ---------------
    Small #segments --> Content-Length: 12345
    Large #segments --> Content-Length: 0
    Large #segments sans shenanigans --> Content-Length: 0
    
    master (f63b58f)
    ----------------
    Small #segments --> Content-Length: 12345
    Large #segments --> Content-Length: 0
    Large #segments sans shenanigans --> Content-Length: 0
    
    So, whatever WebOb wackiness this was intended to hack around, it
    doesn't seem to have been here for a long, long time.
    
    Change-Id: I7b717b1b36de1139cc5b76c166d1715dbc34b332

commit 429cb8b7453662e325d6120457638827527483cd
Author: anc <alistair.coles@hp.com>
Date:   Tue Nov 5 12:15:24 2013 +0000

Fix swift_test_client duplicating request params
    
    Remove extra lines of code that result in
    swift_test_client adding params twice to
    PUT request urls.
    
    Bug was revealed while developing functional
    tests for SLO - SLO manifest PUTs behave incorrectly
    because the url constructed by swift_test_client ends
    with ?multipart-manifest=put?multipart-manifest=put
    
    Fixes bug 1248121
    
    Change-Id: Ie5a8651a55049bb52ef641edfd6eb29b0ff3c245

commit f63b58f5b7819073fb086a4415d6ef4ad2bf4fd1
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Mon Nov 4 12:28:48 2013 -0800

Fix deprecation warning.
    
    BaseException.message is deprecated; if you have an exception of type
    Exception (or subclass thereof), then "str(ex)" is the preferred way
    to get the message.
    
    Change-Id: I5b4acf88de538c1ef0f2db4fefaa92699937cd50

commit dde510590cd77e19e988de83a7a5a0beb6f1be8c
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Wed Oct 30 12:43:54 2013 -0700

Fix quarantine, error counts in audit logs
    
    Any quarantines and errors that happened between the last recon dump
    and the end of an audit pass weren't getting counted in the logs.
    
    This is particularly easy to see on a mostly-empty SAIO: go corrupt a
    file, and the auditor will probably tell you (a) that it's
    quarantining a file, and then (b) that 0 files were quarantined.
    
    Change-Id: I78e32b911e457078144564e3be42527260148ade

commit b20bab41b12f6b714f78a607f3a49fa407db05d9
Author: Peter Portante <peter.portante@redhat.com>
Date:   Tue Oct 1 11:30:15 2013 -0400

Remove unnecessary swift_conn comments
    
    Change-Id: I659073a979e2ed6f76cc0df828e600dc1d955b90
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 023a0615874443e2fc28d0df64a5ab5dd485405d
Author: Peter Portante <peter.portante@redhat.com>
Date:   Sun Aug 18 13:37:44 2013 -0400

Tie socket write buffer size to server parameters
    
    By default, Python 2.*'s standard library "socket" module performs 8K
    writes. For 10ge networks, with large MTUs (typically 9,000), this is
    not optimal. We tie the default buffer size to the client_chunk_size
    paramter for the proxy server, and to the network_chunk_size for the
    object server.
    
    One might be tempted to ask, isn't there a way to set this value on a
    per-request basis? This author was unable to find a reference to the
    _fileobject in the context of WSGI. By the time a request pass to a
    WSGI object's __call__ method, the "wfile" attribute of the
    req.environ['eventlet.input'] (Input) object has been set to None, and
    the "rfile" attribute is the object wrapping the socket for reading,
    not writing.
    
    One might also be tempted to ask, why not just override the
    wsgi.HttpProtocol's "wbufsize" class attribute instead? Until
    eventlet/wsgi.py is fixed, we can't set wsgi.HttpProtocol.wbufsize to
    anything but zero (the default, see Python's SocketServer.py,
    StreamRequestHandler class), since Eventlet does not ensure the socket
    _fileobject's flush() method is called after Eventlet invokes a
    write() method on the same.  NOTE: wbufsize (a class attribute of
    StreamRequestHandler originally, not to be confused with the standard
    library's socket._fileobject._wbufsize class attribute) is used for
    the bufsize parameter of the connection object's makefile() method. As
    a result, the socket's _fileobject code uses that value to set both
    _rbufsize and _wbufsize. While that would allow us to transmit in 64KB
    chunks, it also means that write() and writeline() method calls on the
    socket _fileobject are only transmitted once 64KB have been
    accumulated, or a flush() is called.
    
    As for performance improvement:
    
    Run       8KB   64KB
      0     8.101  6.367
      1     7.892  6.216
      2     7.732  6.246
      3     7.594  6.229
      4     7.594  6.292
      5     7.555  6.230
      6     7.575  6.270
      7     7.528  6.278
      8     7.547  6.304
      9     7.550  6.313
    Average 7.667  6.275  1.3923  18.16%
    
    Run using the following after adjusting the test value for obj_len to
    1 GB:
    
    nosetests -v --nocapture --nologcapture \
    test/unit/proxy/test_server.py:TestProxyObjectPerformance.test_GET_debug_large_file
    
    Change-Id: I4dd93acc3376e9960fbdcdcae00c6d002e545894
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit fc9f9d7baebfa98cdbd9fbe8289e82268ab5e149
Author: Pete Zaitcev <zaitcev@kotori.zaitcev.us>
Date:   Tue Oct 29 16:10:47 2013 -0600

Strengthen account tests
    
    This fell out of Peter Portante's "acctcont-api" branch and seems
    obviously good. Apparently one of these would've triggered while
    doing the Pluggable Backend work.
    
    Now, why commit this separately while simultaneously working on
    a big unified Backend Patch? Because posting them separately
    proves that the test changes worked on the old code.
    
    Change-Id: I9ca6ad45fb255f5c0a177a93b93c1acc68da5bbe

commit f52d96b904a375625df0c2cc6ff9c828ab47e665
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Fri Oct 25 12:02:42 2013 -0700

Quarantine objects with busted metadata.
    
    Before, if you encountered an object with corrupt or missing xattrs,
    the object server would return a 500 on GET, and wouldn't quarantine
    anything. Now the object server returns a 404 for that GET and the
    corrupted file is quarantined, thus giving replication a chance to fix
    it.
    
    Change-Id: Ib1d7ab965391742c88fde3d83dc0b5afe85bada9

commit 97b5f59ea4407f9b5696b2b6f2e6d14e4e686d0f
Author: Jon Snitow <otherjon@swiftstack.com>
Date:   Mon Oct 28 17:28:57 2013 -0700

HEAD on account returns 410 if account was deleted and not yet reaped
    
     * updated CONTRIBUTING.md
     * improved unit tests
     * 100% coverage for proxy acct controller tests
     * invoke fake_http_connect correctly
    
    Change-Id: I0826c5a1c52efdd5ae95f7fde8024f2bff0751ba

commit 0717133197d8929a03ec267d4e3c5b1696f02fb3
Author: John Dickinson <me@not.mn>
Date:   Tue Oct 29 12:29:49 2013 -0700

Make pbr a build-time only dependency
    
    This lets you build swift packages that don't require pbr
    to be installed at all. You would need pbr on the machine running
    rpmbuild / debuild, but not on the machines that install the packages.
    
    Unfortunately, this does not make swift able to be
    installed via pip 0.3.1 on Lucid; you'll need to uninstall the system
    python-pip package and install a new pip some other way. Given that
    pip < 1.3 doesn't perform SSL certificate validation for pypi (trivial
    MITM attack, anyone?), you'd probably want to get a new pip anyway.
    
    Change-Id: Ia50a229c5ae4dd2158beeaa953619b5e8f987c55

commit 56c902c8deed5bad6e8ae3993c4634207f6aa7ef
Author: David Goetz <david.goetz@rackspace.com>
Date:   Mon Oct 28 17:19:18 2013 +0000

form post over XMLHttpRequest (cors) broken
    
    Change-Id: Ia55e0d3974a96e11d49ab3cb26b6dcd7129b5cc8

commit 264766127ebea3f0a72cf696197b7acb7e4e015e
Author: Kun Huang <academicgareth@gmail.com>
Date:   Mon Oct 28 17:41:09 2013 +0800

improve docs in etc/dispersion.conf-sample
    
    1. add a comment to hint using a new account for using dispersion tools
    2. change sample url for keystone from 'saio' to 'localhost'
    
    Change-Id: I4683f5eb0af534b39112f1b7420f67d569c29b3a

commit d6c65c34aa927911b2fbb82d62913f204e58b2c0
Author: David Goetz <david.goetz@rackspace.com>
Date:   Fri Oct 25 19:44:46 2013 +0000

catch decompression errors
    
    Change-Id: Ica380edc2364a5e18cefc26f70710e18ea329cfa

commit a454d1271cc640fac26d6f7381f6d554be363209
Author: Zhenguo Niu <Niu.ZGlinux@gmail.com>
Date:   Fri Oct 25 16:28:14 2013 +0800

Update my mailmap
    
    Using new email address.
    
    Change-Id: I9d49992e64302ff5d879761fa63dd01cefb63cad

commit 0a26bb20b198939f309a47c8ba05bc9e62193a71
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Mon Oct 21 13:26:11 2013 -0700

Simplify callers of diskfile.[read|write]_metadata()
    
    As it happens, diskfile.read_metadata() and diskfile.write_metadata()
    can take either an open file or a filename as their first arguments
    (since xattr.[get|set]xattr() can), so we can clean up a couple places
    where we were opening a file just to call read_metadata() or
    write_metadata() on it. This results in 2 fewer system calls.
    
    Example strace output:
    
    /* read_metadata(filename) */
    getxattr("/mnt/sdb1/1/node/sdb1/afile", "user.some.key", 0x0, 0) = 10
    getxattr("/mnt/sdb1/1/node/sdb1/afile", "user.some.key", "some-value", 10) = 10
    
    /* fp = open(filename); read_metadata(fp) */
    open("/mnt/sdb1/1/node/sdb1/afile", O_RDONLY) = 4
    fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
    fgetxattr(4, "user.some.key", 0x0, 0)   = 10
    fgetxattr(4, "user.some.key", "some-value", 10) = 10
    
    Change-Id: I321d8663b9e9e47b8f3ee6c21a1b65b408bb80e6

commit b48435cd25070d6d56987503908d3cb6bc03b9d5
Author: Peter Portante <peter.portante@redhat.com>
Date:   Thu Oct 24 12:23:39 2013 -0400

Fix UnboundLocalError on account PUT
    
    Fixes bug-1243973
    
    Change-Id: I67143535c0f7a0c6b53f67329a0bb128a355a4de
    Signed-off-by: Peter Portante <peter.portante@redhat.com>

commit 2664f58d10b6a210c1e46256226789f4d7931cbf
Author: Pete Zaitcev <zaitcev@kotori.zaitcev.us>
Date:   Tue Oct 22 17:29:38 2013 -0600

Add colon after service name
    
    In Fedora 21 the log format changed from this:
    
    Oct 10 23:54:20 rhev-a24c-01 proxy-server 10.10.55.128 .....
    
    to this:
    
    Oct 10 23:57:49 kvm-rei journal: proxy-server 192.168.128.11 .....
    
    It is clearly incompatible, because the word "journal:" is added.
    
    It happens because all the log messages are filtered by Systemd
    and it adds "journal:" if it cannot find a word ending with colon.
    See:
     https://bugzilla.redhat.com/show_bug.cgi?id=1018042
    
    It seems that one simple fix could be in Swift. We already add
    the indentifier word, it's just we do not end it in colon.
    Therefore, we could start adding a colon to make the log look
    more like any other system log.
    
    Unfortunately, this is not entirely compatible either! The number
    of words is kept, so if you parse with regexp, you get the things
    in expected places, but the name of the server now has a colon.
    Would be simple enough to fix, I suppose, but still, this needs
    a consideration. Reviewers should feel free to put -1 or -2 on
    this if they find an application that breaks with this patch.
    
    Change-Id: I0d641ae49e6fc989283868ade2ca542a8133cb07

commit 6342cb387a8538a8b59e4a4ba9e2574ac23ff45f
Author: Fabien Boucher <fabien.boucher@enovance.com>
Date:   Tue Oct 15 17:46:01 2013 +0200

Handle COPY verb in account quota middleware
    
    Since COPY verb allow to copy an existing object
    we must check the size of the source object before
    allowing the copy.
    
    Fixes bug: #1201884
    
    Change-Id: Ia37bc716be0c3e5a3174dc4370bb5084f81073ad

John Dickinson (notmyname) on 2013-12-06

Changed in swift:
milestone:	none → 1.11.0

Thierry Carrez (ttx) on 2013-12-10

Changed in swift:
status:	Fix Committed → Fix Released

Revision history for this message

WangChao (chaowsh) wrote on 2014-01-08:

#17

There is another problem happens.

[{"device": "sdb", "avail": 32162545664, "mounted": true, "used": 33980416, "size": 32196526080}, {"device": "sdd", "avail": 32162545664, "mounted": true, "used": 33980416, "size": 32196526080}, {"device": "sdc", "avail": 32162312192, "mounted": true, "used": 34213888, "size": 32196526080}, {"device": "sde", "avail": "", "mounted": "[Errno 5] Input/output error: '/srv/node/sde'", "used": "", "size": ""}]

And then when I run "swift-recon -d", the error like below:

Traceback (most recent call last):
  File "/usr/local/bin/swift-recon", line 5, in <module>
    pkg_resources.run_script('swift==1.11.0', 'swift-recon')
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 499, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1239, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/swift-1.11.0-py2.7.egg/EGG-INFO/scripts/swift-recon", line 867, in <module>
    reconnoiter.main()
  File "/usr/local/lib/python2.7/dist-packages/swift-1.11.0-py2.7.egg/EGG-INFO/scripts/swift-recon", line 855, in main
    self.disk_usage(hosts, options.top, options.human_readable)
  File "/usr/local/lib/python2.7/dist-packages/swift-1.11.0-py2.7.egg/EGG-INFO/scripts/swift-recon", line 660, in disk_usage
    * 100.0
ValueError: could not convert string to float:

OpenStack Object Storage (swift)

500 Error when getting /recon/diskusage

Bug Description

Other bug subscribers

Remote bug watches