container-sync apparently stuck on certain objects with status 408 or 409

Bug #1385171 reported by Filippo Giunchedi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Eran Rom

Bug Description

hi,
we've trying container-sync again with swift 1.13 and running into some issues
with (afaict) some objects.

There are two swift clusters, eqiad being the primary and codfw being the
secondary so we'd like to setup container sync eqiad -> codfw for ~5k out of
~40k containers in eqiad.

Setting up container sync seems to work except that on some containers/objects
it seemingly gets stuck, e.g.

eqiad# swift stat wikipedia-it-local-thumb.fc
       Account: AUTH_mw
     Container: wikipedia-it-local-thumb.fc
       Objects: 8435
         Bytes: 385823776
      Read ACL: mw:media,.r:*
     Write ACL: mw:media
       Sync To: //mw_media/codfw/AUTH_mw/wikipedia-it-local-thumb.fc
      Sync Key: REDACTED
 Accept-Ranges: bytes
   X-Timestamp: 1381945180.34197
    X-Trans-Id: tx307f2b3824b24de6aafd1-00544a2840
  Content-Type: text/plain; charset=utf-8

codfw# swift stat wikipedia-it-local-thumb.fc
       Account: AUTH_mw
     Container: wikipedia-it-local-thumb.fc
       Objects: 1291
         Bytes: 58424838
      Read ACL:
     Write ACL:
       Sync To:
      Sync Key: REDACTED
 Accept-Ranges: bytes
   X-Timestamp: 1414080235.37348
    X-Trans-Id: txe6677cd472114f098271c-00544a2839
  Content-Type: text/plain; charset=utf-8

we have observed from the logs object-server in codfw that for certain objects
(e.g. f/fc/Esistono_gli_angeli%3F.jpg/225px-Esistono_gli_angeli%3F.jpg)
object-server replies with 408 (or 409) to PUT and proxy-server relays that
back and eventually container-sync processes all get stuck on that failing
object.

the attached log below shows when this first happens (i.e. the initial
container-sync)

I've also attached a object/container info for an object that doesn't work and
one that works at the bottom.

I'd be happy to provide more details if needed!

thanks,
filippo

Revision history for this message
Filippo Giunchedi (filippo) wrote :
Revision history for this message
Filippo Giunchedi (filippo) wrote :

object details

Revision history for this message
Filippo Giunchedi (filippo) wrote :

hi,
any ideas on what could be causing this? happy to provide more logs/tests if need be

thanks!

Revision history for this message
Eran Rom (eranr) wrote :

Confirmed.
I believe this is a duplicate of https://bugs.launchpad.net/swift/+bug/1419901

Changed in swift:
status: New → Confirmed
assignee: nobody → Eran Rom (eranr)
Revision history for this message
Tim Burke (1-tim-z) wrote :

Following https://github.com/openstack/swift/commit/f68e22d4 (swift 2.25.0+) we should no longer get stuck on 409s; 408 (Request Timeout) seems like a legit reason to fail and retry the transfer later.

Changed in swift:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.