finalize_durable traceback and fragment lost during PUT with old x-timestamp

Bug #1936508 reported by Alistair Coles
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Alistair Coles

Bug Description

We've seen tracebacks from PUTs of EC fragments:

Jun 25, 2021 @ 15:50:31.205 object-server - - 2021-06-25T15:50:31.205521+00:00 ss4009 object-server: err ERROR __call__ error with PUT <snip> :
Traceback (most recent call last):
  File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 1338, in __call__
    res = getattr(self, req.method)(req)
  File "/opt/ss/lib/python2.7/site-packages/swift/common/utils.py", line 2067, in _timing_stats
    resp = func(ctrl, *args, **kwargs)
  File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 1053, in PUT#012    writer.commit(request.timestamp)
  File "/opt/ss/lib/python2.7/site-packages/swift/obj/diskfile.py", line 3191, in commit
    timestamp)#012  File "/opt/ss/lib/python2.7/site-packages/eventlet/tpool.py", line 129, in execute#012    six.reraise(c, e, tb)
  File "/opt/ss/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker#012    rv = meth(*args, **kwargs)
  File "/opt/ss/lib/python2.7/site-packages/swift/obj/diskfile.py", line 3170, in _finalize_durable
    raise exc
DiskFileError: Problem making data file durable /srv/node/d10719/objects-1/12760/c64/18ec6afbb7c759618fa2eeed9beefc64/1617866962.00000#10#d.data ([Errno 2] No such file or directory: '/srv/node/d10719/objects-1/12760/c64/18ec6afbb7c759618fa2eeed9beefc64') (txn: txce4d79b58ff44d0aa08ee-0060d5fb41)
Jun 25, 2021 @ 15:50:31.199 object-server - - 2021-06-25T15:50:31.199604+00:00 ss4009 object-server: err Problem making data file durable /srv/node/d10719/objects-1/12760/c64/18ec6afbb7c759618fa2eeed9beefc64/1617866962.00000#10#d.data ([Errno 2] No such file or directory: '/srv/node/d10719/objects-1/12760/c64/18ec6afbb7c759618fa2eeed9beefc64'):
Traceback (most recent call last):
  File "/opt/ss/lib/python2.7/site-packages/swift/obj/diskfile.py", line 3129, in _finalize_durable
    files = os.listdir(self._datadir)
OSError: [Errno 2] No such file or directory: '/srv/node/d10719/objects-1/12760/c64/18ec6afbb7c759618fa2eeed9beefc64' (txn: txce4d79b58ff44d0aa08ee-0060d5fb41)

On investigation, the diskfile dir does not exist, and in some cases the object has less than ndata fragments due to multiple fragments experiencing this error during finalize_durable.

The proxy server should return 500 to client if less than ndata fragments are made durable. However, a HEAD of the object will return success because there are some durable fragments.

We were first alerted to this by seeing reconstructor warnings such as:

2021-07-15T12:23:09.926371+00:00 cpu0353 object-reconstructor: err [worker 1/4 pid=274882] Unable to get enough responses (5/8 from 5 ok responses) to reconstruct durable ...

My hypothesis for the cause is:

* an agent PUTs EC object *with x-timestamp in the past, older than reclaim_age* (container-sync, reconciler or other agent might do this)
* object-server write non-durable to disk
* reconstructor (or other daemon, maybe auditor) passes over the hash dir and calls DiskFileManager.cleanup_ondisk_files()
* cleanup_ondisk_files() finds non-durable data older than reclaim age and removes it
* object-server call finalize_durable which generates the traceback and fails to make data durable
*data file lost

This is similar to previously reported bugs [1] [2], but I believe the scenario is different - particularly that this bug manifests when the PUT x-timestamp is already older than reclaim age.

[1] https://bugs.launchpad.net/swift/+bug/1554378 "Bug #1554378 “EC: Durable frag_set is reclaimable in racing cond...” : Bugs : OpenStack Object Storage (swift)"
[2] https://bugs.launchpad.net/swift/+bug/1886848 "Bug #1886848 “rebalance error” : Bugs : OpenStack Object Storage (swift)"

Changed in swift:
status: New → Confirmed
assignee: nobody → Alistair Coles (alistair-coles)
Changed in swift:
status: Confirmed → In Progress
Revision history for this message
Alistair Coles (alistair-coles) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.opendev.org/c/openstack/swift/+/800974
Committed: https://opendev.org/openstack/swift/commit/bbaed18e9b681ce9cf26ffa6a5d5292f5cb219b7
Submitter: "Zuul (22348)"
Branch: master

commit bbaed18e9b681ce9cf26ffa6a5d5292f5cb219b7
Author: Alistair Coles <email address hidden>
Date: Thu Jul 15 18:08:01 2021 +0100

    diskfile: don't remove recently written non-durables

    DiskFileManager will remove any stale files during
    cleanup_ondisk_files(): these include tombstones and nondurable EC
    data fragments whose timestamps are older than reclaim_age. It can
    usually be safely assumed that a non-durable data fragment older than
    reclaim_age is not going to become durable. However, if an agent PUTs
    objects with specified older X-Timestamps (for example the reconciler
    or container-sync) then there is a window of time during which the
    object server has written an old non-durable data file but has not yet
    committed it to make it durable.

    Previously, if another process (for example the reconstructor) called
    cleanup_ondisk_files during this window then the non-durable data file
    would be removed. The subsequent attempt to commit the data file would
    then result in a traceback due to there no longer being a data file to
    rename, and of course the data file is lost.

    This patch modifies cleanup_ondisk_files to not remove old, otherwise
    stale, non-durable data files that were only written to disk in the
    preceding 'commit_window' seconds. 'commit_window' is configurable for
    the object server and defaults to 60.0 seconds.

    Closes-Bug: #1936508
    Related-Change: I0d519ebaaade35249fb7b17bd5f419ffdaa616c0
    Change-Id: I5f3318a44af64b77a63713e6ff8d0fd3b6144f13

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.28.0

This issue was fixed in the openstack/swift 2.28.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.