Container replicator can propagate corrupt timestamps

Bug #1823785 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
New
Undecided
Unassigned

Bug Description

Seen in the wild [1]: Apparently some forms of container DB corruption can leave users with mangled created_at timestamps like '14870&4222&3r980' but which otherwise don't get detected and cleaned up by the auditor. Naturally, this causes errors on GET, with tracebacks like

ERROR __call__ error with GET /disk48/67379/AUTH_XXXXXX/XXXXXX :
Traceback (most recent call last):
  File ".../swift/container/server.py", line 582, in __call__
    res = method(req)
  File ".../swift/common/utils.py", line 2693, in wrapped
    return func(*a, **kw)
  File ".../swift/common/utils.py", line 1230, in _timing_stats
    resp = func(ctrl, *args, **kwargs)
  File ".../swift/container/server.py", line 469, in GET
    resp_headers = gen_resp_headers(info, is_deleted=is_deleted)
  File ".../swift/container/server.py", line 54, in gen_resp_headers
    'X-Backend-Timestamp': Timestamp(info.get('created_at', 0)).internal,
  File ".../swift/common/utils.py", line 756, in __init__
    self.timestamp = float(parts.pop(0))
ValueError: invalid literal for float(): 14870&4222&3r980

Even more fun, since merge_timestamps deals in SQL [2], the replicators manage to settle on the corrupted version! We should

(1) on the receiver, have some validation that the timestamps we're merging are actually valid; if they aren't, abort replication, since this other guys doesn't seem to know what he's talking about,
(2) on the sender, validate the timestamps we're about to send, and quarantine ourselves if they aren't valid, and
(3) in the auditor, similarly check for valid timestamps and quarantine if they aren't valid.

[1] http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2019-04-01.log.html#t2019-04-01T18:57:43
[2] https://github.com/openstack/swift/blob/2.21.0/swift/common/db.py#L566-L570

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.