Container replicator can propagate corrupt timestamps
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
New
|
Undecided
|
Unassigned |
Bug Description
Seen in the wild [1]: Apparently some forms of container DB corruption can leave users with mangled created_at timestamps like '14870&4222&3r980' but which otherwise don't get detected and cleaned up by the auditor. Naturally, this causes errors on GET, with tracebacks like
ERROR __call__ error with GET /disk48/
Traceback (most recent call last):
File ".../swift/
res = method(req)
File ".../swift/
return func(*a, **kw)
File ".../swift/
resp = func(ctrl, *args, **kwargs)
File ".../swift/
resp_headers = gen_resp_
File ".../swift/
'X-
File ".../swift/
self.timestamp = float(parts.pop(0))
ValueError: invalid literal for float(): 14870&4222&3r980
Even more fun, since merge_timestamps deals in SQL [2], the replicators manage to settle on the corrupted version! We should
(1) on the receiver, have some validation that the timestamps we're merging are actually valid; if they aren't, abort replication, since this other guys doesn't seem to know what he's talking about,
(2) on the sender, validate the timestamps we're about to send, and quarantine ourselves if they aren't valid, and
(3) in the auditor, similarly check for valid timestamps and quarantine if they aren't valid.
[1] http://
[2] https:/