OpenStack Object Storage (swift)

Encryption doesn't play well with processes that copy cleartext data while preserving timestamps

Bug #1910804 reported by Tim Burke on 2021-01-08

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Released	Undecided	Unassigned

Bug Description

There are at least two processes that use internal-clients to pull user data out of a cluster and re-upload it with the same (or very nearly the same) timestamp:

Container-sync has a special carve-out to allow client-settable X-Timestamp values so that the modification times between different clusters will match. It also needs to decrypt user data as the remote may not have encryption enabled and, even if it does, there's no guarantee that the two clusters share knowledge of root secrets.

The reconciler deterministically adds an offset to the timestamp when moving data between policies. It may not need to decrypt client data, but it often is configured to use a shared internal-client config that may need to have encryption enabled for other use-cases.

When both of those were originally written, there was an assumption that preserving timestamps like that would be safe, because the write should be idempotent -- if two processes tried to do the same work, the data going out to disk should be the same.

With encryption enabled, however, writes become non-deterministic -- we have to choose random values for the body key, body key iv, body iv, and various metadata ivs. As a result, concurrent writers almost certainly *do not* try to write the same data to disk.

When writing to a replicated policy, this isn't too much of a problem. Each replica is self-contained, and any one of them can service reads. It'll likely cause some confusion at some point when manually comparing replicas, but it shouldn't lead to data loss.

Erasure-coded policies are a problem, though: we've observed objects moved by the reconciler with three distinct fragment sets in an 8+4 policy:

  * seven fragments were encrypted with some set-of-crypto-meta A,
  * three fragments were encrypted with set B, and
  * one fragment was encrypted with set C.

As a result, we don't have enough fragments to decode. *Maybe* we once could, but since we can only find 11 frags now... looks like we're out of luck.

Tags: