OpenStack Object Storage (swift)

POST can cause subsequent EC GET to return 503

Bug #1912014 reported by Alistair Coles on 2021-01-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Committed	Undecided	Alistair Coles

Bug Description

When backend servers have a mix of durable and newer non-durable fragments for an EC object, a GET will return 200 with the older durable object (assuming there are sufficient older durable fragments to reconstruct the object). However, if a POST is made to an object in that state, subsequent GETS may then return a 503. The result may be a 200; the bug is not deterministic.

The bug is caused by an interaction between the fragments' X-Backend-Timestamp and X-Backend-Data-Timestamp in the proxy EC response handler. As backend 200 responses are added to the proxy response buckets, the bucket timestamp, which is initially equal to the X-Backend-Data-Timestamp, is updated by the X-Backend-Timestamp. Without the newer metadata, these two timestamps are the same, but when the metadata is POSTed the X-Backend-Timestamp takes a newer value and so the proxy response bucket timestamp deviates from the X-Backend-Data-Timestamp of the fragments that it is collecting.

This deviation feeds into the X-Backend-Fragment-Preferences that are sent to backend servers as the proxy tries to hunt down the older non-durable fragments: the frag prefs *should* exclude data frags with the timestamp of the non-durable data, but instead the frag prefs exclude only the metadata timestamp. There are no data frags with the metadata timestamp, so the backend servers continue to return the newer non-durable fragments. I have observed repeated GET requests to the same object server, returning the same non-durable fragment, which eventually consume the proxy's request allowance, causing a 503 from the proxy to client.

The bug appears to have come with https://review.opendev.org/c/openstack/swift/+/711342, commit 8f60e0a2607514f05fb873e4a313ab4a93df7601, which enhances the proxy response bucket class to collect bad responses as well as good responses. In the case of bad responses, buckets collect responses of same status, and we *do* want the bucket timestamp to be updated with X-Backend-Timestamp. But for good responses the buckets collect responses with the same timestamp and this must always be the X-Backend-Data-Timestamp.

The proposed fix will include a unit test that reproduces the bug.