EC missing durable can prevent reconstruction
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Critical
|
Unassigned |
Bug Description
If you do not have enough durable [1] fragments to rebuild a missing fragment the reconstructor will fail to rebuild a missing fragment:
object-6020: Unable to get enough responses (2/4) to reconstruct 127.0.0.
Luckily, if the non-durable fragments' neighbor has a durable fragment - it will make the non-durable fragment durable when it syncs. In this manor a durable can "passed" around to all the nodes until it encounters a missing fragment. When a node is missing a fragment - the reconstructor will attempt reconstruction which might fail (if there are < ndata durable fragments available) and it will not attempt to mark other nodes as durable.
This behavior allows a non-durable fragment to be "trapped" by adjacent missing fragments - which can cause it to *never* be marked durable by it's partners - and therefore the missing fragments *never* get rebuilt - so the durability is *never* repaired.
There's an attached probetest.
1. It doesn't matter if you have *enough* fragments - if they're not durable - their object servers won't give them to the reconstructor.
Making the Object Server prefer to serve non-durable frags rather than 404 seems like the most straight forward way to address this terrible terrible failure:
https:/ /gist.github. com/clayg/ 2988c8804ab4b70 4905d203d534a53 42
I'm not aware of any situation where the proxy or reconstructor would ask for a fragment and not be robust to a potentially uncommited minority response. If there was an uncommited *majority* response - it would be served whiled it's available - but the eventual durability would still be subject to the existence and propagation of the durable file.
As such, this solution should be considered at *least* a partial fix for lp bug #1469094 - additional work to improve availability of a minority overwrite while additional primaries off line (i.e. servicing a durable majority from "under" a non-durable minority using secondary frags from the same nodes) should fall under lp bug #1484598