OpenStack Object Storage (swift)

EC non-durable fragment won't be deleted by reconstructor.

Bug #1778002 reported by Hugo Kou on 2018-06-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Released	High	Unassigned

Bug Description

The handoff partition `24228` keeps on the disk. The reconstructor doesn't remove it even the primary location has this fragment. Perhaps it's garbage fragment from old swift version?

```
[root@swift-node15 objects-1]# pwd
/srv/node/d329/objects-1

[root@swift-node15 objects-1]# ls 24228/3d3/5ea48f8dbc06abf1369634fee53d13d3/1528179813.50318#0.data
24228/3d3/5ea48f8dbc06abf1369634fee53d13d3/1528179813.50318#0.data

[root@swift-node15 objects-1]# /opt/ss/bin/python /opt/ss/bin/swift-object-reconstructor ~/ss-agent/2.conf -o -v -d d329 -p 24228
object-reconstructor: Starting 37988
object-reconstructor: Spawned worker 38189 with {'override_devices': ['d329'], 'override_partitions': [24228]}
object-reconstructor: Running object reconstructor in script mode.
object-reconstructor: Run listdir on /srv/node/d329/objects-1/24228
object-reconstructor: 1/15954 (0.01%) partitions reconstructed in 0.17s (5.80/sec, 45m remaining)
object-reconstructor: Object reconstruction complete (once). (0.00 minutes)
object-reconstructor: Forked worker 38189 finished
object-reconstructor: Worker 38189 exited
object-reconstructor: Finished 37988
object-reconstructor: Exited 37988
```

```
Swift Package Version: 2.16.0.2-1.el7
Operating System Version: Linux-3.10.0-514.16.1.el7.x86_64-x86_64-with-centos-7.3.1611-Core
```

Revision history for this message

Tim Burke (1-tim-z) wrote on 2018-06-21:

Some maybe-related bugs:

- https://bugs.launchpad.net/swift/+bug/1706321
- https://bugs.launchpad.net/swift/+bug/1554378

So is there *any* HTTP traffic with the other nodes? Is the fragment durable on the other nodes? What's the reclaim_age like?

Revision history for this message

clayg (clay-gerrard) wrote on 2019-03-05:

Same symptoms as described in lp bug #1816501 (or the earlier incomplete fix for lp bug #1706321)

I think the fix for all of this is https://review.openstack.org/#/c/637662/

Revision history for this message

clayg (clay-gerrard) wrote on 2019-03-05:

I was wrong, the "un-durable data frag can't be reverted" bug is NOT fixed:

vagrant@saio:~$ find /srv/node*/sdb*/object* -name \*.data
/srv/node1/sdb1/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#5.data
/srv/node1/sdb5/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#1#d.data
/srv/node2/sdb2/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#4#d.data
/srv/node2/sdb6/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#0#d.data
/srv/node3/sdb7/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#3#d.data
/srv/node4/sdb4/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#5#d.data
/srv/node4/sdb8/objects-1/941/cf9/eb4596345889fc8709c6826a79305cf9/1551809077.79093#2#d.data

^ no matter how many times I run the re-constructor over sdb1 it won't ssync non-durable fragments or consider it self "in-sync" with the remote.

I'm not entirely sure where the "bug" is. I know we had originally designed the object-server to be "pessimistic" and had to move it towards a more "optimistic" design when dealing with non-durable fragments. I think yield_hashes needs similar treatment...

clayg (clay-gerrard) on 2019-03-06

Changed in swift:
status:	New → Confirmed
importance:	Undecided → Medium

Revision history for this message

clayg (clay-gerrard) wrote on 2020-07-23:

I just hit this again, this handoff partition is almost empty:

[root@ss0123 ~]# find /srv/node/d29/objects-1/55253/
/srv/node/d29/objects-1/55253/
/srv/node/d29/objects-1/55253/.lock
/srv/node/d29/objects-1/55253/hashes.invalid
/srv/node/d29/objects-1/55253/.lock-replication
/srv/node/d29/objects-1/55253/hashes.pkl
/srv/node/d29/objects-1/55253/fe9
/srv/node/d29/objects-1/55253/fe9/d7d5775328949fa8aac50eebd300cfe9
/srv/node/d29/objects-1/55253/fe9/d7d5775328949fa8aac50eebd300cfe9/1593739848.20150#7.data

The frag-index 7 node for p 55253 already has the data (and it's already durable):

[root@ss0127 ~]# find /srv/node/d90/objects-1/55253/fe9/
/srv/node/d90/objects-1/55253/fe9/
/srv/node/d90/objects-1/55253/fe9/d7d50221a8ba28234ce497155e8fbfe9
/srv/node/d90/objects-1/55253/fe9/d7d50221a8ba28234ce497155e8fbfe9/1591163929.63116#7#d.data
/srv/node/d90/objects-1/55253/fe9/d7d5775328949fa8aac50eebd300cfe9
/srv/node/d90/objects-1/55253/fe9/d7d5775328949fa8aac50eebd300cfe9/1593739848.20150#7#d.data
/srv/node/d90/objects-1/55253/fe9/d7d55a200ac390b0ddf74d3901622fe9
/srv/node/d90/objects-1/55253/fe9/d7d55a200ac390b0ddf74d3901622fe9/1594258195.26649#7#d.data
/srv/node/d90/objects-1/55253/fe9/d7d58bd330a6efe3cdd52458cd7fcfe9
/srv/node/d90/objects-1/55253/fe9/d7d58bd330a6efe3cdd52458cd7fcfe9/1591693555.81645#7#d.data

But no matter how times I run a handoffs_only replicator over this yield_hashes keeps coming back with nothing to sync so the "success" comes back with nothing in the in_sync_objs to cleanup