container sync gets stuck after deleting all objects

Bug #1413619 reported by dba
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Eran Rom

Bug Description

I set container to sync container1 to container2.

Then, In container1 uploaded serveral files.
After container sync run, all files of container 1 was copied to container2.
This is no problem.

And then, I deleted all files of container1.
After container sync run, all files of container2 was deleted.
and Also this no problem.

But since this, whenever container sync run, In container_sync.log next error printed.

Jan 22 22:57:26 dev-swift02 container-sync: ERROR Syncing /data1/sdc/containers/46110/8b8/b41eda9ca916f6262b8986fae95648b8/b41eda9ca916f6262b8986fae95648b8.db {'name': 'file1', 'deleted': 1, 'created_at': '1421928707.61045', 'storage_policy_index': 0, 'etag': 'noetag', 'content_type': 'application/deleted', 'ROWID': 729, 'size': 0}: #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/container/sync.py", line 362, in container_sync_row#012 logger=self.logger)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 839, in delete_object#012 client.retry_request('DELETE', **kwargs)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 805, in retry_request#012 return self.base_request(method, **kwargs)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 765, in base_request#012 conn = urllib2.urlopen(req, timeout=timeout)#012 File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen#012 return _opener.open(url, data, timeout)#012 File "/usr/lib64/python2.6/urllib2.py", line 397, in open#012 response = meth(req, response)#012 File "/usr/lib64/python2.6/urllib2.py", line 510, in http_response#012 'http', request, response, code, msg, hdrs)#012 File "/usr/lib64/python2.6/urllib2.py", line 435, in error#012 return self._call_chain(*args)#012 File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain#012 result = func(*args)#012 File "/usr/lib64/python2.6/urllib2.py", line 518, in http_error_default#012 raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)#012HTTPError: HTTP Error 404: Not Found
Jan 22 22:57:43 dev-swift02 container-sync: ERROR Syncing /data1/sdc/containers/46110/8b8/b41eda9ca916f6262b8986fae95648b8/b41eda9ca916f6262b8986fae95648b8.db {'name': 'file2', 'deleted': 1, 'created_at': '1421928665.53463', 'storage_policy_index': 0, 'etag': 'noetag', 'content_type': 'application/deleted', 'ROWID': 730, 'size': 0}: #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/container/sync.py", line 362, in container_sync_row#012 logger=self.logger)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 839, in delete_object#012 client.retry_request('DELETE', **kwargs)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 805, in retry_request#012 return self.base_request(method, **kwargs)#012 File "/usr/lib/python2.6/site-packages/swift/common/internal_client.py", line 765, in base_request#012 conn = urllib2.urlopen(req, timeout=timeout)#012 File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen#012 return _opener.open(url, data, timeout)#012 File "/usr/lib64/python2.6/urllib2.py", line 397, in open#012 response = meth(req, response)#012 File "/usr/lib64/python2.6/urllib2.py", line 510, in http_response#012 'http', request, response, code, msg, hdrs)#012 File "/usr/lib64/python2.6/urllib2.py", line 435, in error#012 return self._call_chain(*args)#012 File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain#012 result = func(*args)#012 File "/usr/lib64/python2.6/urllib2.py", line 518, in http_error_default#012 raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)#012HTTPError: HTTP Error 404: Not Found

...
...
...

This error cause the file list exist deleted flag 1 only in container meta db, although the file is deleted already,
I confirmed meta db of container using sqlite.
sqlite> select * from object;
sqlite> select * from object;
ROWID|name|created_at|size|content_type|etag|deleted|storage_policy_index
727|file1|1421928707.61045|0|application/deleted|noetag|1|0
728|file2|1421928665.53463|0|application/deleted|noetag|1|0
...

When container sync run the second time, the sync_point2 sweep will attempt to validate the work done by the other replicas of the container databases for the other objects it skipped during the sync_point1 run.

The code needs to handle the 404 case, but there was a regression when swiftclient got swapped out for SimpleClient.

dba (lee203)
affects: horizon → swift
Revision history for this message
Gil Vernik (gilv) wrote :

Container sync should also sync objects that were deleted, therefore objects that marked as deleted in DB - are 'synced' with DELETE. This is the correct behavior.
Why you see those errors? Looks like some problem with exception type that was thrown. I will check this.

Changed in swift:
assignee: nobody → Gil Vernik (gilv)
Revision history for this message
dba (lee203) wrote :

In cotainer metada db, object list with delted flgg 1 exist during reclaim_age = 604800.
Therefore, during 604800 sec, whenever continaer sync run, about deleted object was synced continuosly , repeat delete sync.

Delete sync must run one time only. But, currnet container delete sync behavior is runnedrepeatedly during reclaim_age.

clayg (clay-gerrard)
summary: - when container sync run, already deleted object is synced
+ container sync gets stuck after deleting all objects
description: updated
Revision history for this message
clayg (clay-gerrard) wrote :

I codified the steps to reproduce from https://bugs.launchpad.net/swift/+bug/1453993 into a probe test and have an idea how we could fix it - attached.

Revision history for this message
Marc Heckmann (marc-w-heckmann) wrote :

I' ve been testing the attached patch. It seems to work fine. At least through my initial testing.

Revision history for this message
Eran Rom (eranr) wrote :

I believe this is a duplicate of: https://bugs.launchpad.net/swift/+bug/1419901

Changed in swift:
status: New → Confirmed
assignee: Gil Vernik (gilv) → Eran Rom (eranr)
Revision history for this message
Tim Burke (1-tim-z) wrote :
Changed in swift:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.