Swift replication fails due to missing rsync in swift-account/container/object images

Bug #1751168 reported by Lei Zhang on 2018-02-23

This bug report will be marked for expiration in 21 days if no further activity occurs. (find out why)

6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Undecided
Unassigned

Bug Description

There are two issues regarding rsync in the kolla swift-* images.

First, rsync is missing in swift-account/container/object* images. As the default sync method, rysnc will be called from swift replicator.py to push its local replicas to remote drives, without it, replication will certainly fail.

Second, swift-rsyncd image is built with 'swift' as the default user, which doesn't have enough permission for rsyncd to chroot, rsync client will see errors like "@ERROR chroot failed". Another issue is that swift replicator seems to use the default rsync port(873) for replication, while in kolla-ansible the default value for swift_rsync_port is 10873, making swift-rsyncd listening on a different port. Although we can override the default swift_rsync_port, swift-rysncd still need root user(or sudo privilege) to listen on 873 port.

General steps to reproduce:
1, build swift ring files with two regions (r1 and r2)
2, customize /etc/kolla/config/swift/proxy-server.conf as follow to enable write_affinity
[app:proxy-server]
sorting_method = affinity
read_affinity = r1=100
write_affinity = r1
3, kolla deploy, post-deploy and source admin-openrc.sh
4, create an object , e.g. swift upload test abc , and swift stat test abc to find out the account AUTH_xxxxxx.
5, enter one swift-proxy , run swift-get-nodes, for example:
docker exec swift_proxy_server swift-get-nodes /etc/swift/object.ring.gz AUTH_<project_id> test abc
this will show where the replicas will be written to.
6, The replicas should be in region1, but not in region2
7, check replicator logs: docker logs swift_object_replicator, you will see some errors like this:
swift-object-replicator: Error syncing with node: {'index': 0, u'replication_port': 6000, u'weight': 1.0, u'zone': 4, u'ip': u'10.66.25.13', u'region': 2, u'id': 154, u'replication_ip': u'10.66.25.13', u'meta': u'', u'device': u'd16', u'port': 6000}: #012Traceback (most recent call last):#012 File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/swift/obj/replicator.py", line 467, in update#012 success, _junk = self.sync(node, job, suffixes)#012 File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/swift/obj/replicator.py", line 154, in sync#012 return self.sync_method(node, job, suffixes, *args, **kwargs)#012 File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/swift/obj/replicator.py", line 245, in rsync#012 return self._rsync(args) == 0, {}#012 File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/swift/obj/replicator.py", line 178, in _rsync#012 stderr=subprocess.STDOUT)#012 File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/eventlet/green/subprocess.py", line 55, in __init__#012 subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)#012 File "/usr/lib/python2.7/subprocess.py", line 711, in __init__#012 errread, errwrite)#012 File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child#012 raise child_exception#012OSError: [Errno 2] No such file or directory

My test environment was using ubuntu-source-* images with tag Pike , but I also see this in Ocata and Queens.

So the proposals here are:
1, include rysnc package when building swift-account/container/object* images
2, Change USER to root in Dockerfile.j2 of swift-rsyncd
3, Change the default swift_rsync_port in kolla-ansible/ansible/group_vars/all.yml to 873.

Regards,

Lei Zhang (zhangleiop) wrote :

It's been a long time and no response. Could someone have a look?

Mark Goddard (mgoddard) wrote :

I have seen swift syncing working in Rocky, could you retest?

Changed in kolla:
status: New → Incomplete
Lei Zhang (zhangleiop) wrote :

Hi Mark,thanks for replying. The bug was created one year ago, it was in Pike release. If it has already fix in the latest release , then feel free to close this bug.

Mark Goddard (mgoddard) wrote :

I can't see any changes in the swift image code that would fix this so I'll leave it open.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers