regression in 2.23-0ubuntu10: rsync in glusterfs's georeplication fails

Bug #1746995 reported by Mrten on 2018-02-02
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
rsync
Unknown
Unknown
rsync (Ubuntu)
Undecided
Unassigned

Bug Description

Since somewhere in the Jan 14-18 range my glusterfs georeplication reports Faulty. Georeplication uses rsync internally, and tracking this this down leads to an exit code 3 out of rsync.

I have no stacktraces myself, but I found this report about the same problem:

https://www.spinics.net/lists/gluster-users/msg33568.html

with traces:

strace rsync :

30743 23:34:47 newfstatat(3, "6737", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
30743 23:34:47 newfstatat(3, "6741", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
30743 23:34:47 getdents(3, /* 0 entries */, 131072) = 0
30743 23:34:47 munmap(0x7fa4feae7000, 135168) = 0
30743 23:34:47 close(3) = 0
30743 23:34:47 write(2, "rsync: getcwd(): No such file or directory (2)", 46) = 46
30743 23:34:47 write(2, "\n", 1) = 1
30743 23:34:47 rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7fa4fdf404b0}, NULL, 8) = 0
30743 23:34:47 rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7fa4fdf404b0}, NULL, 8) = 0
30743 23:34:47 write(2, "rsync error: errors selecting input/output files, dirs (code 3) at util.c(1056) [Receiver=3.1.1]", 96) = 96
30743 23:34:47 write(2, "\n", 1) = 1
30743 23:34:47 exit_group(3) = ?
30743 23:34:47 +++ exited with 3 +++

The Changelog of glibc mentions that getcwd has been changed:

  * SECURITY UPDATE: Buffer underflow in realpath()
    - debian/patches/any/cvs-make-getcwd-fail-if-path-is-no-absolute.diff:
      Make getcwd(3) fail if it cannot obtain an absolute path
    - CVE-2018-1000001

and downgrading glibc to (2.23-0ubuntu3) indeed fixes my georeplication problem. 0ubuntu3 is the latest version that is available in the repositories, other than 0ubuntu10.

Florian Weimer (fweimer) wrote :

Do you have strace output for the getcwd system call before the failure? That would be helpful.

Mrten (bugzilla-ii) wrote :

I do now:

11051 geteuid() = 0
11051 getegid() = 0
11051 umask(0) = 022
11051 umask(022) = 0
11051 brk(NULL) = 0x55f0d0a14000
11051 brk(0x55f0d0a35000) = 0x55f0d0a35000
11051 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
11051 fstat(3, {st_mode=S_IFREG|0644, st_size=2276256, ...}) = 0
11051 mmap(NULL, 2276256, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f43b7a35000
11051 close(3) = 0
11051 open("/usr/etc/popt", O_RDONLY) = -1 ENOENT (No such file or directory)
11051 open("/etc/popt", O_RDONLY) = -1 ENOENT (No such file or directory)
11051 stat("/etc/popt.d", 0x7ffcb94d9230) = -1 ENOENT (No such file or directory)
11051 rt_sigaction(SIGINT, {0x55f0cffb8ac0, [], SA_RESTORER|SA_NOCLDSTOP, 0x7f43b7c964b0}, NULL, 8) = 0
11051 rt_sigaction(SIGHUP, {0x55f0cffb8ac0, [], SA_RESTORER|SA_NOCLDSTOP, 0x7f43b7c964b0}, NULL, 8) = 0
11051 rt_sigaction(SIGTERM, {0x55f0cffb8ac0, [], SA_RESTORER|SA_NOCLDSTOP, 0x7f43b7c964b0}, NULL, 8) = 0
11051 rt_sigprocmask(SIG_UNBLOCK, [HUP INT USR1 USR2 TERM CHLD], NULL, 8) = 0
11051 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER|SA_NOCLDSTOP, 0x7f43b7c964b0}, NULL, 8) = 0
11051 rt_sigaction(SIGXFSZ, {SIG_IGN, [], SA_RESTORER|SA_NOCLDSTOP, 0x7f43b7c964b0}, NULL, 8) = 0
11051 getcwd("(unreachable)/", 4095) = 15
11051 lstat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
11051 lstat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
11051 openat(AT_FDCWD, "..", O_RDONLY|O_CLOEXEC) = 3
11051 fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
11051 fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
11051 fcntl(3, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE)
11051 fcntl(3, F_SETFD, FD_CLOEXEC) = 0
11051 mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f43b883c000
11051 getdents(3, /* 11 entries */, 131072) = 376
11051 getdents(3, /* 0 entries */, 131072) = 0
11051 lseek(3, 0, SEEK_SET) = 0
11051 getdents(3, /* 11 entries */, 131072) = 376
11051 newfstatat(3, ".trashcan", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
11051 newfstatat(3, "acme", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
11051 newfstatat(3, "web", {st_mode=S_IFDIR|0777, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
11051 newfstatat(3, "XXX", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
11051 newfstatat(3, "glbackup", {st_mode=S_IFDIR, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
11051 getdents(3, /* 0 entries */, 131072) = 0
11051 munmap(0x7f43b883c000, 135168) = 0
11051 close(3) = 0
11051 write(2, "rsync: getcwd(): No such file or directory (2)", 46) = 46

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glibc (Ubuntu):
status: New → Confirmed
Steve Beattie (sbeattie) wrote :

Hi,

Florian has a proposed patch submitted to rsync upstream https://lists.samba.org/archive/rsync/2018-February/031478.html for review. I've incorporated Florian's patch into test packages for Ubuntu 16.04 and 17.10 (and will have other releases later) in a test ppa at https://launchpad.net/~sbeattie/+archive/ubuntu/lp1746995 . It would be appreciated if people seeing this issue with glusterfs' geo-replication can confirm that the patched version of rsync in that ppa address the issue for them.

Thanks, and my apologies for your inconvenience.

Mrten (bugzilla-ii) wrote :

Hi,

After a few kicks and reboots I'll say that it does not work. The strace is much different though, but it trips over a message that's in Florian's patch.

14151 lstat(".gfid/f55c143a-83a5-4ca0-a833-95ff8e30ff32", {st_mode=S_IFREG|0666, st_size=3257, ...}) = 0
14151 stat(".gfid", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
14151 mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7facbe951000
14151 getxattr(".gfid", "system.posix_acl_access", 0x7ffc7f0b63b0, 132) = -1 EOPNOTSUPP (Operation not supported)
14151 getxattr(".gfid", "system.posix_acl_default", 0x7ffc7f0b63b0, 132) = -1 EOPNOTSUPP (Operation not supported)
14151 llistxattr(".gfid", 0x55954b86dc10, 1024) = -1 EOPNOTSUPP (Operation not supported)
14151 mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7facbe910000
14151 write(2, "rsync: unknown current directory for \".gfid\": No such file or directory (2)", 75) = 75
14151 write(2, "\n", 1) = 1
14151 rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7facbddcb4b0}, NULL, 8) = 0
14151 rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7facbddcb4b0}, NULL, 8) = 0
14151 wait4(14152, 0x7ffc7f0b85e4, WNOHANG, NULL) = 0
14151 getpid() = 14151
14151 kill(14152, SIGUSR1) = 0
14151 write(2, "rsync error: errors selecting input/output files, dirs (code 3) at util.c(1057) [sender=3.1.1]", 94) = 94
14151 write(2, "\n", 1) = 1
14151 select(6, [5], [4], [5], {30, 0}) = 1 (out [4], left {29, 999998})
14151 write(4, "\x04\x00\x00\x5d\x03\x00\x00\x00", 8) = 8
14151 select(6, [5], [4], [5], {30, 0}) = 1 (out [4], left {29, 999998})
14151 write(4, "\x20\x00\x00\x07\x05\x21\x05\x2e\x67\x66\x69\x64\x00\x00\x10\x5a\xf2\x16\x7c\xf0\x1b\xdc\xe1\x2e\xed\x41\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", 36) = 36
14151 select(6, [5], [], [5], {30, 0} <unfinished ...>
14152 <... poll resumed> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
14152 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=14151, si_uid=0} ---
14151 <... select resumed> ) = 1 (in [5], left {29, 999641})
14151 read(5, "", 32768) = 0
14151 rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 0x7facbddcb4b0}, NULL, 8) = 0
14151 rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 0x7facbddcb4b0}, NULL, 8) = 0
14151 exit_group(3) = ?
14151 +++ exited with 3 +++
14152 +++ killed by SIGUSR1 +++

full strace here:

https://pastebin.ubuntu.com/26540269/

Adam Conrad (adconrad) on 2018-12-08
affects: glibc → rsync
affects: glibc (Ubuntu) → rsync (Ubuntu)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.