Falling join node after a successful SST
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
New
|
Undecided
|
Unassigned |
Bug Description
I observe a strange situation when I try to add a second node to the percona galera cluster.
After a long file upload using xtrabackup-v2 (3.5TB of data and about 1,000,000 files) and successfully passed all the stages, judging by the log, the second node falls with an absolutely uninformative error:
2017-09-02 01:31:14 17707 [Note] InnoDB: Using atomics to ref count buffer pool pages
2017-09-02 01:31:14 17707 [Note] InnoDB: The InnoDB memory heap is disabled
2017-09-02 01:31:14 17707 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-02 01:31:14 17707 [Note] InnoDB: Memory barrier is not used
2017-09-02 01:31:14 17707 [Note] InnoDB: Compressed tables use zlib 1.2.7
2017-09-02 01:31:14 17707 [Note] InnoDB: Using Linux native AIO
2017-09-02 01:31:14 17707 [Note] InnoDB: Using CPU crc32 instructions
2017-09-02 01:31:14 17707 [Warning] InnoDB: innodb_
2017-09-02 01:31:14 17707 [Note] InnoDB: Initializing buffer pool, size = 10.0G
2017-09-02 01:31:15 17707 [Note] InnoDB: Completed initialization of buffer pool
2017-09-02 01:31:16 17707 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-02 01:32:27 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 02:44:02 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:03:40 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:21:14 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:40:19 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:44:41 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:49:21 17707 [Note] WSREP: Created page /var/lib/
2017-09-02 03:53:24 17707 [Note] InnoDB: 128 rollback segment(s) are active.
2017-09-02 03:53:24 17707 [Note] InnoDB: Waiting for purge to start
2017-09-02 03:53:24 17707 [Note] InnoDB: Percona XtraDB (http://
2017-09-02 03:53:24 17707 [ERROR] Aborting
2017-09-02 03:53:24 17707 [Note] WSREP: Signalling cancellation of the SST request.
2017-09-02 03:53:24 17707 [Note] WSREP: SST request was cancelled
2017-09-02 03:53:24 17707 [Note] WSREP: Closing send monitor...
2017-09-02 03:53:24 17707 [Note] WSREP: Closed send monitor.
2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: terminating thread
2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: joining thread
2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: closing backend
2017-09-02 03:53:26 17707 [Note] WSREP: Service disconnected.
2017-09-02 03:53:26 17707 [Note] WSREP: Waiting to close threads......
2017-09-02 03:53:26 17707 [Note] WSREP: rollbacker thread exiting
-------
This info correspond with strace log data:
2049 03:53:24 <... setpriority resumed> ) = 0
2049 03:53:24 getpriority(
17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] InnoDB: Percona XtraDB (http://
2049 03:53:24 select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
17707 03:53:24 <... write resumed> ) = 137
17707 03:53:24 mmap(NULL, 8392704, PROT_READ|
17707 03:53:24 mprotect(
17707 03:53:24 clone(child_
2050 03:53:24 set_robust_
17707 03:53:24 mmap(NULL, 8392704, PROT_READ|
2050 03:53:24 <... set_robust_list resumed> ) = 0
17707 03:53:24 <... mmap resumed> ) = 0x7fb32c3f8000
17707 03:53:24 mprotect(
2050 03:53:24 futex(0x7fbf974
17707 03:53:24 clone( <unfinished ...>
2051 03:53:24 set_robust_
17707 03:53:24 <... clone resumed> child_stack=
2051 03:53:24 <... set_robust_list resumed> ) = 0
17707 03:53:24 mmap(NULL, 8392704, PROT_READ|
2051 03:53:24 futex(0x7fb4cd3
17707 03:53:24 <... mmap resumed> ) = 0x7fb32bbf7000
17707 03:53:24 mprotect(
17707 03:53:24 clone( <unfinished ...>
2052 03:53:24 set_robust_
17707 03:53:24 <... clone resumed> child_stack=
2052 03:53:24 <... set_robust_list resumed> ) = 0
2052 03:53:24 futex(0x7fb334a
2047 03:53:24 <... pread resumed> "K\315\
2047 03:53:24 pread(12, "\230S\314\376\0\n \354\0\
2047 03:53:24 pread(12, <unfinished ...>
17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [ERROR] Aborting\n\n", 44) = 44
17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] WSREP: Signalling cancellation of the SST request.\n", 84) = 84
17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] WSREP: SST request was cancelled\n", 66) = 66
17707 03:53:24 futex(0x7fb885b
------------------
All logs has been attached.
have added archive with mysql/innobacku p.*.log percona| maria" (or dpkg -l |egrep -i "mysql| percona| maria" |grep "ii")
- output of dmesg
- output of pt-summary
- output of pt-mysql-summary
- full copy of /var/log/messages (or syslog)
- full copy of /etc/my.cnf and any other cnf file that might be included
- full copy of error logs
- full copy of /var/lib/
- output of rpm -qa |egrep -i "mysql|