xtrabackup sst clobber clobbers error log

Bug #1229968 reported by Jay Janssen on 2013-09-24
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Released
Undecided
Unassigned

Bug Description

130924 22:05:24 [Note] WSREP: Requesting state transfer: success, donor: 1
WSREP_SST: [INFO] Proceeding with SST (20130924 22:05:26.055)
WSREP_SST: [INFO] Cleaning the existing datadir (20130924 22:05:26.058)
removed `/var/lib/mysql/backup-my.cnf'
removed `/var/lib/mysql/error.log'
removed `/var/lib/mysql/mysql/db.frm'
removed `/var/lib/mysql/mysql/db.MYI'
removed `/var/lib/mysql/mysql/db.MYD'
removed `/var/lib/mysql/mysql/host.frm'
removed `/var/lib/mysql/mysql/host.MYI'
removed `/var/lib/mysql/mysql/host.MYD'
removed `/var/lib/mysql/mysql/user.frm'
removed `/var/lib/mysql/mysql/user.MYI'
removed `/var/lib/mysql/mysql/user.MYD'
removed `/var/lib/mysql/mysql/func.frm'
removed `/var/lib/mysql/mysql/func.MYI'
removed `/var/lib/mysql/mysql/func.MYD'
removed `/var/lib/mysql/mysql/plugin.frm'
removed `/var/lib/mysql/mysql/plugin.MYI'
removed `/var/lib/mysql/mysql/plugin.MYD'
removed `/var/lib/mysql/mysql/servers.frm'
removed `/var/lib/mysql/mysql/servers.MYI'
removed `/var/lib/mysql/mysql/servers.MYD'
removed `/var/lib/mysql/mysql/tables_priv.frm'
removed `/var/lib/mysql/mysql/tables_priv.MYI'
removed `/var/lib/mysql/mysql/tables_priv.MYD'
removed `/var/lib/mysql/mysql/columns_priv.frm'
removed `/var/lib/mysql/mysql/columns_priv.MYI'
removed `/var/lib/mysql/mysql/columns_priv.MYD'
removed `/var/lib/mysql/mysql/help_topic.frm'
removed `/var/lib/mysql/mysql/help_topic.MYI'
removed `/var/lib/mysql/mysql/help_topic.MYD'
removed `/var/lib/mysql/mysql/help_category.frm'
removed `/var/lib/mysql/mysql/help_category.MYI'
removed `/var/lib/mysql/mysql/help_category.MYD'
removed `/var/lib/mysql/mysql/help_relation.frm'
removed `/var/lib/mysql/mysql/help_relation.MYI'
removed `/var/lib/mysql/mysql/help_relation.MYD'
removed `/var/lib/mysql/mysql/help_keyword.frm'
removed `/var/lib/mysql/mysql/help_keyword.MYI'
removed `/var/lib/mysql/mysql/help_keyword.MYD'
removed `/var/lib/mysql/mysql/time_zone_name.frm'
removed `/var/lib/mysql/mysql/time_zone_name.MYI'
removed `/var/lib/mysql/mysql/time_zone_name.MYD'
removed `/var/lib/mysql/mysql/time_zone.frm'
removed `/var/lib/mysql/mysql/time_zone.MYI'
removed `/var/lib/mysql/mysql/time_zone.MYD'
removed `/var/lib/mysql/mysql/time_zone_transition.frm'
removed `/var/lib/mysql/mysql/time_zone_transition.MYI'
removed `/var/lib/mysql/mysql/time_zone_transition.MYD'
removed `/var/lib/mysql/mysql/time_zone_transition_type.frm'
removed `/var/lib/mysql/mysql/time_zone_transition_type.MYI'
removed `/var/lib/mysql/mysql/time_zone_transition_type.MYD'
removed `/var/lib/mysql/mysql/time_zone_leap_second.frm'
removed `/var/lib/mysql/mysql/time_zone_leap_second.MYI'
removed `/var/lib/mysql/mysql/time_zone_leap_second.MYD'
removed `/var/lib/mysql/mysql/proc.frm'
removed `/var/lib/mysql/mysql/proc.MYI'
removed `/var/lib/mysql/mysql/proc.MYD'
removed `/var/lib/mysql/mysql/procs_priv.frm'
removed `/var/lib/mysql/mysql/procs_priv.MYI'
removed `/var/lib/mysql/mysql/procs_priv.MYD'
removed `/var/lib/mysql/mysql/general_log.frm'
removed `/var/lib/mysql/mysql/general_log.CSM'
removed `/var/lib/mysql/mysql/general_log.CSV'
removed `/var/lib/mysql/mysql/slow_log.frm'
removed `/var/lib/mysql/mysql/slow_log.CSM'
removed `/var/lib/mysql/mysql/slow_log.CSV'
removed `/var/lib/mysql/mysql/event.frm'
removed `/var/lib/mysql/mysql/event.MYI'
removed `/var/lib/mysql/mysql/event.MYD'
removed `/var/lib/mysql/mysql/ndb_binlog_index.frm'
removed `/var/lib/mysql/mysql/ndb_binlog_index.MYI'
removed `/var/lib/mysql/mysql/ndb_binlog_index.MYD'
removed `/var/lib/mysql/mysql/proxies_priv.frm'
removed `/var/lib/mysql/mysql/proxies_priv.MYI'
removed `/var/lib/mysql/mysql/proxies_priv.MYD'
removed directory: `/var/lib/mysql/mysql'
removed directory: `/var/lib/mysql/test'
removed `/var/lib/mysql/performance_schema/db.opt'
removed `/var/lib/mysql/performance_schema/cond_instances.frm'
removed `/var/lib/mysql/performance_schema/events_waits_current.frm'
removed `/var/lib/mysql/performance_schema/events_waits_history.frm'
removed `/var/lib/mysql/performance_schema/events_waits_history_long.frm'
removed `/var/lib/mysql/performance_schema/events_waits_summary_by_instance.frm'
removed `/var/lib/mysql/performance_schema/events_waits_summary_by_thread_by_event_name.frm'
removed `/var/lib/mysql/performance_schema/events_waits_summary_global_by_event_name.frm'
removed `/var/lib/mysql/performance_schema/file_instances.frm'
removed `/var/lib/mysql/performance_schema/file_summary_by_event_name.frm'
removed `/var/lib/mysql/performance_schema/file_summary_by_instance.frm'
removed `/var/lib/mysql/performance_schema/mutex_instances.frm'
removed `/var/lib/mysql/performance_schema/performance_timers.frm'
removed `/var/lib/mysql/performance_schema/rwlock_instances.frm'
removed `/var/lib/mysql/performance_schema/setup_consumers.frm'
removed `/var/lib/mysql/performance_schema/setup_instruments.frm'
removed `/var/lib/mysql/performance_schema/setup_timers.frm'
removed `/var/lib/mysql/performance_schema/threads.frm'
removed directory: `/var/lib/mysql/performance_schema'
removed `/var/lib/mysql/ibdata1'
removed `/var/lib/mysql/ib_logfile0'
removed `/var/lib/mysql/ib_logfile1'
removed `/var/lib/mysql/mysql.sock'
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20130924 22:05:26.405)
WSREP_SST: [ERROR] xtrabackup process ended without creating '/var/lib/mysql//xtrabackup_galera_info' (20130924 22:05:44.432)
WSREP_SST: [INFO] Contents of datadir (20130924 22:05:44.436)
130924 22:05:44 [Warning] WSREP: 1 (ip-10-238-177-207): State transfer to 0 (ip-10-141-160-44) failed: -1 (Operation not permitted)
130924 22:05:44 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort.
130924 22:05:44 [Note] WSREP: gcomm: terminating thread
130924 22:05:44 [Note] WSREP: gcomm: joining thread
130924 22:05:44 [Note] WSREP: gcomm: closing backend
WSREP_SST: [INFO] -rw-rw----. 1 mysql mysql 263 Sep 24 22:05 /var/lib/mysql//backup-my.cnf
-rw-------. 1 mysql mysql 2147484968 Sep 24 22:05 /var/lib/mysql//galera.cache
-rw-rw----. 1 mysql mysql 108 Sep 24 22:05 /var/lib/mysql//grastate.dat
-rw-rw----. 1 mysql mysql 0 Sep 24 22:05 /var/lib/mysql//sst_in_progress (20130924 22:05:44.444)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20130924 22:05:44.448)
WSREP_SST: [INFO] Removing the sst_in_progress file (20130924 22:05:44.452)
130924 22:05:44 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '10.141.160.44' --auth 'sst:secret' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '13658': 32 (Broken pipe)
130924 22:05:44 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
130924 22:05:44 [ERROR] WSREP: SST failed: 32 (Broken pipe)
130924 22:05:44 [ERROR] Aborting

130924 22:05:45 [Note] WSREP: view(view_id(NON_PRIM,67a871a1-2565-11e3-a1fe-46b4dcd8ab8f,18) memb {
 67a871a1-2565-11e3-a1fe-46b4dcd8ab8f,
} joined {
} left {
} partitioned {
 9275e205-255f-11e3-b2ab-36ce11b3ab07,
 ba03803a-255e-11e3-bbe4-9fbeb2b4b2a4,
})
130924 22:05:45 [Note] WSREP: view((empty))
130924 22:05:45 [Note] WSREP: gcomm: closed
130924 22:05:45 [Note] WSREP: /usr/sbin/mysqld: Terminated.
^

My SST failed, and then I have this in my datadir:

[root@ip-10-141-160-44 mysql]# ls -lah
total 2.1G
drwxr-xr-x. 2 mysql mysql 80 Sep 24 22:05 .
drwxr-xr-x. 16 root root 4.0K Sep 24 21:05 ..
-rw-rw----. 1 mysql mysql 263 Sep 24 22:05 backup-my.cnf
-rw-r-----. 1 mysql root 91 Sep 24 22:05 error.log
-rw-------. 1 mysql mysql 2.1G Sep 24 22:05 galera.cache
-rw-rw----. 1 mysql mysql 108 Sep 24 22:05 grastate.dat

[root@ip-10-141-160-44 mysql]# cat error.log
130924 22:05:45 mysqld_safe mysqld from pid file /var/lib/mysql/ip-10-141-160-44.pid ended
130924 22:11:14 [Note] WSREP: Read nil XID from storage engines, skipping position init
130924 22:11:14 [Note] WSREP: wsrep_load(): loading provider library 'none'
130924 22:11:14 [Note] WSREP: Service disconnected.
130924 22:11:15 [Note] WSREP: Some threads may fail to exit.
130924 22:11:15 [Note] WSREP: Read nil XID from storage engines, skipping position init
130924 22:11:15 [Note] WSREP: wsrep_load(): loading provider library 'none'
130924 22:11:15 [Note] WSREP: Service disconnected.
130924 22:11:16 [Note] WSREP: Some threads may fail to exit.

This makes it very hard to diagnose the error. I caught my initial output from my terminal window, it was not clear what happened after.

The default extension of '.err' for error log should avoid this.

            find $DATA -mindepth 1 -regex '.*galera.cache$\|.*sst_in_progress$\|.*grastate.dat$\|.*err$' -prune -o -exec rm -rfv {} 1>&2 \+

is the command used to remove.

Only other alternative is to make this pattern configurable from my.cnf.

Alex Yurchenko (ayurchen) wrote :

From what I see your policy now to clobber everything but some known files. Initially we used the same with rsync SST, but found it wanting. I'd suggest it should be the opposite: clobber only known files, leave unknown intact.

The "clobber known files" has its issues with xbstream (fails to clobber) and even with tar - https://bugs.launchpad.net/bugs/1213073
That issue won't present with rsync since rsync is much more filetree hierarchy aware, hence rsync SST is fine here.

Making it configurable should also allow for other files like RPM_UPGRADE_HISTORY etc.

Note that socket and pid files are created after SST hence, they won't be removed.

This doesn't cover the case of the error log being renamed to 'error.log' and still being in my datadir. This should be easy to figure out from the my.cnf, I'd think, but maybe not.

On Sep 25, 2013, at 1:45 AM, Raghavendra D Prabhu <email address hidden> wrote:

> The default extension of '.err' for error log should avoid this.
>
> find $DATA -mindepth 1 -regex
> '.*galera.cache$\|.*sst_in_progress$\|.*grastate.dat$\|.*err$' -prune
> -o -exec rm -rfv {} 1>&2 \+
>
>
> is the command used to remove.
>
>
> Only other alternative is to make this pattern configurable from my.cnf.

Jay Janssen, MySQL Consulting Lead, Percona
http://about.me/jay.janssen

Nothing has been renamed or clobbered here. There is no error log backed up by xtrabackup, so it is not clobbered here either. It has been removed by SST script before the streaming (" removed `/var/lib/mysql/error.log'"). The error.log present in datadir is re-created by joiner itself. (probably after it has been removed). The default error log is `hostname`.err, can you share the my.cnf of the joiner? (it must be defined in that).

Jay Janssen (jay-janssen) wrote :

[root@ip-10-149-26-184 mysql]# cat /etc/my.cnf
[mysqld]
datadir = /var/lib/mysql
log_error = error.log

binlog_format = ROW

innodb_buffer_pool_size = 10G
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
innodb_file_per_table
innodb_flush_log_at_trx_commit = 0

wsrep_cluster_name = mycluster
wsrep_node_name = ip-10-149-26-184

wsrep_provider = /usr/lib64/libgalera_smm.so
wsrep_provider_options = "gcache.size=2G; gcs.fc_limit=2048"

wsrep_slave_threads = 16
wsrep_auto_increment_control = OFF

wsrep_sst_method = xtrabackup
wsrep_sst_auth = sst:secret

innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2

[mysql]
prompt = "ip-10-149-26-184 mysql> "

[client]
user = root

!include /etc/my-pxc.cnf
!include /etc/my-pxc-extra.cnf

" log_error = error.log" as expected (the paths are taken relative to datadir). You can make it "log_error = error.err" as workaround..

I have created - https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1231121 to make this configurable.

Also, added to errata here http://www.percona.com/doc/percona-xtradb-cluster/errata.html

Alex Yurchenko (ayurchen) wrote :

@3: I fail to see what issues can be there. Shouldn't removing all subdirs (schemas), ibdata and ib_log* files on joiner behalf be sufficient? And you don't need to configure anything.

@Alex,

Yes, that should do. However, there can be nuances in that - also need to remove backup-my.cnf (if present from older runs), xtrabackup* files, ib_lru_dump and any other file which may be backed up (innodb_doublewrite file and other exotic variants included). Also, even if one file is missed and is present at destination while being present in backup, xbstream fails with errno 17.

Also, keeping older files like mysql binlog ones may cause trouble while starting. Hence, I decided to go with whitelisting while removing the files.

Fixed in 5.5.34.

Changed in percona-xtradb-cluster:
milestone: none → 5.6-future
status: New → Fix Released
milestone: 5.6-future → none
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers