xb_galera_sst failures in XB Jenkins

Reported by Alexey Kopytov on 2013-07-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup
High
Raghavendra D Prabhu
2.0
High
Raghavendra D Prabhu
2.1
High
Raghavendra D Prabhu

Bug Description

After rebuilding build-xtradb-cluster-binaries-for-xtrabackup-tests from an earlier revision (to avoid upstream http://bugs.mysql.com/69623), I now see the following xb_galera_sst failures in XB Jenkins builds:

130712 20:19:33 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup --role 'joiner' --address '127.0.0.1:3313' --auth 'jenkins:password' --datadir '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/data/' --defaults-file '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/my.cnf' --parent '22439'
 Read: '(null)'
130712 20:19:33 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '127.0.0.1:3313' --auth 'jenkins:password' --datadir '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/data/' --defaults-file '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/my.cnf' --parent '22439': 2 (No such file or directory)
130712 20:19:33 [ERROR] WSREP: Failed to prepare for 'xtrabackup' SST. Unrecoverable.
130712 20:19:33 [ERROR] Aborting

http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/378/BUILD_TYPE=release,Host=ubuntu-lucid-64bit,xtrabackuptarget=galera55/testReport/junit/%28root%29/t_xb_galera_sst/sh/

Few issues:

a)

"sh: ip: command not found"

So, either iproute2 needs to be installed or I need to set wsrep_node_address explicitly

b) socat needs to be installed

This is critical (or need to workaround with a my.cnf containing [sst] section, can take longer).

c)

130712 19:58:56 [Warning] Ignoring user change to 'jenkins' because the user was set to 'mysql' earlier on the command line

130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/centos6-64/xtrabackuptarget/galera55/test/server/bin//mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 2)
130712 19:58:56 [Warning] One can only use the --user switch if running as root

This is not critical, however, I am not sure about this warning.

${MYSQLD} --basedir=$MYSQL_BASEDIR --user=$USER --help --verbose --wsrep-sst-method=rsync| grep -q wsrep

should --user=$USER be not passed here? I guess not passing this makes it run as jenkins user which I guess is fine?

Raghavendra D Prabhu <email address hidden> writes:
> "sh: ip: command not found"
>
>
> So, either iproute2 needs to be installed or I need to set
> wsrep_node_address explicitly

Why not use 127.0.0.1 ?

> b) socat needs to be installed
>
> This is critical (or need to workaround with a my.cnf containing [sst]
> section, can take longer).

You can patch our puppet: lp:percona-dev-puppet to install the needed
packages.

> c)
>
> 130712 19:58:56 [Warning] Ignoring user change to 'jenkins' because the
> user was set to 'mysql' earlier on the command line
>
> 130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
> 130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
> /mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/centos6-64/xtrabackuptarget/galera55/test/server/bin//mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 2)
> 130712 19:58:56 [Warning] One can only use the --user switch if running as root
>
>
> This is not critical, however, I am not sure about this warning.

You don't get run as root, so you'll have to work around it. Everything
is run as user Jenkins, which doesn't have sudo or anything :)

> ${MYSQLD} --basedir=$MYSQL_BASEDIR --user=$USER --help --verbose
> --wsrep-sst-method=rsync| grep -q wsrep
>
>
> should --user=$USER be not passed here? I guess not passing this
> makes it run as jenkins user which I guess is fine?

you shouldn't need it.

--
Stewart Smith

 > "sh: ip: command not found"
>
>
> So, either iproute2 needs to be installed or I need to set
> wsrep_node_address explicitly

Why not use 127.0.0.1 ?

Actually, ip is used by wsrep_sst_xtrabackup, so installation of iproute2 will be otherwise required.

> b) socat needs to be installed

Ack, I will patch it.

Both iproute2 and socat are dependencies of PXC (trunk), however, since an older branch tip is being used, it may not be installed.

Alexey Kopytov (akopytov) wrote :
Alexey Kopytov (akopytov) wrote :

Here it fails differently:

130720 15:24:47 [Note] WSREP: Requesting state transfer: success, donor: 0
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 2 (20130720 15:24:47.815)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20130720 15:24:47.819)

http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/BUILD_TYPE=release,Host=ubuntu-lucid-64bit,xtrabackuptarget=galera55/382/testReport/junit/%28root%29/t_xb_galera_sst/sh/

Seems like the donor is failing, however, the logs for the donor are not present there (or in the workspace). I will reproduce this locally and see what is happening.

From the investigation I did I found

a) A bug in xb_galera_sst introduced by new framework - fixing it.

b) An issue with innobackupex itself:

 >> log scanned up to (8443459)
 130723 5:20:06 InnoDB: Warning: allocated tablespace 10, old maximum was 9
 [01] Streaming ./ibdata1
 [01] ...done
 [01] Streaming ./sakila/actor.ibd
 [01] ...done
 [01] Streaming ./sakila/address.ibd
 [01] ...done
 [01] Streaming ./sakila/category.ibd
 [01] ...done
 [01] Streaming ./sakila/city.ibd
 [01] ...done
 [01] Streaming ./sakila/country.ibd
 [01] ...done
 [01] Streaming ./sakila/customer.ibd
 [01] ...done
 [01] Streaming ./sakila/film.ibd
 [01] ...done
 [01] Streaming ./sakila/film_actor.ibd
 [01] ...done
 [01] Streaming ./sakila/film_category.ibd
 [01] ...done
 [01] Streaming ./sakila/inventory.ibd
 [01] ...done
 [01] Streaming ./sakila/language.ibd
 [01] ...done
 [01] Streaming ./sakila/payment.ibd
 [01] ...done
 [01] Streaming ./sakila/rental.ibd
 [01] ...done
 [01] Streaming ./sakila/staff.ibd
 [01] ...done
 [01] Streaming ./sakila/store.ibd
 [01] ...done
 >> log scanned up to (8443459)
 130723 05:20:08 innobackupex: Continuing after ibbackup has suspended
 130723 05:20:08 innobackupex: Starting to lock all tables...
 tar: -: Cannot write: Broken pipe
 tar: Error is not recoverable: exiting now
 innobackupex: Error: Failed to stream 'xtrabackup_binlog_info': Inappropriate ioctl for device at /usr/sbin/innobackupex line 413.

It looks like an EOF is being sent somewhere in between there which closes the socat on other end causing EPIPE later on.

Alexey Kopytov (akopytov) wrote :

I'm not sure I understand issue b). I.e. how is it possible to "send EOF" to a pipe?

The EOF is for the socat since it operates over network. The handling of EOF is what I am looking at.

Fixed both in PXC and PXB - starting builds http://jenkins.percona.com/job/build-xtradb-cluster-binaries-for-xtrabackup-tests/ and param builds after that.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers