xb_galera_sst failures in XB Jenkins

Bug #1201686 reported by Alexey Kopytov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Fix Released
High
Raghavendra D Prabhu
2.0
Fix Released
High
Raghavendra D Prabhu
2.1
Fix Released
High
Raghavendra D Prabhu

Bug Description

After rebuilding build-xtradb-cluster-binaries-for-xtrabackup-tests from an earlier revision (to avoid upstream http://bugs.mysql.com/69623), I now see the following xb_galera_sst failures in XB Jenkins builds:

130712 20:19:33 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup --role 'joiner' --address '127.0.0.1:3313' --auth 'jenkins:password' --datadir '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/data/' --defaults-file '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/my.cnf' --parent '22439'
 Read: '(null)'
130712 20:19:33 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '127.0.0.1:3313' --auth 'jenkins:password' --datadir '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/data/' --defaults-file '/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/ubuntu-lucid-64bit/xtrabackuptarget/galera55/test/var/w1/var901/my.cnf' --parent '22439': 2 (No such file or directory)
130712 20:19:33 [ERROR] WSREP: Failed to prepare for 'xtrabackup' SST. Unrecoverable.
130712 20:19:33 [ERROR] Aborting

http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/378/BUILD_TYPE=release,Host=ubuntu-lucid-64bit,xtrabackuptarget=galera55/testReport/junit/%28root%29/t_xb_galera_sst/sh/

Related branches

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Few issues:

a)

"sh: ip: command not found"

So, either iproute2 needs to be installed or I need to set wsrep_node_address explicitly

b) socat needs to be installed

This is critical (or need to workaround with a my.cnf containing [sst] section, can take longer).

c)

130712 19:58:56 [Warning] Ignoring user change to 'jenkins' because the user was set to 'mysql' earlier on the command line

130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
/mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/centos6-64/xtrabackuptarget/galera55/test/server/bin//mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 2)
130712 19:58:56 [Warning] One can only use the --user switch if running as root

This is not critical, however, I am not sure about this warning.

${MYSQLD} --basedir=$MYSQL_BASEDIR --user=$USER --help --verbose --wsrep-sst-method=rsync| grep -q wsrep

should --user=$USER be not passed here? I guess not passing this makes it run as jenkins user which I guess is fine?

Revision history for this message
Stewart Smith (stewart) wrote : Re: [Bug 1201686] Re: xb_galera_sst failures in XB Jenkins

Raghavendra D Prabhu <email address hidden> writes:
> "sh: ip: command not found"
>
>
> So, either iproute2 needs to be installed or I need to set
> wsrep_node_address explicitly

Why not use 127.0.0.1 ?

> b) socat needs to be installed
>
> This is critical (or need to workaround with a my.cnf containing [sst]
> section, can take longer).

You can patch our puppet: lp:percona-dev-puppet to install the needed
packages.

> c)
>
> 130712 19:58:56 [Warning] Ignoring user change to 'jenkins' because the
> user was set to 'mysql' earlier on the command line
>
> 130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
> 130712 19:58:56 [Warning] Can't create test file /var/lib/mysql/jhc-new-centos6-64.lower-test
> /mnt/workspace/percona-xtrabackup-2.1-param/BUILD_TYPE/release/Host/centos6-64/xtrabackuptarget/galera55/test/server/bin//mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 2)
> 130712 19:58:56 [Warning] One can only use the --user switch if running as root
>
>
> This is not critical, however, I am not sure about this warning.

You don't get run as root, so you'll have to work around it. Everything
is run as user Jenkins, which doesn't have sudo or anything :)

> ${MYSQLD} --basedir=$MYSQL_BASEDIR --user=$USER --help --verbose
> --wsrep-sst-method=rsync| grep -q wsrep
>
>
> should --user=$USER be not passed here? I guess not passing this
> makes it run as jenkins user which I guess is fine?

you shouldn't need it.

--
Stewart Smith

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

 > "sh: ip: command not found"
>
>
> So, either iproute2 needs to be installed or I need to set
> wsrep_node_address explicitly

Why not use 127.0.0.1 ?

Actually, ip is used by wsrep_sst_xtrabackup, so installation of iproute2 will be otherwise required.

> b) socat needs to be installed

Ack, I will patch it.

Both iproute2 and socat are dependencies of PXC (trunk), however, since an older branch tip is being used, it may not be installed.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
Alexey Kopytov (akopytov) wrote :

Here it fails differently:

130720 15:24:47 [Note] WSREP: Requesting state transfer: success, donor: 0
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 2 (20130720 15:24:47.815)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20130720 15:24:47.819)

http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/BUILD_TYPE=release,Host=ubuntu-lucid-64bit,xtrabackuptarget=galera55/382/testReport/junit/%28root%29/t_xb_galera_sst/sh/

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Seems like the donor is failing, however, the logs for the donor are not present there (or in the workspace). I will reproduce this locally and see what is happening.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

From the investigation I did I found

a) A bug in xb_galera_sst introduced by new framework - fixing it.

b) An issue with innobackupex itself:

 >> log scanned up to (8443459)
 130723 5:20:06 InnoDB: Warning: allocated tablespace 10, old maximum was 9
 [01] Streaming ./ibdata1
 [01] ...done
 [01] Streaming ./sakila/actor.ibd
 [01] ...done
 [01] Streaming ./sakila/address.ibd
 [01] ...done
 [01] Streaming ./sakila/category.ibd
 [01] ...done
 [01] Streaming ./sakila/city.ibd
 [01] ...done
 [01] Streaming ./sakila/country.ibd
 [01] ...done
 [01] Streaming ./sakila/customer.ibd
 [01] ...done
 [01] Streaming ./sakila/film.ibd
 [01] ...done
 [01] Streaming ./sakila/film_actor.ibd
 [01] ...done
 [01] Streaming ./sakila/film_category.ibd
 [01] ...done
 [01] Streaming ./sakila/inventory.ibd
 [01] ...done
 [01] Streaming ./sakila/language.ibd
 [01] ...done
 [01] Streaming ./sakila/payment.ibd
 [01] ...done
 [01] Streaming ./sakila/rental.ibd
 [01] ...done
 [01] Streaming ./sakila/staff.ibd
 [01] ...done
 [01] Streaming ./sakila/store.ibd
 [01] ...done
 >> log scanned up to (8443459)
 130723 05:20:08 innobackupex: Continuing after ibbackup has suspended
 130723 05:20:08 innobackupex: Starting to lock all tables...
 tar: -: Cannot write: Broken pipe
 tar: Error is not recoverable: exiting now
 innobackupex: Error: Failed to stream 'xtrabackup_binlog_info': Inappropriate ioctl for device at /usr/sbin/innobackupex line 413.

It looks like an EOF is being sent somewhere in between there which closes the socat on other end causing EPIPE later on.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

I'm not sure I understand issue b). I.e. how is it possible to "send EOF" to a pipe?

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

The EOF is for the socat since it operates over network. The handling of EOF is what I am looking at.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Fixed both in PXC and PXB - starting builds http://jenkins.percona.com/job/build-xtradb-cluster-binaries-for-xtrabackup-tests/ and param builds after that.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-380

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.