Test rpl.rpl_heartbeat_basic is unstable

Bug #1620200 reported by Laurynas Biveinis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Fix Released
Low
Laurynas Biveinis
5.6
Invalid
Undecided
Unassigned
5.7
Invalid
Undecided
Unassigned

Bug Description

On 5.5 trunk:

rpl.rpl_heartbeat_basic 'mix' w7 [ fail ]
        Test ended at 2016-08-31 13:16:53

CURRENT_TEST: rpl.rpl_heartbeat_basic
==14618== Memcheck, a memory error detector
==14618== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14618== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14618== Command: /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/client//mysqltest --defaults-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/my.cnf --silent --tmpdir=/tmp/RgVDPRv73K/7 --character-sets-dir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/sql/share/charsets --logdir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/log --plugin_dir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/plugin/auth --database=test --timer-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/log/timer --test-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/t/rpl_heartbeat_basic.test --tail-lines=500 --result-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result
==14618==
--- /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result 2016-08-31 18:15:28.880683377 +0300
+++ /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.reject 2016-08-31 20:16:46.040931999 +0300
@@ -223,7 +223,7 @@
 CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=MASTER_PORT, MASTER_USER='root', MASTER_CONNECT_RETRY=20, MASTER_HEARTBEAT_PERIOD=5;
 include/start_slave.inc
 SET @@global.event_scheduler=1;
-Number of received heartbeat events: 0
+Number of received heartbeat events: 1
 DELETE FROM t1;
 DROP EVENT e1;

mysqltest: Result content mismatch

This is a known upstream issue, fixed in 5.6 by

commit 9ad3b3f94c66f9f28d3a64f4f339e4ea4738e489
Author: Andrei Elkin <email address hidden>
Date: Mon Dec 17 18:33:13 2012 +0200

    Bug#14258884 RPL.RPL_HEARTBEAT_BASIC FAILS SPORADICALLY ON PB2
    Bug#13627066 RPL.RPL_DEADLOCK_INNODB FAILS ON PB2 SPRADICALLY

    Sporadic and long time standing mismatch at the test run

      include/start_slave.inc
      SET @@global.event_scheduler=1;
      -Number of received heartbeat events: 0
      +Number of received heartbeat events: 1

    was caused by a incorrect assumption the no hearbeat event should
    be sent in the context of that test's snippet.
    In fact, there is no guarantee that empty binlog status won't last
    over the hearbeat period. Even though period of scheduled by the server scheduler
    UPDATE queries is 1/5 th of the heartbeat period, the actual time in between of two
    successive bin-logging actions can last as long as the HB period. That's what PB run
    proves in practice.

    Fixed with removing ineffecient piece of the test.

    Bug#13627066 RPL.RPL_DEADLOCK_INNODB FAILS ON PB2 SPRADICALLY

    A possible reason of SQL thread to fail to increment slave_transaction_retries
    status is a failure to start the slave threads that went unnoticible thanks to unblocking
    style of the slave start.
    combined with also small of innodb_lock_wait_timeout. That could allow a race of
    the SQL thread retrying after timeout and the mtr user thread counting through polling in a interval.

    Attempted to fix with correcting the start slave.
    A separate failure in this test that radomly happens on PB near

     source include/wait_for_slave_sql_error.inc;

    is addressed with adding a debug print-outs via rpl_debug=1.
    Extra info will be necessary to actually tackle this issue (to be repoted once the extra info will
    be available in PB saved test logs).

Tags: ci upstream
tags: added: ci upstream
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3548

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.