Percona Server moved to https://jira.percona.com/projects/PS

Test rpl.rpl_heartbeat_basic is unstable

Bug #1620200 reported by Laurynas Biveinis on 2016-09-05

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Percona Server moved to https://jira.percona.com/projects/PS	Status tracked in 5.7
5.5	Fix Released	Low	Laurynas Biveinis	Percona Server moved to https://jira.percona.com/projects/PS 5.5.52-38.3
5.6	Invalid	Undecided	Unassigned
5.7	Invalid	Undecided	Unassigned

Bug Description

On 5.5 trunk:

rpl.rpl_heartbeat_basic 'mix' w7 [ fail ]
Test ended at 2016-08-31 13:16:53

CURRENT_TEST: rpl.rpl_heartbeat_basic
==14618== Memcheck, a memory error detector
==14618== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14618== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14618== Command: /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/client//mysqltest --defaults-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/my.cnf --silent --tmpdir=/tmp/RgVDPRv73K/7 --character-sets-dir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/sql/share/charsets --logdir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/log --plugin_dir=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/plugin/auth --database=test --timer-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/var/7/log/timer --test-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/t/rpl_heartbeat_basic.test --tail-lines=500 --result-file=/mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result
==14618==
--- /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result 2016-08-31 18:15:28.880683377 +0300
+++ /mnt/workspace/percona-server-5.5-valgrind/BUILD_TYPE/valgrind/Host/ubuntu-xenial-64bit/mysql-test/suite/rpl/r/rpl_heartbeat_basic.reject 2016-08-31 20:16:46.040931999 +0300
@@ -223,7 +223,7 @@
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=MASTER_PORT, MASTER_USER='root', MASTER_CONNECT_RETRY=20, MASTER_HEARTBEAT_PERIOD=5;
include/start_slave.inc
SET @@global.event_scheduler=1;
-Number of received heartbeat events: 0
+Number of received heartbeat events: 1
DELETE FROM t1;
DROP EVENT e1;

mysqltest: Result content mismatch

This is a known upstream issue, fixed in 5.6 by

commit 9ad3b3f94c66f9f28d3a64f4f339e4ea4738e489
Author: Andrei Elkin <email address hidden>
Date: Mon Dec 17 18:33:13 2012 +0200

Bug#14258884 RPL.RPL_HEARTBEAT_BASIC FAILS SPORADICALLY ON PB2
Bug#13627066 RPL.RPL_DEADLOCK_INNODB FAILS ON PB2 SPRADICALLY

Sporadic and long time standing mismatch at the test run

      include/start_slave.inc
      SET @@global.event_scheduler=1;
      -Number of received heartbeat events: 0
      +Number of received heartbeat events: 1

    was caused by a incorrect assumption the no hearbeat event should
    be sent in the context of that test's snippet.
    In fact, there is no guarantee that empty binlog status won't last
    over the hearbeat period. Even though period of scheduled by the server scheduler
    UPDATE queries is 1/5 th of the heartbeat period, the actual time in between of two
    successive bin-logging actions can last as long as the HB period. That's what PB run
    proves in practice.

Fixed with removing ineffecient piece of the test.

Bug#13627066 RPL.RPL_DEADLOCK_INNODB FAILS ON PB2 SPRADICALLY

    A possible reason of SQL thread to fail to increment slave_transaction_retries
    status is a failure to start the slave threads that went unnoticible thanks to unblocking
    style of the slave start.
    combined with also small of innodb_lock_wait_timeout. That could allow a race of
    the SQL thread retrying after timeout and the mtr user thread counting through polling in a interval.

Attempted to fix with correcting the start slave.
A separate failure in this test that radomly happens on PB near

source include/wait_for_slave_sql_error.inc;

    is addressed with adding a debug print-outs via rpl_debug=1.
    Extra info will be necessary to actually tackle this issue (to be repoted once the extra info will
    be available in PB saved test logs).

Tags:

Laurynas Biveinis (laurynas-biveinis) on 2016-09-05

tags:

added: ci upstream

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2016-09-05:

     https://github.com/percona/percona-server/pull/977
     https://github.com/percona/percona-server/pull/978
     https://github.com/percona/percona-server/pull/979

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2016-11-03:

See bug 1638919

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3548

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.