Slave crash with malformed binlog event
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS |
Fix Released
|
High
|
Vlad Lesin | ||
5.5 |
New
|
Undecided
|
Unassigned | ||
5.6 |
Fix Released
|
High
|
Vlad Lesin |
Bug Description
The fix for https:/
mysqlbinlog attempts on binlog from surviving (master) node show:
"ERROR: Error in Log_event:
ERROR: Could not read entry at offset 216385052: Error in log format or read error.
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output."
Customer added:
If I look in the error log of a crashed node I see that the last applied Xid is:
"2015-06-11 09:51:25 48997 [ERROR] WSREP: applier could not read binlog event, seqno: 12575799, len: 551
2015-06-11 09:51:25 48997 [Warning] WSREP: Failed to apply app buffer: seqno: 12575799, status: 1
at galera/
I searched for that Xid in binary logs on the primary and could not find it. The last one to be displayed 12575798.
So, given the fix for lp:1438990 was related to adding a table map, I'm assuming that it could have introduced a bug where it would not add a STMT_END_F under certain conditions.
Customer is working to isolate steps to reproduce, I will update here when we have more details.
Changed in percona-server: | |
assignee: | nobody → Vlad Lesin (vlad-lesin) |
The following analysis is based on the code revision number c250ec9791ed6c7 695ae6a191397a3 b65db2e994:
Analysis #1.
1) The assertion takes place on debug build here:
(gdb) bt sysdeps/ unix/sysv/ linux/raise. c:56 assertion@ entry=0x10886c9 "is_binlog_ empty() ", file=file@ entry=0x1088548 "./sql/binlog.cc", entry=433, function@ entry=0x108cdc0 <binlog_ cache_data: :reset( )::PRETTY_ FUNCTION> "virtual void binlog_ cache_data: :reset( )") at assert.c:92 0x10886c9 "is_binlog_ empty() ", cache_data: :reset( )::PRETTY_ FUNCTION> "virtual void binlog_ cache_data: :reset( )") at assert.c:101 cache_data: :reset (this=0x7fff980 24fe0) at ./sql/binlog.cc:433 trx_cache_ data::reset (this=0x7fff980 24fe0) at ./sql/binlog.cc:648 cache_data: :flush (this=0x7fff980 24fe0, thd=0x20adb10, bytes_written= 0x7fffdc0d14b8, wrote_xid= 0x7fffdc0d14eb) cc:1270 cache_mngr: :flush (this=0x7fff980 24e70, thd=0x20adb10, bytes_written= 0x7fffdc0d14f0, wrote_xid= 0x7fffdc0d14eb) LOG::flush_ thread_ caches (this=0x1859dc0 <mysql_bin_log>, thd=0x20adb10) cc:6779 LOG::process_ flush_stage_ queue (this=0x1859dc0 <mysql_bin_log>, total_bytes_ var=0x7fffdc0d1 5f0, var=0x7fffdc0d1 5de, out_queue_ var=0x7fffdc0d1 5f8) at ./sql/binlog. cc:6835 5700) at pthread_ create. c:312 unix/sysv/ linux/x86_ 64/clone. S:111
#0 0x00007ffff67c6cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/
#1 0x00007ffff67ca0d8 in __GI_abort () at abort.c:89
#2 0x00007ffff67bfb86 in __assert_fail_base (fmt=0x7ffff6910830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=
line=line@
function=
#3 0x00007ffff67bfc32 in GI_assert_fail (assertion=
file=0x1088548 "./sql/binlog.cc", line=433,
function=0x108cdc0 <binlog_
#4 0x0000000000a47967 in binlog_
#5 0x0000000000a47d2c in binlog_
#6 0x0000000000a32a7e in binlog_
at ./sql/binlog.
#7 0x0000000000a4827a in binlog_
at ./sql/binlog.cc:778
#8 0x0000000000a40b63 in MYSQL_BIN_
at ./sql/binlog.
#9 0x0000000000a40cfc in MYSQL_BIN_
rotate_
#21 0x00007ffff737d182 in start_thread (arg=0x7fffdc0d
#22 0x00007ffff688a47d in clone () at ../sysdeps/
binlog_ cache_data: :is_binlog_ empty() returns false because binlog_ cache_data: :cache_ log.pos_ in_file have some suspicious value (10344644715844 964239) .
2) binlog_ cache_data: :cache_ log.pos_ in_file is set from here:
Old value = 368 64239 4db48, type=WRITE_CACHE, seek_offset= 103446447158449 64239, use_async_io=0 '\000', clear_cache=0 '\000') mf_iocache. c:387 not_done= 1; 4db48, type=WRITE_CACHE, seek_offset= 103446447158449 64239, use_async_io=0 '\000', clear_cache=0 '\000') mf_iocache. c:387 cache_data: :truncate (this=0x7fff940 4db40, pos=10344644715 844964239) trx_cache_ data::restore_ savepoint (this=0x7fff940 4db40, pos=10344644715 844964239) savepoint_ rollback (hton=0x188ca40, thd=0x20b38c0, sv=0x7fff94007040) cc:1900 to_savepoint (thd=0x20b38c0, sv=0x7fff94007010) cc:2132 to_savepoint (thd=0x20b38c0, name=...) at ./...
New value = 103446447158449
reinit_io_cache (info=0x7fff940
at ./mysys/
387 info->seek_
(gdb) bt
#0 reinit_io_cache (info=0x7fff940
at ./mysys/
#1 0x0000000000a47aaa in binlog_
at ./sql/binlog.cc:486
#2 0x0000000000a48011 in binlog_
at ./sql/binlog.cc:691
#3 0x0000000000a34368 in binlog_
at ./sql/binlog.
#4 0x00000000006391c7 in ha_rollback_
at ./sql/handler.
#5 0x00000000008aac33 in trans_rollback_