Binlog corruption when tmpdir gets full

Bug #1703346 reported by Przemek on 2017-07-10
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Won't Fix
Undecided
Unassigned
5.6
Triaged
High
Unassigned
5.7
Invalid
High
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.6
Confirmed
Undecided
Unassigned
5.7
Invalid
Undecided
Unassigned

Bug Description

In a situation where disk partition used for tmpdir has little free space, and large enough write transaction is executed, which does not fit in binlog_cache_size, an individual write fails with error, but actual binary log gets corrupted, causing the slave to fail.

Reproduced for InnoDB tables with compound primary key, with standard replication as well as for PXC.

Example error messages when tmpdir gets full (ML* of MY* files gets created, apparently depending on whether total transaction file can't be increased (ML) or individual update query (MY)):

master [localhost] {msandbox} (test) > update test1 set a=a+1,b=b+1,c=c+2 limit 110000;
ERROR 3 (HY000): Error writing file '/home/przemek/sandboxes/rsandbox_mysql-5_6_36/tmp/MLFgYjKP' (Errcode: 28 - No space left on device)
master [localhost] {msandbox} (test) > update test1 set a=a+1,b=b+1,c=c+2 limit 210000;
ERROR 3 (HY000): Error writing file '/home/przemek/sandboxes/rsandbox_mysql-5_6_36/tmp/MYUTuIOS' (Errcode: 28 - No space left on device)

Example error on the slave side:

                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'binlog truncated in the middle of event; consider out of disk space on master; the first event 'mysql-bin.000008' at 12574553, the last event read from './mysql-bin.000009' at 10668433, the last byte read from './mysql-bin.000009' at 10668452.'
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Update_rows event on table test.u1; Can't find record in 'u1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000009, end_log_pos 10668433

Example error on PXC slave node:

2017-07-07 18:50:39 8665 [ERROR] Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 0, event_type: 0
2017-07-07 18:50:39 8665 [ERROR] WSREP: applier could not read binlog event, seqno: 378308, len: 2680586
2017-07-07 18:50:47 8665 [Warning] WSREP: Failed to apply app buffer: seqno: 378308, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 2th time

Example output from mysqlbinlog tool:

ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 2180319, event_type: 0
ERROR: Could not read entry at offset 7233086: Error in log format or read error.
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.

Test case to be found in upstream bug report: https://bugs.mysql.com/bug.php?id=86991

tags: added: upstream

Why 5.7 has been set to Invalid? https://bugs.mysql.com/bug.php?id=72457 says "[8 May 20:13] Jay Edgar
I was able to reproduce this on both 5.6.35 and 5.7.18."

Sveta Smirnova (svetasmirnova) wrote :

Because I was not able to reproduce this bug with 5.7 back in time. But looks like it depends from internal_tmp_disk_storage_engine and default_tmp_storage_engine settings. If they set to MyISAM I either get:

mysql> insert into t1 select * from t1;
ERROR 3 (HY000): Error writing file '/home/sveta/build/ps-5.7/mysql-test/bug64849_data/MLKUpL2m' (Errcode: 28 - No space left on device)

If binlog_error_action=IGNORE_ERROR or

mysql> insert into t1 select * from t1;
ERROR 1598 (HY000): Binary logging not possible. Message: An error occurred during flush stage of the commit. 'binlog_error_action' is set to 'ABORT_SERVER'. Hence aborting the server.

If binlog_error_action=ABORT_SERVER

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1994

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1107

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.