Query "stop slave" hangs
Bug #906323 reported by
Dreas van Donselaar
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Confirmed
|
Undecided
|
Kristian Nielsen |
Bug Description
On various of our slave servers, we had problems with replication. It appears this started with hardware issues on our central MariaDB master (power failures). Trying to stop replication with "mysql -e 'stop slave'" resulted in the query simply getting stuck. MySQL wouldn't respond anymore to any queries relating to replication (e.g. show slave status). The only way to fix this, appeared to be to remove the /var/lib/
tags: | added: replication upstream |
To post a comment you must log in.
Based on the data uploaded to FTP separately, I think it's a manifestation of the bug http:// bugs.mysql. com/bug. php?id= 45940 (part 4) and its duplicate http:// bugs.mysql. com/bug. php?id= 53985 which describe exactly the same problem, hanging STOP SLAVE.
The analysis in the latter bug says that it happens when SQL thread is being stopped in a middle of transaction, while IO thread has already exited, and relates to the situation when a mix of transactional and non-transactional engines is involved (so rolling back the started group is not safe).
In our case, we have all the same elements, just due to different reasons.
According to the slave error log, on the server start the IO thread exited immediately due to ER_MASTER_ FATAL_ERROR_ READING_ BINLOG. The previous HW problem on the master can account for that.
SQL thread started, but its position ponted at the beginning of a non-finished transaction (group). So, as the bugs above describe, it finished executing what it had and started waiting for the rest, which the IO thread of course could not provide. The error log does not even show any signs of SQL thread attempting to exit when it presumably should have received the STOP command.
What for the mix of transactional and non-transactional engines, instead of it we have different table engines on master and slave. The transaction itself apparently consisted of two DML statements only (the first was written in the binlog, the second and COMMIT weren't), so there was no mix. But the slave table is Aria, while the master table is most likely InnoDB (judging by the look of the binary log). So, since the binary log is transactional, the SQL thread treats it as such, but it also raises the flag 'modified_ non_transaction al_table' .
I'm assigning it to Kristofer so he could confirm (or deny), and importantly decide if there is anything to be done about it in 5.2/5.3. The original bug was fixed in 5.5, according to the bug comments.