(M)aria recovery failure: ma_key_recover.c:981: _ma_apply_redo_index: Assertion `check_page_length == page_length' failed.

Bug #613418 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Michael Widenius

Bug Description

When recovering from the rqg_maria_varchar test, which uses random DML statements with lengthy strings and multibyte character sets, (M)aria failed as follows:

# 2010-08-04T13:18:55 recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%mysqld: ma_key_recover.c:981: _ma_apply_redo_index: Assertion `check_page_length == page_length' failed.
# 2010-08-04T13:18:55 100804 13:18:55 [ERROR] mysqld got signal 6 ;
# 2010-08-04T13:18:55 Recovery has apparently crashed.
# 2010-08-04T13:18:55 This could be because you hit a bug. It is also possible that this binary
# 2010-08-04T13:18:55 or one of the libraries it was linked against is corrupt, improperly built,
# 2010-08-04T13:18:55 Log message 'or one of the libraries it was linked against is corrupt, improperly built,' indicates database corruption
# 2010-08-04T13:18:55 or misconfigured. This error can also be caused by malfunctioning hardware.
# 2010-08-04T13:18:55 We will try our best to scrape up some info that will hopefully help diagnose
# 2010-08-04T13:18:55 the problem, but since we have already crashed, something is definitely wrong
# 2010-08-04T13:18:55 and this may fail.
# 2010-08-04T13:18:55
# 2010-08-04T13:18:55 key_buffer_size=0
# 2010-08-04T13:18:55 read_buffer_size=131072
# 2010-08-04T13:18:55 max_used_connections=0
# 2010-08-04T13:18:55 max_threads=153
# 2010-08-04T13:18:55 threads_connected=0
# 2010-08-04T13:18:55 It is possible that mysqld could use up to
# 2010-08-04T13:18:55 key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 334854 K
# 2010-08-04T13:18:55 bytes of memory
# 2010-08-04T13:18:55 Hope that's ok; if not, decrease some variables in the equation.
# 2010-08-04T13:18:55
# 2010-08-04T13:18:55 thd: 0x0
# 2010-08-04T13:18:55 Attempting backtrace. You can use the following information to find out
# 2010-08-04T13:18:55 where mysqld died. If you see no messages after this, something went
# 2010-08-04T13:18:55 terribly wrong...
# 2010-08-04T13:18:55 stack_bottom = (nil) thread_stack 0x48000
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(my_print_stacktrace+0x35) [0xb9375f]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(handle_segfault+0x2ee) [0x6716bb]
# 2010-08-04T13:18:55 Recovery has apparently crashed.
# 2010-08-04T13:18:55 /lib64/libpthread.so.0() [0x307ba0f440]
# 2010-08-04T13:18:55 /lib64/libc.so.6(gsignal+0x35) [0x307b6329c5]
# 2010-08-04T13:18:55 /lib64/libc.so.6(abort+0x175) [0x307b6341a5]
# 2010-08-04T13:18:55 /lib64/libc.so.6(__assert_fail+0xf5) [0x307b62b945]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(_ma_apply_redo_index+0xaf8) [0x982fe7]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x9ee6de]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x9ec155]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x9f0e38]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(maria_apply_log+0x234) [0x9ebb15]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(maria_recovery_from_log+0x10e) [0x9eb868]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x96ad38]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(ha_initialize_handlerton(st_plugin_int*)+0xb5) [0x7c3bbc]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x8920a7]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(plugin_init(int*, char**, int)+0x628) [0x89294e]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x6740a9]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld(main+0x259) [0x674bd5]
# 2010-08-04T13:18:55 /lib64/libc.so.6(__libc_start_main+0xfd) [0x307b61ec5d]
# 2010-08-04T13:18:55 /home/buildbot/rqg-buildbot-slave/rqg-perpush-mariaengine/build/mysql-test//../sql/mysqld() [0x57dcb9]

The core, binary and such will be uploaded shortly.

Tags: rqg aria recovery
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

RQG command line:

# 2010-08-04T13:13:06 # /home/buildbot/randgen/runall.pl \
# 2010-08-04T13:13:06 # --basedir=. \
# 2010-08-04T13:13:06 # --vardir=../../vardir-maria_tiny_inserts \
# 2010-08-04T13:13:06 # --grammar=/home/buildbot/randgen/conf/engines/varchar.yy \
# 2010-08-04T13:13:06 # --gendata=/home/buildbot/randgen/conf/engines/varchar.zz \
# 2010-08-04T13:13:06 # --reporter=Recovery,Shutdown \
# 2010-08-04T13:13:06 # --duration=240 \
# 2010-08-04T13:13:06 # --queries=1M \
# 2010-08-04T13:13:06 # --engine=Maria \
# 2010-08-04T13:13:06 # --mysqld=--maria-checkpoint-interval=0 \
# 2010-08-04T13:13:06 # --mysqld=--log-output=file \
# 2010-08-04T13:13:06 # --seed=time \
# 2010-08-04T13:13:06 # --mysqld=--loose-skip-innodb \
# 2010-08-04T13:13:06 # --mysqld=--loose-pbxt=OFF

Changed in maria:
importance: Undecided → High
assignee: nobody → Michael Widenius (monty)
milestone: none → 5.2
milestone: 5.2 → none
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

bzr version-info:

revision-id: <email address hidden>
date: 2010-08-04 12:43:51 +0300
build-date: 2010-08-04 15:27:53 +0300
revno: 2836
branch-nick: maria-5.2

core, binary, vardir, etc.

http://pstoev-askmonty.s3.amazonaws.com/var-bug613418.zip

Michael Widenius (monty)
Changed in maria:
milestone: none → 5.1
Revision history for this message
Michael Widenius (monty) wrote :

Fix committed in Maria 5.1

Changed in maria:
status: New → Fix Committed
Michael Widenius (monty)
Changed in maria:
status: Fix Committed → Fix Released
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

This is still repeatable even after the latest patch.

Changed in maria:
status: Fix Released → Confirmed
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

RQG command line:

# 2010-09-07T00:41:20 # /home/buildbot/randgen/runall.pl \
# 2010-09-07T00:41:20 # --queries=1M \
# 2010-09-07T00:41:20 # --engine=Maria \
# 2010-09-07T00:41:20 # --mysqld=--log-output=file \
# 2010-09-07T00:41:20 # --reporters=ErrorLog,Backtrace,Recovery,Shutdown \
# 2010-09-07T00:41:20 # --duration=240 \
# 2010-09-07T00:41:20 # --threads=20 \
# 2010-09-07T00:41:20 # --rows=1 \
# 2010-09-07T00:41:20 # --mask-level=0 \
# 2010-09-07T00:41:20 # --mysqld=--maria-repair-threads=2 \
# 2010-09-07T00:41:20 # --mysqld=--loose-maria-group-commit=soft \
# 2010-09-07T00:41:20 # --mysqld=--loose-maria_group_commit_interval=0 \
# 2010-09-07T00:41:20 # --mysqld=--maria-checkpoint-interval=0 \
# 2010-09-07T00:41:20 # --mysqld=--maria-block-size=16K \
# 2010-09-07T00:41:20 # --grammar=conf/engines/varchar.yy \
# 2010-09-07T00:41:20 # --gendata=conf/engines/varchar.zz \
# 2010-09-07T00:41:20 # --mask=51717 \
# 2010-09-07T00:41:20 # --queries=100000000 \
# 2010-09-07T00:41:20 # --mask=51717 \
# 2010-09-07T00:41:20 # --basedir=. \
# 2010-09-07T00:41:20 # --seed=1283809284 \
# 2010-09-07T00:41:20 # --vardir=/home/buildbot/rqg-buildbot-slave/rqg-weekend/vardir-maria_recovery_weekend/current

Revision history for this message
Michael Widenius (monty) wrote :

Tried to repeat by running the above test for 10 hours, but no success.
Could not repeat the problem with the log file either.

Changed in maria:
status: Confirmed → Incomplete
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

This is still repeatable with

revision-id: <email address hidden>
date: 2010-09-07 19:58:39 +0300
build-date: 2010-09-09 10:42:24 +0300
revno: 2924
branch-nick: 5.1

maria_read_log reports:

The transaction log starts from lsn (1,0x2007)
TRACE of the last maria_read_log
/home/buildbot/rqg-buildbot-slave/rqg-weekend/build/mysql-test//../storage/maria/maria_read_log: Maria engine: starting recovery
recovered pages: 0%maria_read_log: ma_key_recover.c:1062: _ma_apply_redo_index: Assertion `check_page_length == page_length' failed.

Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

Latest RQG command line that produced the problem

# 2010-09-09T07:46:51 # /home/buildbot/randgen/runall.pl \
# 2010-09-09T07:46:51 # --queries=1M \
# 2010-09-09T07:46:51 # --engine=Maria \
# 2010-09-09T07:46:51 # --mysqld=--log-output=file \
# 2010-09-09T07:46:51 # --mysqld=--maria_log_purge_type=at_flush \
# 2010-09-09T07:46:51 # --reporters=ErrorLog,Backtrace,Recovery,Shutdown \
# 2010-09-09T07:46:51 # --duration=30 \
# 2010-09-09T07:46:51 # --threads=1 \
# 2010-09-09T07:46:51 # --rows=1 \
# 2010-09-09T07:46:51 # --mask-level=2 \
# 2010-09-09T07:46:51 # --mysqld=--loose-maria-group-commit=soft \
# 2010-09-09T07:46:51 # --mysqld=--loose-maria_group_commit_interval=100 \
# 2010-09-09T07:46:51 # --mysqld=--maria-checkpoint-interval=120 \
# 2010-09-09T07:46:51 # --mysqld=--maria-block-size=16K \
# 2010-09-09T07:46:51 # --grammar=conf/engines/varchar.yy \
# 2010-09-09T07:46:51 # --gendata=conf/engines/varchar.zz \
# 2010-09-09T07:46:51 # --mask=62443 \
# 2010-09-09T07:46:51 # --queries=100000000 \
# 2010-09-09T07:46:51 # --mask=62443 \
# 2010-09-09T07:46:51 # --basedir=. \
# 2010-09-09T07:46:51 # --seed=time \
# 2010-09-09T07:46:51 # --vardir=/home/buildbot/rqg-buildbot-slave/rqg-weekend/vardir-maria_recovery_weekend/current

Changed in maria:
status: Incomplete → Opinion
status: Opinion → Confirmed
Revision history for this message
Michael Widenius (monty) wrote :

I run the test case but it failed with failures from maria_read_log like:

WARNING: Aria engine currently logs no records about insertion of data by ALTER TABLE and CREATE SELECT, as they are not necessary for recovery; present applying of log records to table './test/table1_maria_varchar_1024_not_null.MAI' may well not work

There is one cases when maria_read_log can't recreate all data. This is when one is doing inserting into the table with a bulk insert and then creating the index with repair. This happens in the following cases:
- Insert many values into an empty table
- INSERT ... SELECT into an empty table
- ALTER TABLE

From recovery point of few things are safe as the recovery log will log an undo entry that will restore the table to it's empty state if the bulk insert failed and if it succeeded the undo entry will be ignored.

Running maria_read_log on crashed data should work as everything before the checkpoint will be ignored and on bulk insert we have a check that the table is up to date.

One can force Aria to never do bulk insert, in which case maria_read_log should work. This is done by starting mysqld with the --safe-mode option.

With the --safe-mode option I was not able to repeat the problem.

I downloaded var-bug613418-2.zip, but maria_read_log -a -s worked perfectly in the master-data directory.
I then copied the maria_* and test/* from master-data to a MariaDB 5.1 installation and start MariaDB; The recovery worked without any problems.

How can I repeat the problem with the test file ?
What was the error message you got?

Michael Widenius (monty)
Changed in maria:
status: Confirmed → Incomplete
Revision history for this message
Michael Widenius (monty) wrote :

Closed as we haven't seen this issue in over a month (probably fixed)

Changed in maria:
status: Incomplete → Fix Committed
Michael Widenius (monty)
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers