mysqld got signal 11 , segfault

Bug #514157 reported by alecm3 on 2010-01-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona patches
Invalid
Undecided
Unassigned

Bug Description

When setting up a new slave (copied InnoDB .frm .ibd ib_logfile0 ib_logfile1, etc from
another slave), and starting new slave, mysql crashes like this:

InnoDB: The InnoDB memory heap is disabled
InnoDB: use atomic builtins.
InnoDB: Log scan progressed past the checkpoint lsn 345 3140792695
100128 21:31:33 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 345 3146035200
InnoDB: Doing recovery: scanned up to log sequence number 345 3151278080
InnoDB: Doing recovery: scanned up to log sequence number 345 3156520960
InnoDB: Doing recovery: scanned up to log sequence number 345 3161763840
InnoDB: Doing recovery: scanned up to log sequence number 345 3167006720
InnoDB: Doing recovery: scanned up to log sequence number 345 3172249600
InnoDB: Doing recovery: scanned up to log sequence number 345 3177492480
InnoDB: Doing recovery: scanned up to log sequence number 345 3182735360
InnoDB: Doing recovery: scanned up to log sequence number 345 3187978240
InnoDB: Doing recovery: scanned up to log sequence number 345 3193221120
InnoDB: Doing recovery: scanned up to log sequence number 345 3198464000
InnoDB: Doing recovery: scanned up to log sequence number 345 3203706880
InnoDB: Doing recovery: scanned up to log sequence number 345 3203925571
100128 21:31:37 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 100128 21:31:38 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_connections=1024
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 8519680 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x40000, backtrace may not be correct.
Bogus stack limit or frame pointer, fp=0x40000, stack_bottom=0x2afd969e0000,
thread_stack=262144, aborting backtrace.

----
I am using 5.0.87 highperf build 20, on Ubuntu 9.10-server (2.6.31) that I had to patch manually since it would not otherwise compile on this Ubuntu:
--- ha_innodb.orig
+++ ha_innodb.cc
@@ -6786,7 +6786,7 @@
     table->field[10]->store(index->name, strlen(index->name), system_charset_info);
     // fprintf(stderr, " index %s, table %s", index->name, index->table_name);

- if((p = strchr(index->table_name, '/')))
+ if((p = strchr((char *) index->table_name, '/')))
     {
      strncpy(db_name_raw, index->table_name, p-index->table_name);
      db_name_raw[p-index->table_name] = 0;

This slave is identical to 4 other slaves that have been successfully configured this
exact way, EXCEPT that the other slaves were on Ubuntu 8.04-server (2.6.24), and this one
is on Ubuntu 9.10-server ( 2.6.31 )

Some additional info:

root@db6:/log6# free
             total used free shared buffers cached
Mem: 33006912 12679028 20327884 0 264644 11361056
-/+ buffers/cache: 1053328 31953584
Swap: 4883752 0 4883752

root@db6:/log6# cat /proc/cpuinfo
....
processor : 15
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz

identical my.cnf to other slaves that work

strace:

.....

24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0
24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1603, NULL <unfinished ...>
24033 <... pread resumed> "\221t\1\30\0\0Gf\0\0\r\200\0\0\22\264\0\0\1T`}
`E\277\0\0\0\0\0\0"..., 16384, 299466752) = 16384
24033 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1604) = 1
24033 sched_yield( <unfinished ...>
24023 <... futex resumed> ) = 0
24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0
24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1605, NULL <unfinished ...>
24032 <... pread resumed>
"\226\337)u\0\0@\1\0\0\0\0\0\0\0\0\0\0\1Y\272*\244\353\0\0\0\0\0\0\0\0"..., 16384,
268451840) = 16384
24032 pread(20, <unfinished ...>
24033 <... sched_yield resumed> ) = 0
24033 futex(0x1f2bd78c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
24034 <... pread resumed>
")\6Ai\0\0Gi\0\0003?\0\0<\362\0\0\1Yg8.\333E\277\0\0\0\0\0\0"..., 16384, 299515904) =
16384
24034 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1606) = 1
24034 sched_yield( <unfinished ...>
24023 <... futex resumed> ) = 0
24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0
24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1607, NULL <unfinished ...>
24034 <... sched_yield resumed> ) = 0
24034 futex(0x1f2bd78c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
24029 <... pread resumed>
")-3\r\0\0Gt\0\0\6\306\0\0\31\245\0\0\1Yk\212\246\324E\277\0\0\0\0\0\0"..., 16384,
299696128) = 16384
24029 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1608) = 1
24029 sched_yield( <unfinished ...>
24023 <... futex resumed> ) = 0
24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0
24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1609, NULL <unfinished ...>
24028 <... pread resumed>
"\263^\240M\0\0@\244\0\0#\356\0\0\1$\0\0\1Y\2715\354\357E\277\0\0\0\0\0\0"..., 16384,
271122432) = 16384
24028 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1610) = 1
24028 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
24028 write(2, "InnoDB: Log scan progressed past"..., 2634) = 2634
24028 exit_group(1) = ?

another attempt:

.....

24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1677, NULL <unfinished ...>
24095 <... pread resumed> "\221t\1\30\0\0Gf\0\0\r\200\0\0\22\264\0\0\1T`}
`E\277\0\0\0\0\0\0"..., 16384, 299466752) = 16384
24095 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1678) = 1
24095 sched_yield( <unfinished ...>
24091 <... pread resumed>
"\226\337)u\0\0@\1\0\0\0\0\0\0\0\0\0\0\1Y\272*\244\353\0\0\0\0\0\0\0\0"..., 16384,
268451840) = 16384
24091 pread(20, <unfinished ...>
24084 <... futex resumed> ) = 0
24084 futex(0x3182220, FUTEX_WAKE_PRIVATE, 1) = 0
24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1679, NULL <unfinished ...>
24095 <... sched_yield resumed> ) = 0
24095 futex(0x1eccfc0c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
24090 <... pread resumed>
")\6Ai\0\0Gi\0\0003?\0\0<\362\0\0\1Yg8.\333E\277\0\0\0\0\0\0"..., 16384, 299515904) =
16384
24090 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1680) = 1
24090 sched_yield( <unfinished ...>
24084 <... futex resumed> ) = 0
24084 futex(0x3182220, FUTEX_WAKE_PRIVATE, 1) = 0
24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1681, NULL <unfinished ...>
24090 <... sched_yield resumed> ) = 0
24090 futex(0x1eccfc0c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
24093 <... pread resumed>
"\263^\240M\0\0@\244\0\0#\356\0\0\1$\0\0\1Y\2715\354\357E\277\0\0\0\0\0\0"..., 16384,
271122432) = 16384
24093 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1682) = 1
24093 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
24093 write(2, "InnoDB: Log scan progressed past"..., 2634) = 2634
24093 exit_group(1)

Without copying InnodDB tables from another slave, mysql starts normally. InnoDB tables
can then be created from SQL statements.

alecm3 (alecm-chatango) wrote :

This is not percona-specific: I reproduced it on 5.0.90 mysql source without percona patches, using different data from different slaves. It has to do with Ubuntu 9.10

I think the problem is...
"(copied InnoDB .frm .ibd ib_logfile0 ib_logfile1, etc from another slave)"

simple copy is not consistent at all. you should use consistent backuped files for that...

Recovery process of InnoDB is not made to recover inconsistent copied files.
It is for recovery from crushed only.

Download full text (9.8 KiB)

> simple copy is not consistent at all. you should use consistent
> backuped files for that...
>
> Recovery process of InnoDB is not made to recover inconsistent copied files.
> It is for recovery from crushed only.

I have two remarks here:
1) We set up a number of slaves this exact way (although not on this particular set of data), they all went through the recovery process successfully.
This method is described for example in http://addumb.com/blog/2009/07/04/mysql-slave-initialization-dos-and-donts/ (method 3)

2) As a user, I would still expect a message like "Error: 1033, Incorrect information in file: ‘./database/table.frm", instead of segfault and silent crash, even if the slave cloning process that we use is incorrect.

> -----Original Message-----
> From: <email address hidden> [mailto:<email address hidden>] On Behalf
> Of Yasufumi Kinoshita
> Sent: Sunday, January 31, 2010 11:58 PM
> To: <email address hidden>
> Subject: [Bug 514157] Re: mysqld got signal 11 , segfault
>
> I think the problem is...
> "(copied InnoDB .frm .ibd ib_logfile0 ib_logfile1, etc from another slave)"
>
> simple copy is not consistent at all. you should use consistent
> backuped files for that...
>
> Recovery process of InnoDB is not made to recover inconsistent copied files.
> It is for recovery from crushed only.
>
> --
> mysqld got signal 11 , segfault
> https://bugs.launchpad.net/bugs/514157
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in Patches for MySQL by Percona: New
>
> Bug description:
> When setting up a new slave (copied InnoDB .frm .ibd ib_logfile0 ib_logfile1,
> etc from
> another slave), and starting new slave, mysql crashes like this:
>
> InnoDB: The InnoDB memory heap is disabled
> InnoDB: use atomic builtins.
> InnoDB: Log scan progressed past the checkpoint lsn 345 3140792695
> 100128 21:31:33 InnoDB: Database was not shut down normally!
> InnoDB: Starting crash recovery.
> InnoDB: Reading tablespace information from the .ibd files...
> InnoDB: Restoring possible half-written data pages from the doublewrite
> InnoDB: buffer...
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3146035200
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3151278080
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3156520960
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3161763840
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3167006720
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3172249600
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3177492480
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3182735360
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3187978240
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3193221120
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3198464000
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3203706880
> InnoDB: Doing recovery: scanned up to log sequence number 345
> 3203925571
> 100128 21:31:37 InnoDB: Starting an apply batch of log records to the...

> This method is described for example in
> http://addumb.com/blog/2009/07/04/mysql-slave-initialization-dos-and-donts/ (method 3)

Do you mean
"Execute “stop slave;” on the slave and tar+scp or rsync the files to the new master." ?
This is wrong for InnoDB.
If you use simple cp or tar, you should shutdown the InnoDB.

Changed in percona-patches:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers