innodb plugin - Corrupted rollback segment

Bug #484639 reported by Seppo Jaakola
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Fix Released
High
Seppo Jaakola

Bug Description

Running ultimate conflict rate test against two node cluster, configured to use innodb plugin, will result in segfault.
The test load was generated with sqlgen:

~/sqlgen --user=root --password=rootpass --port=3306 --host=abyssinian --host=bengal --create=0 --users=10 --duration=60

Only one node segfaults, here is the stack trace, from gdb session:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f4ddb49e950 (LWP 24174)]
0x00007f4e00babfb0 in mem_heap_create_block (heap=0x7f4dcc08fed0,
    n=18446744073709542546, type=0, file_name=0x7f4dcc08fed8 "0vers.c",
    line=528) at mem/mem0mem.c:357
357 mem/mem0mem.c: No such file or directory.
 in mem/mem0mem.c
Current language: auto; currently c
(gdb) bt
#0 0x00007f4e00babfb0 in mem_heap_create_block (heap=0x7f4dcc08fed0, n=18446744073709542546, type=0, file_name=0x7f4dcc08fed8 "0vers.c", line=528) at mem/mem0mem.c:357
#1 0x00007f4e00bac0fd in mem_heap_add_block (heap=0x7f4dcc08fed0, n=18446744073709542546) at mem/mem0mem.c:432
#2 0x00007f4e00bab017 in mem_heap_alloc (heap=0x7f4dcc08fed0, n=18446744073709542546) at ./include/mem0mem.ic:186
#3 0x00007f4e00bab9aa in mem_heap_dup (heap=0x7f4dcc08fed0, data=0x7f4de6b6236e, len=18446744073709542546) at mem/mem0mem.c:125
#4 0x00007f4e00bff36b in trx_undo_rec_copy (undo_rec=0x7f4de6b6236e "", heap=0x7f4dcc08fed0) at ./include/trx0rec.ic:110
#5 0x00007f4e00c015d7 in trx_undo_get_undo_rec_low (roll_ptr= {high = 0, low = 14164846}, heap=0x7f4dcc08fed0) at trx/trx0rec.c:1350
#6 0x00007f4e00c01654 in trx_undo_get_undo_rec (roll_ptr= {high = 0, low = 14164846}, trx_id={high = 0, low = 264366},
    undo_rec=0x7f4ddb49a658, heap=0x7f4dcc08fed0) at trx/trx0rec.c:1388
#7 0x00007f4e00c0187d in trx_undo_prev_version_build ( index_rec=0x7f4de6b8807e "\200", index_mtr=0x7f4ddb49ac20,
    rec=0x7f4dcc042056 "\200", index=0x3ea09d0, offsets=0x7f4ddb49a830, heap=0x7f4dcc08fed0, old_vers=0x7f4ddb49a750) at trx/trx0rec.c:1473
#8 0x00007f4e00bf175d in row_vers_build_for_consistent_read ( rec=0x7f4de6b8807e "\200", mtr=0x7f4ddb49ac20, index=0x3ea09d0, offsets=0x7f4ddb49ab78, view=0x4558720, offset_heap=0x7f4ddb49ab80,
    in_heap=0x44f9b00, old_vers=0x7f4ddb49ab50) at row/row0vers.c:559
#9 0x00007f4e00be81e2 in row_sel_build_prev_vers_for_mysql ( read_view=0x4558720, clust_index=0x3ea09d0, prebuilt=0x4504b80, rec=0x7f4de6b8807e "\200", offsets=0x7f4ddb49ab78, offset_heap=0x7f4ddb49ab80, old_vers=0x7f4ddb49ab50, mtr=0x7f4ddb49ac20) at row/row0sel.c:2823
#10 0x00007f4e00bea41a in row_search_for_mysql (buf=0x44f9928 "�", mode=2, prebuilt=0x4504b80, match_mode=1, direction=0) at row/row0sel.c:4140
#11 0x00007f4e00b8f77f in ha_innodb::index_read (this=0x44f9738, buf=0x44f9928 "�", key_ptr=0x46d69a0 "", key_len=4, find_flag=HA_READ_KEY_EXACT) at handler/ha_innodb.cc:5287
#12 0x00000000007889be in handler::index_read_map (this=0x44f9738, buf=0x44f9928 "�", key=0x46d69a0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at ../../sql/handler.h:1394
#13 0x000000000077f83b in handler::index_read_idx_map (this=0x44f9738, buf=0x44f9928 "�", index=0, key=0x46d69a0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at handler.cc:4285
#14 0x00000000006be636 in join_read_const (tab=0x46d64e8) at sql_select.cc:11593
#15 0x00000000006cb191 in join_read_const_table (tab=0x46d64e8, pos=0x7f4dcc1b30f0) at sql_select.cc:11492
#16 0x00000000006d5092 in make_join_statistics (join=0x7f4dcc1b3058, tables_arg=0x46d5640, conds=0x46d6398, keyuse_array=0x7f4dcc1b4618) at sql_select.cc:2772
#17 0x00000000006d6b8f in JOIN::optimize (this=0x7f4dcc1b3058) at sql_select.cc:954
#18 0x00000000006dabeb in mysql_select (thd=0x4705ae8, rref_pointer_array=0x4707c20, tables=0x46d5640, wild_num=1, fields=@0x4707b58, conds=0x46d5af8, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2148813312, result=0x46d5ce8, unit=0x4707628, select_lex=0x4707a50) at sql_select.cc:2384
#19 0x00000000006e04a4 in handle_select (thd=0x4705ae8, lex=0x4707588, result=0x46d5ce8, setup_tables_done_option=0) at sql_select.cc:268
#20 0x0000000000645d2b in execute_sqlcom_select (thd=0x4705ae8, all_tables=0x46d5640) at sql_parse.cc:5462
#21 0x00000000006485f1 in mysql_execute_command (thd=0x4705ae8) at sql_parse.cc:2535
#22 0x00000000006534cd in mysql_parse (thd=0x4705ae8, inBuf=0x46d5458 "SELECT * FROM comm01 WHERE p = 0", length=32, found_semicolon=0x7f4ddb49dbc8) at sql_parse.cc:6395
#23 0x0000000000654549 in dispatch_command (command=COM_QUERY, thd=0x4705ae8,
    packet=0x4708519 "SELECT * FROM comm01 WHERE p = 0", packet_length=32) at sql_parse.cc:1381
#24 0x0000000000656570 in do_command (thd=0x4705ae8) at sql_parse.cc:964
#25 0x000000000063ee43 in handle_one_connection (arg=0x4705ae8) at sql_connect.cc:1165
#26 0x00007f4e038103ba in start_thread () from /lib/libpthread.so.0
#27 0x00007f4e02564fcd in clone () from /lib/libc.so.6
#28 0x0000000000000000 in ?? ()

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

This issue looks similar to lp:408290

That "Corrupted rollback segment" issue, got fixed by merge to MySQL 5.1.37. Diagnosis was that innodb undo processing had a hidden bug, which got fixed in 5.1.37 version.

Changed in codership-mysql:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Seppo Jaakola (seppo-jaakola)
milestone: none → 0.8
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

It is probably worth to note that 18446744073709542546 in the block allocation functions is in fact -9070

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Wow,

This large number, which translates to -9070 is length of undo record, received by:
 len = mach_read_from_2(undo_rec)
  - ut_align_offset(undo_rec, UNIV_PAGE_SIZE);

And, the address to read from (undo_rec) is undo_page + offset.

#5 0x00007f4e00c015d7 in trx_undo_get_undo_rec_low (roll_ptr=
      {high = 0, low = 14164846}, heap=0x7f4dcc08fed0) at trx/trx0rec.c:1350
1350 trx/trx0rec.c: No such file or directory. in trx/trx0rec.c

(gdb) p offset
$3 = 9070

What's your magic number?

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

testing with UNIV_DEBUG revealed issues in the delayed conflict resolving phase. These were fixed in a series of change sets (last 2919), and after that the above mentioned sqlgen test passes.

Changed in codership-mysql:
status: In Progress → Fix Committed
milestone: 0.8 → 0.7
Changed in codership-mysql:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.