Corrupted rollback segment

Bug #408290 reported by Seppo Jaakola
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Fix Released
High
Seppo Jaakola

Bug Description

InnoDB is crashing for trying to allocate too big chunk of memory:

090720 23:14:30 [Note] DEBUG: mm_galera.c:process_query_write_set():955: remote
trx seqno: 9896 9945 last_seen_trx: 9895 9896, cert: 0

090720 23:14:30 [Note] DEBUG: job_queue.c:job_queue_start_job():144: job: 0 star
ting

090720 23:14:30 [Note] DEBUG: mm_galera.c:process_query_write_set_applying():907
: GALERA ws commit for: 9896 9945

090720 23:14:30 [Note] DEBUG: job_queue.c:job_queue_end_job():187: job: 0 comple
te

090720 23:14:30 InnoDB: Error: cannot allocate 4294964960 bytes of

Stack trace shows that undo log contains corrupted information:
gdb) bt
#0 0x0850ad98 in ut_malloc_low (n=4294962096, set_to_zero=1,
    assert_on_error=1) at ut/ut0mem.c:150
#1 0x0850ae63 in ut_malloc (n=4294962096) at ut/ut0mem.c:189
#2 0x084c8b17 in mem_area_alloc (size=4294962096, pool=0xa035ce8)
    at mem/mem0pool.c:355
#3 0x084c7ed2 in mem_heap_create_block (heap=0xaa673c8, n=4294962032,
    init_block=0x0, type=0, file_name=0xaa673cc "0vers.c", line=451)
    at mem/mem0mem.c:362
#4 0x084c8129 in mem_heap_add_block (heap=0xaa673c8, n=4294962032)
    at mem/mem0mem.c:465
#5 0x084c7173 in mem_heap_alloc (heap=0xaa673c8, n=4294962032)
    at ../../storage/innobase/include/mem0mem.ic:155
#6 0x084fd4c2 in trx_undo_rec_copy (undo_rec=0xad645490 "", heap=0xaa673c8)
    at ../../storage/innobase/include/trx0rec.ic:81
#7 0x084fef95 in trx_undo_get_undo_rec_low (roll_ptr=
      {high = 0, low = 68883600}, heap=0xaa673c8) at trx/trx0rec.c:1202
#8 0x084ff000 in trx_undo_get_undo_rec (roll_ptr={high = 0, low = 68883600},
    trx_id={high = 0, low = 2250904}, undo_rec=0xa6551ccc, heap=0xaa673c8)
    at trx/trx0rec.c:1240
#9 0x084ff197 in trx_undo_prev_version_build (index_rec=0xad5ec0d4 "\200",
    index_mtr=0xa655200c, rec=0xaa49b66 "\200", index=0xb6bb3868,
    offsets=0xa6551e18, heap=0xaa673c8, old_vers=0xa6551d6c)
    at trx/trx0rec.c:1325
#10 0x084f29cd in row_vers_build_for_consistent_read (rec=0xad5ec0d4 "\200",
    mtr=0xa655200c, index=0xb6bb3868, offsets=0xa6551fac, view=0xb6bc6e68,
    offset_heap=0xa6551fb0, in_heap=0xb6bc3c28, old_vers=0xa6551fa8)
    at row/row0vers.c:482
#11 0x084ebcc7 in row_sel_build_prev_vers_for_mysql (read_view=0xb6bc6e68,
    clust_index=0xb6bb3868, prebuilt=0xb6bd5868, rec=0xad5ec0d4 "\200",
    offsets=0xa6551fac, offset_heap=0xa6551fb0, old_vers=0xa6551fa8,
    mtr=0xa655200c) at row/row0sel.c:2730
#12 0x084edcfd in row_search_for_mysql (buf=0xa6421040 "�\002", mode=2,
    prebuilt=0xb6bd5868, match_mode=1, direction=0) at row/row0sel.c:4040
#13 0x08480fbf in ha_innobase::index_read (this=0xa6420ef0,
    buf=0xa6421040 "�\002", key_ptr=0xab15860 "\002", key_len=4,
    find_flag=HA_READ_KEY_EXACT) at handler/ha_innodb.cc:4644
#14 0x08397b9e in handler::index_read_map (this=0xa6420ef0,
    buf=0xa6421040 "�\002", key=0xab15860 "\002", keypart_map=1,
    find_flag=HA_READ_KEY_EXACT) at ../../sql/handler.h:1390
#15 0x0838e89e in handler::index_read_idx_map (this=0xa6420ef0,
    buf=0xa6421040 "�\002", index=0, key=0xab15860 "\002", keypart_map=1,
    find_flag=HA_READ_KEY_EXACT) at handler.cc:4261
#16 0x082d11dd in join_read_const (tab=0xab15590) at sql_select.cc:11596
#17 0x082e4e5b in join_read_const_table (tab=0xab15590, pos=0xab13f48)
    at sql_select.cc:11495
#18 0x082e8ee7 in make_join_statistics (join=0xab13ed0, tables_arg=0xab13a88,
    conds=0xab154c0, keyuse_array=0xab14fd4) at sql_select.cc:2761
#19 0x082ec0ea in JOIN::optimize (this=0xab13ed0) at sql_select.cc:954
#20 0x082ef818 in mysql_select (thd=0xa647e400, rref_pointer_array=0xa647f940,
    tables=0xab13a88, wild_num=1, fields=@0xa647f8dc, conds=0xab13d68,
    og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0,
    select_options=2148813312, result=0xab13eb8, unit=0xa647f5d8,
    select_lex=0xa647f848) at sql_select.cc:2372
#21 0x082f503d in handle_select (thd=0xa647e400, lex=0xa647f57c,
    result=0xab13eb8, setup_tables_done_option=0) at sql_select.cc:268
#22 0x0825f5e8 in execute_sqlcom_select (thd=0xa647e400, all_tables=0xab13a88)
    at sql_parse.cc:5407
#23 0x0826035c in mysql_execute_command (thd=0xa647e400) at sql_parse.cc:2480
#24 0x0826c056 in mysql_parse (thd=0xa647e400,
    inBuf=0xab13938 "SELECT * FROM comm03 WHERE p = 2", length=32,
    found_semicolon=0xa6554034) at sql_parse.cc:6340
#25 0x0826daa6 in dispatch_command (command=COM_QUERY, thd=0xa647e400,
    packet=0xa64964b9 "SELECT * FROM comm03 WHERE p = 2", packet_length=32)
    at sql_parse.cc:1350
#26 0x0826fa75 in do_command (thd=0xa647e400) at sql_parse.cc:933
#27 0x082573b8 in handle_one_connection (arg=0xa647e400) at sql_connect.cc:1156
#28 0x00acb4d2 in start_thread () from /lib/i686/nosegneg/libpthread.so.0
#29 0x00a2148e in clone () from /lib/i686/nosegneg/libc.so.6

The crash happens with sqlgen test with ultimately high conflict rate (one table with 10 rows,
two nodes with 8 connections, R/W rate 70/30)

Related branches

Changed in codership-mysql:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Seppo Jaakola (seppo-jaakola)
Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Analysis fronm query log show that the crashing query is select for a row, which has been modified by applier thread. (that's why select processing is looking for undo log)

The select query connection, suffered a brute force abort during previous transaction (which was rolled back). That aborted transaction had update(s), for this very same row.

Further, the query logs show that in all the crashes, there were involved transactions, which had two or more updates for the same row. Note, that total row count in the database is only 10, and sqlgen, when having long transactions, will generate update queries at random and they will quite often hit the same row.

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Testing with current runk version, does not bring this issue anymore.

Tested with: http://bazaar.launchpad.net/~codership/codership-mysql/dev/revision/2878

The change in the trunk is essentially just merging with MySQL 5.1.37

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

closing this case as problem does not reproduce anymore after 5.1.37 merging

Changed in codership-mysql:
milestone: none → 0.7
status: In Progress → Fix Committed
Changed in codership-mysql:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.