Corrupted rollback segment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
High
|
Seppo Jaakola |
Bug Description
InnoDB is crashing for trying to allocate too big chunk of memory:
090720 23:14:30 [Note] DEBUG: mm_galera.
trx seqno: 9896 9945 last_seen_trx: 9895 9896, cert: 0
090720 23:14:30 [Note] DEBUG: job_queue.
ting
090720 23:14:30 [Note] DEBUG: mm_galera.
: GALERA ws commit for: 9896 9945
090720 23:14:30 [Note] DEBUG: job_queue.
te
090720 23:14:30 InnoDB: Error: cannot allocate 4294964960 bytes of
Stack trace shows that undo log contains corrupted information:
gdb) bt
#0 0x0850ad98 in ut_malloc_low (n=4294962096, set_to_zero=1,
assert_
#1 0x0850ae63 in ut_malloc (n=4294962096) at ut/ut0mem.c:189
#2 0x084c8b17 in mem_area_alloc (size=4294962096, pool=0xa035ce8)
at mem/mem0pool.c:355
#3 0x084c7ed2 in mem_heap_
init_block=0x0, type=0, file_name=0xaa673cc "0vers.c", line=451)
at mem/mem0mem.c:362
#4 0x084c8129 in mem_heap_add_block (heap=0xaa673c8, n=4294962032)
at mem/mem0mem.c:465
#5 0x084c7173 in mem_heap_alloc (heap=0xaa673c8, n=4294962032)
at ../../storage/
#6 0x084fd4c2 in trx_undo_rec_copy (undo_rec=
at ../../storage/
#7 0x084fef95 in trx_undo_
{high = 0, low = 68883600}, heap=0xaa673c8) at trx/trx0rec.c:1202
#8 0x084ff000 in trx_undo_
trx_id={high = 0, low = 2250904}, undo_rec=
at trx/trx0rec.c:1240
#9 0x084ff197 in trx_undo_
index_
offsets=
at trx/trx0rec.c:1325
#10 0x084f29cd in row_vers_
mtr=0xa655200c, index=0xb6bb3868, offsets=0xa6551fac, view=0xb6bc6e68,
offset_
at row/row0vers.c:482
#11 0x084ebcc7 in row_sel_
clust_
offsets=
mtr=0xa655200c) at row/row0sel.c:2730
#12 0x084edcfd in row_search_
prebuilt=
#13 0x08480fbf in ha_innobase:
buf=0xa6421040 "�\002", key_ptr=0xab15860 "\002", key_len=4,
find_
#14 0x08397b9e in handler:
buf=0xa6421040 "�\002", key=0xab15860 "\002", keypart_map=1,
find_
#15 0x0838e89e in handler:
buf=0xa6421040 "�\002", index=0, key=0xab15860 "\002", keypart_map=1,
find_
#16 0x082d11dd in join_read_const (tab=0xab15590) at sql_select.cc:11596
#17 0x082e4e5b in join_read_
at sql_select.cc:11495
#18 0x082e8ee7 in make_join_
conds=
#19 0x082ec0ea in JOIN::optimize (this=0xab13ed0) at sql_select.cc:954
#20 0x082ef818 in mysql_select (thd=0xa647e400, rref_pointer_
tables=
og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0,
select_
select_
#21 0x082f503d in handle_select (thd=0xa647e400, lex=0xa647f57c,
result=
#22 0x0825f5e8 in execute_
at sql_parse.cc:5407
#23 0x0826035c in mysql_execute_
#24 0x0826c056 in mysql_parse (thd=0xa647e400,
inBuf=0xab13938 "SELECT * FROM comm03 WHERE p = 2", length=32,
found_
#25 0x0826daa6 in dispatch_command (command=COM_QUERY, thd=0xa647e400,
packet=
at sql_parse.cc:1350
#26 0x0826fa75 in do_command (thd=0xa647e400) at sql_parse.cc:933
#27 0x082573b8 in handle_
#28 0x00acb4d2 in start_thread () from /lib/i686/
#29 0x00a2148e in clone () from /lib/i686/
The crash happens with sqlgen test with ultimately high conflict rate (one table with 10 rows,
two nodes with 8 connections, R/W rate 70/30)
Related branches
Changed in codership-mysql: | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Seppo Jaakola (seppo-jaakola) |
Changed in codership-mysql: | |
status: | Fix Committed → Fix Released |
Analysis fronm query log show that the crashing query is select for a row, which has been modified by applier thread. (that's why select processing is looking for undo log)
The select query connection, suffered a brute force abort during previous transaction (which was rolled back). That aborted transaction had update(s), for this very same row.
Further, the query logs show that in all the crashes, there were involved transactions, which had two or more updates for the same row. Note, that total row count in the database is only 10, and sqlgen, when having long transactions, will generate update queries at random and they will quite often hit the same row.