Crash in create_internal_tmp_table or OOM or internal error: 'Using too big key for internal temp tables' with derived_with_keys=on, FROM subquery or view

Bug #957409 reported by jocelyn fournier
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Undecided
Igor Babaev

Bug Description

Hi,

I've just hit a bug with MariaDB 5.3.5-ga. The stack trace is the following :

Mar 15 16:57:46 mysql mysqld: 2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
Mar 15 16:57:46 mysql mysqld: This could be because you hit a bug. It is also possible that this binary
Mar 15 16:57:46 mysql mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Mar 15 16:57:46 mysql mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Mar 15 16:57:46 mysql mysqld:
Mar 15 16:57:46 mysql mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
Mar 15 16:57:46 mysql mysqld:
Mar 15 16:57:46 mysql mysqld: We will try our best to scrape up some info that will hopefully help
Mar 15 16:57:46 mysql mysqld: diagnose the problem, but since we have already crashed,
Mar 15 16:57:46 mysql mysqld: something is definitely wrong and this may fail.
Mar 15 16:57:46 mysql mysqld:
Mar 15 16:57:46 mysql mysqld: Server version: 5.3.5-MariaDB-ga
Mar 15 16:57:46 mysql mysqld: key_buffer_size=1073741824
Mar 15 16:57:46 mysql mysqld: read_buffer_size=4194304
Mar 15 16:57:46 mysql mysqld: max_used_connections=24
Mar 15 16:57:46 mysql mysqld: max_threads=201
Mar 15 16:57:46 mysql mysqld: thread_count=19
Mar 15 16:57:46 mysql mysqld: connection_count=19
Mar 15 16:57:46 mysql mysqld: It is possible that mysqld could use up to
Mar 15 16:57:46 mysql mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2697769 K bytes of memory
Mar 15 16:57:46 mysql mysqld: Hope that's ok; if not, decrease some variables in the equation.
Mar 15 16:57:46 mysql mysqld:
Mar 15 16:57:46 mysql mysqld: Thread pointer: 0xda56fe0
Mar 15 16:57:46 mysql mysqld: Attempting backtrace. You can use the following information to find out
Mar 15 16:57:46 mysql mysqld: where mysqld died. If you see no messages after this, something went
Mar 15 16:57:46 mysql mysqld: terribly wrong...

Mar 15 16:57:46 mysql mysqld: stack_bottom = 7fc0f7ccae98 thread_stack 0x3c000
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(my_print_stacktrace+0x2e) [0xa2ab6e]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(handle_fatal_signal+0x3f9) [0x761e89]
Mar 15 16:57:46 mysql mysqld: /lib/libpthread.so.0(+0xef60) [0x7fc29fe9ef60]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(create_internal_tmp_table(st_table*, st_key*, st_maria_columndef*, st_maria_columndef**, unsigned long long)+0x218) [0x6afc58]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(mysql_derived_create(THD*, st_lex*, TABLE_LIST*)+0xc4) [0x7ad5a4]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(mysql_handle_single_derived(st_lex*, TABLE_LIST*, unsigned int)+0xb8) [0x7ae5e8]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(st_join_table::preread_init()+0xa1) [0x6aaf91]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(sub_select(JOIN*, st_join_table*, bool)+0x496) [0x6bcd16]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld() [0x6bce63]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(JOIN::exec()+0x852) [0x6cf6d2]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*)+0x185) [0x6d17b5]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(handle_select(THD*, st_lex*, select_result*, unsigned long)+0x16f) [0x6d21ef]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld() [0x6480ce]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(mysql_execute_command(THD*)+0x39c8) [0x64dc38]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(mysql_parse(THD*, char*, unsigned int, char const**)+0x299) [0x650a19]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0xa72) [0x6518e2]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(do_command(THD*)+0x101) [0x652431]
Mar 15 16:57:46 mysql mysqld: ./bin/mysqld(handle_one_connection+0xfd) [0x64391d]
Mar 15 16:57:46 mysql mysqld: /lib/libpthread.so.0(+0x68ba) [0x7fc29fe968ba]
Mar 15 16:57:46 mysql mysqld: /lib/libc.so.6(clone+0x6d) [0x7fc29f46002d]

The query :

SELECT tmp.*, m.photo , m.login member_name, m.sexe, pv.name plateform_name FROM ((SELECT mp.product_id id, mp.date_review, mp.status, mp.plateform_id, mp.id review_id, mp.member_id, mp.note, mp.nb_comment, mp.review_all, mp.review_like, mp.selection, mp.conso_future, mpd.review, mpd.review_status, mpd.title_review from sc_member_product_light mp LEFT JOIN sc_member_product_detail mpd ON mpd.member_id = mp.member_id AND mpd.product_id = mp.product_id where mp.product_id = 366123681178224 AND is_review = 1 AND mp.status = 0 AND mp.note > 5 order by review_like DESC, review_all ASC, note DESC limit 1) UNION (SELECT mp.product_id id, mp.date_review, mp.status, mp.plateform_id, mp.id review_id, mp.member_id, mp.note, mp.nb_comment, mp.review_all, mp.review_like, mp.selection, mp.conso_future, mpd.review, mpd.review_status, mpd.title_review from sc_member_product_light mp LEFT JOIN sc_member_product_detail mpd ON mpd.member_id = mp.member_id AND mpd.product_id = mp.product_id where mp.product_id = 366123681178224 AND is_review = 1 AND mp.status = 0 AND mp.note <= 5 order by review_like DESC, review_all ASC, note ASC limit 1)) as tmp JOIN sc_member m ON m.id = tmp.member_id LEFT JOIN sc_param_values pv ON pv.id = tmp.plateform_id

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

and execution plan

+----+--------------+------------+--------+----------------------------------------------------+-----------------------+---------+-----------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+--------+----------------------------------------------------+-----------------------+---------+-----------------------+------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | key0 | NULL | NULL | NULL | 2 | |
| 1 | PRIMARY | pv | eq_ref | PRIMARY | PRIMARY | 4 | tmp.plateform_id | 1 | |
| 1 | PRIMARY | m | eq_ref | PRIMARY,member_photo | PRIMARY | 4 | tmp.member_id | 1 | |
| 2 | DERIVED | mp | range | note,is_review,product_2,member_2,product_3,status | product_3 | 12 | NULL | 201 | Using index condition; Using where; Using filesort |
| 2 | DERIVED | mpd | eq_ref | unique_product_member,product,member,member_2 | unique_product_member | 12 | sc.mp.member_id,const | 1 | Using where |
| 3 | UNION | mp | range | note,is_review,product_2,member_2,product_3,status | product_3 | 12 | NULL | 129 | Using index condition; Using where; Using filesort |
| 3 | UNION | mpd | eq_ref | unique_product_member,product,member,member_2 | unique_product_member | 12 | sc.mp.member_id,const | 1 | Using where |
| NULL | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+--------+----------------------------------------------------+-----------------------+---------+-----------------------+------+----------------------------------------------------+
8 rows in set (0,24 sec)

The crash is quite random. The query could finish successfully or just issue a
ERROR 5 (HY000): Out of memory (Needed 1730537048 bytes)

Before trying to dig more and try to make a reproducable testcase, is there any specific parameters I could control which could explain the reported memory allocation ?

Thanks and regards,
  Jocelyn Fournier

Revision history for this message
jocelyn fournier (joce) wrote :

Hi,

Actually, I've been able to reproduce one time the crash with the following table and datas, with the same query as above :

http://www.wiktik.com/error.tar.gz

HTH,
  Jocelyn

Revision history for this message
Elena Stepanova (elenst) wrote :

Hi Jocelyn,

How much memory do you actually have?

As indicated in the error log, key_buffer_size=1073741824 --
so, key_buffer_size alone takes 1Gb (default is 128 Mb); you might also have other non-default options which increase the memory usage further, check various %_buffer_size values in your cnf file or SHOW GLOBAL VARIABLES output. If your HW has little memory available, and you are using a 'big' config, it might be a problem.

That said, we had a very similar issue reported once (https://bugs.launchpad.net/maria/+bug/929603), for 5.3.3-rc, but it does not seem to be reproducible with the reporter's use case and data any longer, so we would greatly appreciate if you could provide more info about yours: DDL for tables used in the query, and, if possible, the data dump (it can be uploaded to ftp://ftp.askmonty.org. If you cannot provide the table contents, please at least give us the idea about the amount of data stored there. The DDLs (SHOW CREATE TABLE <table name> results) are still very important.

Thank you.

Revision history for this message
Elena Stepanova (elenst) wrote :

Okay, I have not seen your comment when I was sending mine. I will try your test case, thank you.

Revision history for this message
Elena Stepanova (elenst) wrote :

Hi Jocelyn,
I am getting "Oups, cette page n'existe pas ! " trying to use that link...

Revision history for this message
jocelyn fournier (joce) wrote :

Yup sorry, wrong link.

The right one :

http://static1.wiktik.com/error.tar.gz

Note that the latest query execution has produced the following error :

ERROR 1928 (HY000): Internal error: 'Using too big key for internal temp tables'

Revision history for this message
jocelyn fournier (joce) wrote :

Note also I've the following specific settings for Aria :

aria_pagecache_buffer_size = 1G
aria_sort_buffer_size = 2G

Revision history for this message
jocelyn fournier (joce) wrote :

The problem is really random, so you can need to import several times the sql dump before getting an issue while executing the query.

Revision history for this message
Elena Stepanova (elenst) wrote :

Jocelyn,

thank you, I got both OOM and 'Using too big key for internal temp tables'.
I will see if I can make the test case more deterministic.

Changed in maria:
assignee: nobody → Elena Stepanova (elenst)
Elena Stepanova (elenst)
summary: - Crash in create_internal_tmp_table
+ Crash in create_internal_tmp_table or out of memory or internal error:
+ 'Using too big key for internal temp tables' with derived_with_keys=on,
+ FROM subquery or view
summary: - Crash in create_internal_tmp_table or out of memory or internal error:
- 'Using too big key for internal temp tables' with derived_with_keys=on,
- FROM subquery or view
+ Crash in create_internal_tmp_table or OOM or internal error: 'Using too
+ big key for internal temp tables' with derived_with_keys=on, FROM
+ subquery or view
Revision history for this message
Elena Stepanova (elenst) wrote :

Hi Jocelyn,

It seems that the problem has been fixed in the current maria/5.3 tree in revision 3459, along with the bug https://bugs.launchpad.net/maria/+bug/953649. If you build binaries yourself, please pick up the latest code and try if it works for you. If you are using release builds, please try to switch off 'derived_with_keys' in optimizer switch, as a workaround till the next version is released: add optimizer_switch=derived_with_keys=off to your cnf file (and set it globally on the server, if you don't want to restart it at the moment).

If you do either of that, please let us know whether it worked for you.

Meanwhile, I will ask Igor who fixed the other bug whether it should have been expected to fix this one, too, or it just happened to mask it and we still need to do some work about it.

Thank you.

Revision history for this message
Elena Stepanova (elenst) wrote :
Download full text (6.6 KiB)

Hi Igor,

As said above, the problem described in the bug is reproducible on maria-5.3 up to and including revno 3458, and but not on revno 3459 and further. Revision 3459 is your fix for https://bugs.launchpad.net/maria/+bug/953649. Could you please take a look and advise whether it is expected to fix the problem(s) described here, too, or it just changed something so that test cases provided here don't trigger the issue any longer, while it might still exist?

I hope you will be able to say just looking at it, but I will also attach a test case which is not minimal, but is considerably smaller than the initial one. I run it as "perl ./mtr test_oom", without any custom parameters.

The offending query in it is

SELECT COUNT(*) FROM (
SELECT * FROM sc_member_product_detail
UNION
SELECT * FROM sc_member_product_detail
) tmp, sc_member m
WHERE m.id = tmp.member_id;

COUNT is not important, it is here just for convenience as sometimes the query returns thousands of rows instead of expected 1.
With derived_with_keys=off everything works fine.
Replacing the subquery with a view does not eliminate a problem.

The issue appears in different ways, with the same or similar data and query it can be a crash, or out of memory error, or 'Using too big key for internal temp tables', or even a ridiculous result. However, valgrind warnings are all similar if not identical:

Conditional jump or move depends on uninitialised value(s)
at 0xC3D2DA: alloc_root (my_alloc.c:199)
by 0x795D18: create_internal_tmp_table(st_table*, st_key*, st_maria_columndef*, st_maria_columndef**, unsigned long long) (sql_select.cc:14459)
by 0x8FA01C: mysql_derived_create(THD*, st_lex*, TABLE_LIST*) (sql_derived.cc:814)
by 0x8F8EDA: mysql_handle_single_derived(st_lex*, TABLE_LIST*, unsigned int) (sql_derived.cc:176)
by 0x78AC56: st_join_table::preread_init() (sql_select.cc:9897)
by 0x797C19: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15387)
by 0x798614: evaluate_join_record(JOIN*, st_join_table*, int) (sql_select.cc:15629)
by 0x797E5E: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15432)
by 0x79762C: do_select(JOIN*, List<Item>*, st_table*, Procedure*) (sql_select.cc:15093)
by 0x778E32: JOIN::exec() (sql_select.cc:2731)
by 0x7796CC: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned lon
g long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2952)
by 0x770294: handle_select(THD*, st_lex*, select_result*, unsigned long) (sql_select.cc:285)
by 0x6FEB03: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5151)
by 0x6F5BC0: mysql_execute_command(THD*) (sql_parse.cc:2284)
by 0x7014DE: mysql_parse(THD*, char*, unsigned int, char const**) (sql_parse.cc:6152)
by 0x6F336F: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1228)
Conditional jump or move depends on uninitialised value(s)
at 0xC3D349: alloc_root (my_alloc.c:209)
by 0x795D18: create_internal_tmp_table(st_table*, st_key*, st_maria_columndef*, st_maria_columndef**, unsigned long long) (sql_select.cc:14459)
by 0x8FA01C: mysql_derived_create(THD*, st_lex*, TABLE_LIST*) (sql_...

Read more...

Revision history for this message
Elena Stepanova (elenst) wrote :
Changed in maria:
milestone: none → 5.3
assignee: Elena Stepanova (elenst) → Igor Babaev (igorb-seattle)
Revision history for this message
Igor Babaev (igorb-seattle) wrote :

I confirm that the bug is reproducible with 5.3 rev.3458

Changed in maria:
status: New → Confirmed
Revision history for this message
Igor Babaev (igorb-seattle) wrote :

I checked that this was another manifestation of bug #953649 fixed in rev.3459.

Changed in maria:
status: Confirmed → Fix Committed
Revision history for this message
Elena Stepanova (elenst) wrote :

Fix for bug #953649 was released in 5.3.6

Changed in maria:
status: Fix Committed → Fix Released
Revision history for this message
Pietro Baricco (pietro-baricco) wrote :

Hello, the bug seems to be still there in 5.5.28a
I have prepared a zip file containing the test dataset, the offending query and server specs and configuration to help you fix it.
Many thanks!

http://109.74.195.95/test.zip

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers