Innodb crash on JOIN query when "join_buffer_size" > "join_buffer_space_limit"

Bug #925985 reported by Troex Nevelin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Critical
Igor Babaev

Bug Description

I don't how to describe the problem but the query below crashes innodb (xtradb). It does not happening on mysql 5.1 & 5.5
I'm using 5.3.3-MariaDB-rc-mariadb108~squeeze-log from mariadb repository on debian 6 64bit

Query:

SELECT DISTINCT
B.TYPE_SID,
B.ID BANNER_ID,
B.WEIGHT BANNER_WEIGHT,
B.SHOWS_FOR_VISITOR,
B.FIX_CLICK,
B.FIX_SHOW,
B.KEYWORDS BANNER_KEYWORDS,
DATE_FORMAT(B.DATE_SHOW_FIRST, '%d.%m.%Y %H:%i:%s') DATE_SHOW_FIRST,
DATE_FORMAT(B.DATE_SHOW_FROM, '%d.%m.%Y %H:%i:%s') DATE_SHOW_FROM,
DATE_FORMAT(B.DATE_SHOW_TO, '%d.%m.%Y %H:%i:%s') DATE_SHOW_TO,
B.FLYUNIFORM FLYUNIFORM,
B.MAX_SHOW_COUNT MAX_SHOW_COUNT,
B.SHOW_COUNT SHOW_COUNT,
C.ID CONTRACT_ID,
C.WEIGHT CONTRACT_WEIGHT,
C.KEYWORDS CONTRACT_KEYWORDS
FROM
b_adv_type T

INNER JOIN b_adv_banner B ON (
B.ACTIVE='Y'
and B.TYPE_SID = T.SID
and B.STATUS_SID = 'PUBLISHED'
and (B.FOR_NEW_GUEST is null or B.FOR_NEW_GUEST='N')
and (ifnull(B.MAX_SHOW_COUNT,0)>ifnull(B.SHOW_COUNT,0) or ifnull(B.MAX_SHOW_COUNT,0)=0)
and (ifnull(B.MAX_CLICK_COUNT,0)>ifnull(B.CLICK_COUNT,0) or ifnull(B.MAX_CLICK_COUNT,0)=0)
and (ifnull(B.MAX_VISITOR_COUNT,0)>ifnull(B.VISITOR_COUNT,0) or ifnull(B.MAX_VISITOR_COUNT,0)=0)
and (B.DATE_SHOW_FROM<=now() or B.DATE_SHOW_FROM is null or length(B.DATE_SHOW_FROM)<=0)
and (B.DATE_SHOW_TO>=now() or B.DATE_SHOW_TO is null or length(B.DATE_SHOW_TO)<=0))

INNER JOIN b_adv_banner_2_site BS ON (
BS.BANNER_ID = B.ID
and BS.SITE_ID = 'en')

INNER JOIN b_adv_contract C ON (
C.ID = B.CONTRACT_ID
and C.ACTIVE='Y'
and (ifnull(C.MAX_SHOW_COUNT,0)>ifnull(C.SHOW_COUNT,0) or ifnull(C.MAX_SHOW_COUNT,0)=0)
and (ifnull(C.MAX_CLICK_COUNT,0)>ifnull(C.CLICK_COUNT,0) or ifnull(C.MAX_CLICK_COUNT,0)=0)
and (ifnull(C.MAX_VISITOR_COUNT,0)>ifnull(C.VISITOR_COUNT,0) or ifnull(C.MAX_VISITOR_COUNT,0)=0)
and (C.DATE_SHOW_FROM<=now() or C.DATE_SHOW_FROM is null or length(C.DATE_SHOW_FROM)<=0)
and (C.DATE_SHOW_TO>=now() or C.DATE_SHOW_TO is null or length(C.DATE_SHOW_TO)<=0))

INNER JOIN b_adv_contract_2_site CS ON (
CS.CONTRACT_ID = B.CONTRACT_ID
and CS.SITE_ID = 'en')

INNER JOIN b_adv_contract_2_type CT ON (
CT.CONTRACT_ID = C.ID
and (CT.TYPE_SID = 'ALL' or CT.TYPE_SID = T.SID))

INNER JOIN b_adv_banner_2_weekday BW ON (
BW.BANNER_ID = B.ID
and BW.C_WEEKDAY='FRIDAY'
and BW.C_HOUR = '11')

INNER JOIN b_adv_contract_2_weekday CW ON (
CW.CONTRACT_ID = C.ID
and CW.C_WEEKDAY='FRIDAY'
and CW.C_HOUR = '11')

LEFT JOIN b_adv_banner_2_group UG1 ON (
(UG1.BANNER_ID = B.ID
 and UG1.GROUP_ID in (1,2) and UG1.GROUP_ID<>2)
)

LEFT JOIN b_adv_banner_2_page BP1 ON (
BP1.BANNER_ID = B.ID
and BP1.SHOW_ON_PAGE='Y')

LEFT JOIN b_adv_banner_2_page BP2 ON (
BP2.BANNER_ID = B.ID
and BP2.SHOW_ON_PAGE='N'
and '/club/index.php' like concat(BP2.PAGE, '%'))

LEFT JOIN b_adv_contract_2_page CP1 ON (
CP1.CONTRACT_ID = C.ID
and CP1.SHOW_ON_PAGE='Y')

LEFT JOIN b_adv_contract_2_page CP2 ON (
CP2.CONTRACT_ID = C.ID
and CP2.SHOW_ON_PAGE='N'
and '/club/index.php' like concat(CP2.PAGE, '%'))

LEFT JOIN b_adv_banner_2_stat_adv BA ON BA.BANNER_ID = B.ID
LEFT JOIN b_adv_banner_2_country BC ON BC.BANNER_ID = B.ID AND (
(
(B.STAT_TYPE is null OR length(B.STAT_TYPE)=0 OR B.STAT_TYPE='COUNTRY')
AND BC.COUNTRY_ID='UA'
) OR (
B.STAT_TYPE='REGION'
AND BC.COUNTRY_ID='UA'
AND BC.REGION=''
) OR (
B.STAT_TYPE='CITY'
AND BC.CITY_ID='2'
)
)

WHERE
T.ACTIVE = 'Y'

and (
B.STAT_COUNT is null
or B.STAT_COUNT = 0
or BC.BANNER_ID is not null
)
and BP2.ID is null
and CP2.ID is null
and (BP1.ID is null or '/club/index.php' like concat(BP1.PAGE, '%'))
and (CP1.ID is null or '/club/index.php' like concat(CP1.PAGE, '%'))
and (BA.STAT_ADV_ID is null or BA.STAT_ADV_ID='0')
and (BC.COUNTRY_ID is null or BC.COUNTRY_ID='UA')
 and
 (
(B.SHOW_USER_GROUP = 'Y' and UG1.GROUP_ID is not null)
or
(B.SHOW_USER_GROUP <> 'Y' and UG1.GROUP_ID is null)
 )
ORDER BY B.TYPE_SID desc, C.ID desc

This query is used in most popular russian CMS (bitrix) upon user login. I would like to provide more info if needed

Tags: 5.3 crash innodb
Revision history for this message
Troex Nevelin (troex) wrote :
Revision history for this message
Troex Nevelin (troex) wrote :

Downgraded to 5.2.10-MariaDB-mariadb107~squeeze-log without re-creating tables - now this query works fine

Revision history for this message
Elena Stepanova (elenst) wrote :

Hi,

Could you please provide information about the tables used in the query?
Ideally, the full database dump, but if the contents is confidential and can't be obfuscated, then at least the structure (e.g. the output of SHOW CREATE TABLE) of the involved tables, and, if possible, some idea about the size of the tables.

You can use the private section on ftp.askmonty.org to upload the data you don't want to make public.

Thanks.

Revision history for this message
Troex Nevelin (troex) wrote :

Okay, I figured out what exactly lead to the problem.
This problem was not happening with default server configuration, but only happend with my config, after excluding options one by one I found that option which is causing crash - "join_buffer_size = 4M".

Setting this option down to 2M or less will not cause crash. As I see "join_buffer_size" was bigger than "join_buffer_space_limit" which is by default 2M.

Setting "join_buffer_space_limit = 4M" and "join_buffer_size = 4M" did the trick and that query does not crash the DB anymore.

If you still need the database sample - let me know, I'll upload it to the private ftp area.

Revision history for this message
Elena Stepanova (elenst) wrote :

Yes, please upload it anyway.
Thanks.

Revision history for this message
Troex Nevelin (troex) wrote :

I've extracted part of db which doesn't contain any private info. The error is still happening with query and setting described above. Correct result of this query must return one row.

Revision history for this message
Troex Nevelin (troex) wrote :
description: updated
summary: - innodb crash on big query (please review)
+ Innodb crash on JOIN query when join_buffer_size >
+ join_buffer_space_limit
summary: - Innodb crash on JOIN query when join_buffer_size >
- join_buffer_space_limit
+ Innodb crash on JOIN query when "join_buffer_size" >
+ "join_buffer_space_limit"
Revision history for this message
Elena Stepanova (elenst) wrote :
Download full text (4.6 KiB)

Thank you.
I suppose even if the combination of parameters is not very smart, it is not bad enough to allow server to die.
I'm attaching a somewhat reduced MTR test case.

Crash is reproducible on release builds (5.3.3, 5.3.4 from buildbot).

Debug builds don't crash, but cause valgrind warnings:

==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x8329AC6: JOIN_CACHE::join_matching_records(bool) (sql_join_cache.cc:2202)
==16502== by 0x83297C2: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2062)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>*, st_table*, Procedure*) (sql_select.cc:15038)
==16502== by 0x8357012: JOIN::exec() (sql_select.cc:2268)
==16502==
==16502== Conditional jump or move depends on uninitialised value(s)
==16502== at 0x832A0CF: JOIN_CACHE::join_null_complements(bool) (sql_join_cache.cc:2496)
==16502== by 0x83298D1: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2096)
==16502== by 0x8329805: JOIN_CACHE::join_records(bool) (sql_join_cache.cc:2076)
==16502== by 0x8374A7E: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15154)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x8374AA7: sub_select_cache(JOIN*, st_join_table*, bool) (sql_select.cc:15156)
==16502== by 0x8374C90: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:15316)
==16502== by 0x837478A: do_select(JOIN*, List<Item>*, st_table*, Procedure*) (sql_sel...

Read more...

Changed in maria:
milestone: none → 5.3
assignee: nobody → Igor Babaev (igorb-seattle)
Changed in maria:
status: New → Confirmed
importance: Undecided → High
Changed in maria:
status: Confirmed → In Progress
Revision history for this message
Igor Babaev (igorb-seattle) wrote :

The following simple test case crashes the server (debug version) due to the same problem.

CREATE TABLE t1 (a int);
INSERT INTO t1 VALUES (5), (3);

CREATE TABLE t2 (a int, b int);
INSERT INTO t2 VALUES
  (3,30), (1,10), (7,70), (2,20),
  (3,31), (1,11), (7,71), (2,21),
  (3,32), (1,12), (7,72), (2,22);

CREATE TABLE t3 (b int, c int);
INSERT INTO t3 VALUES (32, 302), (42,400), (30,300);

set @tmp_optimizer_switch=@@optimizer_switch;
set optimizer_switch='optimize_join_buffer_size=off';
set join_buffer_space_limit=4096;
set join_buffer_size=4096*2;
set join_cache_level=2;
set optimizer_switch='outer_join_with_cache=on';

EXPLAIN
SELECT * FROM t1, t2 LEFT JOIN t3 ON t2.b=t3.b WHERE t1.a=t2.a;
SELECT * FROM t1, t2 LEFT JOIN t3 ON t2.b=t3.b WHERE t1.a=t2.a;

set join_buffer_space_limit=default;
set join_buffer_size=default;
set join_cache_level=default;
set optimizer_switch=@tmp_optimizer_switch;

DROP TABLE t1,t2,t3;

Changed in maria:
importance: High → Critical
Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.