main.create-big fails on mariadb 5.2 (64bit linux)

Bug #491097 reported by Oleksandr "Sanja" Byelkin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Undecided
Oleksandr "Sanja" Byelkin

Bug Description

Build with ./BUILD/compile-amd64-debug-max-no-ndb (I think it does not matter) main.create-big fails:
CURRENT_TEST: main.create-big
--- /home/bell/maria/bzr/work-maria-5.2-test/mysql-test/r/create-big.result 2009-12-02 00:58:10.216015662 +0300
+++ /home/bell/maria/bzr/work-maria-5.2-test/mysql-test/r/create-big.reject 2009-12-02 01:10:32.209627135 +0300
@@ -71,7 +71,6 @@
 select * from t1;
 i
 1
-2
 drop table t1;
 set @a:=0;
 create table t1 select 1 as i;;
@@ -96,7 +95,6 @@
 select * from t1;
 i
 1
-2
 drop table t1;
 set @a:=0;
 create table t1 select 1 as i;;

mysqltest: Result length mismatch

Revision history for this message
Hakan Küçükyılmaz (hakan-askmonty) wrote :

Can't repeat:

BUILD/compile-amd64-debug-max-no-ndb

./mysql-test-run.pl create-big
Logging: ./mysql-test-run.pl create-big
MySQL Version 5.2.0
Checking supported features...
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /home/hakan/work/monty_program/maria-5.2/mysql-test/var
Removing old var directory...
Creating var directory '/home/hakan/work/monty_program/maria-5.2/mysql-test/var'...
Installing system database...
Using server port 57903

==============================================================================

TEST RESULT TIME (ms)
------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
main.create-big [ pass ] 171914
------------------------------------------------------------
The servers were restarted 0 times
Spent 171.914 of 183 seconds executing testcases

All 1 tests were successful.

Changed in maria:
status: New → Incomplete
assignee: nobody → Hakan Küçükyılmaz (hakan-askmonty)
assignee: Hakan Küçükyılmaz (hakan-askmonty) → Oleksandr "Sanja" Byelkin (sanja-byelkin)
Revision history for this message
Oleksandr "Sanja" Byelkin (sanja-byelkin) wrote :

Actually it fails in the same way on 5.1.n Fail is sporadical but rate is much higher then 50%.

Here is minimal test suite (number of connections is important even if all they are not used, so maybe my 4 core is also important):

# Create auxilliary connections
connect (addconroot1, localhost, root,,);
connect (addconroot2, localhost, root,,);
connect (addconroot3, localhost, root,,);
connection default;

# What happens if other statement sneaks in after the table creation but
# before its opening ?
set session debug="-d,sleep_create_select_before_create:+d,sleep_create_select_before_open";
# Concurrent INSERT
--send create table t1 select 1 as i;
connection addconroot1;
--sleep 2
insert into t1 values (2);
connection default;
--reap
select * from t1;
drop table t1;

Revision history for this message
Oleksandr "Sanja" Byelkin (sanja-byelkin) wrote :

The problem is that when we perform "insert into t1 values (2);" we first sends OK then close tables:

      net_end_statement(thd);
      query_cache_end_of_result(thd);
      /*
        Multiple queries exits, execute them individually
      */
      close_thread_tables(thd);

So client starts continue the script as soon as it got OK, and execute "select * from t1;" before other thread close tables, so result is not visible during select execution, here is how it is in debug trace (T@11 is default connection, T@12 is addconroot1):
...
T@12 : | | >net_end_statement
T@12 : | | | >my_ok
...
T@12 : | | | | | | >query_cache_insert
T@12 : | | | | | | <query_cache_insert
T@12 : | | | | | | >vio_write
T@12 : | | | | | | | enter: sd: 52 buf: 0x215a5a8 size: 11
T@11 : | <vio_read
T@11 : | >vio_read
T@11 : | | enter: sd: 28 buf: 0x20d73a8 size: 17
T@11 : | <vio_read
...
T@11 : | | query: select * from t1
...
T@11 : | | query: drop table t1
...
T@11 : | | | | | | <remove_table_from_cache
T@11 : | | | | | | >wait_for_locked_table_names
T@11 : | | | | | | | >table_is_used
...
T@11 : | | | | | | | >wait_for_condition
T@12 : | | | | | | | exit: 11
T@12 : | | | | | | <vio_write
...
T@12 : | | | | info: OK sent, so no more error sending allowed
T@12 : | | | <my_ok
T@12 : | | <net_end_statement

Revision history for this message
Oleksandr "Sanja" Byelkin (sanja-byelkin) wrote :

I propose to do something like this:
=== modified file 'sql/sql_parse.cc'
--- sql/sql_parse.cc 2009-11-30 21:37:27 +0000
+++ sql/sql_parse.cc 2009-12-02 23:42:54 +0000
@@ -1252,12 +1252,12 @@
       ha_maria::implicit_commit(thd, FALSE);
 #endif

+ /*
+ Multiple queries exits, execute them individually
+ */
+ close_thread_tables(thd);
       net_end_statement(thd);
       query_cache_end_of_result(thd);
- /*
- Multiple queries exits, execute them individually
- */
- close_thread_tables(thd);
       ulong length= (ulong)(packet_end - beginning_of_next_stmt);

       log_slow_statement(thd);
@@ -1645,12 +1645,12 @@
   ha_maria::implicit_commit(thd, FALSE);
 #endif

- net_end_statement(thd);
- query_cache_end_of_result(thd);

   thd->proc_info= "closing tables";
   /* Free tables */
   close_thread_tables(thd);
+ net_end_statement(thd);
+ query_cache_end_of_result(thd);

   log_slow_statement(thd);

Revision history for this message
Michael Widenius (monty) wrote :

The problem is that someone moved in MySQL 5.1 mysql_unlock_tables() from mysql_insert to close_thread_tables().

There is two solutions how to fix this:
a) Move that we call close_thread_tables() before sending the result to the client (as you suggested)
b) Move out the mysql_unlock_tables() + thd->binlog_flush_pending_rows_event from close_threaded_tables() to it's own function (unlock_thread_tables() ?) and call this before we call net_end_statement() in sql_parse.cc

I would suggest we do b) because this enables the client to get some work done while the server is doing end-of-statement cleanup, which results in higher throughput in many cases.

Changed in maria:
status: Incomplete → In Progress
Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.