Comment 10 for bug 1220975

Revision history for this message
John H. Embretsen (johnemb) wrote :

Roel,

I have looked briefly into the issue. As in most cases, reducing the test as much as possible helps to understand the issue better. In this case I tried running only a single query with all transformers enabled, and there was a difference of 2 statements (I used the grammar optimizer_no_subquery.yy):

$ grep "transforms" /tmp/test1.log | wc -l
26
$ grep "transforms" /tmp/test1.txt | wc -l
28

Then I did some extra checks with grep, e.g. "grep SELECT", "grep INSERT", "grep CREATE" added to the command lines above, before the "wc -l". After quickly inspecting the output with this kind of filtering, it seemed that something related to triggers was missing, so I re-ran the test using only the ExecuteAsTrigger transformer (--transformer=ExecuteAsTrigger). Then I got:

$ grep "transforms" /tmp/test1.log | wc -l
6
$ grep "transforms" /tmp/test1.txt | wc -l
8

With this few statements, manual inspection is feasible. Two statements were in the general log, but not in the trace:

1) CREATE DATABASE IF NOT EXISTS transforms

This was the very first statement in the general log which contained the string "transforms". The trace did not include this, but it had:

/* GenTest::Transform::ExecuteAsTrigger */ CREATE DATABASE IF NOT EXISTS transforms;
which also the general log had.

It turns out that the latter is from the ExecuteAsTrigger transformer itself, while the fomer (the statement that was missing from the trace) is from GenTest/Validator/Transformer.pm::validate():

 if ($database_created == 0) {
  $executor->dbh()->do("CREATE DATABASE IF NOT EXISTS transforms");
  $database_created = 1;
 }

Here, the code bypasses the executor code by doing $executor->dbh()->do() instead of $executor->execute(). So this explains why it is not included in the trace, as the trace functionality is in the execute() sub. I do not know why the executor is bypassed like this, or if it can/should be changed.

I suspect that the code shown above is only executed once per test run, so it should not contribute that much to a huge difference in counts as shown in the bug description.

2) INSERT INTO transforms.trigger213329 SELECT alias2 . `col_time_key` AS field1 , alias1 . `col_varchar_key` AS field2 , alias2 . `col_datetime_key` AS field3 , alias1 . `col_varchar_key` AS field4 , alias2 . `col_int_key` AS field5 FROM ( CC AS alias1 , CC AS alias2 ) WHERE ( alias1 . `pk` > 122 AND alias1 . `pk` < ( 122 + 103 ) OR alias1 . `col_varchar_key` IN ('USA' , 'm') ) AND ( alias1 . `col_int_nokey` <> alias1 . `col_varchar_nokey` OR alias1 . `col_int_nokey` IS NULL ) GROUP BY field1, field2, field3, field4, field5 ORDER BY field2 DESC

This statement is also missing from the trace. This statement turns out to be the action caused by the trigger itself. The trigger definition looks like this:

CREATE TRIGGER trigger113329 BEFORE INSERT ON trigger113329 FOR EACH ROW INSERT INTO transforms.trigger213329 SELECT alias2 . `col_time_key` AS field1 , alias1 . `col_varchar_key` AS field2 , alias2 . `col_datetime_key` AS field3 , alias1 . `col_varchar_key` AS field4 , alias2 . `col_int_key` AS field5 FROM ( CC AS alias1 , CC AS alias2 ) WHERE ( alias1 . `pk` > 122 AND alias1 . `pk` < ( 122 + 103 ) OR alias1 . `col_varchar_key` IN ('USA' , 'm') ) AND ( alias1 . `col_int_nokey` <> alias1 . `col_varchar_nokey` OR alias1 . `col_int_nokey` IS NULL ) GROUP BY field1, field2, field3, field4, field5 ORDER BY field2 DESC

In other words, "before each insert into trigger113329 do an insert into transforms.trigger213329". This is reflected in the general log, but, as it happens only on the server side, it is not reflected in the RQG trace.

Similarly, we will see that e.g. prepared statements have more "footprint" in the general log than in the RQG trace. For example,

EXECUTE prep_stmt_13655 /* TRANSFORM_OUTCOME_UNORDERED_MATCH */;

in the RQG sql trace results in

9 Query EXECUTE prep_stmt_13655 /* TRANSFORM_OUTCOME_UNORDERED_MATCH */
9 Execute SELECT alias2 . `col_time_key` AS field1 , alias1 . `col_varchar_key` AS field2 , alias2 . `col_datetime_key` AS field3 , alias1 . `col_varchar_key` AS field4 , alias2 . `col_int_key` AS field5 FROM ( CC AS alias1 , CC AS alias2 ) WHERE ( alias1 . `pk` > 122 AND alias1 . `pk` < ( 122 + 103 ) OR alias1 . `col_varchar_key` IN ('USA' , 'm') ) AND ( alias1 . `col_int_nokey` <> alias1 . `col_varchar_nokey` OR alias1 . `col_int_nokey` IS NULL ) GROUP BY field1, field2, field3, field4, field5 ORDER BY field2 DESC

in the general log. There was no difference in "transforms" counts i this latter example, but there may be other situations where the count will be different yet valid.

That said, when I did a run with --queries=1000 and all transformers enabled, I got:

$ grep "transforms" /tmp/test1.log | grep -v "trigger" | wc -l
17274
$ grep "transforms" /tmp/test1.txt | grep -v "trigger" | wc -l
17275

and this difference of 1 is explained by 1) above.

Based on what I know at this point I would recommend closing this bug as "Not a bug". But I will leave some time for further feedback or new evidence, in case I missed something.