Select literal causes the next select empty string to return junk

Bug #1408148 reported by Weishiun Tsai
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Released
Critical
Howard Qin

Bug Description

As shown in the following example. The 1st query selects a literal with 10 characters. It causes the 2nd query, which selects '', to return rows with junk. It only happens when the 2 queries are executed together. The problem seems to go away if the 1st query is commented out.

This is seen on the v0105_0830 build installed on a cluster. As the nature of junk result often is, it can’t be reproduced 100%, but it does occur often enough. It is likely to be a regression introduced sometime between the v1218_0830 daily build and the v0105_0830 daily build.

---------------------------

Here is the entire script to reproduce it:

create schema mytest;
set schema mytest;
create table t (a int);
insert into t values (1),(2);
select '0123456789' from t;
select '' from t;
drop schema mytest cascade;

---------------------------

Here is the execution output:

[trafodion@n009 bin]$ sqlci
Trafodion Conversational Interface 0.9.1
(c) Copyright 2014 Hewlett-Packard Development Company, LP.
>>obey mytest.sql;
>>create schema mytest;

--- SQL operation complete.
>>set schema mytest;

--- SQL operation complete.
>>create table t (a int);

--- SQL operation complete.
>>insert into t values (1),(2);

--- 2 row(s) inserted.
>>select '0123456789' from t;

(EXPR)
----------

0123456789
0123456789

--- 2 row(s) selected.
>>select '' from t;

(EXPR)
----------

23456789
23456789

--- 2 row(s) selected.
>>drop schema mytest cascade;

--- SQL operation complete.

---------------------------

Here is the execution output showing that if the 1st query is commented out, the 2nd query returns correct empty strings.

[trafodion@n009 bin]$ sqlci
Trafodion Conversational Interface 0.9.1
(c) Copyright 2014 Hewlett-Packard Development Company, LP.
>>obey mytest.sql;
>>create schema mytest;

--- SQL operation complete.
>>set schema mytest;

--- SQL operation complete.
>>create table t (a int);

--- SQL operation complete.
>>insert into t values (1),(2);

--- 2 row(s) inserted.
>>-- select '0123456789' from t;
>>select '' from t;

(EXPR)
------

--- 2 row(s) selected.
>>drop schema mytest cascade;

--- SQL operation complete.

Tags: sql-cmp
Revision history for this message
Anoop Sharma (anoop-sharma) wrote :

Looks like something related to query cache. Maybe the hybrid cache changes are causing it.
Turning off query_cache does not show this issue.

This may be related to how varchar literals are being handled during query cache.
The literal in second select is a string of length zero which is represented as a varchar.
If that literal is changed to one blank, then this issue does not show up since one blank
is treated as a fixed char of length 1.
The output also doesn't contain the first 2 bytes which is being treated as the length
bytes of a varchar.

Forwarding to compiler group to take the first look.

/opt/home/ansharma/ansharma_lp_fixes/core/sqf/rundir/seabase:sqlci
Trafodion Conversational Interface 0.9.1
(c) Copyright 2014 Hewlett-Packard Development Company, LP.
>>select '0123456789' from t;

(EXPR)
----------

0123456789
0123456789

--- 2 row(s) selected.
>>select '' from t;

(EXPR)
----------

23456789
23456789

--- 2 row(s) selected.
>>cqd query_cache '0';

--- SQL operation complete.
>>select '' from t;

(EXPR)
------

--- 2 row(s) selected.
>>

****Run with one blank in the second select which works*****

/opt/home/ansharma/ansharma_lp_fixes/core/sqf/rundir/seabase:sqlci
Trafodion Conversational Interface 0.9.1
(c) Copyright 2014 Hewlett-Packard Development Company, LP.

>>select '1234' from t;

(EXPR)
------

1234
1234

--- 2 row(s) selected.
>>select ' ' from t;

(EXPR)
------

--- 2 row(s) selected.
>>

tags: added: sql-cmp
removed: sql-exe
Changed in trafodion:
assignee: nobody → QF Chen (qifan-chen)
Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :

Here is another script to reproduce a problem which is likely to be caused by this same issue. Notice that query 2 and query 4 return wrong results when query cache is on. But it returns correct results once the cqd query_cache is set to '0'. This problem is causing a lot of queries to return wrong results. The bug report is now bumped to Critical as it is a data correctness issue:

--------------------------------------------------

Here is the script to reproduce it:

create schema mytest;
set schema mytest;
create table t (a char(10), b char(10));
insert into t values ('BOOK','row1'), ('book','row2');
select * from t;

-- query 1: should return row1
select b from t where a like 'BOO_%';
-- query 2: should return row2
select b from t where a like 'boo_%';
-- query 3: should return row1
select b from t where a like 'BOO%';
-- query 4: should return row2
select b from t where a like 'boo%';

cqd query_cache '0';

-- query 1: should return row1
select b from t where a like 'BOO_%';
-- query 2: should return row2
select b from t where a like 'boo_%';
-- query 3: should return row1
select b from t where a like 'BOO%';
-- query 4: should return row2
select b from t where a like 'boo%';

drop schema mytest cascade;

--------------------------------------------------

>>create schema mytest;

--- SQL operation complete.
>>set schema mytest;

--- SQL operation complete.
>>create table t (a char(10), b char(10));

--- SQL operation complete.
>>insert into t values ('BOOK','row1'), ('book','row2');

--- 2 row(s) inserted.
>>select * from t;

A B
---------- ----------

BOOK row1
book row2

--- 2 row(s) selected.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row1

--- 1 row(s) selected.
>>
>>cqd query_cache '0';

--- SQL operation complete.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row2

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row2

--- 1 row(s) selected.
>>
>>drop schema mytest cascade;

--- SQL operation complete.

Changed in trafodion:
importance: High → Critical
Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :

This problem can probably be narrowed down from the generic query cache to hybrid query cache, as setting the cqd hybrid_query_cache 'off' also make the queries return correct results.

>>create schema mytest;

--- SQL operation complete.
>>set schema mytest;

--- SQL operation complete.
>>create table t (a char(10), b char(10));

--- SQL operation complete.
>>insert into t values ('BOOK','row1'), ('book','row2');

--- 2 row(s) inserted.
>>select * from t;

A B
---------- ----------

BOOK row1
book row2

--- 2 row(s) selected.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row1

--- 1 row(s) selected.
>>
>>cqd hybrid_query_cache 'off';

--- SQL operation complete.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row2

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row2

--- 1 row(s) selected.
>>
>>drop schema mytest cascade;

--- SQL operation complete.
>>
>>cqd hybrid_query_cache 'off';

QF Chen (qifan-chen)
Changed in trafodion:
assignee: QF Chen (qifan-chen) → Howard Qin (hao-qin)
Revision history for this message
QF Chen (qifan-chen) wrote :

the fix to LP1408148 is with the following two lines (in red, commented out) in QCache.cpp.

Regardless of the length of the source string, we should stick with the protocol to represent a VARCHAR. In particular, the varchar length field has to be present. This is the reason why after commenting out the two lines, the query works.

1391 Lng32 targetLen = targetType->getNominalSize();
   1392 Lng32 sourceScale = constVal->getType()->getScale();
   1393 Lng32 targetScale = targetType->getScale();
   1394 Lng32 varCharLenSize = 0;
   1395 char* varCharLen = NULL;
   1396
   1397 if ((targetType->getFSDatatype() >= REC_MIN_NUMERIC) and (targetType->getFSDatatype() <= REC_MAX_FLOAT))
   1398 {
   1399 Lng32 extraBuffer = targetLen - (offset % targetLen);
   1400 if (extraBuffer != targetLen)
   1401 offset += extraBuffer;
   1402 }
   1403
   1404 if (DFS2REC::isAnyVarChar(targetType->getFSDatatype()))
   1405 {
   1406 varCharLenSize = targetType->getVarLenHdrSize();
   1407 // align on a 2-byte since this is an integer
   1408 offset += (offset % varCharLenSize);
   1409 varCharLen = (char*)(targetBugPtr+offset);
   1410 offset += varCharLenSize;
   1411
   1412 // is this an empty string
   1413 //if (constVal->isEmptyString()) <===========
   1414 // varCharLenSize = 0; <===========
   1415 }
   1416
   1417 if (DFS2REC::isAnyCharacter(targetType->getFSDatatype()))
   1418 {
   1419 sourceScale= constVal->getType()->getCharSet();
   1420 targetScale= targetType->getCharSet();
   1421 }
   1422

Changed in trafodion:
status: New → In Progress
Howard Qin (hao-qin)
Changed in trafodion:
status: In Progress → Fix Committed
Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :

The way Hybrid Query Cache handles literals is still problematic. But the 2 scenarios described in this bug report are fixed. Instead of keep adding new scenarios to this bug report, a new bug report https://bugs.launchpad.net/trafodion/+bug/1409863 'Hybrid Query Cache causes POSITON to return wrong results' has been created. This one will be closed. Here are the execution results of the 2 scenarios described in this bug report from the v0110 build:

------------------------------------------------------------------

>>set schema mytest;

--- SQL operation complete.
>>create table t (a int);

--- SQL operation complete.
>>insert into t values (1),(2);

--- 2 row(s) inserted.
>>select '0123456789' from t;

(EXPR)
----------

0123456789
0123456789

--- 2 row(s) selected.
>>select '' from t;

(EXPR)
----------

--- 2 row(s) selected.
>>drop schema mytest cascade;

------------------------------------------------------------------

>>create schema mytest;

--- SQL operation complete.
>>set schema mytest;

--- SQL operation complete.
>>create table t (a char(10), b char(10));

--- SQL operation complete.
>>insert into t values ('BOOK','row1'), ('book','row2');

--- 2 row(s) inserted.
>>select * from t;

A B
---------- ----------

BOOK row1
book row2

--- 2 row(s) selected.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row2

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row2

--- 1 row(s) selected.
>>
>>cqd hybrid_query_cache 'off';

--- SQL operation complete.
>>
>>-- query 1: should return row1
>>select b from t where a like 'BOO_%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 2: should return row2
>>select b from t where a like 'boo_%';

B
----------

row2

--- 1 row(s) selected.
>>-- query 3: should return row1
>>select b from t where a like 'BOO%';

B
----------

row1

--- 1 row(s) selected.
>>-- query 4: should return row2
>>select b from t where a like 'boo%';

B
----------

row2

--- 1 row(s) selected.
>>
>>drop schema mytest cascade;

--- SQL operation complete.

Changed in trafodion:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.