DELETE concurrent with index creation causes corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Trafodion |
New
|
Medium
|
Sandhya Sundaresan |
Bug Description
If queries delete rows from a table during CREATE INDEX there is a risk that the index will have more rows than the base table. See the example sqlci session quoted below. Note that the delete happens in the background with no output shown. The test script is attached.
>>
>>obey index_corrupter
>>create table t113b (uniq int not null,
+> c100k int, c10K int , c1K int, c100 int,
+> c10 int, c1 int, primary key (uniq) );
--- SQL operation complete.
>>
>>prepare s1 from upsert using load into t113b select
+>0 + (100000 * x100000) + (10000 * x10000) + (1000 * x1000) +
+> (100 * x100) + (10 * x10) +( 1 * x1),
+>0 + (10000 * x10000) + (1000 * x1000) + (100 * x100) +
+> (10 * x10) +( 1 * x1),
+>0 + (1000 * x1000) + (100 * x100) + (10 * x10) + (1 * x1),
+>0 + (100 * x100) + (10 * x10) + (1 * x1),
+>0 + (10 * x10) + (1 * x1),
+>0 + (1 * x1),
+>0
+>from (values(0)) t
+>transpose 0,1,2,3,4,5,6,7,8,9 as x100000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x10000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x1000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x100
+>transpose 0,1,2,3,4,5,6,7,8,9 as x10
+>transpose 0,1,2,3,4,5,6,7,8,9 as x1;
--- SQL command prepared.
>>
>>explain options 'f' s1;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------
9 . 10 root 1.00E+007
7 8 9 tuple_flow 1.00E+007
. . 8 trafodion_load T113B 1.00E+000
6 . 7 transpose 1.00E+006
5 . 6 transpose 1.00E+005
4 . 5 transpose 1.00E+004
3 . 4 transpose 1.00E+003
2 . 3 transpose 1.00E+002
1 . 2 transpose 1.00E+001
. . 1 values 1.00E+000
--- SQL operation complete.
>>
>>display qid for s1;
QID is MXID11000015197
QID details:
============
Segment Num: 0
Segment Name:
Cpu: 0
Pin: 15197
ExeStartTime: 212289139259874701= 2015/01/27 17:20:59.874701 LCT
SessionNum: 2
UserName: U33333
SessionName: NULL
QueryNum: 478
StmtName: S1
SessionId: MXID11000015197
>>
>>execute s1;
--- 1000000 row(s) inserted.
>>
>>get statistics for qid current;
Qid MXID11000015197
Compile Start Time 2015/01/27 17:21:18.824433
Compile End Time 2015/01/27 17:21:20.080504
Compile Elapsed Time 0:00:01.256071
Execute Start Time 2015/01/27 17:21:20.124949
Execute End Time 2015/01/27 17:22:24.244243
Execute Elapsed Time 0:01:04.119294
State CLOSE
Rows Affected 1,000,000
SQL Error Code 0
Stats Error Code 0
Query Type SQL_INSERT_
Sub Query Type SQL_STMT_NA
Estimated Accessed Rows 0
Estimated Used Rows 0
Parent Qid NONE
Parent Query System NONE
Child Qid NONE
Number of SQL Processes 1
Number of Cpus 1
Transaction Id -1
Source String upsert using load into t113b select 0 + (100000 * x100000) + (10000 * x10000) + (1000 * x1000) + (100 * x100) + (10 * x10) +( 1 * x1), 0 + (10000 * x10000) + (1000 * x1000) + (100 * x100) + (10 * x10) +( 1 * x1), 0 + (1000 * x1000) + (100 * x100) +
SQL Source Length 613
Rows Returned 0
First Row Returned Time -1
Last Error before AQR 0
Number of AQR retries 0
Delay before AQR 0
No. of times reclaimed 0
Cancel Time -1
Last Suspend Time -1
Stats Collection Type OPERATOR_STATS
SQL Process Busy Time 22,685,983
UDR Process Busy Time 0
SQL Space Allocated 618 KB
SQL Space Used 604 KB
SQL Heap Allocated 142 KB
SQL Heap Used 142 KB
EID Space Allocated 0 KB
EID Space Used 0 KB
EID Heap Allocated 0 KB
EID Heap Used 0 KB
Processes Created 0
Process Create Time 0
Request Message Count 0
Request Message Bytes 0
Reply Message Count 0
Reply Message Bytes 0
Scr. Overflow Mode UNKNOWN
Scr File Count 0
Scr. Buffer Blk Size 0
Scr. Buffer Blks Read 0
Scr. Buffer Blks Written 0
Scr. Read Count 0
Scr. Write Count 0
Table Name
Records Accessed Records Used Hbase Hbase Hbase IO Hbase IO
Estimated/Actual Estimated/Actual IOs IO MBytes Sum Time Max Time
TRAFODION.SCH.T113B
0 1
0 1,000,000 16,669 104 1,875,790 1,875,790
--- SQL operation complete.
>>
>>
>>obey index_corrupter
>>
>>prepare s1 from
+>create index idx2 on t113b(c1k);
--- SQL command prepared.
>>
>>display qid for s1;
QID is MXID11000015197
QID details:
============
Segment Num: 0
Segment Name:
Cpu: 0
Pin: 15197
ExeStartTime: 212289139259874701= 2015/01/27 17:20:59.874701 LCT
SessionNum: 2
UserName: U33333
SessionName: NULL
QueryNum: 524
StmtName: S1
SessionId: MXID11000015197
>>
>>sh sqlci -i"index_
>>
>>execute s1;
--- SQL operation complete.
>>
>>
>>
>>set parserflags 1;
--- SQL operation complete.
>>
>>select count(*) from table(index_table idx2);
(EXPR)
-------
--- 1 row(s) selected.
>>
>>select count(*) from t113b;
(EXPR)
-------
--- 1 row(s) selected.
>>exit;
End of MXCI Session
Changed in trafodion: | |
assignee: | nobody → Mike Hanlon (mike-hanlon) |
Changed in trafodion: | |
importance: | High → Medium |
Changed in trafodion: | |
assignee: | Mike Hanlon (mike-hanlon) → Sandhya Sundaresan (sandhya-sundaresan) |
milestone: | r1.1 → r2.0 |
This has been discussed in email:
From: Sharma, Anoop
Sent: Monday, February 02, 2015 1:15 PM
To: Hanlon, Mike; Sundaresan, Sandhya
Cc: Vasudev, Prashanth
Subject: RE: Index corruption during delete
hi
With ddl xns, an index creation and population will be inside of that transaction and index will be dropped if that creation fails.
So that will not be an issue.
But if index is being populated without being created, then you may run into the pbm Mike is talking about.
Xn will not help since index population is a non-transactional activity. Blowing away opens that were done on behalf of
the concurrent delete will work.
anoop
From: Hanlon, Mike
Sent: Friday, January 30, 2015 9:01 AM
To: Sundaresan, Sandhya; Sharma, Anoop
Cc: Vasudev, Prashanth
Subject: RE: Index corruption during delete
HI Sandhya –
The index population is done while the other sqlci session is doing a series of deletes. I think at first, the deletes do not have index maintenance inlined because there isn’t any index at the time the delete is compiled. The subsequent deletes are satisfied by the compiler query cache until the index is finished, and then the plan is recreated with index maintenance due to query invalidation.
In Seaquest, the index population would have locked the base table, so the deletes would have to wait, and when the index was finished, the open would have been blown away and AQR would cause the delete to be recompiled with index maintenance.
I’m not sure if DDL transaction will help in trafodion. Traf QI has most of the effect of the Seaquest blown-away open, and QI can be enhanced to interrupt executing queries just like blown-away open does. But it seems to me that we need locking too.
thanks
Mike
From: Sundaresan, Sandhya
Sent: Friday, January 30, 2015 10:11 AM
To: Hanlon, Mike; Sharma, Anoop
Cc: Vasudev, Prashanth
Subject: Index corruption during delete
Hi Mike, /bugs.launchpad .net/trafodion/ +bug/1415156
I noticed this LP that you created. https:/
Do you have an idea on what is happening or did you just notice it ?
I am wondering if this will be fixed if we have full ddl transaction support ? Ccing Anoop too for comments. Anything we should do in the interim ?
There are a couple more issues like this one that I’d like to link with the DDL transaction LP. Prashanth , is there an LP number for that ?
Thanks
Sandhya