DELETE concurrent with index creation causes corruption

Bug #1415156 reported by Mike Hanlon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
New
Medium
Sandhya Sundaresan

Bug Description

If queries delete rows from a table during CREATE INDEX there is a risk that the index will have more rows than the base table. See the example sqlci session quoted below. Note that the delete happens in the background with no output shown. The test script is attached.

>>
>>obey index_corrupter_traf(cr_table);
>>create table t113b (uniq int not null,
+> c100k int, c10K int , c1K int, c100 int,
+> c10 int, c1 int, primary key (uniq) );

--- SQL operation complete.
>>
>>prepare s1 from upsert using load into t113b select
+>0 + (100000 * x100000) + (10000 * x10000) + (1000 * x1000) +
+> (100 * x100) + (10 * x10) +( 1 * x1),
+>0 + (10000 * x10000) + (1000 * x1000) + (100 * x100) +
+> (10 * x10) +( 1 * x1),
+>0 + (1000 * x1000) + (100 * x100) + (10 * x10) + (1 * x1),
+>0 + (100 * x100) + (10 * x10) + (1 * x1),
+>0 + (10 * x10) + (1 * x1),
+>0 + (1 * x1),
+>0
+>from (values(0)) t
+>transpose 0,1,2,3,4,5,6,7,8,9 as x100000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x10000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x1000
+>transpose 0,1,2,3,4,5,6,7,8,9 as x100
+>transpose 0,1,2,3,4,5,6,7,8,9 as x10
+>transpose 0,1,2,3,4,5,6,7,8,9 as x1;

--- SQL command prepared.
>>
>>explain options 'f' s1;

LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------

9 . 10 root 1.00E+007
7 8 9 tuple_flow 1.00E+007
. . 8 trafodion_load T113B 1.00E+000
6 . 7 transpose 1.00E+006
5 . 6 transpose 1.00E+005
4 . 5 transpose 1.00E+004
3 . 4 transpose 1.00E+003
2 . 3 transpose 1.00E+002
1 . 2 transpose 1.00E+001
. . 1 values 1.00E+000

--- SQL operation complete.
>>
>>display qid for s1;
QID is MXID11000015197212289139259874701000000000206U3333300_478_S1

QID details:
============
  Segment Num: 0
  Segment Name:
  Cpu: 0
  Pin: 15197
  ExeStartTime: 212289139259874701= 2015/01/27 17:20:59.874701 LCT
  SessionNum: 2
  UserName: U33333
  SessionName: NULL
  QueryNum: 478
  StmtName: S1
  SessionId: MXID11000015197212289139259874701000000000206U3333300

>>
>>execute s1;

--- 1000000 row(s) inserted.
>>
>>get statistics for qid current;
Qid MXID11000015197212289139259874701000000000206U3333300_478_S1
Compile Start Time 2015/01/27 17:21:18.824433
Compile End Time 2015/01/27 17:21:20.080504
Compile Elapsed Time 0:00:01.256071
Execute Start Time 2015/01/27 17:21:20.124949
Execute End Time 2015/01/27 17:22:24.244243
Execute Elapsed Time 0:01:04.119294
State CLOSE
Rows Affected 1,000,000
SQL Error Code 0
Stats Error Code 0
Query Type SQL_INSERT_NON_UNIQUE
Sub Query Type SQL_STMT_NA
Estimated Accessed Rows 0
Estimated Used Rows 0
Parent Qid NONE
Parent Query System NONE
Child Qid NONE
Number of SQL Processes 1
Number of Cpus 1
Transaction Id -1
Source String upsert using load into t113b select 0 + (100000 * x100000) + (10000 * x10000) + (1000 * x1000) + (100 * x100) + (10 * x10) +( 1 * x1), 0 + (10000 * x10000) + (1000 * x1000) + (100 * x100) + (10 * x10) +( 1 * x1), 0 + (1000 * x1000) + (100 * x100) +
SQL Source Length 613
Rows Returned 0
First Row Returned Time -1
Last Error before AQR 0
Number of AQR retries 0
Delay before AQR 0
No. of times reclaimed 0
Cancel Time -1
Last Suspend Time -1
Stats Collection Type OPERATOR_STATS
SQL Process Busy Time 22,685,983
UDR Process Busy Time 0
SQL Space Allocated 618 KB
SQL Space Used 604 KB
SQL Heap Allocated 142 KB
SQL Heap Used 142 KB
EID Space Allocated 0 KB
EID Space Used 0 KB
EID Heap Allocated 0 KB
EID Heap Used 0 KB
Processes Created 0
Process Create Time 0
Request Message Count 0
Request Message Bytes 0
Reply Message Count 0
Reply Message Bytes 0
Scr. Overflow Mode UNKNOWN
Scr File Count 0
Scr. Buffer Blk Size 0
Scr. Buffer Blks Read 0
Scr. Buffer Blks Written 0
Scr. Read Count 0
Scr. Write Count 0

Table Name
   Records Accessed Records Used Hbase Hbase Hbase IO Hbase IO
   Estimated/Actual Estimated/Actual IOs IO MBytes Sum Time Max Time
TRAFODION.SCH.T113B
                  0 1
                  0 1,000,000 16,669 104 1,875,790 1,875,790

--- SQL operation complete.
>>
>>
>>obey index_corrupter_traf(cr_index);
>>
>>prepare s1 from
+>create index idx2 on t113b(c1k);

--- SQL command prepared.
>>
>>display qid for s1;
QID is MXID11000015197212289139259874701000000000206U3333300_524_S1

QID details:
============
  Segment Num: 0
  Segment Name:
  Cpu: 0
  Pin: 15197
  ExeStartTime: 212289139259874701= 2015/01/27 17:20:59.874701 LCT
  SessionNum: 2
  UserName: U33333
  SessionName: NULL
  QueryNum: 524
  StmtName: S1
  SessionId: MXID11000015197212289139259874701000000000206U3333300

>>
>>sh sqlci -i"index_corrupter_traf(uniq_deletes)" >/dev/null &;
>>
>>execute s1;

--- SQL operation complete.
>>
>>
>>
>>set parserflags 1;

--- SQL operation complete.
>>
>>select count(*) from table(index_table idx2);

(EXPR)
--------------------

              999960

--- 1 row(s) selected.
>>
>>select count(*) from t113b;

(EXPR)
--------------------

              999574

--- 1 row(s) selected.
>>exit;

End of MXCI Session

Tags: sql-exe
Revision history for this message
Mike Hanlon (mike-hanlon) wrote :
Changed in trafodion:
assignee: nobody → Mike Hanlon (mike-hanlon)
Revision history for this message
Mike Hanlon (mike-hanlon) wrote :

This has been discussed in email:

From: Sharma, Anoop
Sent: Monday, February 02, 2015 1:15 PM
To: Hanlon, Mike; Sundaresan, Sandhya
Cc: Vasudev, Prashanth
Subject: RE: Index corruption during delete

hi
  With ddl xns, an index creation and population will be inside of that transaction and index will be dropped if that creation fails.
So that will not be an issue.

 But if index is being populated without being created, then you may run into the pbm Mike is talking about.
Xn will not help since index population is a non-transactional activity. Blowing away opens that were done on behalf of
the concurrent delete will work.

anoop

From: Hanlon, Mike
Sent: Friday, January 30, 2015 9:01 AM
To: Sundaresan, Sandhya; Sharma, Anoop
Cc: Vasudev, Prashanth
Subject: RE: Index corruption during delete

HI Sandhya –
The index population is done while the other sqlci session is doing a series of deletes. I think at first, the deletes do not have index maintenance inlined because there isn’t any index at the time the delete is compiled. The subsequent deletes are satisfied by the compiler query cache until the index is finished, and then the plan is recreated with index maintenance due to query invalidation.

In Seaquest, the index population would have locked the base table, so the deletes would have to wait, and when the index was finished, the open would have been blown away and AQR would cause the delete to be recompiled with index maintenance.

I’m not sure if DDL transaction will help in trafodion. Traf QI has most of the effect of the Seaquest blown-away open, and QI can be enhanced to interrupt executing queries just like blown-away open does. But it seems to me that we need locking too.
thanks
Mike

From: Sundaresan, Sandhya
Sent: Friday, January 30, 2015 10:11 AM
To: Hanlon, Mike; Sharma, Anoop
Cc: Vasudev, Prashanth
Subject: Index corruption during delete

Hi Mike,
I noticed this LP that you created. https://bugs.launchpad.net/trafodion/+bug/1415156
Do you have an idea on what is happening or did you just notice it ?
I am wondering if this will be fixed if we have full ddl transaction support ? Ccing Anoop too for comments. Anything we should do in the interim ?

There are a couple more issues like this one that I’d like to link with the DDL transaction LP. Prashanth , is there an LP number for that ?
Thanks
Sandhya

Revision history for this message
Mike Hanlon (mike-hanlon) wrote :

As discuss in the previous comment, it may be possible to prevent the corruption by extending QI to cause the EXE of the DELETE to blow-away the open. Specifically, the EXE could check whether its query has been invalidated before committing its transaction and if so, raise an error. In many cases, the DELETE can be AQR'd with a recompiled plan which will contain theindex maintenance.

Changed in trafodion:
importance: High → Medium
Changed in trafodion:
assignee: Mike Hanlon (mike-hanlon) → Sandhya Sundaresan (sandhya-sundaresan)
milestone: r1.1 → r2.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.