Scalar Subqueries and their effect on execution plans
Scalar subqueries…I remember Tom extolling their virtues at the UKOUG last year in one of his presentations. They seem like a neat idea except that they have an unpleasant side effect which I came across the other day on a production system.
We had a situation where a piece of DML was running slowly and I was asked to take a look at it to see what it was doing. I was told that the query had undergone some modifications recently (can you hear the alarm bells ringing !?) but that the execution plan had been checked and it was fine.
I first ran a little script called get_query_progress.sql which I wrote to get the execution plan from the V$SQL_PLAN table and link it to the work areas and long ops V$ views to give a better picture of where a query is at in it’s execution plan…hopefully to give an indication of how far through it is. The results of this showed that the plan was as I had expected for this piece of DML – it’s a big operation using parallel query, full table scans and hash joins to process lots of rows in the most efficient way…so why was it going slowly if the execution plan looked good ? Well, the progress script I ran wasn’t really much use since it just showed the execution plan and didn’t have any long ops to speak of so I couldn’t tell where exactly in the plan it had reached…time for another tack.
I then ran a query against V$SESSION_WAIT_HISTORY to see if that would show what we were waiting for…expecting to see scattered reads and perhaps read/writes on temp as it does the hash joins I was surprised to see lots of sequential reads which suggested the use of an index – but the execution plan from above did not involve any indexes so this seemed odd.
Next I figured I’d try to determine what the sequential reads were actually reading by using the File# and Block# from the waits information using this query:
SELECT segment_name, segment_type, owner, tablespace_name
FROM dba_extents
WHERE file_id =
AND >block#> BETWEEN block_id AND (block_id + blocks -1)
/
Which showed that the sequential reads were in fact reading an index on one of the tables – but how could this be when the plan showed no signs of this step ?
I took the DML and did an EXPLAIN PLAN on it to see what this thought the intended plan would be and it showed no signs of this index read so that’s EXPLAIN PLAN and V$SQL_PLAN (i.e. the actual plan it’s using) both showing no sign of this index read and yet it was definitely happening.
Looking at the changes to the DML my colleague Anthony and I noted that the changes made recently had involved adding some new code including some scalar subqueries on some additional tables – tables which were not showing in the intended/actual plans either. Nowadays, most of people think http://downtownsault.org/downtown/services/daymakers-salon-spa/ levitra generic canada that they can give them an erected penis. A lot of men report having a renewed sense of vitality and having purchase cheap cialis desires they may not have felt in months or even years. 4.) Cost. Jelly Medicines are Rubbery This is the reason it is one of the most selling generic drug for buy viagra without consultation medication. The most efficient natural cures for erectile Dysfunction. buy generic cialis We decided to rewrite the query using an outer join approach rather than the scalar subquery and the plans then showed up the access of these new tables and that the access was via a full table scan. Executing the DML then resulted in an elapsed time in line with expectations.
I did set up a simple test case and then ran a 10053 trace on it to see what was in there –
DROP TABLE test1
/
CREATE TABLE test1(col1 number
,col2 number)
/
DROP TABLE test2
/
CREATE TABLE test2(col1 number
,col2 number)
/
BEGIN
FOR r IN 1..100000 LOOP
INSERT INTO test1 VALUES(r,100000+r);
IF MOD(r,2)=0 THEN
INSERT INTO test2 VALUES(100000+r,200000+r);
END IF;
END LOOP;
COMMIT;
END;
/
exec dbms_stats.gather_table_stats(ownname => USER,tabname => ‘TEST1’,estimate_percent => 100);
exec dbms_stats.gather_table_stats(ownname => USER,tabname => ‘TEST2’,estimate_percent => 100);
SELECT /*+ OUTER JOIN APPROACH */
COUNT(1) FROM (
SELECT t1.col1
, t1.col2
, t2.col2
FROM test1 t1
, test2 t2
WHERE t1.col2 = t2.col1(+)
ORDER BY t1.col1
)
/
alter session set events ‘10053 trace name context forever’;
SELECT /*+ SCALAR SUBQUERY APPROACH */
COUNT(1) FROM (
SELECT t1.col1
, t1.col2
, (SELECT t2.col2
FROM test2 t2
WHERE t1.col2 = t2.col1
) col2
FROM test1 t1
ORDER BY t1.col1
)
/
In the trace file we first see the scalar subquery being evaluated…
****************
QUERY BLOCK TEXT
****************
SELECT t2.col2
FROM test2 t2
WHERE t1.col2 = t2.col1
*****
****************
QUERY BLOCK SIGNATURE
*********************
qb name was generated
signature (optimizer): qb_name=SEL$3 nbfros=1 flg=0
fro(0): flg=0 objn=55825 hint_alias=”T2″@”SEL$3″
… then the costs for this subquery:
*********************************
Number of join permutations tried: 1
*********************************
Final – First Rows Plan: Best join order: 1
Cost: 27.6270 Degree: 1 Card: 1.0000 Bytes: 9
Resc: 27.6270 Resc_io: 26.0000 Resc_cpu: 10783378
Resp: 27.6270 Resp_io: 26.0000 Resc_cpu: 10783378
…then it works on the whole query:
****************
QUERY BLOCK TEXT
****************
SELECT /*+ SCALAR SUBQUERY APPROACH */
COUNT(1) FROM (
SELECT t1.col1
, t1.col2
, (SELECT t2.col2
FROM test2 t2
WHERE t1.col2 = t2.col1
) col2
FROM test1 t1
ORDER BY t1.col1
)
*********************
QUERY BLOCK SIGNATURE
*********************
qb name was generated
signature (optimizer): qb_name=SEL$51F12574 nbfros=1 flg=0
fro(0): flg=0 objn=55824 hint_alias=”T1″@”SEL$2″
…the costs for which are:
*********************************
Number of join permutations tried: 1
*********************************
Final – All Rows Plan: Best join order: 1
Cost: 57.8271 Degree: 1 Card: 100000.0000 Bytes: 500000
Resc: 57.8271 Resc_io: 55.0000 Resc_cpu: 18737631
Resp: 57.8271 Resp_io: 55.0000 Resc_cpu: 18737631
And the (incomplete) plan it thinks it’s running:
============
Plan Table
============
————————————–+———————————–+
Id Operation Name Rows Bytes Cost Time
————————————–+———————————–+
0 SELECT STATEMENT 58
1 SORT AGGREGATE 1 5
2 TABLE ACCESS FULL TEST1 98K 488K 58 00:00:01
————————————–+———————————–+
So, I guess you get more information in the 10053 trace file including the breakdown for the subquery(ies) – so in the case I was working on that would have highlighted to me the use of this new table and it’s access path being (inefficiently) via an index. It still gets the overall plan wrong though.
After a bit of reading, Anthony discovered that Tom had mentioned this issue of execution plans not reflecting actual processing, in his Effective Oracle by Design book (p504 – 514). Tom remarked that it was an issue in 9iR2 and that he hoped it would be fixed in a future release – well, that release is not 10gR2 since we’re still seeing this anomaly.