Thursday, July 17, 2008

Analytics to the rescue - followup

I thought I'd do a follow-up post to yesterday's "Analytics to the rescue" post as the discussion in the comments brought up a couple of things that needed more info than could be easily included in the comment section. So...

What you'll see below is a more complete text of the tests I executed that were discussed in yesterday's post. First, the original query, then for the rewritten query. What I show for each is the actual STAT lines from the 10046 trace data, the EXPLAIN PLAN and some additional stats (collected with a SQL test harness set of scripts I use).

Test for original query

select utrsotp_desc
from utrsotp, ucbsvco
where ucbsvco_sotp_code = utrsotp_code
and ucbsvco_dispatch_date =
(select max(ucbsvco_dispatch_date)
from ucbsvco
where ucbsvco_cust_code = 1320908
and ucbsvco_prem_code = '507601'
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null)
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null


STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS (cr=106693 r=59924 w=0 time=19518419 us)'
STAT #3 id=2 cnt=1 pid=1 pos=1 obj=74130 op='TABLE ACCESS FULL UCBSVCO (cr=106691 r=59924 w=0 time=19518381 us)'
STAT #3 id=3 cnt=1 pid=2 pos=2 obj=0 op='SORT AGGREGATE (cr=13 r=0 w=0 time=145 us)'
STAT #3 id=4 cnt=2 pid=3 pos=1 obj=74130 op='TABLE ACCESS BY INDEX ROWID UCBSVCO (cr=13 r=0 w=0 time=138 us)'
STAT #3 id=5 cnt=9 pid=4 pos=1 obj=82187 op='INDEX RANGE SCAN UCBSVCO_CUST_INDEX (cr=3 r=0 w=0 time=45 us)'
STAT #3 id=6 cnt=1 pid=1 pos=2 obj=74368 op='TABLE ACCESS BY INDEX ROWID UTRSOTP (cr=2 r=0 w=0 time=22 us)'
STAT #3 id=7 cnt=1 pid=6 pos=1 obj=81903 op='INDEX UNIQUE SCAN UTRSOTP_KEY_INDEX (cr=1 r=0 w=0 time=12 us)'


Explain Plan
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 38 | 1768 |
| 1 | NESTED LOOPS | | 1 | 38 | 1768 |
|* 2 | TABLE ACCESS FULL | UCBSVCO | 1 | 12 | 1767 |
| 3 | SORT AGGREGATE | | 1 | 20 | |
|* 4 | TABLE ACCESS BY INDEX ROWID| UCBSVCO | 1 | 20 | 1 |
|* 5 | INDEX RANGE SCAN | UCBSVCO_CUST_INDEX | 8 | | 1 |
| 6 | TABLE ACCESS BY INDEX ROWID | UTRSOTP | 1 | 26 | 1 |
|* 7 | INDEX UNIQUE SCAN | UTRSOTP_KEY_INDEX | 1 | | |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("UCBSVCO"."UCBSVCO_STUS_CODE"='C' AND "UCBSVCO"."UCBSVCO_DISPATCH_IND"
IS NOT NULL AND "UCBSVCO"."UCBSVCO_DISPATCH_DATE"= (SELECT /*+ */
MAX("UCBSVCO"."UCBSVCO_DISPATCH_DATE") FROM "UIMSMGR"."UCBSVCO" "UCBSVCO" WHERE
"UCBSVCO"."UCBSVCO_CUST_CODE"=1320908 AND "UCBSVCO"."UCBSVCO_PREM_CODE"='507601' AND
"UCBSVCO"."UCBSVCO_STUS_CODE"='C' AND "UCBSVCO"."UCBSVCO_DISPATCH_IND" IS NOT NULL))
4 - filter("UCBSVCO"."UCBSVCO_PREM_CODE"='507601' AND
"UCBSVCO"."UCBSVCO_STUS_CODE"='C' AND "UCBSVCO"."UCBSVCO_DISPATCH_IND" IS NOT NULL)
5 - access("UCBSVCO"."UCBSVCO_CUST_CODE"=1320908)
7 - access("UCBSVCO"."UCBSVCO_SOTP_CODE"="UTRSOTP"."UTRSOTP_CODE")


Statistics Snapshot

Type Statistic Name Value
----- ---------------------------------------- --------------
Latch cache buffers chains 2,142,103
library cache 116,725
row cache objects 19,592

Stats buffer is pinned count 8
consistent gets 106,693
db block gets 56
execute count 16
parse count (hard) 1
physical reads 59,924
physical writes 0
redo size 9,524
session logical reads 106,749
session pga memory 594,008
session pga memory max 0
sorts (disk) 0
sorts (memory) 0
sorts (rows) 0
table fetch by rowid 10
table scan blocks gotten 106,669

Time elapsed time (centiseconds) 2,016



Test for rewritten query

select utrsotp_desc
from utrsotp,
(
select ucbsvco_sotp_code sotp_cd, ucbsvco_dispatch_date dispatch_dt,
max(ucbsvco_dispatch_date) over (partition by ucbsvco_cust_code, ucbsvco_prem_code) max_dispatch_dt
from ucbsvco
where ucbsvco_cust_code = 1320908
and ucbsvco_prem_code = '507601'
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null
) svco
where svco.dispatch_dt = svco.max_dispatch_dt
and svco.sotp_cd = utrsotp_code


STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS (cr=15 r=0 w=0 time=249 us)'
STAT #3 id=2 cnt=1 pid=1 pos=1 obj=0 op='VIEW (cr=13 r=0 w=0 time=228 us)'
STAT #3 id=3 cnt=2 pid=2 pos=1 obj=0 op='WINDOW BUFFER (cr=13 r=0 w=0 time=218 us)'
STAT #3 id=4 cnt=2 pid=3 pos=1 obj=74130 op='TABLE ACCESS BY INDEX ROWID UCBSVCO (cr=13 r=0 w=0 time=145 us)'
STAT #3 id=5 cnt=9 pid=4 pos=1 obj=82187 op='INDEX RANGE SCAN UCBSVCO_CUST_INDEX (cr=3 r=0 w=0 time=37 us)'
STAT #3 id=6 cnt=1 pid=1 pos=2 obj=74368 op='TABLE ACCESS BY INDEX ROWID UTRSOTP (cr=2 r=0 w=0 time=16 us)'
STAT #3 id=7 cnt=1 pid=6 pos=1 obj=81903 op='INDEX UNIQUE SCAN UTRSOTP_KEY_INDEX (cr=1 r=0 w=0 time=9 us)'


Explain Plan
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 48 | 4 |
| 1 | NESTED LOOPS | | 1 | 48 | 4 |
|* 2 | VIEW | | 1 | 22 | 3 |
| 3 | WINDOW BUFFER | | 1 | 25 | 3 |
|* 4 | TABLE ACCESS BY INDEX ROWID| UCBSVCO | 1 | 25 | 1 |
|* 5 | INDEX RANGE SCAN | UCBSVCO_CUST_INDEX | 8 | | 1 |
| 6 | TABLE ACCESS BY INDEX ROWID | UTRSOTP | 1 | 26 | 1 |
|* 7 | INDEX UNIQUE SCAN | UTRSOTP_KEY_INDEX | 1 | | |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("SVCO"."DISPATCH_DT"="SVCO"."MAX_DISPATCH_DT")
4 - filter("UCBSVCO"."UCBSVCO_PREM_CODE"='507601' AND
"UCBSVCO"."UCBSVCO_STUS_CODE"='C' AND "UCBSVCO"."UCBSVCO_DISPATCH_IND" IS NOT NULL)
5 - access("UCBSVCO"."UCBSVCO_CUST_CODE"=1320908)
7 - access("SVCO"."SOTP_CD"="UTRSOTP"."UTRSOTP_CODE")


Statistics Snapshot

Type Statistic Name Value
----- ---------------------------------------- --------------
Latch cache buffers chains 9,098
library cache 3,608
row cache objects 370

Stats buffer is pinned count 8
consistent gets 15
db block gets 53
execute count 15
parse count (hard) 1
physical reads 0
physical writes 0
redo size 9,008
session logical reads 68
session pga memory 0
session pga memory max 0
sorts (disk) 0
sorts (memory) 1
sorts (rows) 2
table fetch by rowid 10
table scan blocks gotten 0

Time elapsed time (centiseconds) 18


Difference Report

TYPE NAME original rewrite DIFFERENCE
----- --------------------------------------- --------------- --------------- ---------------
Latch cache buffers chains 2142103 9098 2133005
library cache 116725 3608 113117
row cache objects 19592 370 19222

Stats buffer is pinned count 8 8 0
consistent gets 106693 15 106678
db block gets 56 53 3
execute count 16 15 1
parse count (hard) 1 1 0
physical reads 59924 0 59924
physical writes 0 0 0
redo size 9524 9008 516
session logical reads 106749 68 106681
session pga memory 594008 0 594008
session pga memory max 0 0 0
sorts (disk) 0 0 0
sorts (memory) 0 1 -1
sorts (rows) 0 2 -2
table fetch by rowid 10 10 0
table scan blocks gotten 106669 0 106669


Time elapsed time (centiseconds) 2016 18 1998


Original

STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS (cr=106693 r=59924 w=0 time=19518419 us)'
STAT #3 id=2 cnt=1 pid=1 pos=1 obj=74130 op='TABLE ACCESS FULL UCBSVCO (cr=106691 r=59924 w=0 time=19518381 us)'
STAT #3 id=3 cnt=1 pid=2 pos=2 obj=0 op='SORT AGGREGATE (cr=13 r=0 w=0 time=145 us)'
STAT #3 id=4 cnt=2 pid=3 pos=1 obj=74130 op='TABLE ACCESS BY INDEX ROWID UCBSVCO (cr=13 r=0 w=0 time=138 us)'
STAT #3 id=5 cnt=9 pid=4 pos=1 obj=82187 op='INDEX RANGE SCAN UCBSVCO_CUST_INDEX (cr=3 r=0 w=0 time=45 us)'
STAT #3 id=6 cnt=1 pid=1 pos=2 obj=74368 op='TABLE ACCESS BY INDEX ROWID UTRSOTP (cr=2 r=0 w=0 time=22 us)'
STAT #3 id=7 cnt=1 pid=6 pos=1 obj=81903 op='INDEX UNIQUE SCAN UTRSOTP_KEY_INDEX (cr=1 r=0 w=0 time=12 us)'


Rewrite

STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS (cr=15 r=0 w=0 time=249 us)'
STAT #3 id=2 cnt=1 pid=1 pos=1 obj=0 op='VIEW (cr=13 r=0 w=0 time=228 us)'
STAT #3 id=3 cnt=2 pid=2 pos=1 obj=0 op='WINDOW BUFFER (cr=13 r=0 w=0 time=218 us)'
STAT #3 id=4 cnt=2 pid=3 pos=1 obj=74130 op='TABLE ACCESS BY INDEX ROWID UCBSVCO (cr=13 r=0 w=0 time=145 us)'
STAT #3 id=5 cnt=9 pid=4 pos=1 obj=82187 op='INDEX RANGE SCAN UCBSVCO_CUST_INDEX (cr=3 r=0 w=0 time=37 us)'
STAT #3 id=6 cnt=1 pid=1 pos=2 obj=74368 op='TABLE ACCESS BY INDEX ROWID UTRSOTP (cr=2 r=0 w=0 time=16 us)'
STAT #3 id=7 cnt=1 pid=6 pos=1 obj=81903 op='INDEX UNIQUE SCAN UTRSOTP_KEY_INDEX (cr=1 r=0 w=0 time=9 us)'


In the original query, there was some question as to whether or not the SELECT MAX() was happening for each row. I had not included the predicate information section for the EXPLAIN PLAN which would have cleared this up. Note the filter on line 2 of the plan for the full table scan.

2 - filter("UCBSVCO"."UCBSVCO_STUS_CODE"='C' AND
"UCBSVCO"."UCBSVCO_DISPATCH_IND" IS NOT NULL AND
"UCBSVCO"."UCBSVCO_DISPATCH_DATE"=
(SELECT /*+ */ MAX("UCBSVCO"."UCBSVCO_DISPATCH_DATE")
FROM "UIMSMGR"."UCBSVCO" "UCBSVCO"
WHERE "UCBSVCO"."UCBSVCO_CUST_CODE"=1320908
AND "UCBSVCO"."UCBSVCO_PREM_CODE"='507601'
AND "UCBSVCO"."UCBSVCO_STUS_CODE"='C'
AND "UCBSVCO"."UCBSVCO_DISPATCH_IND" IS NOT NULL))


The SELECT MAX query is part of the filter on this step. That means that it will get executed once for every row that is examined. One thing to note is that it is most likely that there will be subquery caching that occurs since the MAX query remains the same. In other words, the MAX query will be executed once and the answer stored in a cached array so that it really won't get executed every time, only when the inputs are new. And, in this case, the inputs are always the same so the answer will be retrieved from cache on every check except for the first one. I've got a really good example of how subquery caching works that I'll post on another day. :)

So, the plan chosen does show that the SELECT MAX would occur as a filter for the full table scan and that means it could possibly execute once for each row. It was just "lucky" that the MAX query always used the same inputs and therefore didn't really require an actual execution of the MAX query for each row evaluated by the WHERE clause due to subquery caching.

But, the thing to notice about the original plan is that it does require 2 access operations against the UCBSVCO table where one should suffice. The new analytic syntax does only one access operation on that table. That's good! If you can do something once instead of twice to get the same end result, once is better. It's like needing to go to the store to get bread and milk and you make one trip and get the bread and bring it home, then you make a second trip to get the milk. You could've gotten both items in a single trip and saved yourself a lot of time and effort. It works the same way with Oracle. Don't send Oracle to the store twice if once will do.

There were a couple of other comments I wanted to discuss as well.
From Bundit: Is there any reason to avoid using analytic function due to performance issue?

I'd say that analytic functions aren't a fix for everything. They're just another tool in the toolbox to choose from. I like the saying "just because you have a hammer doesn't mean every problem is a nail". You can't use analytic functions to fix everything, but you can use them to help reduce the resource requirements for certain classes of problems like the one I've demonstrated. They won't always be the answer but they should be one of the syntax options to test when/where appropriate. Only through proper testing would you be able to prove which way of writing a particular query would be most efficient.

From Craig M: Are the columns in the partition by necessary? It seems like it isn't adding anything since you are filtering on those columns.

In this case, you're right. Since the WHERE clause is specifying only a single cust_code and prem_code, the partition by isn't really necessary. For me, it's just a habit I'm in of always clearly indicating on which columns I want grouping to occur.

Also from Craig M: Also, I use analytic functions frequently, however I am not sure I would have ever thought to this problem in this way. Would it provide an advantage over something like

with max_ucbsvco as
(select sotp_cd
from
( select ucbsvco_sotp_code sotp_cd
from ucbsvco
where ucbsvco_cust_code = :b1
and ucbsvco_prem_code = :b2
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null
and ucbsvco_dispatch_date is not null
order by ucbsvco_dispatch_date desc
)
where rownum = 1
)
select utrsotp_desc
from utrsotp
where utrsotp_code =
( select sotp_cd
from max_ucbsvco);


The only way I can answer this now is to say we'd have to test it and see. But, I think the analytic version would still be better if this version did what I think you're expecting it to do and materialize a temp table with the result of the query against UCBSVCO. First, the plan for this version would require an "almost" sort. The use of ORDER BY to put the rows in descending order would mean the rows output from that step would need to come out in sorted order and Oracle would have to do that operation. Now, because of ROWNUM, it should do an in-line sort vs. a normal sort operation. In other words, as the rows are read, Oracle will check to see if the ucbsvco_dispatch_date column value is greater than the previous row's value. After all the rows are read, then it will have the one row which has the most current (MAX) value. So, I'd want to see how the bit of extra ordering work would be handled as compared to the analytic version.

The other thing is that if the result set is materialized for that query, there will be some small overhead to create that temp table and then read from it. That overhead doesn't exist in the analytic version.

So, my instinct on this one is that the analytic version would still win. I mean 15 LIOs and .18 seconds is likely going to be pretty hard to beat. But, it's just speculation unless I get a chance to visit that client again and test it out. :)


Thanks to everyone who commented on the original post and I hope this follow-up provides more info to clarify issues raised.

10 comments:

Craig Martin said...

Excellent set of posts. The additional information with the STATs and your explanations is much appreciated. I will be sure to test the analytic approach the next time I have a similar problem.

freekdhooge said...

Thanks for the update.

I always thought that when you have a subquery that is not linked to the parent query, oracle would always execute the subquery only once.

But from your post I understand that oracle would cache the result and recheck the input values. So even if the query itself is not executed there would be an overhead because of the input value checking.
Is my understanding correct?
Is there a way to know if the result caching occurs?
Could it happen that a result "dissappears" from the cache?

regards,

Freek

Karen said...

Freek,

The number of times a subquery is executed depends on several things. When you have a subquery in the WHERE clause (as in this example), it can be executed for each row that must pass through the test that subquery filter is performing.

I want to state that I'm not absolutely sure that this particular query would get cached. Normally, Oracle will provide a caching mechanism for scalar subqueries and I'm not absolutely positive that this subquery would qualify as such after giving it a bit more thought. But, if Oracle was able to use a cache for this, you are correct that there would be a bit of overhead to check the cache, but that (as you can imagine) would be much less resource intensive than executing the SQL every time.

Both Tom Kyte and Jonathan Lewis have written about scalar subquery caching and I'd suggest searching their sites for "scalar subquery caching" to get some great examples of how to verify caching occurs.

As far as values disappearing from the cache, the answer is "yes"...this can happen. If you have more distinct values in the cache than it can hold (limits of the cache size vary based on Oracle version), then the oldest values in the cache will be replaced with newer values.

Again, I'm now questioning whether or not the original SQL in this post would have used caching but if it did that would be "good" and if it didn't do it, then the behavior of the query would be to execute the subquery for each row that must pass the WHERE clause successfully.

freekdhooge said...

Karen,

Thanks for the info.
I have searched on asktom and Jonathan Lewis' blog for scalar subquery caching and the given examples where all about subqueries that where linked with the parent query or using functions.

In the query in the orginal post however the subquery was independent of the parent query, meaning it is sure to return the same value for each of the rows from the parent query.

When I did a quick test with the dbms_xplan.display_cursor function, the stats indicated that the subquery was only executing the subquery just once in my example.

But this could be either the result of the caching or because oracle knows that the subquery result will always be the same.

sys@GUNNAR> select count(*)
2 from T1
3 where veld2 =
4 ( select max(veld1)
5 from T2
6 );

COUNT(*)
----------
1000

sys@GUNNAR> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID gqf8p77hszd3a, child number 0
-------------------------------------
select count(*) from T1 where veld2 = ( select max(veld1) from T2
)

Plan hash value: 2872123968

---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 26 |
|* 2 | TABLE ACCESS FULL | T1 | 1 | 1000 | 1000 |00:00:00.03 | 26 |
| 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 3 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 10 | 10 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("VELD2"=)


21 rows selected.

Oracle said...

Hi,

I am unable to decide whether to go with analytic functions or not. Your help is appreciated.

drop table t purge;
create table t (dt date, sno number, amt number);
alter table t add constraint t_pk primary key (dt, sno);
insert into t select trunc(sysdate)-1, level, round(dbms_random.value(1, 1000)) from dual connect by level <=10;
insert into t select trunc(sysdate), 10+level, round(dbms_random.value(1, 1000)) from dual connect by level <= 3;
insert into t select trunc(sysdate), level, round(dbms_random.value(1, 1000)) from dual connect by level <= 3;
commit;
select * from t;

select *
from t t1
Where dt = trunc(sysdate) - 1
AND NOT EXISTS (select 1
from t t2
where t2.dt = trunc(sysdate)
and t1.sno = t2.sno);

DT SNO AMT
--------- ---------- ----------
25-JUL-08 4 965
25-JUL-08 5 944
25-JUL-08 6 20
25-JUL-08 7 44
25-JUL-08 8 458
25-JUL-08 9 848
25-JUL-08 10 245

7 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 862936642

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 100 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 4 | 100 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 10 | 140 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T_PK | 10 | | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | T_PK | 3 | 33 | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("DT"=TRUNC(SYSDATE@!)-1)
4 - access("T2"."DT"=TRUNC(SYSDATE@!) AND "T1"."SNO"="T2"."SNO")


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
672 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
7 rows processed

SQL>


With analytic function:

select dt, sno, amt from (
select dt, sno, amt,
lead(dt) over (partition by sno order by dt) next_dt,
lead(sno) over (partition by dt order by sno) next_sno
from t t1
Where dt in ( trunc(sysdate), trunc(sysdate) - 1))
Where dt = trunc(sysdate) - 1
and next_dt is null;


DT SNO AMT
--------- ---------- ----------
25-JUL-08 4 965
25-JUL-08 5 944
25-JUL-08 6 20
25-JUL-08 7 44
25-JUL-08 8 458
25-JUL-08 9 848
25-JUL-08 10 245

7 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2843002491

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16 | 704 | 3 (34)| 00:00:01 |
|* 1 | VIEW | | 16 | 704 | 3 (34)| 00:00:01 |
| 2 | WINDOW SORT | | 16 | 224 | 3 (34)| 00:00:01 |
| 3 | INLIST ITERATOR | | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| T | 16 | 224 | 2 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | T_PK | 16 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("DT"=TRUNC(SYSDATE@!)-1 AND "NEXT_DT" IS NULL)
5 - access("DT"=TRUNC(SYSDATE@!) OR "DT"=TRUNC(SYSDATE@!)-1)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
624 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
7 rows processed


I am extremely sorry as I do not know how to format code on blogspot.

Thanks for taking my question.

Regards

Karen said...

Oracle,

Given your test data, the optimal performing plan is the one derived by using the analytic. It goes back to the main point of this post, in that it's almost always better to access an object as few times as possible. With your NOT EXISTS query, the table "t" must have two access operations performed against it whereas the analytic syntax allows the optimizer to choose a plan which only requires one pass over the data.

Using your same test, simply increase the number of rows you add to your table and watch how much better the analytic version looks as the data volume grows. Your test showed consistent gets of 11 for the NOT EXISTS version and 4 for the analytic version. But, if you add a bunch more rows to the table, you'll see what looks pretty good in favor of the analytic look even better. A quick test I did to put about 500 records in the table, showed that the NOT EXISTS version used 578 consistent gets versus 7 for the analytic version. Try it yourself and see!

Peter said...

Hi Karen,

Are the 2 queries that you're comparing functionally equivalent? It seems the SQL that does not use the analytics should have 2 extra predicates in the outer query?

select utrsotp_desc
from utrsotp, ucbsvco
where ucbsvco_sotp_code = utrsotp_code
and ucbsvco_dispatch_date =
(select max(ucbsvco_dispatch_date)
from ucbsvco
where ucbsvco_cust_code = :b1
and ucbsvco_prem_code = :b2
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null)
and ucbsvco_cust_code = :b1 -- do you not need this predicate?
and ucbsvco_prem_code = :b2 -- do you not need this predicate?
and ucbsvco_stus_code = 'C'
and ucbsvco_dispatch_ind is not null;

Karen said...

Peter,

Good eye! Yes, the original query should (and does) have the two additional predicates. Apparently, when I copied the SQL, I missed the last two lines of the statement and didn't realize it. But, if you look at the plan output in the Predicate Information section, you'll see that in steps 4 & 5, those two predicates show up.

So, when I did the test, all the required predicates were there. I carelessly missed the last two lines when I copied the SQL into the blog post.

Sorry about that...and good catch!

robert said...

Sorry to repeat myself, I hadn't seen your update when posting the comment to v1.0: what about using a join with an inline view?

Cheers

Karen said...

Robert -

You asked about using a join with an inline view? I didn't test it as the results of the analytic were so great. It might certainly be another option. But, if the formulation of the inline view query required the main table to need to be accessed more than once, then my belief is that it wouldn't produce better results than the analytic.

Thanks.