Monday, December 29, 2008

AND-EQUAL vs BITMAP AND

My friend and colleague, Jan, sends me some great SQL performance issues from time to time. He and I had an email exchange around the Christmas holiday that was another interesting case so I thought I'd share it.

First a bit of background info. The problem query is running under both 9i and 10g database versions. On 9i, the query runs in about .5 secs and uses a AND-EQUAL operation. However, on version 10, the optimizer chooses a BITMAP AND operation and converts the b-tree indexes to bitmaps first. This is where the problem lies. The v10 optimizer choice to use the BITMAP AND operation increases the response time for the query to almost 1 minute (56.5 secs).

I recall that 10g deprecated the AND-EQUAL operation and will now choose this "index combine" operation using bitmap conversions instead most of the time. However, I also recall having read about performance issues occurring at times with this choice. I've seen a couple of tricks to work-around this problem. One is to set the optimizer_features_enable parameter to 9.2.0 in the session before executing the query (many times with a logon trigger). The other would be to split up the query into two queries using UNION ALL to put them together.

As you'll see below in the details, in v10 the majority of the time is spent in the plan step that converts the rowids of the b-tree index to bitmaps. But, the part that I'm not sure of (and haven't built a test case yet for review) is why the INDEX RANGE SCAN that provides the rowids to the bitmap conversion accesses so many more blocks than the range scan for the AND-EQUAL operation.

Now for the details:


select count(item_t.poid_id0)
from item_t -- num_rows 5,095,494
where 1=1
and item_t.ar_billinfo_obj_ID0 = :1 -- num_distinct 599,778
and item_t.ar_bill_obj_ID0 = :2 -- num_distinct 3,155,282

Bind variable values:
:1 = 53
:2 = 10

The query under 9i uses AND-EQUAL as shown in the STAT lines from collected trace data:

STAT #2 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=79 pr=23 pw=0 time=514521 us)'
STAT #2 id=2 cnt=6 pid=1 pos=1 obj=352589 op='TABLE ACCESS BY INDEX ROWID ITEM_T (cr=79 pr=23 pw=0 time=162835 us)'
STAT #2 id=3 cnt=6 pid=2 pos=1 obj=0 op='AND-EQUAL (cr=76 pr=20 pw=0 time=142399 us)'
STAT #2 id=4 cnt=21 pid=3 pos=1 obj=368525 op='INDEX RANGE SCAN I_ITEM_AR_BNFO_OBJ__ID (cr=36 pr=1 pw=0 time=13924 us)'
STAT #2 id=5 cnt=16 pid=3 pos=2 obj=368532 op='INDEX RANGE SCAN I_ITEM_AR_BILL_OBJ__ID (cr=40 pr=19 pw=0 time=101036 us)'

Under 10g, using default settings the query uses a BITMAP AND plan:

STAT #2 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=9688 pr=9519 pw=0 time=56554617 us)'
STAT #2 id=2 cnt=6 pid=1 pos=1 obj=352589 op='TABLE ACCESS BY INDEX ROWID ITEM_T (cr=9688 pr=9519 pw=0 time=56554575 us)'
STAT #2 id=3 cnt=6 pid=2 pos=1 obj=0 op='BITMAP CONVERSION TO ROWIDS (cr=9685 pr=9519 pw=0 time=56554533 us)'
STAT #2 id=4 cnt=1 pid=3 pos=1 obj=0 op='BITMAP AND (cr=9685 pr=9519 pw=0 time=56554480 us)'
STAT #2 id=5 cnt=1 pid=4 pos=1 obj=0 op='BITMAP CONVERSION FROM ROWIDS (cr=3 pr=0 pw=0 time=128 us)'
STAT #2 id=6 cnt=22 pid=5 pos=1 obj=368525 op='INDEX RANGE SCAN I_ITEM_AR_BNFO_OBJ__ID (cr=3 pr=0 pw=0 time=90 us)'
STAT #2 id=7 cnt=6 pid=4 pos=2 obj=0 op='BITMAP CONVERSION FROM ROWIDS (cr=9682 pr=9519 pw=0 time=46814337 us)'
STAT #2 id=8 cnt=530398 pid=7 pos=1 obj=368532 op='INDEX RANGE SCAN I_ITEM_AR_BILL_OBJ__ID (cr=9682 pr=9519 pw=0 time=67896016 us)'

Notice how both of them use INDEX RANGE SCAN I_ITEM_AR_BILL_OBJ__ID, but the bitmap plan is doing 9682 cr against 40 for the and-equal. Why would that be the case?

My guess for the increased cr's comes from seeing cnt=530398 in step 8 of the bitmap plan where the index range scan occurs. That step is returning over half a million rows from the index that then are converted by the parent step 7. The parent ends up with cnt=6. What that says to me is that the conversion happened for the half million rowids and then were filtered to end up with only 6. But in the and-equal plan, the index range scan only returned cnt=16. I can see the behavior, my question is why? What's really going on? If any of you have the answer to that question, I'd appreciate hearing from you. I promise to test it for myself too! ;)

5 comments:

Nigel said...

Karen

The index scans for AND-EQUAL only traverses the matching index entries (and then the AND-EQUAL gives you the necessary intersection). Is the conversion to bitmap constructing the entire bitmap for each of the selected values? ie a list of 0s and 1s for every row in the table - 5 million for each index, which have to be sorted into rowid order. So does that need a full index scan (and a sort) rather than a short range scan?

However the CR figures suggest it is the BITMAP AND that costs; presumably the same short index scans are used, but during the BITMAP AND the bitmaps are "normalised" so that they place the 1s in the appropriate bits (by generating 0s for all other rowids).

Just my guess from a sleepy post-Christmas break in Italy

Regards Nigel

Cheryl said...

I just want to say that I understood the last blog much better! Thanks for posting things that appeal to all of us from time to time!

Narendra said...

Karen,

Happy New Year!!!
Did you manage to find out the solution for change from AND-EQUAL to BITMAP CONVERSION ?
My apologies if you are enjoying your holidays or are busy with something more important. But I am actually curious to know the details. (not that I will understand all and unfortunately, I am still on 8i so can't test it myself...)

Connor McDonald said...

I don't have an answer, but instead of logon triggers et al, the optimizer features enable can be specified within the SQL itself with a OPT_PARAM hint.

hth
Connor

Unknown said...

Looks simular to bug id 3258746.

I ran into what looks like a simular issue during a 9 to 10 upgrade. In my case I created a 2 column index, dropped the index with the leading column but had to leave the index which was the trailer in the new index. In my situation it worked out well.