Thursday, June 30, 2011

Bug alert!

While testing Oracle 11.2.0.2 (on Red Hat Enterprise Linux), I recently bumped up against a problem with stats collections using DEGREE > 1 that results in an ORA-600.

20:56:41 SQL> BEGIN DBMS_STATS.GATHER_TABLE_STATS('TR_OWNER','TR_POL', Estimate_Percent=>dbms_stats.AUTO_SAMPLE_SIZE, method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO',cascade=>FALSE, Degree=>dbms_stats.AUTO_DEGREE); END;
20:56:41 2 /
BEGIN DBMS_STATS.GATHER_TABLE_STATS('TR_OWNER','TR_POL', Estimate_Percent=>dbms_stats.AUTO_SAMPLE_SIZE, method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO',cascade=>FALSE, Degree=>dbms_stats.AUTO_DEGREE); END;
*
ERROR at line 1:
ORA-20011: Approximate NDV failed: ORA-12801: error signaled in parallel query
server P024, instance db100.srv.com:TRXTPLT2 (2)
ORA-00039: error during periodic action
ORA-00600: internal error code, arguments: [17114], [0x2AE3C33489B8], [], [],
[], [], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_STATS", line 23112
ORA-06512: at "SYS.DBMS_STATS", line 23205
ORA-06512: at line 1

If I execute with an estimate_percent of anything other than auto_sample_size, the ORA-20011 goes away, but the ORA-600 remains. Interestingly enough, if I execute with Degree=>1, everything is fine.

After doing more research and finding that I could execute a stats collection on other objects without any problems, I narrowed down the problem to the existence of certain types of function-based indexes. In particular, if the table has an index that uses either a DECODE or a CASE expression or a user-defined function, the problem occurs. As long as the table doesn't have one of these, everything works fine.

The contact to Oracle Support in regard to this issue replied as follows:

It possible you're getting the problem described in Bug:11774077 which is closed as a duplicate of Bug:10013177. For more information please read Note:10013177.8. Functional index causes max decode group by sql to truncate values / can cause dump.
Please check if one of the workarounds will solve your problem:
- Set _replace_virtual_columns=false
OR
- Set _disable_function_based_index=true
OR
- Gathering table stats using degree 1.

Ouch! Turning off the use of virtual columns and function-based indexes is *not* an option. We rely quite heavily on them in our application. And, for the tables that have this problem, gathering stats without doing it in parallel (degree 1) is doable, but not desirable as we have a limited window of time to collect stats and have relied on being able to collect in parallel to reduce the total time it takes to complete the collections.

We're still researching other options or workarounds and pushing on Oracle Support for a fix for this problem, but are not making much progress on either front at this point.

I'll update again soon with more information as I have it.

4 comments:

Flado said...

Since you have multiple tables with this problem, I guess you could parallelize manually, running several tables in parallel, each with degree=>1.
Sure it is clunky, but it might just be enough for you to fit into the stats gathering window.

Cheers!
Flado

Karen said...

Doing that actually makes it worse (we've tried it). We're trying other alternatives and just heard back from Oracle that they've got a patch that might help us out. As soon as we get it applied and tested, I'll post back here with results.

Mohamed Houri said...

I could not reproduce the same error. See below

Best Regards

Mohamed Houri

mhouri@mhouri> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

mhouri@mhouri> create table mho_all_objects as select * from all_objects;

Table créée.

mhouri@mhouri> ed
écrit file afiedt.buf

1 CREATE INDEX fbi_mho_obj ON mho_all_objects
2 (
3 CASE
4 WHEN object_type = 'TABLE'
5 THEN NULL
6 ELSE OBJECT_NAME
7 END
8* )
mhouri@mhouri> /

Index créé.

mhouri@mhouri> BEGIN
2 DBMS_STATS.GATHER_TABLE_STATS(user
3 ,'MHO_ALL_OBJECTS'
4 ,ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE
5 ,METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE AUTO'
6 ,cascade => false
7 ,degree => 2);
8 END;
9 /

Procédure PL/SQL terminée avec succès.

mhouri@mhouri> BEGIN
2 DBMS_STATS.GATHER_TABLE_STATS( user
3 ,'MHO_ALL_OBJECTS'
4 , ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE
5 , METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE AUTO'
6 , cascade => false
7 , degree => DBMS_STATS.AUTO_DEGREE);
8 END;
9 /

Procédure PL/SQL terminée avec succès.

Karen said...

The bug doesn't kick in until you have a table with a significant number of rows. In my case, I found that the table had to be at least 3.25 million rows before the bug occurred. So, your mileage may vary, but since you created what I'd assume is a fairly small table (built on all_objects), it's likely not large enough to meet the conditions of the bug. Also, our environment is 11.2.0.2 on RHEL Linux 5.6 and yours is 11.2.0.1 on Windows. I don't know if the bug affects your environment. See my follow-up post for the link to the patch set fix that was supplied to read more about the bug and determine if your environment is one that is affected.