Showing posts with label Embarcadero. Show all posts
Showing posts with label Embarcadero. Show all posts

Friday, May 2, 2014

Upcoming Webinar: Writing SQL Right on May 20

Coming up on May 20, I'll be delivering a webinar entitled Oracle SQL Performance: Writing SQL Right. Registration is now open. Once again the webinar will be hosted by Embarcadero and there will be two sessions - one at 10am ET and one at 2pm ET.


Abstract:
There are many ways to write a SQL statement that may lead to the functionally correct answer. However, we often get stuck in a rut using the same SQL syntax over and over even when there may be a better way to write the SQL to enhance performance. Some techniques worked great in previous Oracle versions but changes in the optimizer with the later versions made the behavior of those techniques change and, in many cases, regress.

This session covers a number of common anti-patterns that can be rewritten to provide enhanced performance and will help you learn how to:

  • Recognize certain patterns that can be sub-optimal. 
  • Analyze the "performance footprint" of certain patterns. 
  • Rewrite the anti-patterns to use less resources and take less time to execute. 


I look forward to "seeing" you there!

Tuesday, March 18, 2014

Effective Indexing Webinar

Thanks to everyone for attending today's Effective Indexing webinar sponsored by IOUG with Embarcadero. For attendees, IOUG will likely send out a link to the recording and PDF of the presentation, but I also wanted to post it here.

Presentation PDF
Webinar recording



Please note that during the webinar I mentioned the new clustering factor calculation available in 12c and also available in some patchsets for 11g releases. The patch number I referred to is incorrect. Instead, please search for the Bug/Enhancement article in MOS. You can find it by searching for "Bug 13262857 - Enh: provide some control over DBMS_STATS index clustering factor computation (Doc ID 13262857.8)."

Thanks again and stay tuned for additional webinars coming soon!

Sunday, November 17, 2013

Webinar Follow-up: Everyday Oracle Pro

Thanks to everyone who attended the November 12 webinar!

Presentation PDF
Webinar recording


Wednesday, October 30, 2013

Becoming an Everyday Oracle Pro - November 12 Webinar

Register for my next webinar on November 12 entitled "Becoming an Everyday Oracle Pro".

About the webinar
Whether you are new to Oracle or a seasoned veteran, you want to do your job to the best of your ability. Each one of us can become an everyday Oracle pro if we strive towards one basic truth: doing something well isn't only about what you know, but about how you apply what you know. Even if you have memorized a lot of information, it's not much good if you can't apply it, and more importantly, understand how and when to apply that knowledge effectively.

You can become an everyday Oracle pro and make even greater contributions to the success and effectiveness of your organization by focusing on a few basic principles.

In this session, you will learn:

  • The 3 R's of being an everyday Oracle Pro (Research, Remember, Replicate)
  • The difference between memorization and knowledge
  • How to think clearly about problem solving
  • How to collect and grow your personal collection of helpful tools



This will be the sixth, and final, Embarcadero sponsored webinar for 2013. See you then!


Friday, September 20, 2013

Webinar Follow-up: Execution Plans - Learn by Example

Thanks to everyone who attended the September 17 webinar!

Presentation PDF
Webinar recording



Q&A
Q: Do all these methods of showing execution plans work (dbms_sqltune, dbms_xplan) with Oracle Standard Edition?  What about creating extended statistics in Oracle SE? Do all these tips work with SE? Is SQL Monitor a licensed product?

A: You cannot use SQL Monitor reports (dbms_sqltune) with Oracle Standard Edition as they are produced using elements included in the Tuning Pack license which is *not* available on SE. However, you can use dbms_xplan without restriction and all the tips for how to read and analyze plans are the same regardless of the method you use to display plan data.

Q: We have a big table with 80 partitions.. for analyze it is taking long time.. is there any easy way analyze can done quickly?

A: You could consider using incremental partition statistics if many of the partitions have data that changes infrequently. Generally speaking, make sure to use the default collection parameters (like estimate_percent=>auto_sample_size) and only collect stats when you really need to (after data changes by > 10%). The following two links to the Optimizer Development Team's blog may also be of some help:
https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables
https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics

Q: How can the order of filters, joins, etc in the where clause be controlled to manually keep a minimum dataset through the execution or just force a different one for educational / what if purposes?

A: You can control plan operations and the order in which they are executed using hints. Simply inject the hints that specify access operations (FULL, INDEX) and join methods (USE_HASH, USE_NL) and join order (LEADING). The more hints you provide, the more control you can apply to the plan operations.

Q: How do we know if the current execution plan is the best plan or if there is ANY execution plan  better  than the current execution plan?

A: Test! Remember that the optimizer has gone through numerous alternatives before settling in on the final plan. If the plan chosen isn't performing as well as you'd like, then you must try to determine alternatives (by using hints to force some choices or by rewriting the SQL or adjusting statistics...). There is also the Visual SQL Tuning method you could use to "map" the best order of operations for a SQL statement (see my July webinar for more on VST). The bottom-line is that you have to test to understand the performance of the chosen plan and then find the reasons why it under-performs and correct those root causes.

Q: Basically the SQL execution is in the AWR SQL history but not in shared pool. and we would like to see the execution plan using dbms_xplan. how can we do that?

A: DBMS_XPLAN.DISPLAY_AWR will do the trick. You'll provide the SQL_ID and PLAN_HASH_VALUE (optionally) and the FORMAT parameters you desire.

Q: is there a way to force FAST FULL INDEX SCAN instead of INDEX FULL SCAN?

A: Hints. The INDEX_FFS (table index) hint would force an INDEX FAST FULL SCAN. Don't use the hint indiscriminately as if the optimizer "thinks" the fast full scan would be better it would have costed it as such and selected that operation. If you believe you should be getting a fast full scan and are not, try and verify why before you hint the SQL.

Q: When will optimizer choose INDEX FAST FULL SCAN over INDEX FULL SCAN?

A: A fast full scan is similar to a full table scan in that it will read all the blocks (using multiblock reads) in the index without maintaining order. You often see this operation when the query result set can be satisfied from the index contents alone without having to do additional data block accesses. The bottom line (and somewhat cheeky response) is that it will be chosen when the optimizer thinks it is the best choice. If you find otherwise, it's going to be up to you to research and test to discover why the optimizer thought it was best when it actually wasn't.

Q: Is there a way to display the lines of execution plan by the order in which the lines are executed, instead of  using the indentation which can be VERY tideous for  100+ lines of SQL statement?

A: My favorite script for doing this from Randolf Geist and you can find it on his blog at http://oracle-randolf.blogspot.com/2011/12/extended-displaycursor-with-rowsource.html. Of course, you could write your own but no need to reinvent the wheel.

Q: In SQL monitor report what do the columns Time Active (s) and start active (S) mean. I never found a good documentation explaining these 2?

A: The Time Active(s) column shows how long the operation has been active (the delta in seconds between the first and the last active time). The
Start Active column shows, in seconds, when the operation in the execution plan started relative to the SQL statement execution start time.

Q: Can you please tell in which case FULL TABLE SCAN is OK to see in the Explain Plan? or IS Full Table Scan is always bad?

A: All operations have a good use case! It's critical to **NOT** assign a "good" or "bad" judgment to any of them. Each operation may be optimal given the context in which it is used. With full table scans, you typically hope to see them being used when a significant amount of data from an object is needed (as determined by the number of blocks that must be accessed in order to retrieve the needed rows).

Q: What are the trade-offs between SQL Tuning Advisor and Execution Plans?

A: I wouldn't say there are any "trade-offs". SQL Tuning Advisor can be used to identify possible changes that could be helpful to your query's performance. One of the options STA may offer is the option to create a SQL Profile. The Profile provides some additional statistical information to the optimizer so it can/should produce a more effective execution plan. I personally think of STA as a tool to point me towards things I need to investigate.

Q: When running an execution plan, I see "- dynamic sampling used for this statement (level=6)" even though all of the objects in the query (table and indexes) have good statistics and optimizer_dynamic_sampling=2.  Is there a way (without setting a 10053 trace) to find out why dynamic sampling was used, and at what part of the plan it was used?

A: From Oracle Database 11g Release 2 onwards the optimizer will automatically decide if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel. This decision is based on size of the tables in the statement and the complexity of the predicates. However, if the optimizer_dynamic_sampling parameter is explicitly set to a non-default value, then that specified value will be honored. When it does kick in at level 6, it is simply doing a 256 block sample to help the optimizer produce more accurate cardinality estimates on the objects being accessed in parallel.

Q: How do you find out about your extended stats, what query or view do I need?

A: I'm going to point you to a couple of blog articles from the Optimizer Development Team regarding extended stats which should answer this and any other questions you may have on extended stats.
https://blogs.oracle.com/optimizer/entry/extended_statistics
https://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload

Q: Does extended stats get maintained automatically or does it need to be manually collected again and again.

A: Please see the two links I provided for the previous question for more details. But, generally speaking, once you create an extended statistic, it will continue to be collected (if you use default collection parameters) until you specifically drop the extended stat.

Q: How to effectively trace an execution plan for a given SQL to understand why the Optimizer chose the specific execution plan?

A: You can capture a 10053 optimizer trace of the given SQL and review the trace file. However, that is not something I'd recommend as a primary method! You can "always" know that the optimizer chose a particular set of plan operations because those operations were the lowest costed options considered by the optimizer. So, if you really want to know why, you need to inspect the inputs the optimizer used to cost the various plans. The place to start is with object statistics and the execution plan rowsource statistics. You need to compare estimated cardinalities with the actual rows returned and find where discrepancies exist. When found, the discrepancies will lead you to the statistics you need to review or they can help you see where/how in the SQL the objects that have discrepancies are used. The bottom-line is that it's going to require you to research and evaluate the plan execution data to determine where the optimizer may have gone astray. Only in very rare cases would I resort to a 10053 trace.

Q: If the E-ROWS and A-ROWS differ a lot but still the execution plan did not change in comparing when E-rows and A-rows are matching, will there is a performance difference in the SQL?

A: You can't know that unless you find out why the estimates and actuals are different and take the necessary steps to correct things so that the difference is limited or eliminated. Once you find out why there was a difference and correct it, the plan may change and performance may improve. However, depending on the access paths and join methods available to the optimizer, it is entirely possible that the plan may not change after the estimates are improved and performance would therefore remain the same. 

Q: How to interpret the COST value/estimate of a given SQL?

A: Cost is the value computed by the optimizer that indicates the estimated amount of work (time and resources) required to produce the result set. It is computed based on statistical formulas utilized by the optimizer to assign a value to each viable set of plan operations possible for a given SQL statement. For the most part, cost is not something I focus on as it is a given that the cost of the plan selected by the optimizer was computed to be the lowest of all possible choices. Therefore, if the response time and resource usage for that plan doesn't meet my expectations, my next step is to determine where/how the optimizer "went wrong" in computing the selected option as the best/lowest cost.

Q: What is the difference between explain plan and execution plan?

A: An EXPLAIN PLAN is simply the proposed plan that the optimizer "might" choose when the query is executed. It is *NOT* a guarantee of the plan that will be used at runtime. The execution plan, on the other hand, is the actual plan that was selected by the optimizer and used to produce the query result set. So, the easiest way to differentiate the two is that EXPLAIN PLAN is the estimate, the execution plan is the actual.

Q: How to analyze or find which tables need histograms for the best execution plans depending on bind variable values?

A: When you're considering histograms, you're considering skew in your data. So, you're looking for columns that contain skewed data. If you simply collect statistics using the default METHOD_OPT parameter ('FOR ALL COLUMNS SIZE AUTO'), histograms will be collected automatically for you. Now, the collection may not be perfect, so you may need to use your own knowledge of the data to help you define specific columns that need histograms. Also, if you find that for queries that use binds your performance is wildly variable, that may be a red flag to point you towards columns that need histograms as well as being an indicator that you might need to adjust your use of bind variables to use some literals to help the optimizer make the best plan choices.

Q: Used Memory - what does (0) means?

A: This column displays the sum of the maximum amount of memory that was used in all execution of the specific plan operation.

Q: What are advantages using this tool over oracle RAT?

A: RAT, or Real Application Testing, is simply a way to do two things: 1) capture and replay executions of application code (SQL) for a representative period of workload and 2) compare the performance (i.e. plan changes and resulting differences in response times and resource usage) of an original workload and the workload after some change(s). The 2nd element (known as SQL Performance Analyzer) helps automate the process of comparing before and after execution plans and can highlight plans that change. The plans that change may change for the good (they improve) or for the bad (they regress). SPA captures the changes and helps you to focus in on the plans that regress so that you can do further work to find and correct the issues. STA simply compares performance and plans for the same SQL_IDs and displays the difference. You could do this yourself manually but SPA provides a separately licensable product to do much of the work for you. However, if you find a problem, you'll likely still have to do some work on your own to determine why the plans changed and how to correct them. So, this is not an "either/or" situation. RAT uses execution plans and simply automates a bit of the legwork for you.



Stay tuned for details of my November webinar coming soon! 

Saturday, September 7, 2013

SQL Tuning Fundamentals: Execution Plans - September 17 Webinar


Register now and join me for my next webinar entitled "SQL Tuning Fundamentals: Execution Plans".

Few tools are as critical to SQL optimization as the execution plan. The ability to read and understand an execution plan allows us to evaluate and optimize SQL performance. Unfortunately, complex plans often seem daunting and can be difficult to understand. During this webinar, I'll walk you through a set of guidelines for how to read an execution plan and how to make sense of the operations and statistics (both estimated and actual) the plan output provides.

Topics covered in this webinar include:
  • How to read an execution plan, and how various plan operations work 
  • How to use the plan to pinpoint performance problems 
  • Tools to accelerate your analysis of execution plan data

Tuesday, July 23, 2013

Follow-up: Visual SQL Tuning Webinar

Thanks to everyone who attended my Visual SQL Tuning webinar. My goal was to keep it simple and show the value of using VST to help you know what execution plans "should" do.

Presentation PDF
Webinar recording


I want to thank:
Kyle Hailey for his extensive work on the subject and his recent great detailed VST presentation from KScope13.
Craig Martin for his building join order diagram I used but incorrectly attributed authorship to Kyle (sorry Craig!).

I hope to see everyone in September for my next webinar! Stay tuned for details.


Tuesday, May 14, 2013

Follow-up: Using Optimizer Hints for Oracle Performance Tuning Webinar

Thanks to everyone who attended my webinar on using hints for Oracle testing and performance tuning. As usual, it was a great event and I appreciate the comments and questions.

Downloads:
Presentation PDF
Related scripts
Webinar recording


I'll be back in the saddle again in July so keep your eyes open for the announcement of that event. Thanks again and hope to see you then!



Monday, May 6, 2013

Using Optimizer Hints for Oracle Performance Tuning


My next Embarcadero sponsored webinar will be on May 14 and is entitled Using Optimizer Hints for Oracle Performance Tuning.

Register now!


Hints are excellent database performance tuning tools that direct the Oracle optimizer to utilize specific operations in SQL execution plans. We often use hints because the Oracle optimizer doesn’t always come up with the execution plan we want on its own. When used correctly, hints can help stabilize an execution plan to use the same operations over and over allowing the SQL to perform the way we desire.

In this webinar, I'll take a look at using hints specifically for testing. Hints are great, and often overlooked, testing tools.

Register for Optimizer Hints for Oracle Performance Tuning webinar to learn:
  • The basics of using optimizer hints to choose desired plan operations 
  • How to setup and compare tests using different optimizer hints for changes in response time and resource consumption 
  • How to create and maintain an information repository of testing results to be used for problem analysis in the future 
  • Tools for applying and testing a broad range of hints 
Thanks again to Embarcadero (@DBPowerStudio) for their sponsorship of these webinars.

I hope to see you there!

Tuesday, March 26, 2013

Back to the Future AWR Mining Webinar Followup



Thanks to everyone who attended my webinar and thanks to Embarcadero (@DBPowerStudio) for hosting it. The presentation and scripts can be downloaded from the following links:

Presentation (PDF)
Scripts (ZIP)
Webinar recording


Stay tuned for my next webinar coming in May!


Tuesday, March 12, 2013

Webinar - Back to the Future: Oracle SQL Performance Firefighting using AWR


It's webinar time again! On March 26, Embarcadero will once again provide sponsorship for my webinar entitled "Back to the Future: Mining AWR Data for Oracle SQL Performance".

Abstract
Most of us have been in the situation where, for no apparent reason, performance for key SQL takes a nose-dive after having previously performed well. So, how do you handle this situation and stabilize performance back to acceptable levels? One approach is to go back in time using execution data stored in AWR. In many cases, AWR may contain what you need to revert your problem SQL to a better performing alternative.

Register for the webinar now to learn:
  • How to mine and analyze AWR data to review the SQL's performance over time
  • How to validate that the SQL is using the "good" execution plan
  • Tools to accelerate your analysis of AWR data and SQL code
There are two sessions (one at 10am EDT and one at 2pm EDT). I hope to see you there!

Thursday, February 7, 2013

Wednesday, February 6, 2013

Understanding Oracle Optimizer Statistics Webinar Q&A

There were quite a few questions asked during the webinar yesterday that I didn't have time to answer online. So, here's the list of questions from the webinar transcripts with my answers. In some cases, I combined several similar questions and gave a single answer. Hope it's helpful!

1. Can you give your thoughts on bind peeking and how this relates to statistics?
Bind-peeking occurs during the hard parse of a SQL statement that includes a bind variable. At the time of the parse, the optimizer will "peek" at the contents of the bind variable and use that value in its cardinality calculation. The issue that can occur happens when the data is skewed (i.e. there are some frequently occurring values and some infrequently occurring values). If a histogram is present on that column and a literal is used in the SQL - instead of a bind variable - the optimizer is able to determine the cardinality exactly for that value because each query that uses a different literal value would require a new hard parse. Each parse could result in a different plan selection depending on the estimates derived for that specific value. But, the problem is that since the bind is present, there will be only one hard parse of the statement and thus whichever bind variable value is present during the initial parse, that's the plan that is in place for that specific SQL. Until that plan gets aged out of the library cache, it will continue to be used. So, if another bind variable value is used later and that value would be better serviced by a different plan operation, it would still use the original plan. This has changed in later versions of Oracle with the advent of features such as adaptive cursor sharing and cardinality feedback. But, the answer to your question is that bind peeking relates to statistics by making it a bit harder for the optimizer to create plans that are optimal for different bind values. Otherwise, how the optimizer uses the statistics is identical to a query that doesn't use binds.


2. How important are the system statistics (no workload statistics and workload statistics)?
Do you recommend to use workload statistics? When to use workload stats?
What about system statistics do you recommend to collect them?

Workload statistics are intended to help the optimizer understand workload characterizations that change over the course of the day (or whatever time period that is specific to you). For example, if your database is heavy OLTP between 8am and 6pm and heavy batch/reporting between 6pm and 8am, you could let the optimizer know about these different types of workload by implementing workload stats. The way it would work is that you would collect/capture workload stats twice: once during the daytime OLTP hours and once during the evening batch hours. The captured stats indicate how cpu, IO and throughput looks during that time period and could show the optimizer that the daytime hours are full of numerous fast-running queries whereas the evening hours are consumed by long-running queries. This information can help the optimizer choose plans more appropriate to each workload (for example, more index scans during the day and more full scans during the evening). Noworkload stats are in place by default and use fairly innocuous values for these stats. If you've never used workload stats, I advise caution before attempting to implement them. Different workload stats can/will effect the optimizer's plan choices and it may do so in ways that effect more plans than you expect (or want). So, while workload stats can provide the optimizer additional information from which to develop plans, make sure you are prepared to do in-depth and thorough testing before implementing them.


3. Do the order of predicates in the where clause influence in any way the order the optimizer will use them, i.e. if I put the filter for my largest table first, would it take it first?
The order of predicates used to make a difference in the long ago prior to the advent of cost-based optimization. But today's optimizer, using stats, will calculate which predicates will provide the best filtering and execute those first. So, the only reason to write your predicates in any order is so that you can verify what you expect with what the optimizer ends up choosing in the plan. If you think the order should be different than the optimizer chooses, you can quickly compare your written order with the steps in the plan and see where things are "off". Then, you can verify where the optimizer went wrong by reviewing the statistics and comparing the actual vs estimated values the optimizer made.


4. So are you stuck with no histogram to account for skew if you use bind variables?
No, you're not stuck. You just have to realize that skew and bind variables don't always play nicely together. As I mentioned in the earlier answer on bind peeking, today's optimizer is much better at handling binds and data skew. If, however, you find that a particular SQL's behavior is unpredictable and you have to have stability, you may have to consider writing the SQL specifically to accommodate certain cases of skew. For example, you may write two SQL statements and use IF/THEN logic in your code to execute the correct SQL based on the bind value to be used. It requires extra code and knowing about specific corner cases where skew is a problem, but when it's something that is important enough, that may be your best option.


5. Are terms "selectivity" and "density" used interchangeably?
The difference is that density refers to the computed answer to the expression 1/number-of-distinct-column-values. Each column has its own density. Selectivity is typically thought of as the combined densities of multiple columns (or predicates). Such that if you had a WHERE clause of WHERE gender = 'F' and district = 12, the density for gender would be 1/2, or .50, and the density for district would be 1/12, or .083. The predicate selectivity would be .50 x .083 = .0415. But, if you only have a single predicate, then you could use the terms interchangeably but I prefer to keep the two terms separate for clarity.


6. As a general default, what value would you recommend for the "method_opt" parameter of dbms_stats.gather_schema_stats, and would your recommendation be different for Oracle 11.2 vs. 9.2?
In v11, particularly 11.2, I'd recommend using the default method_opt of FOR ALL COLUMNS SIZE AUTO. There have been numerous improvements in how stats are collected and I think the defaults provide "close to perfect" results in most cases. I'd have to say that I didn't start using the defaults until this latest version. For pre-11 Oracle versions, I stuck with FOR ALL COLUMNS SIZE 1 and then did separate collections for tables/columns that I knew would benefit from histograms. The bottom-line is that there really is no "one size fits all" way to collect stats. Your data and your SQL have specific nuances that only you can know. But, if you're running the latest version of Oracle, I'd start with the defaults and modify/adjust from there to meet your specific needs.


7. Which version did Extended Statistics come out in?
11g release 1


8. Is there a way to see the transformed statement?
The transformed statement isn't emitted anywhere. The closest you can get is to use an optimizer trace, 10053, and review that information to see which transformations were considered and selected.


9. While SQL remains the definitive language of DBMS, increasingly, Java in the database through JDBC or other server side programming have increased with Oracle and DB2; are DBAs doing well with the complexity of integrating SQL, PL/SQL, JAVA and JEE in delivering effective DBMS performance and in ensuring enterprise application security?
The biggest issue I see is that there are usually multiple groups that each have expertise in the different disciplines. Java developers have certain biases and PL/SQL developers have theirs. Sometimes these biases, or preferences for how to do something, cause the final product to suffer. I think the key is that all groups, regardless of the tool they are using, must remember that the database is the core (and common) element. Understanding how the database does what it does (i.e. how it executes SQL) is critical. Then, each tool can be utilized to exploit its strengths while making sure to support what the database needs and can do best.


10. After using the hint with 2 predicates why is Optimizer estimating 100 rows? I am assuming the last row in stats is showing the optimizer estimated rows the SQL will return
When using a dynamic_sampling hint at level 4, the optimizer will be able to consider relationships/dependencies between columns it previously considered independently of one another. Since the two columns used in the example were identical (i.e. they contained identical data and thus either used alone would return the same answer), by default, they would be considered independently of one another and would cause the selectivity to be too low. However, when the hint is applied and the relationship between the columns is known, the optimizer computes the selectivity properly so that it is the same as if only a single predicate were used.


11. What would be the difference between letting the out of the box gather stats vs just creating the gather stats on the specific schemas that is used by the application?
Out of the box has multiple meanings. It can refer to the default settings and the default scheduled maintenance job where stale stats are gathered. In either case, I think using default settings (like method_opt FOR ALL COLUMNS SIZE AUTO) are a good place to start. But, as I already mentioned in a previous answer, I think each site needs to adjust stats collection parameters to properly handle their own unique needs. As for the default scheduled maintenance job that collects stale stats, I'm a little wary of that one. I prefer to control when stats are collected so the default job makes me feel less in control of things. So, once again I'll say that I think the defaults (all of them) are there because they are intended to suit the needs of most databases, most of the time. Only you can determine if the defaults "as is" work in your situation and if not, then you must adjust accordingly.


12. How often should the data dictionary stats be gathered?

I'll answer your question with another question: how often and by how much does the data dictionary in your database change significantly enough to require updated stats? You have to know the answer to that question before you can decide the best collection strategy for dictionary stats in your environment. I will say that if you are upgrading versions or doing any significant patching that effects data dictionary content, collecting dictionary stats should be done after that effort.


13. Does RULE hint prevent dynamic sampling?
Yes, if the RULE hint is used, dynamic sampling will not be used. The reason is because dynamic sampling is used by the cost-based optimizer to gather statistics that are used in the development of the plan. When the rule-based optimizer is used, then these statistics are irrelevant and will not be collected.


14. Does dynamic sampling occur every sql run even within the same session?
Dynamic sampling occurs during the hard parse of any SQL statement that either doesn't have stats on an object used in the SQL or if a dynamic_sampling hint exists or if the optimizer_dynamic_sampling instance parameter setting is high enough to "kick in". But, once a single SQL has been parsed and the plan chosen and loaded into the library cache, it will be used until the plan is aged out.


15. What do you do for stats on a highly changing table?
If the changes to the table cause the number of rows to increase enough or the distribution of values to change enough so that skewed values shifts within columns, then you need to gather stats frequently enough to allow the optimizer to adjust plans based on the updated stats so that you get (hopefully) the best, optimally performing plan. However, if the changes to number of rows and distribution of values doesn't really effect the plans the optimizer should choose, then I wouldn't collect as frequently and instead collect on a regular schedule that suits your needs (daily, weekly, bi-weekly, etc). In the end, the reason to collect stats is that the plans the optimizer is choosing using the current stats aren't adequate. If you need new plans to be derived, then you'd collect new stats. If you don't want plans to change, then don't collect.


16. You have asked dynamic sampling = 4 . Is there particular reason why 4?
Level 4 is the level at which dynamic sampling will consider complex predicates (an AND or OR operator between multiple predicates on the same table).


17. When would you want to lock stats?  Also if you have a transaction table that is cleared out each minute, would you want to prevent stats from being generated on that table?
I'd want to lock stats on a table if
1) the table is static and never or rarely changes
2) if the table is emptied and reloaded frequently but always contains basically the same amount and type of data
3) I don't want changes to stats on this table to cause plan changes
4) I want to delete all stats on a table and prevent any new stats from being collected

If I have a frequently cleared and reloaded table, I'd either want to collect a set of stats at a time when the contents of the table are representative of what is typically queried and then lock them. Or, I'd want to delete the stats completely and lock them so that future stats collections are not allowed but instead remain empty thus allowing dynamic sampling to kick in based on the optimizer_dynamic_sampling parameter setting (which I'd want to set to at least 2 and most likely 4 for that table).





Tuesday, February 5, 2013

Understanding Oracle Optimizer Statistics Webinar Follow-up

Thanks to everyone for attending today's webinar on Understanding Oracle Optimizer Statistics sponsored by Embarcadero Technologies. I appreciate everyone who took time to join me today and hope you found it informative.

The webinar recording will be posted within in the next couple of days, but you can download the presentation file now.

Understanding Oracle Optimizer Statistics - presentation

I'll provide the link to the recording as soon as it's available and will update this post with Q&A from the webinar sessions shortly.

Thanks and again and see you in March when I'll be presenting "Back to the Future: Oracle SQL Performance Firefighting using AWR."

Thursday, January 31, 2013

Webinar - Understanding Oracle Optimizer Statistics


Embarcadero is sponsoring another Oracle Community webinar on January 29  (rescheduled) February 5 conducted by yours truly. Register for Understanding Oracle Optimizer Statistics and join me as I cover
  • Oracle optimizer statistics fundamentals for computing cardinality/selectivity on single and multi-column predicates.
  • How statistics can help the optimizer understand data distribution patterns.
  • Reasons why certain ways of writing SQL limit the optimizer's ability and how query transformations improve the odds of getting a better execution plan.
Hope to "see" you there!




Thursday, September 27, 2012

Making Impactful Performance Changes Webinar Follow-up

Thanks everyone who attended the 3rd segment of my 3-part DBA Performance webinar series. I enjoyed delivering them and hope you found them informative and useful.

You can find the recording here and the presentation slide deck here.

There were two SQL statements in the deck that were used to produce execution plan output (either using dbms_xplan.display_cursor or dbms_sqltune.report_sql_monitor). So you won't have to type them if you want to use them, here they are:

-- SQL to produce execution plan using dbms_xplan.display_cursor
SELECT xplan.*
FROM  
(
select max(sql_id) keep
       (dense_rank last order by last_active_time) sql_id
     , max(child_number) keep
       (dense_rank last order by last_active_time) child_number
  from v$sql
 where upper(sql_text) like '%&1%'
   and upper(sql_text) not like
  '%FROM V$SQL WHERE UPPER(SQL_TEXT) LIKE %'
) sqlinfo,
table(DBMS_XPLAN.DISPLAY_CURSOR
(sqlinfo.sql_id, sqlinfo.child_number, 'ALLSTATS LAST')) xplan ;


-- SQL to produce a SQL monitor report
SELECT dbms_sqltune.report_sql_monitor
(
    SQL_ID => '&sql_id',
    SESSION_ID => '&session_id',
  SESSION_SERIAL => '&session_serial',
    TYPE => '&report_format'
)
FROM  dual ;

And since we didn't have enough time to answer all the questions that were posted, I wanted to provide those answers here.

Thanks again to everyone who attended any of the events and I look forward to doing more in the future. Stay tuned!



Questions and Answers

Q: Can spatial queries be optimized, queries that deal with large spatial data (land registry, geological data)
A: Any query can be optimized in one way or another, but I'd suggest that you check out the following link to a question posed on the Oracle Forum about spatial performance: https://forums.oracle.com/forums/thread.jspa?threadID=494768.


Q: Could you  pls repeat when an SQL will be monitored automatically?
A: Queries that run in parallel or any query that consumes 5 or more seconds of CPU or IO time.


Q: How did you arrive at 10ms time for one row index maintenance?
A: The 10ms time is from the book I referred to (Relational Database Index Design and the Optimizers). It is a combination of Queuing time, Seek time, Disk rotation time and Transfer time. Remember that it is intended to be used for doing estimated calculations; your mileage may vary.


Q: Do you need SQL tuning license to use MONITOR hint?
A: Well, you could use the MONITOR hint, but you just couldn't view the SQL Monitor reports of the collected data. So, I suppose you could imply that you couldn't/shouldn't use the MONITOR hint if you don't have the appropriate Tuning Pack license.


Q: STA will recommend new indexes in isolation for just one query. Doesn't it? That may not be useful when you have got to worry about 10 different queries accessing one table.
A: Yes, STA isn't looking at index creation from a system-wide view. It only looks at the one SQL statement and determines if a non-existing index might provide better performance. It cannot tell you what the creation of that index will do, either to the good or bad, for other queries. In the end, it is your responsibility to thoroughly test any index creation or modification in a representative environment to determine the effects.


Q: On a partitioned table , local index, if we want the load to be faster , the local index of the partition being loaded could be made unusable. but queries that are looking for partitions other than the one that was made unusable, don't use the index due to the fact that one of the partitions is unusable. Is there a way to tell the optimizer go ahead and use the index (does not work even with a profile)
A: I'd actually investigate other ways of performing loads, like using partition exchange, if possible. I do not know a way around the unusable index problem so I'd suggest investigating how to change your load process to avoid having to use that method.


Q: What is desired way to do benchmark and compare of SQL performance?
A: In part, the most appropriate way to benchmark SQL performance will depend on your specific circumstances. If you have an application that has been running in production for a while, you can use AWR data (or Statspack if you're not licensed for AWR) to retain execution information and do comparisons over time. If you're developing new code, you may have Performance SLAs (System Level Agreements) that specify the expected response times. In that case, testing would be evaluated against those metrics. But, regardless of your situation, you need a baseline measurement that is the expected/desired response time and then all tests must meet those minimum standards. You need to be able to measure performance under load and with representative data volumes to do this well. And, you have to keep in mind that response time isn't just about 'now', it's about the future as well. Testing must help ensure that performance will remain stable over time as user load and data volumes increase.


Q: Does join order still matter in 11gR2?
A: Not really and actually it (theoretically) mattered since the optimizer began using CPU costing and not just IO costing as early as Oracle version 9. The optimizer will use statistics to evaluate and weight various join orders and doesn't just evaluate joins in the order the SQL is written. For my own purposes, I typically try to write SQL in the order I "hope" it will get executed. If what I hope for and what actually occurs don't match, then I work my way through the differences. But, there is not a limit to what the optimizer will chose based on the order the SQL is written.


Q: Does Oracle engine have a process that collects statistics information near real time so that CBO can always get the latest and correct information?
A: There is a default stats collection job that runs in a nightly maintenance window. This job will check the staleness of the statistics and collect updated stats if needed (by default, statistics are considered stale when the table has changed by 10%). But, there is no automated way in place to collect "near real time" statistics. Of course, you could set up collections to occur as frequently as you wish, but due to the overhead required to do collections, most people don't collect stats except during non-peak times.


Q: What was the title of the database tuning/index book Karen mentioned?
A: Relational Database Index Design and the Optimizers by Tapio Lahdenmaki and Michael Leach


Q: In terms of Disk I/O versus Memory, have you seen an increase when adding more I/O or Memory to improve database performance? Or does it depend on a OLTP / OLAP?
A: Well, the only way to know if adding disk capacity or memory will help improve your performance footprint is to know if either disk IO or lack of memory is a bottleneck for you. If your bottleneck is CPU, then obviously adding disk or memory isn't going to fix your problem. So, my answer is the old consultant stand-by of "it depends". If you have a known issue with disk capacity or memory, then adding more of those resources should give you more breathing room. But, if you add those resources and something else is your main bottleneck, then you won't really see much improvements. 


Q: When looking at a SQL Query Execution Plan, what top 5 areas would you recommend for DBAs to look at initally. Thanks
A: I'd go so far as to say there's only one thing you need to look at and that is where is the most time and resources consumed in the plan? If you gather the appropriate execution plan data that includes rowsource execution statistics (using gather_plan_statistics or monitor hints for example), then you will be able to quickly find the lines of the plan that are your "big hitters". Once you have that, you must work towards reducing the time and resources consumed by that step. After you fix that, then you can check again and see where the most time is then spent. Keep going until you meet your SLA or can make no further improvement. Admittedly, just finding the "big hitter" steps in the plan is easy, but knowing what to do once you find them is the hard part. For that, I'd suggest learning as much about the optimizer and SQL execution fundamentals as you can so you'll be more adept at knowing what to do when you find a problem.






Tuesday, September 25, 2012

Making Impactful Performance Changes Webinar


Join me on Wednesday, September 26 as I present a webinar entitled "Making Impactful Performance Changes". This is the final segment in the 3-part DBA Performance Series sponsored by Embarcadero. Register even if you can't make it and you'll get an email with the link to the webcast video after it's over.

In this webinar, I'll cover
  • Common ways to rewrite SQL that make it perform better and more consistently
  • How and when to add or modify indexes 
  • How and when the SQL Tuning Advisor helps you and when it doesn't

Hope to see you!

Tuesday, September 4, 2012

Making SQL Performance Solutions Stick - Followup #2

During last week's webinar, there were several questions that I did not have a chance to answer so I wanted to follow-up by answering those questions here. I also wanted to provide links to previous webinar recordings as many people had asked where to find them.

Webinar Recordings
Best Practices for Developing Optimally Performing SQL

Diagnosing SQL Performance Problems (Part 1 of 3 part series)

Making SQL Performance Solutions Stick (Part 2 of 3 part series)


Question 1
The first question I wanted to address was something I provided an answer to but an audience member provided a follow-up that I wanted to comment on after having a chance to do some research.

Q: Is there any way to change the execution plan of an already running SQL? There are some articles on Internet that indicate that possibility by adding DBMS_RLS.ADD_POLICY which will call procedure that is using literals that will force hard parse.

Audience member follow-up: There is an article about using dbms_sqldiag_internal on optimizer blog about changing execution plan of an existing query on the fly.

My answer was that I didn't know of any way to change the execution plan of a query "on the fly" so I first wanted to clarify my understanding of the question. "On the fly" to me means that the questioner wanted to know if the execution plan could be changed while it was executing. My response to this was "no...not that I know of". There are several ways to influence/change the execution plan of a query after it has completed, but while it is executing, I do not know of any way to cause a plan change that would affect the query while it executes. The original questioner mentioned the use of DBMS_RLS.ADD_POLICY and I don't think that would do what was requested. That package/proc is used for Virtual Private Database and typically is called to provide the dynamic predicate needed to properly service VPD queries. The follow-up regarding dbms_sqldiag_internal also does not provide a way to change the execution plan "on the fly" but does offer a way to change the execution plan via a SQL Patch where you could add a hint, for example, that could change the plan the next time that same SQL was executed.

So, I'll stick by my original answer and say that I do not know of any way to change the execution plan for a SQL while it is currently executing. If anyone out there does know of a way to do so, I'd really like to hear from you.

Question 2
Q: Is the plan changing when the running node from clustered environment is evicted and the execution is failed over to another node?

A: On this one, I'll give the default answer of "it depends" as a starting point, but typically, the execution plan should not change. However, there is always a possibility the plan might change. So, I'll err on the side of caution and say that typically, the plan wouldn't change, but there is always a possibility.

Question 3
Q: Question on cardinality feedback-if child cursor is generated due to cardinality feedback for a given sql, will that new cursor be shared between users assuming that users are executing the same sql via application?

A: Yes, a child cursor generated due to cardinality feedback can be shared/used by other users executing the same SQL. The point of cardinality feedback is to be able to adjust row estimates derived in a plan so that they are more accurate. The new child cursor containing the updated cardinality estimates should then take precedence and be used.

Question 4
Q: Will reordering the table with respect to index help the CLUF (clustering factor) to be calculated correctly? Can you explain about it please?

A: Yes...and No. If you had an index on order_date (along with several other indexes on several other columns), and you reordered the table to store it in sorted order by order_date, the CLUF for that one index should certainly improve. However, what about all the other indexes on all the other columns? What happens to the index CLUF for those? The CLUF of those indexes would most likely be impacted as well causing some to get "worse".

Remember, CLUF is used to help weight the cost of using the index. But a table can only be physically stored in one order. The table data sort order can match one index order but it can't perfectly match the order of numerous indexes at the same time. So, the bottom-line is that you typically shouldn't reorder a table to try and get a better CLUF. You will likely adversely affect many more things that you fix in the long run.

Question 5
Q: Do you recommend running the following everyday dbms_stats.gather_schema_stats(ownname => , options =>'GATHER AUTO',estimate_percent => 40, method_opt => 'for all columns size 1', degree => 2, cascade => true ); ownername = sys, system

A: I'll answer this by saying that Oracle, particularly for the latest releases, recommends collecting statistics on dictionary objects (those owned by SYS) regularly as is evidenced by the fact that the automatic stats collection job includes a collection of these objects. So, if you have the automatic job enabled, you will be getting stats collections on these objects if/when needed.

There are specific procedures available for collecting dictionary stats (e.g. GATHER_FIXED_OBJECT_STATS, GATHER_DICTIONARY_STATS) and so I'd be more apt to use those than coming up with my own "home grown" method as you've shown here. However, since I am not familiar with your environment, then I can't say positively if your choice is "good" or "bad". I'd just encourage you to use defaults unless you have evidence you need something else that works better for your particular database application environment.

Question 6
Q: Can there be a negative impact on database performance of gathering stats for currently missing or null stats?

A: Gathering stats can have multiple impacts. First, there is the impact of the collection itself as it must use resources to complete. Second, there is the impact of how new statistics may affect the optimizer's plan choices. Plans could change based on the new statistics that cause performance to improve or decline. Finally, if you have *never* had stats on certain columns/objects, then don't just rush off and collect stats without thoroughly testing your application to discover the impacts. Ultimately, you won't know until you collect and test. Remember that you now have the option of collecting stats without publishing them (i.e. pending stats), so you can actually collect stats and do some specific testing with those stats before you apply them across the board.

Question 7
Q: Where do i check the column level stats?
A: Column stats can be found in the DBA_TAB_COL_STATISTICS view. I have a set of scripts I use to view all the various stats for a specific object and you can download those scripts here.

Wednesday, August 29, 2012

Making SQL Performance Solutions Stick Webinar Followup

Thanks again to everyone who attended part 2 of my Oracle SQL Performance webinar series entitled Making SQL Performance Solutions Stick. Embarcadero will be posting the recording within the next few days. In the meantime, I have uploaded the slides and scripts.

Thanks again and don't forget about the final installment in the series coming next month on September 29 entitled Making Impactful Performance Changes.

I hope to see you then!

Friday, August 10, 2012

Making SQL Performance Solutions Stick Webinar - August 29


Join me for the 2nd Embarcadero Community webinar in my Performance Series on August 29 entitled "Making SQL Performance Solutions Stick". Registration is now open and I'd love to see you there!

Don't worry if you didn't attend the first session in July. This session will have plenty of new information and does not have any dependency on having to attend the first event.

I'll be talking about:
  • How to adjust statistics collection to accommodate for "sensitive SQL".
  • The differences between SQL Profiles and SQL Baselines and how to implement them.
  • How and when you can fool the optimizer into using a desired execution plan without changing SQL text.

The 3rd and final session, "Making Impactful Performance Changes", will be on September 26 and you can also register for it now as well. I'll post again about that session after the August webinar is behind us.

Hope to see you there!

*** By the way, the last session had several SQL Server attendees, so I just wanted to make sure everyone knows these sessions are Oracle-centric.