How the Optimizer chooses Between Indexes and Full-Table Scan

advertisements

_____________________________________________________________________________________________________________________

The rule-based optimizer will almost always favor an access path involving an index to one involving full-table scan. This is because in the absence of any information about the sizes of tables, the index-based access path is the safer choice. A full-table scan might be faster than an index lookup in some circumstances- say 50% faster.
However, a full-table scan could be 10000% slower depending on the size of the the table. So the rule-based optimizer plays it safe and favors the index-based path.

The cost-based optimizer knows more – or thinks it knows more about the distribution of data in the tables. As a result, it will often choose a full0-table scan even where an index is available if it calculates that the full-table scan will be less expensive. Factors influencing the cost-based optimizer’s decisions include the following

Number of blocks allocated to the table that have ever contained data (that is the blocks below the high-water mark)
Size of data blocks.
Number of blocks that can be read in a single I/O operation (partially determined by the configuration setting DB_FILE_MULTIBLOCK_READ_COUNT)
Selectivity of the index-the number of rows returned for each index key
Depth of the index-number of I/O operations required to resolve a single index key

The decisions that the cost-based optimizer arrives at can be misguided if

A candidate index has only a few distinct values, but the value being queried is quite selective. Unless you have created a histogram on the column and explicitly coded the value being selected into the SQL, the cost-based optimizer will probably ignore the index.

The optimizer goal is set to ALL-ROWS or CHOOSE, but the requirement is really for response time. A full-table scan might result in the best response time to retrieve all the rows, but an index will usually be quicker when retrieving just the first row.

If you haven’t kept your optimizer statistics up to date (by using the ANALYZE command or DBMS_STATS package), The cost-based optimizer may think that the table is smaller than it really is. This might lead it to choose incorrectly the full-table scan.

The cost-based optimizer is heavily biased toward reducing I/o. CPU overheads such as those In sorting rows for a merge join, are not always accurately estimated. On some systems these overheads can be substantial.

TO encourage the cost-based optimize to use suitable indexes, you should therefore do the following

Create histograms on indexed columns where there are only a few distinct values but where most of the values are very rare. Hardcode these values in the SQL statement (don’t use bind variables) to allow the histogram information to be used.
Specify the FIRST_ROWS optimizer goal If interactive response time is your aim. This can be specified as hint, in a ALTER SESSION statement, or in the database configuration.
Use hints to direct the optimizer toward specific indexes.

_____________________________________________________________________________________________________________________

Website Stats

How the Optimizer chooses Between Indexes and Full-Table Scan

0 comments:

Post a Comment

Archive

Labels