Oracle Tips by Vikas Rajan Atrey
Also see these important notes on estimating the maximum results from a paginated Oracle SQL query.
Invariably while developing data intensive web applications we come across the requirement of pagination. The term "pagination" refers to the collection of a definite number of rows per page, along with a mechanism to navigate back-and-forth between the pages. We also have the concept of filtering, a technique to cut-short the result set based on some specified attribute抯 value. Let's take a closer look at doing pagination in Oracle SQL.
The number of rows per page can be specified in two ways:
By using application specific parameters (set by the application administrator)
Using hard-coded values e.g. 20 rows on a single page or asked by the user itself in run time.
Also, (for the user抯 convenience) options can also be coded to "group" or "sort" the SQL result set, based on some user specified attributes, on the fly.
The objective of this short article is to suggest a way to incorporate these two features in order to get best response time for the end user. We use the terms ?b>Delayed lookup? and ?b>Guided Search?to explain these two Oracle pagination techniques.
If we carefully examine the table list in the 揊ROM?clause of a SQL query, each of the tables in the list can be divided in to one or more of the below three categories.
For queries with lookup tables, we can defer the access of the lookup tables (category #3) and execute the remaining query. Once we have reduced the result set to only few dozen records (with all the required grouping and sorting) then only we will access the lookup tables and fetch the desired columns values.
Structurally, such a deferred SQL query might look like this. Note that the join into tables "b" and "c" are deferred until after the main result set is fetched. Again, the lookup tables 揵?and 揷 ?are deferred, and they will be joined only with 20 rows (number of records to be displayed in a page) corresponding to the main result set.
select
v2.* ,
b.desc ,
c.desc
from b ,
c ,
(select * from ( Select rownum rn , v1.* From (Select /*+ materialize no_merge */
t1.id1 , t2.id2, t1.col1 , t2.col2
from
t1 , t2 Where
<join clause> And
<filter clause> < group by clause> <Order by clause>)v1 where
rn >1 and rn <=20) v2 where
v2.id1= b.id1 and v2.id2=c.id2;
Note the use of the materialize hint for keeping the intermediate result set and the no_merge hint.
It is a well known fact that oracle can join only two tables at a time (The different join orders are evaluated by the oracle SQL optimizer, and trace event 10053 or an execution plan can be used to see which join orders are evaluated by oracle to arrive at the final join order) .
For example, if there are 5 tables in the FROM clause, then theoretically oracle may evaluate 120 join orders, a factorial expansion. For a 7-way table join, we have 5,040 possible table join combinations.
After evaluating certain join orders, the optimizer may decide that it is not worthy to evaluate more join orders because the query execution time is not going to be impacted much or the query execution itself will not take that long to justify evaluation of all the join orders.
The init.ora parameter ?a href="http://www.dba-oracle.com/oracle_news/2004_5_3_burleson.htm">optimizer_max_permutations? determines the maximum number of join orders that oracle will evaluate before deciding the final join order/execution plan. (It becomes a hidden parameter in Oracle 10g). The setting for optimizer_max_permutations may also force oracle to stop evaluating further join orders after certain number of join orders are evaluated resulting into sub optimal execution plan.
It is evident that number of tables in the 揻rom 揷lause of our main query will get reduced if we use above technique.
For example, tables 揵?and 揷?are not any filtering any records, so they can be joined into the main result set after we have completed the grouping , ordering and pagination operations (i.e. we have only 20 rows in main result set). This can result in a very efficient SQL join.
Typically. lookup tables do not contain more than few thousand records and joining them with only 20 records will not take much I/O.
Using my Oracle pagination technique we have reduced the parsing, joining, grouping and sorting work to a considerable extent, while not compromising the functionality. Hence, we see much faster query performance.
NOTE: One problem with the Oracle optimizer is that he cannot always guess the inter-join result set size, and histograms and dynamic sampling can help. See this important note on tuning with Oracle histograms.
Also, see determining the optimal table join order.
Sometimes users may not remember the exact filter (to get required result set), and specify a format mask, forcing our paginated application to use more than one wildcard in the query (e.g. the filter in the query may be: ?i>where name like ?P%挃 ). This rules out any possibility of effective index usage and forces a full-table scan.
Also, the number of records satisfying this search might be very huge, creating a burden on database server as well as on the network (not to mention the poor end user response time).
One obvious solution in this scenario could be to force user to enter at least certain number of characters as his search criterion so as to limit number of records satisfying his filter but that solution will not help as the user really does not know those many characters corresponding to his search.
A better approach would be to use guided search which helps the end-user to refine his search and then fetch the required information. To do this, we first present the end-user a list of values satisfying his search criterion (without any result set), and then use that refined search to fetch the required paginated information for the user. This approach will benefit the user, the application, Oracle and the entire network.
It is very unlikely that any business user will be interested to browse huge number of records (satisfying clauses similar to where 搉ame?like ?P%?/i>) that his unconstrained filter will fetch. Instead, they are interested only in few selected records but they can not find proper criteria. Once the end-user is provided with huge number of records, they can again refine his search to get to his candidate records.
We are only reversing the sequence of the steps in his two step search process and in turn user is getting convenience and much better response time not to mention the reduced load on the database and the network traffic.
The query will be executed in two parts. The first part of the query will help the user to refine his search. (e.g. user is searching house online, but they are not sure about the exact details. So, our first step will be help user to refine their search.
The small query is used in the first step of the search will access only an index, and it has no need to evaluate any business logic, causing it to execute much faster. Also the result set will usually be the single column, with much less data for the application to send across the network.
If there are only definite number of permutations (the set of search columns) allowed to be used by the search then these sets can be pre杙opulated at EOD (End of the day) or SOD (Start of the day) activities of the application. We also have the option of pre-populating these values from Oracle抯 Index Organized tables (IOTs) for speedier access.
Now, after the completion of the step 1, the user has already narrowed down his search and hence the query to fetch the main result set. At this point, the end-user will be in better position to use more selective indexes so as to fetch precise result set in much less time. Also such steps will improve database performance in general.