?? ch15.htm
字號:
<P>The best arrangement of elements within your query, particularly in the <TT>WHERE</TT>clause, really depends on the order of the processing steps in a specific implementation.The arrangement of conditions depends on the columns that are indexed, as well ason which condition will retrieve the fewest records.</P><P>You do not have to use a column that is indexed in the <TT>WHERE</TT> clause,but it is obviously more beneficial to do so. Try to narrow down the results of theSQL statement by using an index that returns the fewest number of rows. The conditionthat returns the fewest records in a table is said to be the <I>most restrictivecondition</I>. As a general statement, you should place the most restrictive conditionslast in the <TT>WHERE</TT> clause. (Oracle's query optimizer reads a <TT>WHERE</TT>clause from the bottom up, so in a sense, you would be placing the most restrictivecondition first.)</P><P>When the optimizer reads the most restrictive condition first, it is able to narrowdown the first set of results before proceeding to the next condition. The next condition,instead of looking at the whole table, should look at the subset that was selectedby the most selective condition. Ultimately, data is retrieved faster. The most selectivecondition may be unclear in complex queries with multiple conditions, subqueries,calculations, and several combinations of the <TT>AND</TT>, <TT>OR</TT>, and <TT>LIKE</TT>.<BLOCKQUOTE> <P><HR><FONT COLOR="#000077"><B>TIP:</B></FONT><B> </B>Always check your database documentation to see how SQL statements are processed in your implementation. <HR></BLOCKQUOTE><P>The following test is one of many we have run to measure the difference of elapsedtime between two uniquely arranged queries with the same content. These examplesuse Oracle7.3 relational database management system. Remember, the optimizer in thisimplementation reads the <TT>WHERE</TT> clause from the bottom up.</P><P>Before creating the <TT>SELECT</TT> statement, we selected distinct row countson each condition that we planned to use. Here are the values selected for each condition:</P><P><TABLE BORDER="1"> <TR> <TD VALIGN="TOP"><I>Condition</I></TD> <TD VALIGN="TOP"><I>Distinct Values</I></TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>calc_ytd = '-2109490.8'</TT></TD> <TD ALIGN="LEFT" VALIGN="TOP">13,000 +</TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>dt_stmp = '01-SEP-96'</TT></TD> <TD ALIGN="LEFT" VALIGN="TOP">15</TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>output_cd = '001'</TT></TD> <TD ALIGN="LEFT" VALIGN="TOP">13</TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>activity_cd = 'IN'</TT></TD> <TD ALIGN="LEFT" VALIGN="TOP">10</TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>status_cd = 'A'</TT></TD> <TD ALIGN="LEFT" VALIGN="TOP">4</TD> </TR> <TR ALIGN="LEFT" rowspan="1"> <TD ALIGN="LEFT" VALIGN="TOP"><TT>function_cd = '060'</TT></TD> <TD VALIGN="TOP">6</TD> </TR></TABLE><BLOCKQUOTE> <P><HR><FONT COLOR="#000077"><B>NOTE:</B></FONT><B> </B>The most restrictive condition is also the condition with the most distinct values. <HR></BLOCKQUOTE><P>The next example places the most restrictive conditions first in the <TT>WHERE</TT>clause:<H5>INPUT:</H5><PRE><FONT COLOR="#0066FF">SQL> <B>SET TIMING ON</B> 2 <B>SELECT COUNT(*)</B> 3 <B>FROM FACT_TABLE</B> 4 <B>WHERE CALC_YTD = '-2109490.8'</B> 5 <B> AND DT_STMP = '01-SEP-96'</B> 6 <B>AND OUTPUT_CD = '001'</B> 7 <B>AND ACTIVITY_CD = 'IN'</B> 8 <B> AND STATUS_CD = 'A'</B> 9 <B> AND FUNCTION_CD = '060';</B></FONT></PRE><H5><FONT COLOR="#000000">OUTPUT:</FONT></H5><PRE><FONT COLOR="#0066FF">COUNT(*)-------- 81 row selected.Elapsed: 00:00:15.37</FONT></PRE><P>This example places the most restrictive conditions last in the <TT>WHERE</TT>clause:<H5>INPUT/OUTPUT:</H5><PRE><FONT COLOR="#0066FF">SQL> <B>SET TIMING ON</B> 2 <B> SELECT COUNT(*)</B> 3 <B>FROM FACT_TABLE</B> 4 <B> WHERE FUNCTION_CD = '060'</B> 5 <B> AND STATUS_CD = 'A'</B> 6 <B> AND ACTIVITY_CD = 'IN'</B> 7 <B> AND OUTPUT_CD = '001'</B> 8 <B>AND DT_STMP = '01-SEP-96'</B> 9 <B> AND CALC_YTD = '-2109490.8';</B>COUNT(*)-------- 81 row selected.Elapsed: 00:00:01.80</FONT></PRE><H5>ANALYSIS:</H5><P>Notice the difference in elapsed time. Simply changing the order of conditionsaccording to the given table statistics, the second query ran almost 14 seconds fasterthan the first one. Imagine the difference on a poorly structured query that runsfor three hours!<H3><FONT COLOR="#000077"><B>Procedures</B></FONT></H3><P>For queries that are executed on a regular basis, try to use procedures. A <I>procedure</I>is a potentially large group of SQL statements. (Refer to Day 13, "AdvancedSQL Topics.")</P><P>Procedures are compiled by the database engine and then executed. Unlike an SQLstatement, the database engine need not optimize the procedure before it is executed.Procedures, as opposed to numerous individual queries, may be easier for the userto maintain and more efficient for the database.<H3><FONT COLOR="#000077"><B>Avoiding </B>OR</FONT></H3><P>Avoid using the logical operator <TT>OR</TT> in a query if possible. <TT>OR</TT>inevitably slows down nearly any query against a table of substantial size. We findthat <TT>IN</TT> is generally much quicker than <TT>OR</TT>. This advice certainlydoesn't agree with documentation stating that optimizers convert <TT>IN</TT> argumentsto <TT>OR</TT> conditions. Nevertheless, here is an example of a query using multiple<TT>OR</TT>s:<H5>INPUT:</H5><PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B> 2 <B>FROM FACT_TABLE</B> 3 <B> WHERE STATUS_CD = 'A'</B> 4 <B> OR STATUS_CD = 'B'</B> 5 <B> OR STATUS_CD = 'C'</B> 6 <B> OR STATUS_CD = 'D'</B> 7 <B> OR STATUS_CD = 'E'</B> 8 <B> OR STATUS_CD = 'F'</B> 9 <B>ORDER BY STATUS_CD;</B></FONT></PRE><P>Here is the same query using <TT>SUBSTR</TT> and <TT>IN</TT>:<H5>INPUT:</H5><PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B> 2 <B>FROM FACT_TABLE</B> 3 <B>WHERE STATUS_CD IN ('A','B','C','D','E','F')</B> 4 <B> ORDER BY STATUS_CD;</B></FONT></PRE><H5>ANALYSIS:</H5><P>Try testing something similar for yourself. Although books are excellent sourcesfor standards and direction, you will find it is often useful to come to your ownconclusions on certain things, such as performance.</P><P>Here is another example using <TT>SUBSTR</TT> and <TT>IN.</TT> Notice that thefirst query combines <TT>LIKE</TT> with <TT>OR</TT>.</P><H5>INPUT:</H5><PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B> 2 <B>FROM FACT_TABLE</B> 3 <B>WHERE PROD_CD LIKE 'AB%'</B> 4 <B> OR PROD_CD LIKE 'AC%'</B> 5 <B> OR PROD_CD LIKE 'BB%'</B> 6 <B> OR PROD_CD LIKE 'BC%'</B> 7 <B> OR PROD_CD LIKE 'CC%'</B> 8 <B>ORDER BY PROD_CD;</B></FONT></PRE><PRE><FONT COLOR="#0066FF">SQL> <B>SELECT *</B> 2 <B>FROM FACT_TABLE</B> 3 <B>WHERE SUBSTR(PROD_CD,1,2) IN ('AB','AC','BB','BC','CC')</B> 4 <B>ORDER BY PROD_CD;</B></FONT></PRE><H5>ANALYSIS:</H5><P>The second example not only avoids the <TT>OR</TT> but also eliminates the combinationof the <TT>OR</TT> and <TT>LIKE</TT> operators. You may want to try this exampleto see what the real-time performance difference is for your data.<H2><FONT COLOR="#000077"><B>OLAP Versus OLTP</B></FONT></H2><P>When tuning a database, you must first determine what the database is being usedfor. An online analytical processing (OLAP) database is a system whose function isto provide query capabilities to the end user for statistical and general informationalpurposes. The data retrieved in this type of environment is often used for statisticalreports that aid in the corporate decision-making process. These types of systemsare also referred to as decision support systems (DSS). An online transactional processing(OLTP) database is a system whose main function is to provide an environment forend-user input and may also involve queries against day-to-day information. OLTPsystems are used to manipulate information within the database on a daily basis.Data warehouses and DSSs get their data from online transactional databases and sometimesfrom other OLAP systems.<H3><FONT COLOR="#000077"><B>OLTP Tuning</B></FONT></H3><P>A transactional database is a delicate system that is heavily accessed in theform of transactions and queries against day-to-day information. However, an OLTPdoes not usually require a vast sort area, at least not to the extent to which itis required in an OLAP environment. Most OLTP transactions are quick and do not involvemuch sorting.</P><P>One of the biggest issues in a transactional database is rollback segments. Theamount and size of rollback segments heavily depend on how many users are concurrentlyaccessing the database, as well as the amount of work in each transaction. The bestapproach is to have several rollback segments in a transactional environment.</P><P>Another concern in a transactional environment is the integrity of the <I>transactionlogs, </I>which are written to after each transaction. These logs exist for the solepurpose of recovery. Therefore, each SQL implementation needs a way to back up thelogs for use in a "point in time recovery." SQL Server uses dump devices;Oracle uses a database mode known as ARCHIVELOG mode. Transaction logs also involvea performance consideration because backing up logs requires additional overhead.<H3><FONT COLOR="#000077"><B>OLAP Tuning</B></FONT></H3><P>Tuning OLAP systems, such as a data warehouse or decision support system, is muchdifferent from tuning a transaction database. Normally, more space is needed forsorting.</P><P>Because the purpose of this type of system is to retrieve useful decision-makingdata, you can expect many complex queries, which normally involve grouping and sortingof data. Compared to a transactional database, OLAP systems typically take more space
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -