Previous | Table of Contents | Next |
Overriding the Optimizer
Every now and then, the server simply estimates wrong. In the example weve been looking at, the server grossly overstates the cost of using the sales table as the outer table, and it grossly understates the cost of using the titles table.
Note: I promise I didnt cook the example to show the server being wrong. I just wrote a simple query and was as surprised as you should be to see the estimation problems crop up here.
If the estimation errors were any more severe (for example, if the price/type conditions were to select 75 percent of the data instead of 8 percent), it would be appropriate to force a different join order than the one SQL Server chose. To do that, you use the forceplan option:
set forceplan {on|off}
This option forces SQL Server to use the order in which the tables appear in the join clause as the physical join order when resolving the query.
To execute the sample query and force an optimization plan with sales as the outer table, do this:
set forceplan on select s.stor_id, t.type, count(*) from sales s join titles t on t.title_id = s.title_id where s.ord_date >= 9/1/94 and s.ord_date < 10/1/94 and t.price between 15.00 and 20.00 and t.type in (children, drama, psychology) group by s.stor_id, t.type set forceplan off
Tip: Turn on forceplan right before the statement you want to affect; turn it off immediately afterwards. You dont want to dictate the join order for every statement you execute; only for those in which you have done sufficient research and testing to prove its the right thing to do.
Take note of the FROM clause:
from sales s join titles t on t.title_id = s.title_id
With the use of forceplan, the order of the tables dictates the join order for the query: sales is the outer table, titles is inner. Here is the FROM clause using the traditional join syntax:
from sales s, titles t
Here is a list of dos and donts that will help you get better join performance.
Do......Dont
|
Now that you understand join order, we can explore outer joins.
Outer joins allow you to see all the rows in the outer table, whether there are matching rows in the inner table or not.
Heres a query that lists book types and their corresponding sales records. (All the queries in the section on outer joins are drawn from the pubs database.)
select t.type, sum(s.qty) from titles t join sales s on t.title_id = s.title_id group by t.type
type ------------ ---------- business 90 mod_cook 50 popular_comp 80 psychology 193 trad_cook 80
Is that all of the types? Heres a complete list:
select distinct type from titles
type ------------ UNDECIDED business mod_cook popular_comp psychology trad_cook
The UNDECIDED type is not included in the first result set because there are no sales for unpriced, unpublished books. If you wanted to include the UNDECIDED titles, you would use an outer join. The outer join is indicated in the ANSI join syntax with the keywords, LEFT OUTER JOIN :
select t.type, sum(s.qty) from titles t left outer join sales s on t.title_id = s.title_id group by t.type
type ------------ ---------- UNDECIDED (null) business 90 mod_cook 50 popular_comp 80 psychology 193 trad_cook 80
The outer join includes a row in the final output for each value of type in the titles table. Notice that the value of the sum for UNDECIDED books is (null). In an outer join, whenever there is a row in the outer table that is unmatched in the inner table, all inner table values in the result set evaluate to null for that row.
The traditional SQL syntax is also capable of expressing an outer join. Simply add an asterisk (* ) on one side of the join condition to indicate which table is the outer table in an outer join. Heres the query joining sales and titles in the traditional join syntax:
select t.type, sum(s.qty) from titles t, sales s where t.title_id *= s.title_id group by t.type
The asterisk appears on the left side, next to the titles version of the title_id column, so the titles table is the outer table.
Previous | Table of Contents | Next |