Sams Teach Yourself Transact-SQL in 21:How the Server Uses Indexes

Analyzing Execution with statistics io

Another useful tool for analyzing query performance is the statistics io option. With this tool, you can determine how much disk and memory work is performed to resolve your query. This can help you assess whether the server is making correct choices. We’ll look at some output from statistics io now.

set statistics io on
select count(*) “count”, sum(qty) “total     sold”
from sales
where stor_id between “F” and #147;Gz”
and title_id like “BI2%”
go

Note: Unlike showplan, statistics io is an execution-time option. You can include it in a batch and even wrap it around the particular query you are interested in:
select count(*) from titles where title_id like “A%” set statistics io on select count(*) “count”, sum(qty) “total sold” from sales where stor_id between “F” and “Gz” and title_id like “BI2%” set statistics io off select count(*) from titles where title_id like “A%” go

This will report statistics for the second query, but not for the first or last.

count       total sold
----------- -----------
188         160224
Table: sales  scan count 1, logical reads: 567, physical reads: 31,
read ahead reads: 350

(1 row affected)

In addition to the query results, the server returns information about disk and memory reads performed when executing the query. You can compare these values to your expectations to determine whether SQL Server is finding the most effective way to resolve the query.

Table: sales

In a multitable query, there will be one statistics line per table. This element of the output allows you to figure out which table is being reported.

scan count 1,

In a single-table query, the table will be scanned only once. In multitable queries, SQL Server selects outer and inner tables. The inner tables are scanned one time per row in the outer table.

logical reads: 567, physical reads: 31,

Logical reads are reads from cache; physical reads are from disk. If a required page is not found in cache, SQL Server performs a physical read to place it in the cache and then a logical read to retrieve it from cache. So there is a logical read for every page required, whether that page was originally found in cache or not.

Tip: In my experience, it’s most useful to try to reduce logical reads when troubleshooting queries. In general, as this value goes down, query time goes down. The problem is that too many outside factors influence what’s in cache. If you work to minimize physical reads (which are more time-consuming than logical reads), you may find that a sudden increase in activity on the server will bump all your data out of cache, with a sudden surge in physical reads again.
As a programmer, in particular, the place where you can have the greatest impact is on logical, not physical reads. Focus on that, and your query performance will improve.

read ahead reads: 350

Read-aheads are used by SQL Server when it needs to retrieve a page from disk, but it expects to need additional pages in the same 16KB extent. SQL Server launches a task on a separate thread to perform a 16KB block read. Read-aheads are most effective when SQL Server is running on a system with multiple processors, but the efficiency of 16KB I/Os can also dramatically improve performance on large tables.

Overriding the Optimizer

Now that you know how to get information about your query, and how to read that information, what can you do with it? Let’s look one more time at our sample query and consider our options.

select count(*) “count”, sum(qty) “total sold”
from sales
where stor_id between “F” and “Gz”
and title_id like “BI2%”

With normal optimization, as you saw earlier, SQL Server chooses to use the clustered index. If you wanted to override that choice, you could try forcing the use of the nonclustered index by adding the name of the index after the table name in the FROM clause of the query.

select count(*) “count”, sum(qty) “total      sold”
from sales (index = titleidind)
where stor_id between “F” and “Gz”
and title_id like “BI2%”

STEP 1
The type of query is SELECT
Scalar Aggregate
FROM TABLE
sales
Nested iteration
Index : titleidind
STEP 2
The type of query is SELECT
count        total sold
----------  ------------
188         160224

(1 row(s) affected)

Table: sales scan count 1, logical reads: 1524, physical
reads: 440,
read ahead reads: 43

It worked: the optimization plan uses the nonclustered index, titleidind. Although the result is the same, notice how the cost of the query has increased in both logical and physical reads. The server was also less able to take advantage of read aheads. The choice of the clustered index was correct here.

Just to be complete, let’s force a table scan, too, by putting a zero after the table name:

select count(*) “count”, sum(qty) “total sold”
from sales (0)
where stor_id between “F” and “Gz”
and title_id like “BI2%”

STEP 1
The type of query is SELECT
Scalar Aggregate
FROM TABLE
sales
Nested iteration
Table Scan
STEP 2
The type of query is SELECT
count total sold
----------- -----------
188         160224

(1 row(s) affected)

Table: sales scan count 1, logical reads: 4536, physical reads: 159,
read ahead reads: 2415

Again, the query returned the correct results, but the table scan method cost about 4,000 additional logical reads and substantially more read-aheads. The index selection made by the optimizer was the correct one for this query.

Table of Contents

Используются технологии uCoz