Sams Teach Yourself Transact-SQL in 21:How the Server Uses Indexes

Table of Contents

Figure 15.4 displays the rest of the clustered index structure for this example. The next level of the index (level 2) includes one row per page at level 1, so it is considerably smaller than the previous level. The server continues to generate additional index levels until it is able to store pointers to all of the pages at the next level in a single page, called the root page. Each level in the index stores the first entry on each page of the next level down the tree. The root is the level at which the index is reduced to a single page. In this case, it required only two intermediate levels to get to the root page.

Figure 15.4. A three-level index structure.

The root page is the index page where all the rows in a single index level are recorded. Almost all index-based searches start with the root page. (See the next section, “Walking the Index: A Clustered Index in Action.”

Note: This clustered index in the diagram consists of the table itself, two intermediate levels, and a root level. There are two factors that could cause the number of index levels to increase.
First, the number of rows in the table could increase, which would generate more data pages, meaning more entries at the first index level, forcing more index pages, and so on.

Second, the size of the index key could increase. The index key here, stor_id, is defined as char(4). If it were increased to char(6), the number of index rows that could be recorded on a single page would drop by about 25 percent. This would require additional index pages at the first level, meaning more rows at the second level—each of which would also need more space—and so on.

The minimum number of levels an index may have is one, consisting of the root page. There is no maximum number of levels, but it takes a lot of rows to get a lot of levels. A clustered index on a 1-billion row table with a 4-byte integer key has only six levels!

Walking the Index: A Clustered Index in Action

To understand how a clustered index works, you need to consider how the server would use the index to resolve a query like this one:

select title_id
from sales
where stor_id = “7067”

Instead of reading all 9MB of the table, we’ll use the clustered index to find only the pages where rows matching the search criterion might be stored. The server’s first step is to find the “root” or starting point for the index, which in this case is page 2357.

Note: The address of the root page of each index is stored in the sysindexes table.

The server uses the index to identify the first page where a row with the value 7067 might be located. 7067 falls between the values 6380 and A232 appearing on the root page, so any instances of 7067 would be found under 6380. The server uses the page pointer 5555 to find the right page at the next level of the index. Again, it decides that 7067 would fall between 6380 and 8372, so it follows the chain for 6380 to the next level using page 4723. The same logic applies at this level, only now there is a match for 7067. The server follows the chain to page 8601, which is the table itself.

Once SQL Server has found the correct page, it is able to follow the chain, reading each row in turn. As soon as a row is retrieved that has a value that does not match the value 7067 (in this case, the next value is 7131), the server can stop the search immediately. A clustered index scan is effectively a table scan on the necessary pages. It’s a very efficient method of finding applicable rows.

Nonclustered Indexes

A nonclustered index is a little different from a clustered index. Because you can only have one clustered index on a table (you can only sort the actual data one way!), nonclustered indexes are able to build an independent tree of values in the table.

When you create a nonclustered index, SQL Server starts by building a sorted list of all the values for the index column, with page and row pointers to each value. Figure 15.5 shows the first, or leaf, level of a nonclustered index on the title_id column.

Figure 15.5. The data and the leaf level of a nonclustered index.

The leaf level of a nonclustered index holds one row per row in the table.

Notice that there is one row in the leaf level of a nonclustered index per row in the table itself. This is the principal difference between clustered and nonclustered indexes.

Figure 15.6 shows the rest of the nonclustered index, from the leaf level up to the root level. The figure shows a single intermediate level. If more rows were added to the table, or if the index were based on a longer index key, more levels might have been required.

Figure 15.6. A nonclustered index resembles a clustered index from the leaf level up.

A Nonclustered Index in Action

Use of a nonclustered index is similar to use of a clustered index. Let’s review briefly how the server would use a nonclustered index to resolve this query:

select sum(qty)
from sales
where title_id = “BU1032”

Again, the server finds the root page of the index (3721), and then follows the chain to pages 3643 and 3855. Page 3855 displays several values for the title_id value, BU1032.

At this point, the functional difference between the index types becomes clear. Instead of walking the table and reading one data page to retrieve many relevant rows, the server walks the index leaf level, reading one data page per row of data.

Table of Contents

Используются технологии uCoz