Previous Table of Contents Next


Using Aggregates on Nonnumeric Columns

The aggregates can’t all be used on just any old data type. Only the COUNT aggregate, in all its forms, can operate on any column type. Both the SUM and AVG are restricted to numeric types, which include decimal, real, float, integer, smallinteger, tinyinteger, money, smallmoney, and numeric types. The MIN and MAX operator can operate on numeric columns, too, but they also can retrieve aggregate information on date and character columns.

Let’s say you need someone to clean the windows around the office, and your cleaning service doesn’t do windows. If you want to find the rookie employee for this job, you would start with this query:

      select     max(hire_date)
      from     employee
      --------------------------
      May  1  1994  12:00AM

Now that you have the hire date, you can use it to find out which employee belongs to that hire date:

      select     fname,  lname
      from        employee
      where      hire_date = ‘May  1  1994  12:00AM’
      fname                      lname
      -----------------          ----------------
      Matti                      Karttunen

This request has to be done in two parts. You cannot use an aggregate in the WHERE clause. See the section titled “What You Can’t Do,” for discussion on that topic. You could have used a subquery, but those aren’t covered until Day 8. Anyway, all a subquery does is combine two SELECT statements in such a way that the result from one feeds another. Either way, you need to resolve the query in two phases.

First, you get the maximum hire date. This returns the highest date value in the hire_date column. Then, you use a simple SELECT to determine which employee has that hire date.

Aggregates with WHERE

You can use the WHERE clause with aggregate values in order to restrict the values upon which they function. You will still receive only a single value.

To illustrate this, let’s go back to the example of retrieving aggregate information from the price column, in the titles table. If you want the average price of all books, you use this query from earlier in the chapter:

select     avg(price)
from       titles
------------------
14.77

If you want to get the average of all business books, you need only tack on a WHERE clause:

      select      avg(price)
      from        titles
      where       type = ‘business’
      ------------------
      13.73

The WHERE clause limits the rows considered in the average to only business books. The average of the price for those rows is returned in the result set.

What You Can’t Do

Aggregates are resolved after the rows in the WHERE clause have been determined. The server looks for business books and then retrieves the price values and averages them. Because of this order of operations, you cannot include an aggregate in the WHERE clause. How could the server evaluate it? It would need to figure out which rows to include, perform the aggregate, and then go around again to see which rows to include. It can’t do that.

It’s certainly conceivable that you would want a query that allows you to put an aggregate in the WHERE clause. Suppose, for example, that you wanted to see all the books that were priced above average. To write this, you need to use a subquery. There will be a special section on aggregates and subqueries on Day 8. If you want a sneak peek, please feel free to skip ahead.

Expressions Inside Aggregates

A rarely used feature of T-SQL is the ability to evaluate expressions inside aggregates.

A column, all by itself, is an expression. When SUM(price) is evaluated at the server, here’s what actually happens for each row

1.  If there is a WHERE clause, the row is checked against it to see whether or not it should be included in the determination of the expression. If it should be included…
2.  The expression inside the aggregate is evaluated. For example, in the aggregate SUM(price), the expression is price. If you asked for SUM(price * price), the price would be squared.
3.  The evaluated expression is added to the sum.

The titles table gives you a scenario where this is really useful. You may have noticed that the titles table contains a ytd_sales column. This column contains the total number of books that have sold for each title. To find the total sales for a single book, you need to write a query such as this one:

      select     title_id,   price  *  ytd_sales  “Revenue”
      from        titles
      title_id      Revenue
      --------      --------
      BU1032        81,859.05
      BU1111       46,318.20
      […]

If you want to find out how many total dollars the books generated, you want the sum of the Revenue column in your last query. To get that, use this query:

      select     sum(price   *   ytd_sales)   “Total  Revenue”
      from        titles
      Total   Revenue
      ----------------
      1,045,508.72

For each column in the titles table, its price is multiplied by its ytd_sales. The product of each row is then added together to return the final sum.

Any expression is valid inside an aggregate function. You can perform calculations on columns, use constants (although it won’t mean much to get min(50)…), or use functions. One of the more useful functions with aggregates is the ISNULL() function.


Previous Table of Contents Next
Используются технологии uCoz