Previous | Table of Contents | Next |
Today you learn different ways to produce totals and subtotals in your query output.
In todays lesson, I cover the following topics:
First, I describe how to use the GROUP BY clause. GROUP BY enables you to group rows together for the purpose of viewing subtotals. GROUP BY extends the functionality of the aggregate functions from yesterday. The CUBE and ROLLUP operators, starting with MS SQL Server 6.5, work with GROUP BY to calculate superaggregates for grouped columns, and place this information in a regular row. COMPUTE and COMPUTE BY let you calculate total and subtotal information, respectively, but use a different way to display that data.
Finally, you learn about the HAVING clause, what it does and how to use it with the rest of what you learn today.
Yesterday you learned how to retrieve aggregate information from a table by using the aggregate functions. One of the limitations of aggregates is that, when you request an aggregate in your SELECT list, you cannot request other, non-aggregate information.
This query and error message is from yesterdays lesson, demonstrating the effect of requesting a normal column with an aggregate:
select type, avg(price) from titles Msg 8118, Level 16, State 1: Column type.price is invalid in the select list because it is not contained in an aggregate function and there is no GROUP BY clause
A GROUP BY enables you to group together rows sharing a common column value, for the purposes of calculating an aggregate. For example, to see the average prices of books, broken down by book type, use this GROUP BY query:
select type, avg(price) Average from titles group by type
type Average ------ -------------------------- UNDECIDED (null) business 13.73 mod_cook 11.49 popular_comp 21.48 psychology 13.50 trad_cook 15.96 (6 row(s) affected)
For each type of book, a separate subtotal is calculated. The average price for business books is $13.73, the average for modern cooking books is $11.49, and so on.
Note: Scalar and vector aggregates
You may hear the terms scalar aggregate and vector aggregate during your career with SQL. A scalar aggregate is an aggregate that is performed without a GROUP BY and returns a single row of data. A vector aggregate is an aggregate operating on a column appearing in the GROUP BY and may return one or more rows. The server uses different strategies to evaluate these two types of aggregates.
When grouping, just as when you ORDER a column or use it in a WHERE clause, the column by which you group need not appear in the result set. However, with grouping, it seldom makes sense to leave out the grouped column. To see the number of employees at all the different job levels, you could use the following query against the employee table:
select count(job_id) from employee group by job_id
---------- 1 1 1 7 4 4 4 4 4 4 3 3 3 (13 row(s) affected)
The output for this query shows the number of employees at each job level, but because the job levels arent listed, you cant tell which counts go with which job levels. This output isnt terribly useful. When you use GROUP BY, you should include the column by which you are grouping.
Previous | Table of Contents | Next |