Page 196
Listing 8.7 gives an example of a query that retrieves a list of the Department_IDs that exist in the Course table.
Listing 8.7. Example of the GROUP BY clause.
SQL> select Department_ID 2 from Course 3 group by Department_ID; DEPARTMENT_ID -------------------- ANTHRO BIO ECON ENG ENGL HIST MATH PHILO PSYCH 9 rows selected.
Listing 8.8 illustrates how to count the number of courses that are associated with each Department_ID.
Listing 8.8. Using the COUNT function with the GROUP BY clause.
SQL> select Department_ID, count(*) 2 from Course 3 group by Department_ID; DEPARTMENT_ID COUNT(*) -------------------- --------- ANTHRO 3 BIO 3 ECON 3 ENG 4 ENGL 4 HIST 3 MATH 4 PHILO 3 PSYCH 5 9 rows selected.
Now consider the use of both the GROUP BY and
HAVING clauses. You can use the HAVING clause to retrieve those departments that have exactly two courses, as displayed in Listing 8.9.
Page 197
Listing 8.9. Using the GROUP BY and HAVING clauses.
SQL> select Department_ID, count(*) 2 from Course 3 group by Department_ID 4 having count(*) = 4; DEPARTMENT_ID COUNT(*) -------------------- --------- ENG 4 ENGL 4 MATH 4
Group functions cannot be combined with columns in the select list unless you use the GROUP BY clause. For instance, you can obtain the average value for additional fees per department as shown in Listing 8.10.
Listing 8.10. Combining the AVG function with the GROUP BY clause.
SQL> select Department_ID, avg(Additional_Fees) 2 from Course 3 group by Department_ID; DEPARTMENT_ID AVG(ADDITIONAL_FEES) -------------------- -------------------- ANTHRO 20.833333 BIO 63.333333 ECON 258.33333 ENG 50 ENGL 0 HIST 0 MATH 2.5 PHILO 0 PSYCH 48 9 rows selected.
In the next example, you'll look at a different use for a database: a patient information
system for a hospital. Another use of the HAVING and
GROUP BY clauses is in the identification of duplicate rows. For example, you might not be able to add a primary key to a table
because the primary key column contains duplicate values, as illustrated in Listing 8.11. A query
that uses the GROUP BY and HAVING clauses identifies which Patient_IDs have more than one
value. A second query determines which row should be deleted. After it is deleted, the primary
key is successfully added to the table.
Page 198
Listing 8.11. Identifying duplicate rows.
SQL> alter table Patient add 2 constraint Patient_PK 3 primary key (Patient_ID); alter table Patient add * ERROR at line 1: ORA-02299: cannot add or enable constraint (TYO.PATIENT_PK)- duplicate keys found SQL> select Patient_ID 2 from Patient 3 having count(*) > 1 4 group by Patient_ID; PATIEN ------ GG9999 SQL> select * 2 from Patient 3 where 4 Patient_ID = `GG9999'; PATIEN BODY_TEMP_DEG_F FEVER_CLASS ------ --------------- -------------------- GG9999 107.6 LETHAL FEVER GG9999 107 SQL> delete from Patient 2 where Patient_ID = `GG9999' and Body_Temp_Deg_F = 107; 1 row deleted. SQL> alter table Patient add 2 constraint Patient_PK 3 primary key (Patient_ID); Table altered.
I've included the topic of hierarchical information in this lesson, rather than in one of the earlier lessons on SQL, to avoid confusion. An example of hierarchical data is a manufacturing bill of materials (BOM) that describes all the parts, subassemblies, and assemblies that compose a finished product; an aircraft manufacturer might have a BOM that consists of thousands of levels and millions of parts.
NEW TERM
Hierarchical data is information contained in a table that has, at a minimum, a column that specifies the parent or higher-level key and another column that specifies the primary key of the current row.
Page 199
Can relational databases in general and SQL in particular support this type of hierarchy? The answer is mixed. The good news is that SQL does provide some support for hierarchical data via the CONNECT BY clause. The bad news is that the support is quite limited, and the syntax is not intuitive. Nevertheless, I'll show you an example of how to use SQL to navigate through hierarchical data.
A three-column table illustrates the use of the CONNECT BY clause.
Product_ID | The product of interest |
Assembly_ID | The assembly or subassembly to which the Product_ID belongs |
Description | The product description |
This table design is based upon the concept that every product belongs to the finished product or an assemblyexcept for the finished product itself. The rows in the Product_Assembly table demonstrate this concept. A camera, the X1000, is composed of several parts and one subassembly, B200. Listing 8.12 displays the contents of the Product_Assembly table.
Listing 8.12. Contents of the Product_Assembly table.
SQL> select Assembly_ID, Product_ID, Description 2 from Product_Assembly 3 order by Assembly_ID, Product_ID; ASSEMBLY_ID PRODUCT_ID DESCRIPTION ------------ ------------ -------------------------------------------- B200 I101 Titanium alloy, teflon-coated iris B200 S42 Variable-speed shutter, standard B200 W123 Manual film advance, winder X1000 B200 Black body, stainless steel frame camera body X1000 F100 Blue filter - standard X1000 F55 Xenon flash unit X1000 L100 100mm lens - standard X1000 S04 4 foot camera strap, leather X1000 Complete camera 9 rows selected.
Given the small quantity of data presented, you can easily visualize the organization of the parts. However, the purpose of this example is to demonstrate how you can retrieve the organization of a hierarchy.
Listing 8.13 provides an example of how to retrieve hierarchical data with the
SELECT statement. The SELECT statement contains two new clauses:
START WITH and CONNECT BY. The START WITH clause identifies the top-level row for which Oracle should retrieve
subordinate rowsin this example, you want to find the subordinate rows of the X1000. The
CONNECT BY clause tells Oracle how to present the rows, and the
PRIOR operator tells Oracle to present the rows so that each child row's Assembly_ID is equal to its parent row's Product_ID.