Sams Teach Yourself Transact-SQL in 21:Joining Tables

Referential Integrity

I’m going to bring up referential integrity here because it will be useful during the exercises as you formulate joins. Don’t worry if these details are sketchy or you feel as though there’s more going on that I haven’t provided. You learn more about this on Day 9.

Referential integrity is an important but very simple concept. It has one rule.

Referential integrity exists when all values in foreign key columns have corresponding primary key values.

That’s it.

Referential integrity is useful for ensuring that the columns in your tables and the relationships they share with other tables contain meaningful data. If there were a record in sales for a title_id that didn’t exist in titles, what would that mean? (It would mean there was some kind of mistake! Did someone order the wrong title? Is a title missing?) Referential integrity is often enforced with triggers, discussed on Days 18 and 19.

At the beginning of the chapter, I started by telling you that “Usually, the server knows nothing about the relationships between tables.” The exception to this rule is when declarative referential integrity (DRI) is used. DRI can be defined when a table is created. You will learn more about exactly how to use it on Day 9. The reason I bring it up here is that the pubs database is using declarative referential integrity.

Note: Right now, don’t worry too much about the details of referential integrity, declarative or otherwise. I’ve skimmed over the basics, but haven’t discussed all the intricate details of implementing it on your own tables, dealing with it inside application programs, or managing it.
Because declarative referential integrity is defined on pubs, you can use it to your advantage when formulating joins.

The real point of my bringing up this topic is so that you can use the defined relationships in pubs to help you join tables together. The sp_help output for a table in pubs will display the keys defined on a table, along with any foreign keys referencing this table. Here is (abbreviated) output for the publishers table; it appears at the bottom of the sp_help output if you want to run this yourself:

sp_help publishers

constraint_type            constraint_name     constraint_keys
----------------           ---------           ---------
PRIMARY KEY (clustered)    UPKCL_pubind        pub_id
Table is referenced by
------------------------------------------------
pubs.dbo.employee: FK__employee__pub_id__337FAFDD
pubs.dbo.pub_info: FK__pub_info__pub_id__2AEA69DC
pubs.dbo.titles:   FK__titles__pub_id__0E4E2B2E

The defined primary key for the publishers table, shown first, is pub_id. The server has generated names for each of the “rules” or “constraints” that are defined with declarative referential integrity.

The tables that reference publishers are listed in the “Table is referenced by” section. This section lists the full table name (composed of the database, pubs, the owner, dbo, and the table names themselves, employee, pub_info, and titles). (There is a detailed section on object naming on Day 11.) All you need is that last, familiar part. This is followed by the name of the constraint, which includes the letters FK (for foreign key), the name of the table containing the foreign key, the column name (pub_id), and a system-generated hexadecimal key included for uniqueness. Here, all you need is the column name, pub_id.

Referential Integrity Summary

This output shows you where to go to find tables that reference publishers. Knowing this lets you directly join publishers to any of those tables. You can use this strategy to identify joins when declarative referential integrity is being used on your tables. If it is not, the server won’t know anything about those relationships and can’t help you out. Then, you’ll need either an ER diagram or someone helpful who is familiar with the database.

Joins and Performance

When two (or more) tables are joined, the server will consider the join keys between the tables, and formulate the most efficient way to generate the joined result set. To do this, it designates a join order for the tables.

The join order is the order in which tables are joined, where one table is designated as the inner table and the other is designated the outer table. Each row in the outer table generates a search for all joining rows in the inner table, so the choice the server makes is important to performance.

Technical Note: Indexes factor heavily into the performance of joins. There is a detailed discussion of indexes on Day 15. Indexes allow the server to find in a few milliseconds what would take many minutes or hours without indexes.
Because the tables in pubs are so small, the indexes are rarely used to resolve joins; it’s faster to simply read all the rows and do the join in memory. If you connect to the bigpubs database on the metis website, though, you can get some indication of join performance.

Generally speaking (and this is oversimplifying a very complex process that the coders of SQL Server have spent years refining), the inner table is the smaller table that will be searched once for every row in the outer table.

Table of Contents

Используются технологии uCoz