Page 35
For an entity to be in Second Normal Form (2NF), all of its columns must depend on the primary key only. Put simply, this rule means that a table must not contain extraneous information. For example, in addition to other information, the Class table identifies each course and the instructor for that course. The instructor is identified by the column Instructor ID. The primary key for the table is the Class ID. If the Class table also contained the instructor's position, the table would not be in 2NF; the instructor's position depends on Instructor ID only, not on the primary key.
To be in Third Normal Form (3NF), a table's columns must be entirely dependent on the primary key. The key word in that last sentence is entirely. Each column in the table must be dependent on the entire primary key, not just a portion of it.
For example, the Course table identifies each course and the department that offers it. The primary key consists of the Department ID and the Course ID; both columns are necessary to uniquely define a row in the table. If the Course table also contained a column for storing the department chairperson, the table would not be in 3NF; the department chairperson depends only on the Department ID, not on the entire primary key. So it can be said that Department Chairperson is dependent on part of the primary key, but it is not wholly dependent on the primary key.
You'll sometimes see this concept referred to as the derived column. In the previous example, Department Chairperson can be derived from Department ID. According to relational theory, a table should not contain any derived columns. In practice, tables frequently contain derived columns.
Normalization theory discusses normal forms beyond the Third Normal Form (3NF), but I won't delve into that topic here. However, I would like to discuss the application of normalization theory. If you read articles about relational database technology, you will encounter what can be called "the great debate." Relational purists say that all tables must be in at least 3NF, although practitioners argue that to achieve acceptable performance, you must denormalize a databasein other words, reduce the database design from 3NF to 2NF. My position is somewhere in the middle. Here are my recommendations.
Page 36
DO DON'T |
Do ensure that your design is in 3NF. |
A schema is the set of database objectstables, columns, primary keys, foreign keys, and other objectsthat implement a logical data model.
To assist you in the development of a logical data model, you should seriously consider obtaining an entity relationship (ER) modeling tool such as Designer/2000, Oracle Data Designer, LogicWorks ERwin, or Sybase S-Designer. All of these tools are available for Windows NT or Windows 95. Briefly, these tools enable you to construct an entity-relationship diagram by selecting objects from a toolbar and drawing relationships between them.
The advantages of these tools are as follows:
Page 37
Figure 3.1.
Using Oracle Data-
base Designer to
develop a data model.
The example used throughout this book is a database for a small, fictitious college. In 1884, Reginald Flugle, a wealthy manufacturer of cuff links and tie pins in the northern high desert of Southern California provided a sizable endowment to a small, dust-ridden college. Until now, the administration of Flugle College has been a paper-based system. The college dean has awarded you a contract to develop a computerized information system based on a client/server architecture.
The high-level requirements for the Flugle College Information System (or FCIS as it will be known), include the following:
Page 38
Using these high-level requirements, the following are the entities needed to support the information needs of Flugle College:
The next sections look at each table in greater detail.
The purpose of the Student table is to maintain information about each student at Flugle College. Each student is identified by a Student IDthe table is defined so that two students cannot have the same Student ID. In addition to the student's name, address, and other pertinent information, the Student table also contains an attribute named Year which indicates the year of a studentfor example, freshman. In this respect, the sample database may be somewhat unrealistic; it's probably more important to keep track of a student's accumulated units than the year they are in. Another note: this table doesn't indicate the student's statusfor example, active, leave of absence, inactive. There is no attribute to indicate if a student has received a degree. Nevertheless, the attributes shown in the following Student table list provide sufficient realism for our purposes.
Student
Student ID
Last Name
First Name
Middle Initial
Street Address
City
State
Zipcode