Previous Table of Contents Next


Service Dimension

The service dimension describes the services being sold. This table will have a relatively small number of rows, since fewer than 1,000 services are provided. It will usually be a table with a large number of columns. These columns detail everything that can be known about the service, such as the following:

  Service type
  Service name
  Location of service (onsite or offsite support)
  Internal group
  Date the service was first offered

Here are some questions that may be answered with the service dimension:

  What is the average sale in software support? In hardware support?
  What is the sales total for Oracle RDBMS support during the last quarter?
  Which service is the most profitable?

Salesperson Dimension

The salesperson dimension records information about the salesperson who achieved the sale. This dimension will often be used for spatial analysis. The columns in this dimension contain information about the salesperson that is easy to group, as shown in the following list. Notice that the name and address information of the salesperson is not recorded. That information is not relevant for high-level decisions.

  sales_person_id
  sales_manager_id
  salary
  birthday
  region
  number of years at the company

The following questions are the type that may be answered with the salesperson dimension.

  What is the sales total per month for salespeople who are making over $35,000 per year?
  What is the average age of our sales force? (This one doesn’t actually use the fact table.)
  Which region has the highest total sales?

Customer Dimension

The customer dimension is often the most valuable dimension in this particular type of star schema. It can record information about the customer that will allow management to better understand the customer base. Some of the data included would be the following:

  Customer name
  Zip code
  City
  State
  Number of employees
  Revenue
  Credit rating
  Repeat customer or new customer

Some of the questions the customer dimension might answer are the following:

  What is the average sales commission to companies with over 500 employees?
  What is the average length of a sales contract for companies in New York or New Jersey?
  What is the number of sales over two million dollars made to companies with a poor credit rating?

Using All the Dimensions

The true benefit of dimensions is that they can all be used together. The data from the fact table can be filtered using any of the dimension tables.

The following are examples of the kinds of questions that may be answered with multiple dimensions:

  What was the most profitable service in region 5 during the first quarter? (Time, Service, and Salesperson)
  Who was our largest customer (in sales) in the software support area? (Customer, Service)
  Which region has paid the most in sales commissions during December? (Salesperson, Time)
  What is the average length of sales contracts for companies with revenues of over 500 million dollars that purchased Oracle RDBMS onsite service contracts during the last fiscal year? (Time, Service, Customer)

The star schema is a data warehousing model that provides a flexible, simple architecture for decision support systems. It is the most proven and accepted model in this area.

Designing the Data Refresh Process

The data warehouse is typically designed to be updated on a periodic basis. The data warehouse design team must determine the periodic refresh interval for each group of data. Some tables will need to be updated nightly. The Sales table is a good candidate for a nightly refresh because it is constantly changing. Other tables, such as lookup tables, change much less frequently.

An example of this is the Shipping table. The Shipping table contains the shipper_id, name, address, phone number, and shipping code. The Shipping table has three rows (see Table 30.1).

Table 30.1. Shipping table

ID Name Address Phone Shipping Code

1 Federal Express 101 Lincoln Ave. 303-555-1112 A788D6
2 United Parcel Service 121 10th St. 303-555-4444 11118786657
3 U.S. Post Office 6556 Main Rd. 303-555-8878 U454IUYO

Because the company rarely adds or changes shipping companies, this table can be updated much less frequently inside the data warehouse.

Some data brought into the data warehouse is refreshed based on a schedule outside of the data warehouse design team’s control. Data derived from accounting cycles will be updated inside the data warehouse according to the accounting cycle dates. A company that completes all of its accounting for each month on the 15th of the following month will refresh the data on the 16th of the month. Other data will not be periodic in nature. Press releases are often random in nature and will need to be added as quickly as possible.

It’s important for the data warehouse design team to remember that not all the data is current at a single point in time.


Previous Table of Contents Next
Используются технологии uCoz