Previous | Table of Contents | Next |
What we need is a way to query the new sales data in our data warehouse; we need some sort of snapshot of current data to go along with our historical data. Luck is on our side because Oracle has the exact thing we are looking for, and it is called a snapshothow convenient.
Because the business expert only wants to view our sales data, and, of course, not change it, we decide to implement a read-only snapshot.
A snapshot is defined as a copy of a complete table(s) or a subset of a table(s) that reflects a recent state of that table that is defined periodically by the user.
Now we have all the data the business analyst needs because we have placed a snapshot of our recent sales database on our data warehouse server (see Figure 16.5).
Figure 16.5. By using a snapshot, we can incorporate current data into a data warehouse.
This type of snapshot is a transaction consistent snapshot in that it is being refreshed in a given period by actual sales data and therefore remains consistent with the reality of our sales transactions. The period of refresh might be hourly, daily, or even weekly, depending on our needs. When this period occurs, data is physically moved to our snapshot and we are said to be in a snapshot refresh. Obviously, this refresh takes time and resources so, in many cases, it is accomplished at night or when the system is being lightly used (see Figure 16.6).
Figure 16.6. A nightly refresh of a snapshot does not disrupt analysts or sales activity.
Lets create our snapshot for this fictitious sales company. In this case, we are doing it strictly because users of the data warehouse want to see current sales data. In this case, we have primary site replication because we are replicating a central sales database for another separate database, a data warehouse. Our sales database is themaster database that owns the master table; we will call it Sales.
Heres a simple Sales table that is logging the thousands of sales we receive each day:
Sales ----------------------------------------------------------- Transaction_id customer_id product_id location quantity_sold unit_price purchase_timestamp
In our data warehouse, the users simply want a table snapshot called current_sales that looks exactly the same and is refreshed from the master table, Sales, in our OLTP database. To create a snapshot like this, we simply use SQL as in the creation of any other object:
Create Snapshot current_sales Tablespace data_warehouse Storage (Initial 900m Next 100m Pctincrease 0) Refresh Fast Next sysdate + 7 As Select * From sales@sales.nj.com;
As seen here, we refer to our table in the above SQL as sales.nj.com. This implies that we have already installed Oracles SQL*Net, which needs to be present when we refer to any object on a different instance of Oracle using SQL. We have also created a database link using the create database link command.
Look at some of the options used in the previous SQL to see how they relate to replication principles:
Note:
For automatic refresh to function, the DBA or system administrator must refer to his operating system-specific startup file for local instances and start an additional snapshot refresh process that will run in memory as does the Oracle8 kernel processes. Many times this file is called dbstart.
Note that because of the power of a snapshot, we might decide to create a more sophisticated snapshot. We might only want sales data that is three months old because of the many returns. But further assume that Hong Kong has a new no-return policy, so now our data warehouse analysts need all sales data in Hong Kong, yet only sales data three months old from New York and London. In this case, our AS condition becomes more sophisticated:
AS select * from sales where (location != 'HONG KONG and purchase_timestamp < add_months(sysdate, -3)) or (location = 'HONG KONG);
This lends a great power to Oracle replication because we can create snapshots of subsets of an original table or tables. Our analysts can just as easily do an analysis of customer profiles in the data warehouse by creating a snapshot that joins a second customer table:
AS select sales.purchase_timestamp, sales.quantity_sold, sales.unit_price, customer.age, customer.gender, customer.zip_code from sales, customer where sales.customer_id = customer.customer_id
Previous | Table of Contents | Next |