Previous Table of Contents Next


Chapter 16
Replication

by Advanced Information Systems, Inc.

In This Chapter
•  Replication: “Why Ask Why?”
•  Replication and the Data Warehouse
•  Read-Only Replications and Snapshots
•  Complex Replication—Distributed Databases
•  Conflict Resolution
•  Survivability
•  The Advantages of Oracle8 and the Replication Manager

Replication: “Why Ask Why?”

Replication sounds like something out of a science fiction book, regarding space aliens and DNA. At first glance, it might not appear to be relevant to your business; it is hardly an intuitive term. If we were to deduce what Oracle’s replication might be, it would probably be the replication of databases because that is Oracle’s business. We would be correct.

Replication means to make a copy of one thing from another. In the world of databases, it means to make a copy of one database or subset thereof from separate databases. You might at this point storm out of the room, asking yourself why in the world you might want to replicate your database when you can hardly manage it as it is now!

Good question! Let’s take the road of science fiction to answer it and hopefully entertain. Let’s beam onboard Star Trek’s Enterprise, a starship in the federation of planets.

On our starship we have a computer. Probably one that talks because we want good TV ratings. Now assume that as captain you discover a new planet, a new species, a new spin-off series! Now assume you enter all this new information in your ship’s database.

A month later, many light years away, I come flying by in my ship, also a member of the federation. I encounter this new race of beings that you discovered. I check my database and find out that they love Jolt Cola—a fact that you found out on your initial mission.

Now how did the data get from your computer into mine? That would be my first question. Assume that we are too far out into space to log into a central computer; our shipboard computer is all we have; and instead we somehow need to update each other’s database periodically. Well, what we need is replication, plain and simple.

In business, many times for larger companies, databases are functioning in different parts of the world or simply performing different tasks. Maybe we are an international corporation and we do business in New York, London, and Hong Kong. We have three huge databases logging the sales of Star Trek trinkets in each of the three cities and a massive WAN (wide area network) linking the databases as in Figure 16.1.


Figure 16.1.  Our Star Trek sales data center architecture.

In this situation, clerks in New York, London, and Hong Kong are entering thousands of orders a day for the alien dolls that like Jolt Cola. Each database is being pressed to the limit! On each node of our distributed database, we have executives who need to see not only sales for their region but total sales. How do we show them that? Well, we use replication techniques just like our friends from the future do.


Note:  
In an ideal world for either our real business or for our starships spread across the galaxy:

Replication is inferior to a sufficiently powerful computer that can be accessed quickly across any distance.


If you have this ability, replication is of no value.

In this second scenario, Figure 16.2, we might have a fast enough database server and network and/or light enough sales volume so we can structure our London, New York, and Hong Kong business in the following way.


Figure 16.2.  A powerful server and network that is fully recoverable negates the need for replication.

Here we have a server that can handle all the requests and a network that can enable the amount of data we receive to pass through efficiently. Now say we also have the ultimate fail-over system for our dream server. If your business can do something like this, please do not consider implementing replication just to be one of the beautiful people.

Replication and the Data Warehouse

Here is an interesting caveat to my premise: Even with an efficient database server and fast enough network, we still might need replication. Imagine that we do have the dream database and the dream network—New York, London, and Hong Kong are all running effortlessly off our amazing central database.

Suppose this Star Trek thing has been going on since the 1960s. After a while, our perfectly-running database is going to fill up with a great deal of data. If we enter a modest 300 gigabytes per year of sales data, after 30 years we have 9 terabytes. Our year 2000 problem is going to be that our computer is going to choke itself to death and die if we don’t do something about our older data.

Furthermore, our Star Trek trinket company has hired a bunch of fancy business consultants, who are running long-term trend studies on our sales database! When these people log in to find out if our Jolt Cola alien action figures are losing popularity, suddenly the database slows down and sales people all over the world have trouble entering data. Our perfect system is slowly dying of data overload and being choked to death by business consultants who took one too many statistics courses in college (see Figure 16.3).


Figure 16.3.  At some point, you need to know how to say “when” in the storage of historical data and statistical data.

If I were to enter the stage as your consultant, after I handed you my modest bill, I would suggest that you separate the old sales data into a data warehouse. A data warehouse is a large repository of data that is generally not changed day-to-day. It is a place where our business analysts can analyze as much historical data as they can get their hands on.

We might decide to keep only the last three years online because we do have returns and other common events with this newer sales data, yet move the last 28 years to a data warehouse. Our architecture now looks something like Figure 16.4.


Figure 16.4.  By moving analytic and historical data off an OLTP sales server, we improve performance.

As you can see, our OLTP (Online Transaction Processing) sales database is much smaller and faster, and it is also separate from the grueling business studies being done on the historical data.

At this point though, the business analyst—pocket-protector and all—might protest that the more recent data must be visible from the data warehouse because he is running a time-series analysis.


Previous Table of Contents Next
Используются технологии uCoz