Monday, August 15, 2016

General Overview of Multimaster Replication

General Overview of Multimaster Replication 
 
Multimaster replication is a utility that allows data in multiple databases to be automatically kept in sync. For example, in a multimaster replication system, if a row gets inserted into one of the databases in the system, that row will be automatically propagated to all of the other databases in that system. Updates and deletes to the data in any of the databases will be propagated in the same way.
A multimaster replication environment is set up by configuring databases to be part of a “replication group”. One of the databases in the group is defined as the “master definition site,” and all of the other databases in the group are classified as “master sites.” The main difference between the two types of sites is that most of the replication administration commands must be invoked from the master definition site.

There are two basic ways that transactions get propagated to remote databases—“synchronously” and “asynchronously”. Synchronous replication occurs by causing each transaction to be applied to all the master sites in a group immediately. The way this is achieved is by using Oracle’s two- phase commit functionality, to ensure that all of the databases in question can apply a given transaction. If any of the sites in the group cannot accept the transaction (such as because the site’s database has crashed, or the network connection to a database is down) then none of the master sites in the replication group will be able to accept the transaction—the transaction will not be able to take place.
The way asynchronous replication works is that all the transactions that occur on a site are temporarily placed in a buffer, called the “deferred transaction queue,” or deftran queue. Periodically, such as once per minute, all of the transactions in a site’s deftran queue get sent to all of the other sites, by “push” jobs. These jobs get created by calling the “schedule_push” procedure. Finally, the transactions in a deftran queue that have already been sent to other sites must be periodically purged, to prevent the deftran queue from growing too large.

The vast majority of customer sites that use multimaster replication use asynchronous replication rather than synchronous. One of the reasons for this is that asynchronous replication has been available for a much longer time; the initial versions of multimaster replication only allowed for asynchronous propagation. The main reason that asynchronous is used, though, is because it has many advantages over synchronous.

First of all, asynchronous replication uses much less network bandwidth and provides higher performance than synchronous replication. The primary reason for this is that it is more efficient to store multiple transactions and then propagate them all as a group, rather than to propagate each transaction separately.

This is particularly important when the sites in question are very far apart geographically (such as having one site in San Francisco and another in New York). Another reason for these bandwidth and performance improvements is that there is much more overhead associated with synchronous replication because each and every transaction requires that separate connections be established to all of the other sites in the replication group. With asynchronous replication, fewer connections need to be established, since transactions are propagated as a group.

The biggest advantage of asynchronous replication, though, is that it provides for high availability of the replication group. With asynchronous replication, if one of the sites in the replication group crashes, all of the other sites will still be able to accept updates—the transactions that are made on the remaining sites will just “stack up” in those sites’ deftran queues until the down site becomes available.

On the other hand, with synchronous replication, if any one of the sites becomes unavailable (such as because of a database crash or a network failure) then none of the sites will be updatable. This is because with synchronous replication, each and every transaction must be able to be immediately applied to all of the sites in the replication group, and of course if a site is unreachable no transactions will be able to be applied to it. This means that not only does synchronous replication not provide any higher database availability, it can actually provide lower availability than using a single database!

No comments:

Post a Comment