
        
        Advanced Replication Design and 
          Setup Tips
        Is Replication Appropriate?
          Advanced Replication is not appropriate for all high availability 
          scenarios. Replication is generally most appropriate for the following 
          situations;
          1) The number of replicated tables is modest and manageable (< 1000, 
          not 10,000 or more).
          There is no specific limit imposed by Oracle concerning the number of 
          tables which can be replicated.
          The main issue will concern whether the number of replicated tables 
          will be manageable by the DBA in a timely manner.
          2) The DBA has full control over the design of the tables. 
          Every replicated table must have a primary key, and additional columns 
          and triggers need added for conflict resolution. Schemas such as Oracle 
          Financials or SAP do not lend themselves to replication because they 
          consist of 10,000 - 20,000 tables, which would represent too many tables 
          for a DBA to manage in a replicated environment. The DBA also does not 
          have full control over the schemas implemented by these applications, 
          and would not be able to safely modify the tables without affecting 
          the reliability or supportability of the application. For these types 
          of applications Clustering or a Standby database configuration (or both) 
          should be considered.
          3) Database availability requires a zero-failover timeframe. 
          The Tnsnames.ora Net8 configuration file can be configured for Transparent 
          Application Failover. TAF allows Net8 to automatically direct database 
          access requests to alternate servers. Standby databases require a manual 
          switchover process to be performed by the DBA in order to recover from 
          a failover, with an additional delay required for the standby database 
          to apply the remaining archivelog files. Users can be manually directed 
          to alternate servers for load balancing purposes one server does not 
          get overloaded.
          4) Dropping of a table at one site must not propagate to other sites. 
          
          DDL commands do not get replicated unless this feature is specifically 
          enabled. The scripts created by this application do not replicate DDL 
          commands in order to improve reliability due to this type of error. 
          Standby and Clustering solutions always transfer DDL commands between 
          sites. However standby databases can be configured with a delay interval 
          in an attempt to give the DBA the opportunity to prevent the DDL from 
          reaching the standby site.
          5) The number of transactions per second is manageable (does not exceed 
          400 transactions per second). This is a general guideline, which has 
          been mentioned by Oracle, and is very dependent upon the hardware being 
          used for the database servers. Replication generates a significant amount 
          of redo because all of the replicated data is written multiple times. 
          The data gets written into a deferred transaction queue at the source 
          site (causing redo), gets sent out across the network where it then 
          gets queued again (generating more redo) until it is finally applied 
          to the destination database.
          6) The data need replicated between servers running differing Oracle 
          versions or differing Operating Systems. 
          Standby and Clustered databases always require that the servers be running 
          the exact same version of Oracle and the exact same operating system 
          platform and version. Replication however may take place between differing 
          Oracle versions running on different platforms.
          7) The database servers need to be physically located far away from 
          each other.
          Standby databases can be remotely located from each other, but clustered 
          databases can only be as far away from each other as a short haul fiber 
          optic cable connection (probably not more than about 20 miles).
          Replicated databases can also be remotely located as long as the communications 
          link is reliable and fast enough to support the transaction rate which 
          needs supported.
        Setup Tips:
          1) The use of the Oracle Replication Manager application is not recommended 
          (even by Oracle) for the initial setup of Advanced Replication. Some 
          versions of the Replication Manager application contain a known bug 
          whereby the application logs into the database using the SYSTEM account 
          instead of the SYS account during the setup process. Use of the SYSTEM 
          account does not provide enough priviledges to complete the required 
          tasks, thus causing the setup process to fail. Oracle Worldwide Support 
          recommends that DBAs should build their own scripts to set up replication 
          (the reason for the existence of this application!). Oracle recommends 
          that DBAs should only use the Java-based Replication Manager application 
          to manage replication once it has been setup. Some previous versions 
          (8.1.6 for instance) of the Replication Manager also do not permit for 
          the saving of configuration SQL scripts. 
          2) When setting up replication on an Oracle 9i database sqlplus must 
          be used, because svrmgrl is no longer available. With previous versions 
          of Oracle it is generally advisable to use svrmgrl when setting up replication 
          in order to avoid a small number of issues with sqlplus when performing 
          DBA tasks such as shutting down or starting up databases. 
          3) Forward and reverse DNS must be setup for each server hostname in 
          order for the database servers to be registered into the OEM repository. 
          The servers must be registered into the OEM repository to allow the 
          Replication Manager to manage the servers.
        Performance:
          1) If 2 master sites are located physically close to each other, then 
          dedicating a network interface for the exclusive use of Replication 
          is recommended. Even if master sites are not physically close together 
          may benefit from the separation of replication traffic on a different 
          interface for non-replication related access to the database.
          2) Most UNIX systems will not route TCP/IP traffic to a 2nd network 
          interface unless it is assigned an IP address in another network or 
          subnet. For instance if the primary network interface used an IP address 
          of 10.1.0.10 with a subnet of 255.255.255.0, then the 2nd network interface 
          could potentually use an IP address of 10.2.0.10 with a subnet of 255.255.255.0. 
          However if the 2nd network interface were assigned an IP address of 
          10.1.0.11, then it would not appear to function because all of the network 
          traffic would get sent thru the 1st network interface.
          The network administrator should be consulted prior to assigning IP 
          addresses or subnetes in order to insure that the site's security policy 
          is followed and to insure that access lists in switches, routers and 
          firewalls are updated properly.
        Batch Operations:
          1) Oracle recommends that batch transactions which insert or update 
          more than a few hundred records should be performed with "Procedural 
          Replication". Procedural Replication is Oracle's term for running 
          a DBA-created PL/SQL stored procedure at each master site without allowing 
          the changes to propagate via replication to the other sites. Within 
          the PL/SQL procedure the DBA should first execute the DBMS_REPUTIL.REPLICATION_OFF 
          procedure before performing DML operations on replicated tables. The 
          DBMS_REPUTIL.REPLICATION_OFF procedure disables the replication process 
          during the current session. After the DML operations have been completed, 
          the DBA should execute the DBMS_REPUTIL.REPLICATION_ON procedure to 
          re-enable replication within the session.
          This process needs to be repeated at each of the remaining master sites 
          in order to insure that the data is in sync. 
          2) In addition to executing the DBMS_REPUTIL.REPLICATION_OFF procedure 
          the DBA should also disable the replication-related trigger on the date_modified 
          column of the table. Triggers do not execute within the same user session, 
          the replication process will take place based upon the modified data 
          and this will then cause data to be replicated to the other sites during 
          the batch operation.
        Conflict Resolution:
          1) Conflict resolution is an often-overlooked but important part of 
          any Advanced Replication setup. If data updates occur to the same record 
          simultaneously at more than one site, a conflict will occur. Oracle 
          will automatically handle the resolution of these types of conflicts 
          if conflict resolution methods are designed into the replication process. 
          The scripts generated by Repgen 
          automatically implement 3 conflict resolution methods on every replicated 
          table because Oracle recommends a minimum of 2 conflict resolution methods. 
          
          The 1st conflict resolution method is the LATEST TIMESTAMP method. This 
          method retains the most recently modified data as being the most up-to-date 
          information. 
          The 2nd conflict resolution method is SITE PRIORITY. If the LATEST TIMESTAMP 
          method fails to resolve a conflict, then the 2nd method is applied. 
          This method assigns a priority to each master site and causes data at 
          the site with the highest priority to be retained across all of the 
          sites. 
          There are some situations which will not be resolved by either of the 
          1st two conflict resolution methods. One of these situations involves 
          mis-convergence of the data between servers. If the number of records 
          within the tables at each of the servers is identical, but the actual 
          data contains differing values between the servers, the replication 
          transaction will fail with the "ORA-1403 Data Not Found" error. 
          According to the database the reason that the data was not found is 
          due to the same values not being found when comparing the before images 
          of data at each site. Oracle compares the data in each of the table 
          columns before the modification of the data. If any column at any one 
          of the sites contains data in the "before modification" image 
          of the data which is different from the site which is pushing the data, 
          then the database considers this to be a conflict requiring resolution. 
          Ideally this situation should never happen, but under real-world operating 
          conditions it does happen. One of the most common reasons this problem 
          occurs is due to the modification of data outside of the replication 
          process. In general it is not a good idea to modify data outside of 
          the replication process, but in fact it is required if you are performing 
          procedural replication due to batch updating requirements. The OVERWRITE 
          conflict resolution method will resolve this type of conflict by overriding 
          the conflict situation and forcing the propagation of the most recent 
          modification of the table to all of the sites. If the DBA is aware of 
          differing data within the tables (as a result of running the DBMS_RECTIFIER_DIFF 
          package), then the DBA can perform a harmless modification to data at 
          the site containing the "correct" data, thus pushing the "correct" 
          data to the rest of the sites and forcing convergence of the data. An 
          good example of a harmless modification is updating the date_modified 
          column of the table with the SYSDATE value. If the DBA does nothing, 
          the data will converge automatically during the normal course of updating 
          data within the tables. But ideally a person who is knowledgeable in 
          the design of the schema and overall dataflow should make this decision.
          2) An important conflict resolution technique is conflict avoidance.
          If multiple sites generate records containing the same primary key value, 
          a conflict will result. If no conflict resolution is implemented, then 
          the transactions will end up in the deferror queue. If the DBA tries 
          to re-apply the transactions manually, they will still fail due to the 
          duplicate primary keys which will fail while trying to insert data into 
          the table. The scripts generated by this graphical application are set 
          up to implement differing ranges of primary key values at each master 
          site. These scripts are designed around a primary key value consisting 
          of an incrementing integer value. Other methods can be implemented, 
          but would require modifying the reptriggers.sql code. It is recommended 
          that a large block of numbers be set aside for primary keys at each 
          master site. In most cases 1G blocks of primary key values will generally 
          be sufficient to prevent the DBA from having to re-allocate numbers 
          within a reasonable period of time. (The database isn't going to run 
          out of number anytime soon either!). The use of this straightforward 
          approach toward assigning primary key values can also pay benefits when 
          troubleshooting problems. By looking at the 1st digit in the primary 
          key value, the DBA can instantly determine the site which produced the 
          data.
        
        