In most environments one set of GoldenGate process (1 Extract & 1 Replicat process) is sufficient for change data synchronization. But if your source database generates a huge volume of data then a single process may not be sufficient to handle the data volume.
In such a scenario there may be need to split the workload, across multiple Extract and/or Replicat processes.
You can configure multiple GoldenGate Extract and Replicat process to split workload using one of the two options below.
• Break up current Extract/Replicat processes for one table, into multiple processes using the GoldenGate Range function.
• Break up current Extract/Replicat processes, into multiple processes containing different set of tables in different processes.
For a given table if there is high data volume generated, then you can make use of the first option using the GoldenGate @Range function to split the workload for heavily accessed tables across multiple Extract/Replicat processes.
How does GoldenGate Range Function @Range work?
The @Range function divides a workload into multiple, randomly distributed groups of data. It determines the group that the range falls in, by computing a hash against the primary key or user-defined columns. It guarantees that the same row is always processed by the same process group.
The @Range function can be used on either the TABLE or the MAP parameter of Extract and Replicat parameter respectively. The @Range function is used along with the FILTER clause.
Limitations of @Range
You can’t use @Range function, if primary key updates are performed on the database.
Demonstration
In this example we will configure multiple Replicats (3) on the target system to split workload for SCOTT.EMP and SCOTT.DEPT tables. Each Replicat Process will read the same trail file generated by the Pump Extract process but operate on different set of data for the same table.
Environment Details
Parameters Source/Primary Target/Secondary
DB Name orcl orcl
DB Version 11.2.0.3 11.2.0.3
Schema SCOTT SCOTT
Table Name EMP, DEPT EMP, DEPT
OGG Version 11.2.1.0.20 11.2.1.0.20
Enable Supplemental logging at Table level
GGSCI > dblogin userid ggs, password oracle
Successfully logged into database.
GGSCI> add trandata scott.emp
GGSCI> add trandata scott.dept
Extract and Pump on Source System
Primary Extract
$ cd $GG_HOME
$ ./ggsci
GGSCI> EDIT PARAMS Erange1
Extract Erange1
USERID GGS, PASSWORD oracle
EXTTRAIL ./dirdat/rt
TABLE Scott.emp;
TABLE Scott.dept;
GGSCI> ADD Extract Erange1, TRANLOG, BEGIN NOW
GGSCI> ADD EXTTRAIL ./dirdat/rt, Extract Erange1
GGSCI> Start Erange1
GGSCI> Info Erange1
Pump Extract
GGSCI> EDIT PARAMS Prange1
Extract Prange1
USERID GGS, PASSWORD oracle
RMTHOST 192.168.1.30, MGRPORT 7809
RMTTRAIL ./dirdat/rt
TABLE Scott.EMP;
TABLE Scott.DEPT;
GGSCI> ADD Extract Prange1, EXTTRAILSOURCE ./dirdat/rt
ADD RMTTRAIL ./dirdat/rt, Extract Prange1
GGSCI> Start prange1
GGSCI> Info prange1
Replicat processes on Target System
Here we will create 3 Replicat processes, with each Replicat group processing one-third of the data in the GoldenGate trail based on the primary key.
Replicat - 1
$ cd $GG_HOME
$ ./ggsci
GGSCI> EDIT PARAMS Rrange1
Replicat Rrange1
USERID GGS, Password oracle
DISCARDFILE ./dirrpt/Rrange1.dsc, Append
ASSUMETARGETDEFS
MAP Scott.EMP, TARGET Scott.EMP, FILTER (@RANGE (1,3));
MAP Scott.DEPT, TARGET Scott.DEPT, FILTER (@RANGE (1,3));
GGSCI> ADD Replicat Rrange1, EXTTRAIL ./dirdat/rt
GGSCI> Start Rrange1
GGSCI> Info Rrange1
Replicat - 2
GGSCI> EDIT PARAMS Rrange2
Replicat Rrange2
USERID GGS, Password oracle
DISCARDFILE ./dirrpt/Rrange2.dsc, Append
ASSUMETARGETDEFS
MAP Scott.EMP, TARGET Scott.EMP, FILTER (@RANGE (2,3));
MAP Scott.DEPT, TARGET Scott.DEPT, FILTER (@RANGE (2,3));
GGSCI> ADD Replicat Rrange2, EXTTRAIL ./dirdat/rt
GGSCI> Start Rrange2
GGSCI> Info Rrange2
Replicat – 3
GGSCI> EDIT PARAMS Rrange3
Replicat Rrange3
USERID GGS, Password oracle
DISCARDFILE ./dirrpt/Rrange3.dsc, Append
ASSUMETARGETDEFS
MAP Scott.EMP, TARGET Scott.EMP, FILTER (@RANGE (3,3));
MAP Scott.DEPT, TARGET Scott.DEPT, FILTER (@RANGE (3,3));
GGSCI> ADD Replicat Range3, EXTTRAIL ./dirdat/rt
GGSCI> Start Rrange3
GGSCI> Info Rrange3
Now when the data is loaded into the source database, each of the 3 Replicat processes will apply the trail data to target database to different sets groups of rows at the same time (in parallel).
Read more on VirtualBox For GoldenGate
• Provision VirtualBox using Pre-Built Developer VMs
• Adding Disk Storage to Oracle VirtualBox
• Connecting Applications from Desktop to your VirtualBox Database
Some issue with
ERROR DDL committed by one Replicat before waiting for all preceding DMLs
ERROR Delete occurs before the primary key is updated by another Replicat
Now you have other options Integrated and Coordinated Replicat
1) Coordinated Delivery Approach
REP.PRM MAP sales.acct, TARGET sales.acct, THREADRANGE(1-3, ID));
Creating Coordinated Replicat Groups
GGSCI> add replicat rep1, coordinated, exttrail ¡K, maxthreads 10
Creates 12 Replicats¨ in total
rep1 (Coordinator), rep1000 (Thread 0 that is created by default)
rep1001, rep1002, rep1003¡K. rep1010 (Threads 1, 2 ¡K etc.)
Each Replicat has its own checkpoint file, report file and is a fully functional replicat.
Processes will only be created for threads mentioned in the Parameter file MAXTHREADS defaults to 25 threads.
2) Integrated Replicat
• Integrated Replicat for Oracle target databases only 12.1.0.1 and 11.2.0.4
• Leverages database parallel apply servers for automatic dependency aware parallel apply
Single replicat parameter file for all tables
No split transaction semantics
can this be done at the schema level ?
Hi Prasanna,
What version of the database and GoldenGate are you using? In GoldenGate 12c there is better support for paralleling these large tables without manual setup of the @range functions. Hopefully this helps.