Introduction
GoldenGate technology is very useful to replicate databases between on-prem and any Public Cloud while they are migrated the databases between the two. This helps reduce downtime during cutover.
However, performance is concern because the replication does not keep up with the database changes and there is significant lag occurring in the extracts, and therefore into the replicats as well. The solution being used is GoldenGate 19c Microservices.
The objective of this document is to provide solution, in reducing or eliminating the replication lag. This document will address issues with the Oracle Retail Application database, but the learnings will be applicable to any database.
Recommendations
- Break Up Extracts and Replicats
Allow schemas with the largest transaction volume their own extract each, combine the small ones into a single or multiple extract, and then include tables without a unique key in the final extract. While the lack of unique key won’t impact the extract performance it will allow to split transactions more easily into a single replicat.
This is assuming there is no referential integrity between the tables in the different extracts. If there is, tables with referential integrity should be included in the same extract. The replicats should then be built so that they match the extracts one for one. To make sure the extracts and replicats connect to the correct instance, TNS aliases will need to be created for each instance, and then and separate credentials need to be added to the configuration for each one.
- Streams Pool Size
For extract, the size of the streams pool size is based on the MAX_SGA_SIZE parameter in the extract parameter file. In this case that is 1000M, which requires a streams pool size of at least 1.25Gb.
If parameter “LogMiner preparer: memory” background waits is above 25% suggests then the MAX_SGA_SIZE is not large enough. So, if the MAX_SGA_SIZE is increased by 50% to 1500M then we’ll need to increase the streams pool size as well.
Increasing the MAX_SGA_SIZE to 1500M and the streams pool size to 3G per instance will provide more head room and should reduce these background waits.
For the replicats, the streams pool size should be at least (1Gb x the number of replicats) * 1.25.
If the replicats are divided between the RAC nodes and each has parallelism set to 8 then the streams pool size in each instance should be at least 12Gb, although more is always better when it comes to replicats and since the database isn’t being queried significantly, then assigning a good portion of the SGA to the streams pool makes sense. If the SGA was to remain at 36Gb then 18Gb would be a suitable size for the streams pool.
- Improve Database Redo Log Read Performance
When using integrated extracts, the database LogMiner server’s read performance can be improved by increasing the number and size of log read buffers.
This is done by setting the following two database parameters:
_log_read_buffers = 64
_log_read_buffer_size = 128
It is always recommended to set these two parameters in the source database to the values listed above.
- Improve LogMiner Read Performance
To improve performance of the extract process, the “_LOGMINER_READ_BUFFERS” parameter can be introduced. This parameter controls the number of buffers to read redo. It is recommended to set this parameter to 256.
- BatchSQL Parameter
The BATCHSQL parameter causes SQL statements across multiple transactions which affect the same table to be grouped together and kept then applied in arrays.
For some DML patterns this can provide additional performance. BATCHSQL works the best for small row changes. For large row changes, the performance is not applicable. BATCHSQL is not fit for the row size > 5K.
There are few criteria which stop the BATCHSQL process, such as:
-
- LOB and LONG data
- Row greater than 25k
- Tables having more than one unique key besides primary key
- Errors
In these cases, the batch process will rollback and return to the default apply mode. BATCHSQL also grouping changes the execution order of the SQL statements. This information is useful in some cases.
- Add Extract Performance Diagnostics
The “reportcount” parameter allows for the performance figures of the extract processes to be logged in the extract report file, which will help in the investigation of performance degradation.
It is suggested to log the performance figures for the replicat and extract process every 20 minutes by setting the “reportcount” parameter in the extract’s parameter file with the following line:
REPORTCOUNT EVERY 20 MINUTES, RATE
- Tables without Unique Keys
Tables without Unique Keys can present serious issues for GoldenGate. If there are updates or deletes against a table of any significant size without a unique key then the whole table will be scanned to find the row to be updated or deleted. Where in the source database a DML might update all the rows in table as one statement, GoldenGate must then convert that into individual row updates which will scan the table for every row.
To resolve this, artificial keys may need to be created for certain tables.
- Network Performance
To improve the performance of the GoldenGate hub extract processes Oracle Net connectivity should be optimized with a higher SDU size. The SDU size controls the size of the buffer that Oracle Net uses for sending data across the network. Using a higher buffer should improve Oracle Net performance. For database 12g Release 1 and later the maximum SDU size is 2MB. It is recommended to set the SDU size being used to 2MB. The SDU size needs to be set on the client side as well as the server side.
- COMMIT_WAIT Parameter
This parameter can be added to a replicat parameter file if you have large volumes of inserts for tables. It’s particularly useful if you have a replicat which only contains tables with large throughput that are mainly inserts. It may impact ACID compliance but should improve performance.
Benefits:
GoldenGate Implementation was very successful and able to reduce downtime during cutover by 60% with real time extract and replicat.