Oracle STREAMS for database replication in LCG Openlab
Oracle STREAMS for database replication in LCG Openlab Technical Review 07 June @ CERN Eva Dafonte Pérez
Agenda n n STREAMS overview STREAMS on the TESTBED Configuration PROBLEMS STREAMS log mining configuration survey ¡ ¡ n n Downstream capture Split & Merge solution STREAMS monitoring Throughput TESTS 7 June 2006 Oracle Streams for database replication in LCG 2
STREAMS Overview n n Flexible feature for information sharing Basic elements: ¡ ¡ ¡ n n Capture Staging Consumption Replicate data from one database to one or more databases Databases can be non identical copies 7 June 2006 Oracle Streams for database replication in LCG 3
STREAMS Architecture CAPTURE PROCESS capture changes LCRs capture propagate events LCRs REDO LOG DESTINATION QUEUE APPLY PROCESS user changes 7 June 2006 apply SOURCE QUEUE log changes SOURCE DATABASE propagation Oracle Streams for database replication in LCG apply changes TARGET DATABASE (replica) 4
STREAMS on the TESTBED capture insert into employees values ( 03, “Joan”, …. ); propagation apply CERN LCR CNAF LCR CERN IN 2 P 3 LCR RAL 7 June 2006 LCR Sinica LCR BNL Oracle Streams for database replication in LCG 5
Configuration PROBLEMS spill over insert into employees values ( 03, “Joan”, …. ); capture propagation apply performance CERN LCR LCR LCR CNAF LCR CERN IN 2 P 3 LCR RAL 7 June 2006 LCR Sinica BNL Oracle Streams for database replication in LCG 6
STREAMS log mining survey n Objectives ¡ Source database (CERN) isolation against network or database problems at the replicas n ¡ Replica sites isolation against each other n 7 June 2006 Downstream Capture split & merge solution Oracle Streams for database replication in LCG 7
DOWNSTREAM Capture n n Capture process runs on a different database Redo log files are copied from source to downstream Use of fewer resources and data loss protection Definite latency in the replication process propagation jobs COPY redo log files downstream capture RAL CERN RAC SOURCE DATABASE CERN CNAF DOWNSTREAM DATABASE DESTINATION SITES https: //twiki. cern. ch/twiki/bin/view/PSSGroup/Downstream. Database. Configuration 7 June 2006 Oracle Streams for database replication in LCG 8
DOWNSTREAM Capture n Real-Time Downstream capture ¡ Redo transport services use the LGWR n ¡ ¡ The redo data is stored in the standby redo log at the destination db Capture process captures changes from n n ¡ 7 June 2006 records data in the online redo log at the source db standby redo log -> whenever possible archived redo log files -> whenever necessary Reduces the amount of time required to capture changes made at the source database Oracle Streams for database replication in LCG 9
Split & Merge solution n Split the capture process ¡ ¡ (original) Real-Time capture for sites “in good shape” (new) normal capture for site/s unstable/s n n new capture queue and propagation job original propagation job is dropped à n spilled LCRs are dropped from the original capture queue Merge the capture processes ¡ ¡ Real-Time capture is used for all the sites Resynchronization suggested by Patricia Mc. Elroy ( Principal Product Manager Distributed Systems/Replication) 7 June 2006 Oracle Streams for database replication in LCG 10
Split & Merge solution RAL Real-Time Downstream Capture propagation jobs RAL LCR propagation jobs CERN RAL … CNAF x propagation jobs CERN RAL … CNAF start scn minimum start scn normal Capture start scn: valid scn from which the capture process should start capturing changes propagation job RAL x x propagation job RAL with the help of Volker Kuhr (Oracle consulting) 7 June 2006 Oracle Streams for database replication in LCG 11
STREAMS monitoring n “Home-made” scripts ¡ ¡ ¡ n STRMMON: Oracle Streams monitor tool ¡ n capture, propagation and apply status queues status processes statistics overview of the Streams activity Health Check report ¡ 7 June 2006 information on the setup and operation of Streams Oracle Streams for database replication in LCG 12
STREAMS monitoring Display general information about each capture process -------------------------------------Redo Total Capture Serial Entries LCRs Name Number ID Number State Scanned Enqueued ------------------ ----------------- --------STRMADMIN_CAPTURE C 001 136 7 CAPTURING CHANGES 13394731 705854 STREAMS Monitor, v 2. 2 Copyright Oracle Corp. 2002, 2005. Interval = 3, Count=1000 Logon= @ ORACLE 10. 2. 0 Streams Pool Size = 752 M LOG : <redo generated per sec> NET: <client bytes per sec> <dblink bytes per sec> Cxxx: <lcrs captured per sec> <lcrs enqueued per sec> <capture latency> MEM : <percent of memory used> % <streams pool size> PRxx: <messages received per sec> Qx : <msgs enqueued per sec> <msgs spilled per sec> PSxx: <lcrs propagated per sec> <bytes propaged per sec> Axxx: <lcrs applied per sec> <txns applied per sec> <dequeue latency> <F>: flow control in effect <B>: potential bottleneck <x%I x%F x%xx>: <idle wait events percentage> <flow control wait events percentage> <other wait event percentage and name> xx->: database instance name 2006 -06 -6 16: 25: 26 || d 3 r 1 -> | | | MEM 6 % 752 M 2006 -06 -6 16: 25: 26 || d 3 r 1 -> | LOG 512 | NET 6 K 0 | <B> C 001 0 0 3 sec <0%I 0%F -> | Q 46190 0 0 | PS 01 0 0 0 <89%I 0%F -> | PS 02 0 0 0 <0%I 0%F -> MEM 6 % 752 M 2006 -06 -6 16: 25: 29 || d 3 r 1 -> | LOG 0 | NET 6 K 0 | <F> C 001 0 0 3 sec <0%I 0%F -> | Q 46190 0 0 | PS 01 0 0 0 <100%I 0%F -> | PS 02 0 0 0 <0%I 0%F -> MEM 6 % 752 M 7 June 2006 Oracle Streams for database replication in LCG 13
Throughput TESTS n n n Script written in python Based on condition data Insert only workload LCR CERN LCR CNAF LCR CERN IN 2 P 3 All databases are RAC systems 7 June 2006 Oracle Streams for database replication in LCG 14 …
Throughput TESTS n CERN to CERN replication n CERN to CNAF replication similar results CERN to IN 2 P 3 replication 7 June 2006 Oracle Streams for database replication in LCG 15
Throughput TESTS n n Preliminary numbers (~10 MB/min) Observed Apply process is the bottleneck ¡ n Propagation job stops due to FLOW CONTROL Queues @destinations are not sized appropriately ¡ ¡ n Apply parallelism does not help Improvements together with ORACLE Queue size Processes optimization ¡ ¡ in contact with Patricia Mc. Elroy (Principal Product Manager Distributed Systems/Replication) 7 June 2006 Oracle Streams for database replication in LCG 16
Questions & Answers 7 June 2006 Oracle Streams for database replication in LCG 17
7 June 2006 Oracle Streams for database replication in LCG 18
- Slides: 18