Oracle Data Guard Support Issues Brian Hitchcock OCP
Oracle Data. Guard Support Issues Brian Hitchcock OCP 10 g DBA Sun Microsystems brian. hitchcock@sun. com brhora@aol. com www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 1
Oracle Data. Guard Ÿ Focus on Logical Standby Support Issues Ÿ Maintains a standby database – – Archived redo logs on primary Sent to standby and applied Ÿ Simple idea Ÿ Many configuration options Ÿ Can become very complicated www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 2
Data. Guard Ÿ Must be SYS to make changes – Sqlplus / as sysdba Ÿ Changes to Data. Guard standby database – Some can’t be made while apply process running Ÿ Change Guard status Ÿ Support Issues – Create physical standby – Convert to logical standby After logical standby is running Refresh process – – www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 3
Data. Guard Errors Ÿ Data. Guard reports lot of errors – – Standby database alert log Many are for normal operation Ÿ Why reported as errors? – Monitoring of db alert log Ÿ Will report these ‘errors’ Ÿ Hard to filter out normal ‘errors’ www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 4
Create Physical Standby Ÿ On Primary database – – – Enable Forced Logging Create password file Setup init. ora/spfile parameters Ÿ Can’t connect to standby Ÿ SYS password – – – Verify archiving enabled Backup db (hot or cold) Create standby control file www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 5
Create Physical Standby Ÿ On Standby database – – Copy db backup files from primary Copy standby control file from primary Setup init. ora/spfile parameters Start physical standby db Ÿ Trace file – Verify physical standby working Ÿ May not ‘see’ redo logs, register them Ÿ Redo logs not deleted, use RMAN www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 6
Convert to Logical Standby Ÿ On Primary database – Build Log. Miner dictionary Ÿ On Standby database – – – Stop redo apply Ÿ Errors, no impact Convert database to logical standby Ÿ Two trace files Restart db Open resetlogs Verify logical standby working www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 7
Logical Standby is Running Ÿ Business requirements – – Standby frozen most of the day Standby catches up once per day Ÿ Alert log messages while catching up Ÿ Disk space for archived redo logs Ÿ Other issues – Apply process is slow Ÿ How to detect, resolve – Primary versus Standby backups Ÿ Impact, resolution www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 8
Logical Standby is Running Ÿ Other Issues – Constraint violations Ÿ Errors, resolution – No data found Ÿ Errors, resolution – ORA 16211 Ÿ Errors, Oracle Support – Primary db XDB schema issues Ÿ Fixed on primary, errors on standby www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 9
Logical Standby is Running Ÿ Other issues – ORA-07445 Ÿ Refresh cures all – Refresh process Ÿ After refresh – ORA-16211: unsupported record found in the archived redo log – Compile invalid objects Import into standby database – www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 10
Primary Can’t Connect Ÿ Standby not available – – Reported on primary production database “ORACLE not available” Ÿ Looks like production primary is down – Your monitoring may need to be adjusted Thu Oct 18 16: 59: 20 2007 Error 1034 received logging on to the standby Thu Oct 18 16: 59: 20 2007 Errors in file /shared/orahome 01/admin/BRHPROD/bdump/brhprod_arc 1_2635. trc: ORA 01034: ORACLE not available PING[ARC 1]: Heartbeat failed to connect to standby ‘BRHPRSB'. Error is 1034. www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 11
SYS Password Issue Ÿ Primary tries to connect to standby Mon Oct 8 15: 31: 36 2007 Error 1017 received logging on to the standby ------------------------------Check that the primary and standby are using a password file and remote_login_passwordfile is set to SHARED or EXCLUSIVE, and that the SYS password is same in the password files. returning error ORA-16191 ------------------------------Mon Oct 8 15: 31: 36 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_arc 0_2309. trc: ORA-16191: Primary log shipping client not logged on standby PING[ARC 0]: Heartbeat failed to connect to standby ‘BRHBRSB'. Error is 16191. www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 12
SYS Password Issue Ÿ Verify SYS password is the same – – On primary and standby Sqlplus sys/<password> Ÿ Verify password file has same password – – On primary and standby Cat $ORACLE_HOME/dbs/orapw<SID> Ÿ Refresh password file – – Alter user SYS identified by <password> Update password file www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 13
Data. Guard Trace File Ÿ Physical Standby – – – Start log apply process Trace file created Stops when log apply process stops Ÿ See file contents later… www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 14
Can’t ‘See’ Redo Logs Ÿ Physical Standby Ÿ Creating or Refreshing standby – – Primary configured, sending redo logs Standby not yet created/running Ÿ Standby may not register redo logs – Our scripts maintain primary archived redo logs Ÿ Compress to save disk space, delete after 2 days Ÿ Manually register – – Alter database register logfile ‘<logfile>’; Data. Guard applies redo log www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 15
Can’t ‘See’ Redo Logs BRHBETA> select * from v$archive_gap; THREAD# LOW_SEQUENCE# HIGH_SEQUENCE# 1 1959 1976 BRHBETA> select sequence#, applied from v$archived_log order by sequence#; SEQUENCE# APP 1956 YES 1957 YES 1958 YES 1977 NO 1978 NO 1979 NO 1980 NO 1981 NO 1982 NO 1983 NO 1984 NO 1959 thru 1976 on standby 11 rows selected. www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 16
Redo Logs Not Deleted Ÿ Physical Standby – – After applied to standby Unlike logical standby Ÿ SQL apply process does delete them Ÿ Use RMAN Ÿ Possible disk space issues on standby – – How long will you need to store redo logs? Not an issue if converting to logical soon www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 17
Stop Physical Standby Ÿ Log Apply Process start – – – Starts trace file When physical standby first created Ends when log apply stops Ÿ Normal processing – Trace file looks like a problem www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 18
Turn Off Apply Process Ÿ Physical standby – Turn off apply process Ÿ Regular maintenance Ÿ Converting to Logical Standby – Generates ‘error’ Ÿ Why is this an error? Ÿ Typical of Data. Guard Ÿ Everything seems to be an ‘error’ Even when it is perfectly routine Ÿ Makes support more difficult When is an error something to worry about? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 19
Standby Alert Log Tue Oct 9 16: 34 2007 Physical Standby Database mounted. Completed: ALTER DATABASE MOUNT Tue Oct 9 16: 34: 36 2007 alter database recover managed standby database disconnect from session Tue Oct 9 16: 34: 36 2007 Attempt to start background Managed Standby Recovery process (BRHBETA) MRP 0 started with pid=11, OS id=13474 Tue Oct 9 16: 34: 36 2007 MRP 0: Background Managed Standby Recovery process started (BRHBETA) Managed Standby Recovery not using Real Time Apply parallel recovery started with 7 processes Log apply process started when physical standby created www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 20
Standby Alert Log Wed Oct 10 10: 15 2007 alter database recover managed standby database cancel Wed Oct 10 10: 15: 19 2007 MRP 0: Background Media Recovery cancelled with status 16037 Wed Oct 10 10: 15: 19 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_mrp 0_13474. trc: ORA-16037: user requested cancel of managed recovery operation Recovery interrupted! Wed Oct 10 10: 15: 20 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_mrp 0_13474. trc: ORA-16037: user requested cancel of managed recovery operation Wed Oct 10 10: 15: 20 2007 MRP 0: Background Media Recovery process shutdown (BRHBETA) Wed Oct 10 10: 15: 21 2007 Managed Standby Recovery Canceled (BRHBETA) Wed Oct 10 10: 15: 21 2007 Completed: alter database recover managed standby database cancel Log apply process stopped – preparing to convert to logical standby www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 21
Trace File $ more /orahome 01/admin/BRHBETA/bdump/brhbeta_mrp 0_13474. trc Oracle Database 10 g Enterprise Edition Release 10. 2. 0 - 64 bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /orahome 01/product/10. 2. 0 System name: Sun. OS Node name: brh-beta 1 -zone 04 Release: 5. 10 Version: Generic_118833 -36 Machine: sun 4 u Instance name: BRHBETA Redo thread mounted by this instance: 1 Oracle process number: 11 Unix process pid: 13474, image: oracle@beta 1 -zone 04 (MRP 0) *** SERVICE NAME: () 2007 -10 -09 16: 34: 36. 298 *** SESSION ID: (394. 1) 2007 -10 -09 16: 34: 36. 298 ARCH: Connecting to console port. . . *** 2007 -10 -09 16: 34: 36. 299 60639 kcrr. c Start applying redo logs to physical standby MRP 0: Background Managed Standby Recovery process started www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 22
Trace File *** 2007 -10 -09 16: 34: 41. 302 1018 krsm. c Managed Recovery: Initialization posted. *** 2007 -10 -09 16: 34: 41. 303 60639 kcrr. c Managed Standby Recovery not using Real Time Apply Recovery target incarnation = 2, activation ID = 0 Influx buffer limit = 27762 (50% x 55524) Successfully allocated 7 recovery slaves Using 158 overflow buffers per recovery slave Start recovery at thread 1 ckpt scn 8257757517457 logseq 1956 block 5 *** 2007 -10 -09 16: 34: 42. 124 Media Recovery add redo thread 1 *** 2007 -10 -09 16: 34: 42. 124 1018 krsm. c Recreating redo logs Managed Recovery: Active posted. ORA-00367: checksum error in log file header ORA-00305: log 1 of thread 1 inconsistent; belongs to another database ORA-00312: online log 1 thread 1: '/shared/oralogs 01/BRHBETA/redo 01 a. log' *** 2007 -10 -09 16: 34: 42. 147 60639 kcrr. c Clearing online redo logfile 1 /shared/oralogs 01/BRHBETA/redo 01 a. log *** 2007 -10 -09 16: 36: 15. 066 60639 kcrr. c Clearing online redo logfile 1 complete ORA-00367: checksum error in log file header ORA-00305: log 2 of thread 1 inconsistent; belongs to another database ORA-00312: online log 2 thread 1: '/shared/oralogs 01/BRHBETA/redo 02 a. log' www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 23
Trace File *** 2007 -10 -09 16: 36: 15. 100 60639 kcrr. c Clearing online redo logfile 2 /shared/oralogs 01/BRHBETA/redo 02 a. log *** 2007 -10 -09 16: 37: 51. 473 60639 kcrr. c Clearing online redo logfile 2 complete ORA-00367: checksum error in log file header ORA-00305: log 3 of thread 1 inconsistent; belongs to another database ORA-00312: online log 3 thread 1: '/shared/oradata 02/BRHBETA/redo 03 b. log' *** 2007 -10 -09 16: 37: 51. 479 60639 kcrr. c Clearing online redo logfile 3 /shared/oradata 02/BRHBETA/redo 03 b. log *** 2007 -10 -09 16: 39: 26. 048 60639 kcrr. c Clearing online redo logfile 3 complete ORA-00367: checksum error in log file header ORA-00305: log 4 of thread 1 inconsistent; belongs to another database ORA-00312: online log 4 thread 1: '/shared/oradata 02/BRHBETA/redo 04 b. log' *** 2007 -10 -09 16: 39: 26. 488 60639 kcrr. c Clearing online redo logfile 4 /shared/oradata 02/BRHBETA/redo 04 b. log *** 2007 -10 -09 16: 41: 00. 447 60639 kcrr. c Clearing online redo logfile 4 complete *** 2007 -10 -09 16: 41: 00. 469 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 1956 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 24
Trace File Applying redo logs to physical standby *** 2007 -10 -09 16: 41: 00. 469 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1956 -1976 *** 2007 -10 -09 16: 41: 30. 782 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------*** 2007 -10 -09 16: 54: 31. 045 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1956 -1956 *** 2007 -10 -09 16: 55: 01. 154 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------*** 2007 -10 -09 16: 56: 31. 179 Media Recovery Log /oraarch 01/BRHBETA/LOG_1956_1_629245032. arc *** 2007 -10 -09 16: 56: 33. 431 Media Recovery Log /oraarch 01/BRHBETA/LOG_1957_1_629245032. arc *** 2007 -10 -09 16: 56: 44. 495 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 25
Trace File *** 2007 -10 -09 16: 56: 44. 495 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 1958 *** 2007 -10 -09 16: 56: 44. 495 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1958 -1976 *** 2007 -10 -09 16: 57: 14. 647 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------*** 2007 -10 -09 17: 05: 14. 785 Media Recovery Log /oraarch 01/BRHBETA/LOG_1958_1_629245032. arc *** 2007 -10 -09 17: 05: 18. 043 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 1959 *** 2007 -10 -09 17: 05: 18. 043 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1959 -1976 *** 2007 -10 -09 17: 05: 48. 284 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ------------------------------ www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 26
Trace File *** 2007 -10 -09 17: 07: 18. 309 Media Recovery Log /oraarch 01/BRHBETA/LOG_1959_1_629245032. arc *** 2007 -10 -09 17: 07: 21. 114 Media Recovery Log /oraarch 01/BRHBETA/LOG_1960_1_629245032. arc *** 2007 -10 -09 17: 07: 22. 945 Media Recovery Log /oraarch 01/BRHBETA/LOG_1961_1_629245032. arc *** 2007 -10 -09 17: 07: 27. 300 Media Recovery Log /oraarch 01/BRHBETA/LOG_1962_1_629245032. arc *** 2007 -10 -09 17: 07: 29. 637 Media Recovery Log /oraarch 01/BRHBETA/LOG_1963_1_629245032. arc *** 2007 -10 -09 17: 07: 29. 709 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 1964 *** 2007 -10 -09 17: 07: 29. 709 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1964 -1976 *** 2007 -10 -09 17: 07: 59. 858 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------*** 2007 -10 -09 17: 08: 29. 866 Media Recovery Log /oraarch 01/BRHBETA/LOG_1964_1_629245032. arc www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 27
Trace File *** 2007 -10 -09 Media Recovery *** 2007 -10 -09 Media Recovery *** 2007 -10 -09 Media Recovery 17: 08: 31. 924 Log /oraarch 01/BRHBETA/LOG_1965_1_629245032. arc 17: 09: 12. 510 Log /oraarch 01/BRHBETA/LOG_1966_1_629245032. arc 17: 09: 21. 050 Log /oraarch 01/BRHBETA/LOG_1967_1_629245032. arc 17: 09: 40. 234 Log /oraarch 01/BRHBETA/LOG_1968_1_629245032. arc 17: 09: 45. 055 Log /oraarch 01/BRHBETA/LOG_1969_1_629245032. arc 17: 09: 50. 572 Log /oraarch 01/BRHBETA/LOG_1970_1_629245032. arc 17: 09: 58. 968 Log /oraarch 01/BRHBETA/LOG_1971_1_629245032. arc 17: 10: 03. 922 Log /oraarch 01/BRHBETA/LOG_1972_1_629245032. arc 17: 10: 13. 196 Log /oraarch 01/BRHBETA/LOG_1973_1_629245032. arc 17: 10: 21. 927 Log /oraarch 01/BRHBETA/LOG_1974_1_629245032. arc 17: 10: 34. 064 Log /oraarch 01/BRHBETA/LOG_1975_1_629245032. arc 17: 10: 42. 420 60639 kcrr. c Waiting for thread 1 sequence 1976 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 28
Trace File *** 2007 -10 -09 17: 10: 42. 421 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1976 -1976 *** 2007 -10 -09 17: 11: 12. 538 -----------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------*** 2007 -10 -09 17: 12: 42. 563 Media Recovery Log /oraarch 01/BRHBETA/LOG_1976_1_629245032. arc *** 2007 -10 -09 17: 12: 45. 563 Media Recovery Log /oraarch 01/BRHBETA/LOG_1977_1_629245032. arc *** 2007 -10 -09 17: 12: 48. 534 Media Recovery Log /oraarch 01/BRHBETA/LOG_1978_1_629245032. arc *** 2007 -10 -09 17: 13: 00. 505 Media Recovery Log /oraarch 01/BRHBETA/LOG_1979_1_629245032. arc *** 2007 -10 -09 17: 13: 02. 054 Media Recovery Log /oraarch 01/BRHBETA/LOG_1980_1_629245032. arc *** 2007 -10 -09 17: 13: 03. 231 Media Recovery Log /oraarch 01/BRHBETA/LOG_1981_1_629245032. arc *** 2007 -10 -09 17: 13: 03. 902 Media Recovery Log /oraarch 01/BRHBETA/LOG_1982_1_629245032. arc www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 29
Trace File *** 2007 -10 -09 Media Recovery *** 2007 -10 -09 Media Recovery *** 2007 -10 -09 Media Recovery *** 2007 -10 -09 Media Recovery 17: 13: 04. 492 Log /oraarch 01/BRHBETA/LOG_1983_1_629245032. arc 17: 13: 08. 171 Log /oraarch 01/BRHBETA/LOG_1984_1_629245032. arc 17: 13: 26. 860 60639 kcrr. c Waiting for thread 1 sequence 1985 17: 16: 07. 172 Log /oraarch 01/BRHBETA/LOG_1985_1_629245032. arc 17: 16: 08. 067 Log /oraarch 01/BRHBETA/LOG_1986_1_629245032. arc 17: 16: 08. 131 Log /oraarch 01/BRHBETA/LOG_1987_1_629245032. arc 17: 16: 08. 195 60639 kcrr. c Waiting for thread 1 sequence 1988 17: 16: 13. 202 Log /oraarch 01/BRHBETA/LOG_1988_1_629245032. arc 17: 16: 13. 268 60639 kcrr. c Waiting for thread 1 sequence 1989 21: 14: 01. 119 Log /oraarch 01/BRHBETA/LOG_1989_1_629245032. arc 21: 14: 16. 922 60639 kcrr. c Waiting for thread 1 sequence 1990 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 30
Trace File *** 2007 -10 -10 09: 32: 33. 399 60639 kcrr. c Fetching gap sequence in thread 1, gap sequence 1990 -1990 *** 2007 -10 -10 09: 33: 05. 187 Media Recovery Log /oraarch 01/BRHBETA/LOG_1990_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 505 Media Recovery Log /oraarch 01/BRHBETA/LOG_1991_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 570 Media Recovery Log /oraarch 01/BRHBETA/LOG_1992_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 631 Media Recovery Log /oraarch 01/BRHBETA/LOG_1993_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 693 Media Recovery Log /oraarch 01/BRHBETA/LOG_1994_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 761 Media Recovery Log /oraarch 01/BRHBETA/LOG_1995_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 807 Media Recovery Log /oraarch 01/BRHBETA/LOG_1996_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 864 Media Recovery Log /oraarch 01/BRHBETA/LOG_1997_1_629245032. arc *** 2007 -10 -10 09: 33: 22. 918 Media Recovery Log /oraarch 01/BRHBETA/LOG_1998_1_629245032. arc *** 2007 -10 -10 09: 33: 23. 199 Media Recovery Log /oraarch 01/BRHBETA/LOG_1999_1_629245032. arc www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 31
Trace File Stop Log Apply Process Ready to convert to Logical Standby *** 2007 -10 -10 09: 33: 23. 255 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 2000 *** 2007 -10 -10 10: 11: 07. 685 Media Recovery Log /oraarch 01/BRHBETA/LOG_2000_1_629245032. arc *** 2007 -10 -10 10: 11: 08. 422 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 2001 *** 2007 -10 -10 10: 14: 48. 843 Media Recovery Log /oraarch 01/BRHBETA/LOG_2001_1_629245032. arc *** 2007 -10 -10 10: 14: 49. 013 60639 kcrr. c Media Recovery Waiting for thread 1 sequence 2002 *** 2007 -10 -10 10: 15: 19. 072 60639 kcrr. c MRP 0: Background Media Recovery cancelled with status 16037 ORA-16037: user requested cancel of managed recovery operation ----- Redo read statistics for thread 1 ----Read rate (ASYNC): 619732 Kb in 63640. 12 s => 0. 01 Mb/sec Total physical reads: 619732 Kb Longest record: 28 Kb, moves: 0/2001133 (0%) Change moves: 779641/4101685 (19%), moved: 141 Mb Longest LWN: 1023 Kb, moves: 117/175493 (0%), moved: 23 Mb Last redo scn: 0 x 0782. a 8 f 27 f 37 (8257761607479) -----------------------*** 2007 -10 -10 10: 15: 19. 088 Media Recovery drop redo thread 1 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 32
Trace File *** 2007 -10 -10 10: 15: 20. 864 1018 krsm. c Managed Recovery: Not Active posted. ORA-16037: user requested cancel of managed recovery operation ARCH: Connecting to console port. . . *** 2007 -10 -10 10: 15: 20. 871 60639 kcrr. c MRP 0: Background Media Recovery process shutdown *** 2007 -10 -10 10: 15: 20. 871 1018 krsm. c oraarch 01/BRHBETA $ www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 33
Convert to Logical Standby Ÿ SQL Apply Process – – – When applying redo logs Generates 2 trace files What are they? Ÿ Trace files – – – One shows start of kcrrwkx Second shows end of kcrrwkx What are these for? Neither show up in alert log Both continue as long as SQL apply process runs www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 34
First Trace File /orahome 01/admin/BRHBETA/bdump $ more brhbeta_arc 0_13168. trc /orahome 01/admin/BRHBETA/bdump/brhbeta_arc 0_13168. trc Oracle Database 10 g Enterprise Edition Release 10. 2. 0 - 64 bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /orahome 01/product/10. 2. 0 System name: Sun. OS Node name: brh-beta 1 -zone 04 Release: 5. 10 Version: Generic_118833 -36 Machine: sun 4 u Instance name: BRHBETA Redo thread mounted by this instance: 1 Oracle process number: 24 Unix process pid: 13168, image: oracle@beta 1 -zone 04 (ARC 0) *** SERVICE NAME: () 2007 -10 -10 10: 40: 26. 358 *** SESSION ID: (188. 2) 2007 -10 -10 10: 40: 26. 358 kcrrwkx: nothing to do (start) *** 2007 -10 -10 10: 45: 26. 240 kcrrwkx: nothing to do (start) *** 2007 -10 -10 10: 46: 35. 388 kcrrwkx: nothing to do (end). . . www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 35
Second Trace File /orahome 01/admin/BRHBETA/bdump $ more brhbeta_arc 1_13170. trc /orahome 01/admin/BRHBETA/bdump/brhbeta_arc 1_13170. trc Oracle Database 10 g Enterprise Edition Release 10. 2. 0 - 64 bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /orahome 01/product/10. 2. 0 System name: Sun. OS Node name: brh-beta 1 -zone 04 Release: 5. 10 Version: Generic_118833 -36 Machine: sun 4 u Instance name: BRHBETA Redo thread mounted by this instance: 1 Oracle process number: 9 Unix process pid: 13170, image: oracle@beta 1 -zone 04 (ARC 1) *** SERVICE NAME: () 2007 -10 -10 10: 40: 26. 358 *** SESSION ID: (396. 1) 2007 -10 -10 10: 40: 26. 358 kcrrwkx: nothing to do (start) *** 2007 -10 -10 10: 41: 26. 315 kcrrwkx: nothing to do (end) *** 2007 -10 -10 10: 42: 26. 322 kcrrwkx: nothing to do (end) … … www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 36
Data. Guard Likes to Chat Ÿ Physical Standby – – While applying archived redo logs Trace file documents everything standby does Ÿ Logical Standby – – – Once converted to logical standby Two trace files generated Contain messages for start/stop of each log apply Ÿ Why are these generated? – – Why not have Data. Guard alert logs? Trace files tell me that something is wrong www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 37
Normal Operation Ÿ Logical Standby catching up to Primary – Apply process turned off during the day Ÿ Catches up at night – Apply process failed Ÿ Catch up after fix (skip table in the example) Ÿ Typical alert log messages – – – Redo log from primary registered with DG Redo logs applied to standby Redo logs deleted from standby www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 38
Standby Catching Up Tue Oct 16 15: 13: 22 2007 Stop SQL Apply process Completed: ALTER DATABASE STOP LOGICAL STANDBY APPLY Tue Oct 16 15: 14: 16 2007 Incremental checkpoint up to RBA [0 x 7. a 0 aa 2. 0], current log tail at RBA [0 x 7. b 8 e 2 c. 0] Tue Oct 16 15: 14: 45 2007 ALTER DATABASE START LOGICAL STANDBY APPLY Start SQL Apply process after skipping table Tue Oct 16 15: 14: 45 2007 ALTER DATABASE START LOGICAL STANDBY APPLY (BRHBETA) Tue Oct 16 15: 14: 45 2007 No optional part Attempt to start background Logical Standby process LSP 0 started with pid=21, OS id=5041 LOGSTDBY status: ORA-16111: log mining and apply setting up Tue Oct 16 15: 14: 46 2007 LOGMINER: Parameters summary for session# = 1 LOGMINER: Number of processes = 3, Transaction Chunk Size = 201 LOGMINER: Memory Size = 30 M, Checkpoint interval = 150 M Tue Oct 16 15: 14: 46 2007 Completed: ALTER DATABASE START LOGICAL STANDBY APPLY LOGMINER: session# = 1, builder process P 001 started with pid=7 OS id=10018 LOGMINER: session# = 1, reader process P 000 started with pid=34 OS id=10014 LOGMINER: session# = 1, preparer process P 002 started with pid=36 OS id=10020 LSP 2 started with pid=23, OS id=5043 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 39
Standby Catching Up Tue Oct 16 15: 14: 48 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2048_1_629245032. arc LOGSTDBY Analyzer process P 003 started with pid=13 OS id=10051 Tue Oct 16 15: 14: 48 2007 LOGMINER: Turning ON Log Auto Delete LOGSTDBY Apply process P 004 started with pid=40 OS id=10054 LOGSTDBY Apply process P 006 started with pid=42 OS id=10062 LOGSTDBY Apply process P 007 started with pid=17 OS id=10064 LOGSTDBY Apply process P 005 started with pid=15 OS id=10060 Tue Oct 16 15: 22: 02 2007 Beginning log switch checkpoint up to RBA [0 x 8. 2. 10], SCN: 8295181217591 Thread 1 advanced to log sequence 8 Current log# 4 seq# 8 mem# 0: /shared/oradata 02/BRHBETA/redo 04 b. log Current log# 4 seq# 8 mem# 1: /shared/oralogs 01/BRHBETA/redo 04 a. log Tue Oct 16 15: 28 2007 Completed checkpoint up to RBA [0 x 8. 2. 10], SCN: 8295181217591 Tue Oct 16 15: 34: 32 2007 Incremental checkpoint up to RBA [0 x 8. 4 cbae. 0], current log tail at RBA [0 x 8. 65553. 0] Tue Oct 16 15: 42: 40 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2048_1_629245032. arc Tue Oct 16 15: 42: 40 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2049_1_629245032. arc. . . www. brianhitchcock. net. . . Brian Hitchcock October 23, 2007 Processing redo logs Page 40
Standby Catching Up Tue Oct 16 17: 20: 48 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2049_1_629245032. arc Processing redo logs Tue Oct 16 17: 20: 48 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2050_1_629245032. arc Tue Oct 16 17: 20: 54 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2050_1_629245032. arc. . Deleting redo logs Tue Oct 16 18: 39: 13 2007 LOGMINER: Log Auto Delete - deleting: /oraarch 01/BRHBETA/LOG_2048_1_629245032. arc Deleted file /oraarch 01/BRHBETA/LOG_2048_1_629245032. arc. . Tue Oct 16 18: 43: 40 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2082_1_629245032. arc Tue Oct 16 18: 43: 59 2007 Processing redo logs LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2082_1_629245032. arc Tue Oct 16 18: 43: 59 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2083_1_629245032. arc www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 41
Standby Catching Up Tue Oct 16 18: 44: 01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch 01/BRHBETA/LOG_2056_1_629245032. arc Deleted file /oraarch 01/BRHBETA/LOG_2056_1_629245032. arc Tue Oct 16 18: 44: 01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch 01/BRHBETA/LOG_2057_1_629245032. arc Deleted file /oraarch 01/BRHBETA/LOG_2057_1_629245032. arc Tue Oct 16 18: 44: 01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch 01/BRHBETA/LOG_2058_1_629245032. arc Deleted file /oraarch 01/BRHBETA/LOG_2058_1_629245032. arc. . . … … Deleting redo logs www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 42
Standby Catching Up Standby is at 2087 Tue Oct 16 18: 44: 15 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2087_1_629245032. arc Tue Oct 16 18: 48: 37 2007 Completed checkpoint up to RBA [0 xa. 2. 10], SCN: 8295181577382 Tue Oct 16 18: 55: 18 2007 Incremental checkpoint up to RBA [0 xa. 12 dad. 0], current log tail at RBA [0 xa. 1314 b. 0] Tue Oct 16 19: 01: 31 2007 RFS[1]: No standby redo logfiles created RFS[1]: Archived Log: '/oraarch 01/BRHBETA/LOG_2153_1_629245032. arc' Primary is at 2153 Tue Oct 16 19: 01: 32 2007 RFS Log. Miner: Registered logfile [/oraarch 01/BRHBETA/LOG_2153_1_629245032. arc] to Log. Miner session id [1] Tue Oct 16 19: 15: 22 2007 Incremental checkpoint up to RBA [0 xa. 142 b 2. 0], current log tail at RBA [0 xa. 143 fe. 0] Tue Oct 16 19: 29: 01 2007 LSP 0: warning -- apply server 2, sid 384 waiting on user sid 196 for event (since 0 seconds): Tue Oct 16 19: 29: 01 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2087_1_629245032. arc. . . … … www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 43
Standby Catching Up Unsupported DDL – Standby doesn’t execute Tue Oct 16 19: 30: 58 2007 LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA. PFILE. 19144. 1192413665' FROM SPFILE =… LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0 x 0003. 02 d. 00013 e 70, h. SCN 0 x 0782. a 9 c 2 fdb 8, l. SCN 0 x 0782. a 9 c 2 fdb 8, Thread 1, RBA … LOGSTDBY stmt: create pfile='/orahome 01/oradba/tmp/ora_adm_sqlbt_bkp. tmp 1. 17449. BRHBETA' from spfile … LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0 x 000 b. 001. 000126 cf, h. SCN 0 x 0782. a 9 c 2 fe 15, l. SCN 0 x 0782. a 9 c 2 fe 15, Thread 1, RBA… LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA. PFILE. 19695. 1192413687' FROM SPFILE = … LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0 x 0003. 00 c. 00013 e 62, h. SCN 0 x 0782. a 9 c 2 fe 4 a, l. SCN 0 x 0782. a 9 c 2 fe 4 a, Thread 1, RBA… LOGSTDBY stmt: ALTER DATABASE BACKUP CONTROLFILE TO '/tmp/datatools/dtodump_… LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0 x 0009. 007. 00011453, h. SCN 0 x 0782. a 9 c 2 feb 4, l. SCN 0 x 0782. a 9 c 2 feb 4, Thread 1, RBA… Tue Oct 16 19: 30: 58 2007 ALTER TABLESPACE "SYSTEM" BEGIN BACKUP Completed: ALTER TABLESPACE "SYSTEM" BEGIN BACKUP Tue Oct 16 19: 30: 58 2007 ALTER TABLESPACE "SYSTEM" END BACKUP Completed: ALTER TABLESPACE "SYSTEM" END BACKUP. . . … … www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 44
Standby Catching Up Standby catches up at 2158 Tue Oct 16 21: 29: 19 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2157_1_629245032. arc Tue Oct 16 21: 30: 03 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2157_1_629245032. arc Tue Oct 16 21: 35: 52 2007 Incremental checkpoint up to RBA [0 xa. f 41 b 7. 0], current log tail at RBA [0 xa. f 41 cc. 0] Tue Oct 16 21: 55: 56 2007 Incremental checkpoint up to RBA [0 xa. f 43 b 5. 0], current log tail at RBA [0 xa. f 43 b 5. 0] Tue Oct 16 22: 11: 16 2007 RFS[1]: No standby redo logfiles created RFS[1]: Archived Log: '/oraarch 01/BRHBETA/LOG_2158_1_629245032. arc' Tue Oct 16 22: 11: 16 2007 RFS Log. Miner: Registered logfile [/oraarch 01/BRHBETA/LOG_2158_1_629245032. arc] to Log. Miner session id [1] Tue Oct 16 22: 11: 16 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2158_1_629245032. arc Tue Oct 16 22: 11: 20 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2158_1_629245032. arc www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 45
Archived Redo Logs Ÿ Logical Standby – – After applied to standby SQL apply process does delete them Ÿ Unlike physical standby Ÿ Possible disk space issues on standby – – How long will you need to store redo logs? If standby frozen all day Ÿ Weekends? Holidays? – If standby fails Ÿ How many days to fix failures? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 46
Archived Redo Logs Ÿ How long are redo logs available on primary? – – – If not on disk when needed for standby Recover from backup Dataguard may not see these redo logs Ÿ Register redo logs Ÿ Logical standby – – Also generates its own archived redo logs Needed to recover standby db Ÿ Unique standby db objects? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 47
SQL Apply Process Slow Ÿ Detect long running transaction Ÿ Compute estimate of time to complete Ÿ Identify and skip problem table www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 48
Long Running Transaction Ÿ Standby Alert Log – – SQL apply process applying redo log 2049 Doesn’t move on within a few minutes Current time is Tue Oct 16 08: 09: 55 2007 Shows start time for this redo log Mon Oct 15 05: 52: 29 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2049_1_629245032. arc – Has been processing for over 24 hours www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 49
Long Running Transaction Ÿ What is apply process doing? Ÿ Check redo logs waiting to be applied Ÿ Where is processing in current redo log? – How long to complete current redo log? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 50
Long Running Transaction alter session set nls_date_format = 'DD Mon YYYY hh 24: mi: ss'; column first_change# format 9999999999 column next_change# format 9999999999 column resetlogs_change# format 9999999999 select * from dba_logstdby_log; BRHBETA> SELECT TYPE, STATUS, HIGH_SCN FROM V$LOGSTDBY; TYPE STATUS HIGH_SCN COORDINATOR ORA 16116: no work available 8257767540953 READER ORA 16127: stalled waiting for additional transactions to be applied 8257767541085 BUILDER ORA 16127: stalled waiting for additional transactions to be applied 8257767540965 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 51
Long Running Transaction PREPARER ORA 16127: stalled waiting for additional transactions to be applied 8257767540965 ANALYZER ORA 16117: processing 8257767540953 APPLIER ORA 16116: no work available 8257767539467 APPLIER ORA 16116: no work available 8257767512259 APPLIER ORA 16113: applying change to table or sequence "BRH". "XXSUN_BRH_COMPS_INT" 8257767539247 APPLIER ORA 16116: no work available 8257767512262 9 rows selected. BRHBETA> www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 52
Long Running Transaction BRHBETA> select * from dba_logstdby_log; THREAD# RESETLOGS_CHANGE# RESETLOGS_ID SEQUENCE# FIRST_CHANGE# NEXT_CHANGE# ----------------------------------FILE_NAME TIMESTAMP DIC APPLIED 1 8257200902826 629245032 2048 8257767447753 8257767534297 12 -Oct-2007 22: 41: 05 12 -Oct-2007 23: 18: 23 /oraarch 01/BRHBETA/LOG_2048_1_629245032. arc 12 -Oct-2007 22: 19: 08 NO NO CURRENT 1 8257200902826 629245032 2049 8257767534297 8257767754044 12 -Oct-2007 23: 18: 23 13 -Oct-2007 00: 11: 05 /oraarch 01/BRHBETA/LOG_2049_1_629245032. arc 12 -Oct-2007 23: 12: 18 NO NO CURRENT 1 8257200902826 629245032 2050 8257767754044 8257767922751 13 -Oct-2007 00: 11: 05 13 -Oct-2007 01: 11: 05 /oraarch 01/BRHBETA/LOG_2050_1_629245032. arc 13 -Oct-2007 00: 11: 06 NO NO NO. . 1 8257200902826 629245032 2140 8257781397314 8257781562968 16 -Oct-2007 07: 41: 15 16 -Oct-2007 08: 41: 15 /oraarch 01/BRHBETA/LOG_2140_1_629245032. arc 16 -Oct-2007 07: 41: 16 NO NO NO 93 rows selected. www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 53
Long Running Transaction Ÿ redo log 2049 goes from – SCN 8257767534297 to SCN 8257767754044 Ÿ Check again Tue Oct 16 15: 04: 29 MST 2007 APPLIER ORA-16113: applying change to table or sequence "BRH". "XXSUN_BRH_COMPS_INT" 8257767540857 Ÿ Compute Estimate – – – Tue Oct 16 08: 09: 55 Tue Oct 16 11: 17: 39 APPLIER has moved from 39247 to 39991 3 hours > roughly 750 SCNs, 250 per hour it still needs to go from 539991 to 754044 Ÿ over 200, 000 SCNs at 250 per hour, this would take 800 hours > 33 days www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 54
Long Running Transaction Ÿ Check again Tue Oct 16 15: 04: 29 MST 2007 APPLIER ORA-16113: applying change to table or sequence "BRH". "XXSUN_BRH_COMPS_INT" 8257767540857 Ÿ APPLIER has moved – – – 39991 to 40857 in the last 4 hours, 866 SCNs, roughly in line with 250/hr we computed earlier www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 55
Long Running Transaction Ÿ This is truly awful science! – Assumes all SCNs take same amount of time Ÿ If processing takes more than a few minutes – – – Compute estimate Confirm that it will take a long time Compare with business requirements for standby Ÿ Must be in synch once per day – Decide to skip table Ÿ If table required, must wait or full refresh www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 56
Long Running Transaction Ÿ Skip table ALTER DATABASE STOP LOGICAL STANDBY APPLY; EXECUTE DBMS_LOGSTDBY. SKIP (stmt => 'DML' , schema_name => 'BRH' , object_name => 'XXSUN_BRH_COMPS_INT', proc_name => null); ALTER DATABASE START LOGICAL STANDBY APPLY; – – SQL Apply Process restarts with redo 2048 Standby catches up quickly www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 57
Primary/Standby Interactions Ÿ Logical standby backup starts – Tablespaces put into backup mode Ÿ Apply process applies redo logs from primary – – Contain transactions for primary backup Tries to put tablespaces into backup mode Ÿ Apply process fails – – Wait for standby backup to finish Restart apply process Ÿ Disable standby backups when catching up – Apply process runs longer than normal www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 58
Unique Constraint Violation Ÿ Oracle calls this – Oscillating updates Ÿ Oracle docs explain this (I can’t…) – Or primary update really did fail Ÿ And was rolled back on primary db Ÿ Fails and rolls back in standby db Ÿ SQL apply process restarts – – Automatically No need to do anything www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 59
Unique Constraint Violation Tue Oct 16 21: 23: 42 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2147_1_629245032. arc. . Tue Oct 16 21: 24: 31 2007 LOGSTDBY stmt: insert into "APPLSYS". "WF_LOCAL_ROLES" values “COL 1" = 'Value 1', “COL 2" = 'Value 2', “COL 3" IS NULL, … LOGSTDBY status: ORA-00001: unique constraint (APPLSYS. WF_LOCAL_ROLES_U 1) violated LOGSTDBY id: XID 0 x 0009. 016. 00011548, h. SCN 0 x 0782. aa 32 b 533, l. SCN 0 x 0782. aa 32 b 533, Thread 1, RBA… Tue Oct 16 21: 25: 20 2007 LOGMINER: End mining logfile: /oraarch 01/BRHBETA/LOG_2147_1_629245032. arc SQL Apply Process continues processing www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 60
No Data Found Ÿ What does it mean? – When Data. Guard updates standby Ÿ Brings update from primary Ÿ Brings pre update data from primary – On standby, Data. Guard compares Ÿ Pre update data from primary Ÿ Current data on standby – If they don’t agree Ÿ Data. Guard won’t apply the update on standby Ÿ SQL apply process fails www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 61
No Data Found Wed Sep 19 12: 09: 23 2007 LOGSTDBY stmt: update "PO". "PO_LINE_LOCATIONS_ALL". . . SQL, values. . . LOGSTDBY status: ORA 01403: no data found LOGSTDBY id: XID 0 x 0008. 01 e. 0000 c 437, h. SCN 0 x 0789. eacde 6 c 1, l. SCN 0 x 0789. eacde 6 c 1… LOGSTDBY Apply process P 007 pid=29 OS id=3447 stopped Wed Sep 19 12: 09: 23 2007 Errors in file /shared/orahome 01/admin/BRHPRSB/bdump/brhprsb_lsp 0_12386. trc: ORA 12801: error signaled in parallel query server P 004 ORA 01403: no data found LOGSTDBY Analyzer process P 003 pid=24 OS id=3439 stopped LOGSTDBY Apply process P 006 pid=27 OS id=3445 stopped LOGSTDBY Apply process P 005 pid=26 OS id=3443 stopped www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 62
No Data Found Ÿ What happened? – – For some reason Table data not the same primary vs standby Ÿ How could this happen? – Logical standby is read write Ÿ SYS can change anything at any time – SYS left guard status at NONE Ÿ Other db users can make changes in standby Ÿ How to fix? – – Skip table Refresh logical standby www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 63
No Data Found Ÿ Logical Standby – – – No way to find out what happened No utility to verify primary, standby in synch Differences can exist for a long time Ÿ Won’t cause error until table updated on primary Ÿ Logical Standby for reporting? – – – Can you depend on this for your reports? How do you know what is in the standby? What has been skipped? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 64
Primary Schema Issues Ÿ Primary db – XDB schema reinstalled Ÿ Create java class (loads java class from filesystem) Ÿ Standby db – – Transactions came through to standby Standby doesn’t have java class files Apply process fails Identify and skip transaction(s) www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 65
ORA-07445 Errors Ÿ SR opened – Results Ÿ Known bug fixed in 11 g Ÿ Apply patch on standby – Impact Ÿ None, no affect on standby – Apply patch? Ÿ No – refresh would wipe out patch Ÿ Don’t want to patch primary db Primary doesn’t have this error www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 66
ORA-07445 Errors SQL Apply Process stops Tue Oct 16 21: 27: 50 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_p 004_6577. trc: ORA-07445: exception encountered: core dump [krvsmso()+1212] [SIGSEGV] [Address not mapped to object]… Tue Oct 16 21: 29: 06 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_lsp 0_5041. trc: ORA-12805: parallel query server died unexpectedly Tue Oct 16 21: 29: 06 2007 Logical Standby is not for the faint of heart! TLCR process death detected. Shutting down TLCR logminer process death detected, exiting logical standby LOGSTDBY Analyzer process P 003 pid=13 OS id=10051 stopped LOGSTDBY Apply process P 005 pid=15 OS id=10060 stopped LOGSTDBY Apply process P 006 pid=42 OS id=10062 stopped LOGSTDBY Apply process P 007 pid=17 OS id=10064 stopped Tue Oct 16 21: 29: 06 2007 LOGSTDBY status: ORA-16222: automatic Logical Standby retry of last action LOGSTDBY status: ORA-16111: log mining and apply setting up SQL Apply Process automatically restarts www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 67
ORA-07445 Errors Tue Oct 16 21: 29: 07 2007 LOGMINER: Parameters summary for session# = 1 LOGMINER: Number of processes = 3, Transaction Chunk Size = 201 LOGMINER: Memory Size = 30 M, Checkpoint interval = 150 M LOGMINER: session# = 1, builder process P 001 started with pid=7 OS id=10018 LOGMINER: session# = 1, reader process P 000 started with pid=34 OS id=10014 LOGMINER: session# = 1, preparer process P 002 started with pid=36 OS id=10020 Tue Oct 16 21: 29: 10 2007 LOGMINER: Begin mining logfile: /oraarch 01/BRHBETA/LOG_2147_1_629245032. arc Tue Oct 16 21: 29: 10 2007 SQL Apply Process continues processing LOGMINER: Turning ON Log Auto Delete LOGSTDBY Analyzer process P 003 started with pid=13 OS id=10051 LOGSTDBY Apply process P 006 started with pid=42 OS id=10062 LOGSTDBY Apply process P 004 started with pid=30 OS id=10219 LOGSTDBY Apply process P 005 started with pid=15 OS id=10060 LOGSTDBY Apply process P 007 started with pid=17 OS id=10064 www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 68
Refresh Process Ÿ Export unique standby db objects – Scripts to recreate Ÿ Backup primary db – Create standby control file Ÿ Recover primary db backup on standby – Use standby control file Ÿ Create physical standby Ÿ Convert to logical standby Ÿ Import unique standby db objects – Recreate with scripts www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 69
Unsupported Record Ÿ ORA 16211 – SQL apply process fails Ÿ Must skip table or refresh standby Ÿ Oracle SR tells me to – – Add all column supplemental log group to table Rebuild standby Ÿ Or reinstantiate the table – Needed for each table Ÿ Not an easy process www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 70
Unsupported Record Thu Oct 11 10: 11: 58 2007 LOGMINER: Log Auto Delete - deleting: /oraarch 01/BRHBETA/LOG_2005_1_629245032. arc Deleted file /oraarch 01/BRHBETA/LOG_2005_1_629245032. arc Thu Oct 11 10: 15: 55 2007 ** LOGMINER WARNING - Invalidated 4 LCRs ** Thu Oct 11 10: 29 2007 LOGSTDBY stmt: "BRH". "XXSUN_INV_ITEMS_INT": unsupported LOGSTDBY status: ORA-16211: unsupported record found in the archived redo log ORA-06512: at "SYS. DBMS_INTERNAL_LOGSTDBY", line 4717 ORA-06512: at line 1 LOGSTDBY id: XID 0 x 0009. 02 e. 0001127 d, h. SCN 0 x 0782. a 9016545, l. SCN 0 x 0782. a 9016545, Thread 1… LOGSTDBY Apply process P 007 pid=23 OS id=16578 stopped Thu Oct 11 10: 29 2007 Errors in file /orahome 01/admin/BRHBETA/bdump/brhbeta_lsp 0_13625. trc: ORA-12801: error signaled in parallel query server P 007 ORA-16211: unsupported record found in the archived redo log LOGSTDBY Analyzer process P 003 pid=19 OS id=16570 stopped LOGSTDBY Apply process P 005 pid=21 OS id=16574 stopped LOGSTDBY Apply process P 006 pid=36 OS id=16576 stopped LOGSTDBY Apply process P 004 pid=34 OS id=16572 stopped www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 71
Unsupported Record Ÿ What causes this? – – Metalink 304061. 1 Possible causes Ÿ Direct path insert on partitioned table Ÿ Table has 500 columns Ÿ Is this a standby? – – At any time this error may happen How to predict/prevent? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 72
Compile Invalid Objects Ÿ In Logical Standby – – – Execute utlrp. sql 2 hours go by Not much changed Ÿ Disable Guard for session – – – Alter session disable guard Ÿ alter database guard standby; Recompile runs in 2 minutes Alter session enable guard Ÿ When normal things don’t work – Perhaps guard enabled is the problem Ÿ Guard level is the problem (all vs standby) www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 73
Import Into Logical Standby Ÿ For recompile we used – Alter session disable guard; Ÿ Refresh Logical Standby – – Unique db objects exported before refresh Must be imported after refresh Ÿ Import doesn’t use SQL*Plus session – Alter database guard standby; www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 74
Conclusion Ÿ Logical standby – Lots of errors Ÿ Many require refreshing standby Ÿ Lots of DBA support needed – For all of this support Ÿ What do you have? Ÿ Do you know what is in the standby Reporting? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 75
Conclusion Ÿ Physical standby – – Is solid, dependable No issues Ÿ Logical standby – – – Is it really a standby? Is it ready for failover? Is it providing complete data for reports? Lots of issues Is it worth the effort/risk? www. brianhitchcock. net Brian Hitchcock October 23, 2007 Page 76
- Slides: 76