DATABASE CORRUPTION By Stephen Muliokela ABOUT ME Name
DATABASE CORRUPTION By Stephen Muliokela
ABOUT ME Name; Stephen A Muliokela Work; American Credit Acceptance
DATABASE CORRUPTION. NOW WHAT? What to do when the database is corrupt. 1. Don’t panic. 2. Don’t detach the database. 3. Don’t restart SQL Server. 4. Don’t just run repair. 5. Run an integrity check. 6. Afterwards do a root cause analysis.
WHY DATABASE CORRUPTION OCCURS • Almost always the I/O subsystem • I/O subsystem means anything underneath SQL Server • Windows operating system • File system filter driver • E. g. antivirus, defrager, encryption • Network cards, switches, cables • SAN controllers, RAID controllers • Disk • Other; • High temperature, vibrations, power cycling not ideal • Could be version related/SQL Server bugs(Service pack missing) • Memory corruption; bad memory chips(data held will be corrupt). • Human Error • Manually edit the database e. g. sys tables. • Manually deleting the transaction log.
POSSIBLE REPAIR OPTIONS Restore from a clean backup. Restore corrupted page from a valid backup. Export data to different table/(s) or database. Use DBCC Repair options/ variation of repair. Run Chkdsk to repair any disk problems. Contact your hardware vendor(write caching should not be enabled) Reformat the disk drives. Reinstall operating system (gets rid of any corrupted. dll or. exe program).
DETECTING PAGE CORRUPTION CHECKSUM TORN PAGE § Checksum formula is applied to § Established by writing a 2 -bit the 8 -KB data page § The page header checksum value is updated –page header status set to CHECKSUM § Page is then written to stable media § Checksum examines all bits on the page when read § Page Header Status = CHECKSUM value in the lowest order 2 bits of each 512 -byte sector of the page (16 X 512 ) § Page header is updated with torn bit tracking information § Page Header Status = TORN
DETECTING PAGE CORRUPTION In-memory CHECKSUM § SQL Server extends protection of data pages by extending the PAGE_VERIFY CHECKSUM to allow for in-memory checksumming § Pages written to stable media with a CHECKSUM status are eligible for in memory checksumming – if the dynamic trace flag –T 831 is enabled. § The page must have received the initial checksum value during write to stable media to participate in the in-memory checksum. § To reduce performance affect, in memory checksum is only audit when – Page state (Dirty, Clean)
CHECKSUM § Checksum – is an error detection scheme that uses a formulated numeric value to represent grouping of bits § Example –Taking a 2 -byte string AB and apply formula § Assume that the string was stored on disk but when read back it was “AC”-(Value = Position +ASCII value) Character POS ASCII Value Formula Value A 1 65 67 B 2 66 68 Checksum 135 Character POS ASCII Value Formula Value A 1 65 67 C 2 67 69 Checksum 139
DBCC CHECKDB ISSUE “MSG 823 IN SQL SERVER” Page Read Resolution § Checksum is validated as soon as the read finishes § Check the suspect_pages in msdb Page Modification Request § Check consistency of db – DBCC CHECKDB § Dirty page: As soon as page has been dirtied, § Turn on PAGE_VERIFY CHECKSUM on § Clean Page: § Transition from clean to dirty triggers in-memory checksum validation § A failure during this transition indicates that the page was damaged during a period in which it was considered to be read-only § Review Windows Evet logs review errors reports- the checksum no longer maintained Discard § A page is termed ‘discarded’ when it is returned to the free list § Dirty page: cannot be discarded. Page must be written to stable media before it can be discarded § Clean Page: act of discarding a clean page triggers in memory checksum validation. databases-provides best option for verifying pages are they are written to disk e. g 824 errors and errors like “The driver detected a controller error on “DevceHarddisk 4DR 4” – evaluate if file is present § Use SQLIOsim utility to find if these 823 errors can be reproduced outside of regular SQL Server I/O requests § Work with you hardware vendor or device manufactore provided with any diagnostic utilities § Evaluate filter Drivers in the path of these I/O § Check if any updates to these filter drivers § Check if filter drivers are disabled -f problem can go away
DBCC CHECKDB SYNTAX
DBCC CHECKDB() SPECIAL CONSIDERATIONS • DBCC CHECKDB does not examine disabled indexes. • In versions before SQL Server 2005 SP 2, DBCC cleared the plan cache which would cause recompilations and cause the system to be slow. • DBCC CHECKDB obtains database level exclusive lock to prevent changes. • DBCC waits for 20 seconds or the configured timeout value for a database lock to be released then gives off this error-TABLOCK option (non snapshot ) • Exclusive locks are dropped when allocation units checks are completed at the database level-
DBCC CHECKDB() SPECIAL CONSIDERATIONS • DBCC CHECKDB operates on an internal read only database (database snapshot); to prevent blocking and concurrency issues. • If the snapshot cannot be created, DBCC runs WITH TABLOCK option. • An internal database snapshot cannot be created with DBCC in the following cases; 1. Against the database or instance of SQL Server is running in single-user mode 2. Against a ready-only database. 3. Against a database that is placed in EMERGENCY mode 4. Against tempdb. 5. WITH TABLOCK option is explicitly specified. 6. Database is created on a non-NTFS file system(snapshot creation depends on NTFS sparse – file technology) 7. DBCC commands defaults to a table lock instead of a database snapshot when run it against a read only FILEGROUP.
DBCC CHECKDB() SPECIAL CONSIDERATIONS • DBCC CHECKDB will run in parallel on Enterprise edition • You can limit parallelism using; • the trace flag 2528( increases run time but reduces resources used). • up to limit imposed by sp_configure MAXDOP or Resource governor workload group MAX_DOP setting • Ported for 2008, 2008 R 2, 2014, 2016 trace flags to improve DBCC processing Ported with the latest service packs. • Trace flag 2562; runs DBCC CHECKDB in single batch minimizes impact on tempdb and contention on DBCC_MULTIOBJECT_SCANNER latch. • May want to grow tempdb by 5%. • Indexes on computed columns slow down DBCC CHECKDB - bottleneck occurs on the DBCC_OBJECT_METADATA latch.
RUNNING A REPAIR • DBCC check DB will tell you which repair option will resolve the issue. • You can put a transaction around your repair operation and roll it back if you like what the repair suggestion. REPAIR_ALLOW_DATA_LOSS doesn’t account for (shouldn’t be the first option) • Foreign-Key constraints • Inherent business logic and data relations • Replication
DATABASE CORRUPTION; FATAL ERROR • Not all errors can be fixed using Repair_Allow_Data_Loss option repair option; Indications of fatal error • You have experienced an Objectid 99 (Non existent object) • Boot page corruption page(1: 9). • Other errors e. g. 7985 or check terminated error • When you have file header errors • PFS page, Corrupt metadata, Critical system table corruption 2540, 7991 • No minimum repair option is given by DBCC
AUTOMATIC PAGE REPAIR; DATABASE CORRUPTION Automatic page repair is supported by • Database Mirroring and Always. On Avaliability Group • Applies to errors 823, 824 and 829 Page types that cannot be repaired • Page 9, File header page • Allocation pages ; GAM, SGAM and PFS pages View Automatic Page-Repair Attempts • Always. On Availability Group; sys. dm_hadr_auto_page_repair • Database Mirroring sys. dm_mirroring_auto_page_repair
AUTOMATIC PAGE REPAIR; DATABASE CORRUPTION Handling I/O Errors on the Principal/Primary Database Handling I/O Errors on the Mirror/Secondary Database § Page and error recorded in suspect_pages when encountered § Request sent to secondary server § Page received –Page marked as restore pending (error 829) § After fix the page in suspect_pages is marked (event=5) suspect_pages when encountered – State changes to suspended § Request sent to the Primary server § Page received § After fix the page in suspect_pages is marked(event=4)
AUTOMATIC PAGE REPAIR; DATABASE CORRUPTION
SUSPECT_PAGES Automatic page repair is supported by • Suspect_pages table – contains rows up to a limit of 1000 rows • Repaired(event 5 –primary) Restored (event 4 –Secondary) • DBCC – event 5 (repaired) or event 7 (Deallocated by DBCC)
CORRUPTION USING SQL SERVER SERVICE PACKS. • Data corruption occurs in clustered index when you run an online index rebuild when concurrent queries are running against the same data. 1. Versions affected; SQL Server 2012 and SQL Server 2014 2. Latest service pack for SQL Server 2012 and SQL Server 2014 fixes this issue.
UNDOCUMENTED DBCC COMMANDS • DBCC IND • DBCC PAGE • DBCC DBINFO • DBCC DBTABLE • DBCC WRITEPAGE
DEMO; DBCC DBINFO
DEMO; DBCC DBINFO CONTINUED
EXAMPLES OF DATABASE CORRUPTION 1. Corruption only in the nonclustered indexes • • Disable the associated object nonclustered indexes Drop the disabled indexes Recreate the disabled indexes Other errors pointing to index corruption; 8915, 8952, 8956 2. Error 825 Read retry(Msg 823 or 824) • Read re-try up to four times before failing read/write • No action needed- Caution check I/O subsystem.
EXAMPLES OF DATABASE CORRUPTION CONTINUED 3. Memory corruption(error 832) • SQL Server 2012 on windows 2012 • Re-reads corrupted page into memory from disk • Cannot repair dirty page-Different from page in memory 4. Data Purity Errors(Error 2570) • e. g. out of range data exists in column
EXAMPLES OF DATABASE CORRUPTION CONTINUED 5. Error 605 • • Damaged page chain, corrupt Index Allocation MAP(IAM) Invalid entry in sys. objects for the object 605 could be transient 6. Deleted transaction log
Symptoms Resolution § Running DBCC CHECKDB gives Error 823 state: 2 § Check the suspect_pages in msdb § Check consistency of db – DBCC CHECKDB 2017 -03 -06 22: 41: 19. 55 spid 58 Error: 823, Severity: 24, State: 2. 2010 -03 -06 22: 41: 19. 55 spid 58 The operating system returned error 38(Reached the end of the file. ) to SQL Server during a read at offset 0 x 000000 a 72 c 0000 in file 'C: Program FilesMicrosoft SQL ServerMSSQL 10. SQL 2008MSSQLDATAmy_db. mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. Cause § One of the API’s used by SQL Server (E. g. Read. File, Write. File, Read. File. Scatter, Write. File. Gather for I/O operations experienced an error – Error 823 is issued § Sometimes the Windows API calls can succeed but the data transferred by the IO operation might have encountered a logical consistency problem – Error reports via Error 824 § 824 usually an indication of hardware errors § Turn on PAGE_VERIFY CHECKSUM on databases-provides best option for verifying pages are they are written to disk § Review Windows Evet logs review errors reports- e. g 824 errors and errors like “The driver detected a controller error on “DevceHarddisk 4DR 4” – evaluate if file is present § Use SQLIOsim utility to find if these 823 errors can be reproduced outside of regular SQL Server I/O requests § Work with you hardware vendor or device manufactore provided with any diagnostic utilities § Evaluate filter Drivers in the path of these I/O § Check if any updates to these filter drivers § Check if filter drivers are disabled -f problem can go away
Symptoms Resolution § The Write. File API call return as successful but an § Check the suspect_pages in msdb immediate, successful read of the same data block returns older data – Including data stored in the hardware cache § Usually happens because of read cache problem § You can use T 818 (DBCC TRACEON(818, -1) to diagnose the issue How it works § T 818 enables an in memory ring buffer that is used for tracking the last 2048 successful write operations that are performed by computer running SQL Server § When errors such as Error 605, 823 or 3448 occur, the incoming buffer’s log sequence number(LSN) value is compared to the recent write list § Check consistency of db – DBCC CHECKDB § Turn on PAGE_VERIFY CHECKSUM on databases-provides best option for verifying pages are they are written to disk § Review Windows Evet logs review errors reports- e. g 824 errors and errors like “The driver detected a controller error on “DevceHarddisk 4DR 4” – evaluate if file is present § Use SQLIOsim utility to find if these 823 errors can be reproduced outside of regular SQL Server I/O requests § Work with you hardware vendor or device manufactore provided with any diagnostic utilities § Evaluate filter Drivers in the path of these I/O § Check if any updates to these filter drivers § Check if filter drivers are disabled -f problem can go away
Symptoms Resolution § Running DBCC CHECKDB gives § Check the suspect_pages in msdb Error 824 state: 2 § Check consistency of db – DBCC CHECKDB § Turn on PAGE_VERIFY CHECKSUM on 2017 -11 -02 15: 46: 42. 90 spid 51 Error: 824, Severity: 24, State: 2. 2017 -11 -02 15: 46: 42. 90 spid 51 SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1: 43686; actual 0: 0). It occurred during a read of page (1: 43686) in database ID 23 at offset 0 x 0000001554 c 000 in file 'H: MSSQL. SQL 2008MSSQLDATAmy_db. mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. databases-provides best option for verifying pages are they are written to disk § Review Windows Evet logs review errors reports- e. g 824 errors and errors like “The driver detected a controller error on “DevceHarddisk 4DR 4” – evaluate if file is present § Use SQLIOsim utility to find if these 824 errors can be reproduced outside of regular SQL Server I/O requests § Work with you hardware vendor or device manufactore provided with any diagnostic utilities § Evaluate filter Drivers in the path of these I/O § Check if any updates to these filter drivers One of the API’s used by SQL Server (E. g. Read. File, Write. File, § Check if filter drivers are disabled -f problem can Read. File. Scatter, Write. File. Gather for I/O operations experienced an error – Error 823 is issued go away Cause § § Sometimes the Windows API calls can succeed but the data transferred by the IO operation might have encountered a logical consistency problem – Error reports via Error 824 § 824 usually an indication of hardware errors
Error Type Checksum Torn Page Short Transfer Description • The read resulted in a checksum failure. • The checksum stored on the data page does not match the checksum as calculated after the read operation. • Data on the page has been damaged and will require a restore to correct it. • Extended Data: “incorrect checksum (expected: ##; actual: ##)” Contact your hardware manufacture for assistance. • The read resulted in a torn bits failure. The torn bits stored in the data page header do not match the torn bits stored in the individual sectors following the read operation. • Data on the page has been damaged and will require a restore to correct it. Extended Data: “torn page (expected signature: ##; actual signature: ##)” Contact your hardware manufacture for assistance. • The requested number of bytes were not read. • For example, if the read request was for 8 KB but the returned data was only 4 KB, the condition is flagged as a short transfer error. • This indicates that the file is damaged or the I/O subsystem has a severe problem transferring data to and from media. Extended Data: “insufficient bytes transferred”
Error Type Bad Page Id Stale Read Page Audit Failure Description • The page header does not contain the correct value for the expected page ID member. • The expected page ID can be calculated using the following formula: (page id = physical offset in file / 8192 bytes). • When the expected page is not returned, the bad page ID error is indicated. Extended Data: “incorrect pageid (expected ##: ##; actual ##: ##)” • This is frequently a condition where the I/O subsystem returns the incorrect data during the read. Contact your hardware manufacture for assistance • If a page has been recently written to disk and is still stored in the stale read hash table, the Log Sequence Number (LSN) stored in the hash table is compared to the LSN in the page header. • If they do not match then the page is flagged as incorrect. • Example message: “stale page (a page read returned a log sequence number (LSN) (##: ##) that is older than the last one that was written (##: ##))” • When trace flag –T 806 is enabled, a DBCC audit is performed on the page to test for logical consistency problems. If the audit fails, the read is considered to have experienced an error. Extended Data: “audit failure (a page read from disk failed to pass basic integrity checks)”
Symptoms Resolution § Running DBCC CHECKDB gives Error 832 The to find the cause of the error, consider these options: state: 1 § Error detected by background task often § Run any normal hardware or system checks to determine if a memory, CPU, or other hardware related problem exists. Ensure all system drivers, Operating system updates, and hardware updates have been applied to our system. § More Information § Only pages that are not currently modified in cache (or "dirty") are checked § Furthermore, the page was read in "clean" from disk because it has a checksum value on the page and has not encountered a checksum failure (Msg 824) § However, the page could be modified at some point after this error and then written to disk with the incorrect modification § If this occurs, a new checksum is calculated based on all modifications before it is written to disk. § Therefore, the page could be damaged on disk but a subsequent read from disk may not trigger a checksum failure. It is important to run DBCC CHECKDB on any database that is referenced by this error. § It is possible that even DBCC CHECKDB will not report an error for a page in this state after being written to disk. § This is because the incorrect modification could be at locations on the page that don't hold any data, nor contain any important page or row structure information, or could be modifications to data that CHECKDB cannot detect. referred to as Lazy. Writer- error is not returned to a client application § Error written to Windows Application Event Log as Event. ID=832 Error: 832, Severity: 24, State: 1. A page that should have been constant has changed (expected checksum: <expected value>, actual checksum: <actual value>, database <dbid>, file '<filename>', page <pageno>). This usually indicates a memory failure or other hardware or OS corruption. Cause § An external condition has caused a database page to be modified outside normal SQL Server engine code used to change database pages. § A thread running in the SQL Server process that incorrectly writes on a database page. This is often referred to as “scribbler” § A hardware or operating system problem where the memory backing the database page is incorrect modified or damaged
Lazy Writer How the Lazy writer works § The Lazy writer performs a clean(constant) buffer checksum validations § Does not require –T 831 to be enabled § Every second the lazy writer updates the buffer counter and performs various housekeeping activities § During the housekeeping, the lazy writer sweeps 16 buffers § When the lazy writer finds a clean buffer with a valid checksum, it validates the checksum § If an error is detected an 832 error message is logged Background in-memory checksum audit activity)
Symptoms Resolution § When I/O requests take longer than 15 seconds, § Resolution of this problem could involve updating Microsoft SQL Server reports "Msg 833" to the event log, and the following text is logged in the ERRORLOG file: § Other symptoms may occur, such as the following: § High wait times for PAGEIOLATCH waits § Errors in the System event log § Indications of disk latency problems from performance monitor counters such as disk secs/transfer § Cause § This problem is caused by some issue in the operating system, by device drivers, by firmware, or by hardware that is used to support the devices to store SQL Server database and/or transaction log files. § SQL Server records the time that it initiated an I/O request and records the time that the I/O was completed. § If that difference is 15 seconds or more, this condition is detected. § This also means that SQL Server is not the cause of a delayed I/O condition that this message describes and reports. all device drivers and firmware or performing other diagnostics that are associated with your disk system.
Symptoms Resolution § Running DBCC CHECKDB gives Error 825 § Since retry was successful- no § When Error 823 or 824 is encountered – § However the retry operation indication state: 2 SQL Server will retry the operation up to four times before failing the read or write operation § If the operation is successful after one of the retry attempt, query that required the read or write will not fail but you will see the rror below in the Error. Log § An information level message will be written into Windows Application Event Log with Event. ID =825 2010 -03 -31 21: 06: 54. 34 spid 52 A read of the file 'C: Program FilesMicrosoft SQL ServerMSSQL 10. SQL 2008MSSQLDATAmydb. mdf' at offset 0 x 0000132000 succeeded after failing 2 time(s) with error: incorrect checksum (expected: 0 xffd 99027; actual: 0 x 331 d 106 f). Additional messages in the SQL Server error log and system event log may provide more detail. This error condition threatens database integrity and must be corrected. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. Cause § Cause the same as 823 or 824 errors immediate action is needed an underlying problems that should be investigated § Use same investigation techniques for MSG 823 or 824 § Was introduced in SQL Server 2005 because Micrrosoft identified that some read retries succeeded after they raised 823, 824 errors
Symptoms Cause § Running DBCC CHECKDB Fails –after § Occurs when DBCC CHECKDB § After you restore a database using CONTINUE_AFTER_ERROR § Restore from a 2008 or 2005 backup § You receive an error after error prevent you from restoring Msg 8967, Level 16, State 216, Server <server name>, Line 2 An internal error occurred in DBCC which prevented further processing. Please contact Customer Support. DBCC results for '<database name>'. Msg 8921, Level 16, State 1, Server <server name>, Line 1 Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors. Error Log 2007 -05 -26 07: 13: 49. 21 spid 58 DBCC encountered a page with an LSN greater than the current end of log LSN (<LSN>) for its internal database snapshot. Could not read page (file id: page id), database '<database name' (database ID database id>), LSN = (<LSN>), type = 32, is. In. Sparse. File = 1. Please re-run this DBCC command cannot perform the necessary checks to confirm consistency § State 216 – indicates that DBCC CHECKDB common read a page from internal snapshot that has larger log sequence number(LSN) than the end of log LSN Workaround § Use TABLOCK hint with the DBCC CHECKDB command. DBCC completes without any errors
DBCC CHECKDB REPORTS A 2576 ERROR AFTER YOU DROP A BLOB COLUMN IN SQL SERVER 2008 OR IN SQL SERVER 2008 R 2 Symptoms Cause § Running DBCC CHECKDB returns an error when run § The sequence of actions that is described in against a Blob table-text, ntext or image data type columns – after you perform following actions § Enable the table for File. Stream storage § Add a filestream column to the table (for example, [column name] [varbinary](max) FILESTREAM NULL) § Move the data from the existing binary large object column to the new filestream column § Drop the existing binary large object column § Rebuild the indexes on the table Msg 2576, Level 16, State 1, Line 1 The Index Allocation Map (IAM) page (0: 0) is pointed to by the previous pointer of IAM page (1: 158) in object ID 0, index ID -1, partition ID 0, alloc unit ID 72057594039828480 (type Unknown), but it was not detected in the scan. the "Symptoms" section leaves the first IAM page of the allocation unit for the dropped binary large object column in an orphaned state. § This condition generates the 2576 error message. Resolution § After you have verified that the 2576 error is the only error that is reported by DBCC CHECKDB, you can use the DBCC CHECKDB command together with the REPAIR_ALLOW_DATA_LOSS option to remove the orphaned IAM page. § This repair process deallocates the orphaned IAM page and generates the following message: § Repair: The page (1: 158) has been deallocated from object ID 0, index ID -1, partition ID 0, alloc unit ID 72057594039828480 (type Unknown).
Symptoms Cause § Running DBCC CHECKDB takes a long § You have a lot of partitions in § In addition high CPU usage occurs § When you run DBCC CHECKDB time to complete – in SQL Server 2012 or 2014. Microsoft SQL Server 2012 and SQL Server 2014. against the database, it takes much longer to complete than in earlier versions of SQL Server. Resolution Run the following command to check the number of partitions per object/index select COUNT(*) from system_internals_partitions GROUP BY object_id, index_id
EXAMPLES OF DATABASE CORRUPTION CONTINUED 7. B-Tree Errors • Error 8931; Level mismatch • Error 8934; Last Key in C 1!<lowest Key in C 2 • Error 8935; B 2 thinks C 4 is C 3’s left sibling • Error 8937; B 1 and B 2 both point to page C 2
TRANSACTION LOG CORRUPTION
DEMO; TRANSACTION LOG CORRUPTION; WITH CLEAN SHUTDOWN • Create the database; with one log file • Check the number of virtual Log Files
CORRUPTION WITH CLEAN SHUTDOWN CONTINUED • Detach the created database • Delete the log file
DEMO; TRANSACTION LOG CORRUPTION • Reattach the database using the FOR ATTACH option • Check the virtual Log files
DEMO; TRANSACTION LOG CORRUPTION • Create database with multiple log files
DEMO; TRANSACTION LOG CORRUPTION • Check the number of log files for the newly created database • Detach the database
DEMO; TRANSACTION LOG CORRUPTION • Try to re-attach the database using the FOR ATTACH option • Re attach the database using the FOR ATTACH_REBUILD_LOG
DEMO; TRANSACTION LOG CORRUPTION • Check the number of virtual log files created
CORRUPTION WITHOUT A CLEAN SHUTDOWN • Create the database
CORRUPTION WITHOUT A CLEAN SHUTDOWN • Create the Department table and insert records into the table
DEMO; TRANSACTION LOG CORRUPTION • Begin a transaction and update record. Force the updated page to be pushed to disk • In a separate window issue a shutdown
DEMO; TRANSACTION LOG CORRUPTION • Open up a Hex editor, open up the Log file in the Hex editor
DEMO; TRANSACTION LOG CORRUPTION • Restart SQL Server after corrupting the log • Check the status of the database who log is now corruption; SUSPECT
DEMO; TRANSACTION LOG CORRUPTION • Try to get into the database • Take the database into EMERGENCY mode
DEMO; TRANSACTION LOG CORRUPTION • Try to get into the database; SUCCESS • Check if transaction was rollback – Emergency mode(database still corrupted)
DEMO; TRANSACTION LOG CORRUPTION • After the fix the database is brought back ONLINE • Transaction that was not committed is rolled back; We lose the changes.
DEMO; TRANSACTION LOG CORRUPTION; UNREPAIRABLE ERROR • Create the database
DEMO; TRANSACTION LOG CORRUPTION; UNREPAIRABLE ERROR • Create the Department table and insert records into the table
DEMO; TRANSACTION LOG CORRUPTION; UNREPAIRABLE ERROR • We corrupt the PFS page • Check if transaction was rollback – Emergency mode(database still corrupted)
CORRUPTION; UNREPAIRABLE ERROR • Check the status of the database • Try to get into the database
DEMO; TRANSACTION LOG CORRUPTION; UNREPAIRABLE ERROR • Change the database to Emergency • Try to get into the database
CORRUPTION; UNREPAIRABLE ERROR • Change the database to Emergency
MY CUSTOM SOLUTION TO ERROR 2570
CONTINUED
MONITORING FOR I/O ERROR • Manual monitoring time consuming • Automated monitoring • SQL Agent alerts –severity 19 and above • Third party monitoring tools
VERIFY BACKUPS • Requires WITH CHECKSUM option when performing backups. • Verifies pages as they are read with checksum calculation-verify that every page is setup correctly(During the backup process). • Checks integrity of backups -Impact is negligible • Allows you to check the integrity of backups using Restore verify only.
VERIFY BACKUPS: EXAMPLE § Test 1; Backup database without Checksum option
VERIFY BACKUPS: EXAMPLE § Test 2; Backup database with the Checksum option
VERIFY BACKUPS: §EXAMPLE Test 3; Backup database with Checksum option With Error(Force Backup to succeed)
VERIFY BACKUPS: EXAMPLE § Test 4; Backup database with trace flag 3023 (same as using Checksum) (Force Backup success)
RESTORING FROM BACKUPS § System databases • Use backup from the same service pack level as SQL Server instance • • E. g. RTM backup of master will not restore on SP 1 Restore model and msdb same as user databases. • Version restore Enterprise; version of backup containing enterprise only features cant be restored to other features • Can be a setback • Use DMV sys. dm_db_persisted_sku feature to whether any of your databases are enterprise –only.
- Slides: 71