Disaster Recovery Plan Development Mark Carlson VP of
- Slides: 78
Disaster Recovery: Plan Development Mark Carlson, VP of Information Technology SMI – Carrollton, GA ERICSA 50 th Annual Training Conference & Exposition ▪ May 19 – 23 ▪ Hilton Orlando Lake Buena Vista, Florida
Topics • Original versus Current BC Plan • Roles and People • Plan Development – Plan Creation – Tabletop Exercises – Recovery Tests – Disaster Events
Original Plan • FEMA Based – Full featured BCP with focus on “the plan” • Components – Risk Assessment – Emergency Communication Plan Quick Reference – Emergency Action Plan – Incident Prevention Plan – Testing Scripts … and more
Current Plan • “Operations” based – Simplified DR plan with focus on high risk and the highest probability outage scenarios • Primary Component – Scenario Tables (including onsite and offsite) • Associated Components – Tabletop exercises – Testing scripts
From Then to Now Previous Plan Formal BC plan perspective Offsite recovery Monolithic document Created independently Unknown shelf-ware New Plan Operational service perspective Off and Onsite recovery Bite-sized pieces Shared among Testing, IM, SR, CM processes Underpinning knowledge, Shared during outages
Roles and People • Business Continuity Team – BC Plan Manager – BC Plan Testing Team • Incident Response Team – Incident Response Coordinator – Recovery Agents…
Roles and People Cont. • Specific Positions – Operational Management – Technical Management – Chief Operating Officer • Personalities
Plan Development: Workflow Update Plan Creation Tabletop Exercises Disaster Test Disaster
Plan Creation Update Plan Creation Tabletop Exercises Recovery Test Disaster
Plan Creation Survey Operational Environment Define High Risk, Most Common Scenarios Organize by Operational / Technical Services Plug-in Recovery Activities into “Scenario Tables” • Create Tabletop & Testing Script Documents • •
Plan Creation: Table TOC
Plan Creation: Table TOC
Scenario Table Structure • Scenario Setup – Description, Services Affected, Mitigation, Symptoms, Response Coordinator • Phase 1 – Impact Assessment • Phase 2 – Recovery • Phase 3 – Restoration
Sample Scenario Table
Sample Scenario Table Cont.
Sample Scenario Table Cont.
Sample Scenario Table Cont.
Sample Scenario Table Cont.
Sample Scenario Table Cont.
Tabletop Exercises Update Plan Creation Tabletop Exercises Recovery Test Disaster
Tabletop Exercise • Scenario Table Walkthrough – Mental exercise for DR team (Recovery Phase) – Notes appended to the scenario tables • Scenario Table Quality Improvements – BC Manager compiles feedback – Operational validation – Technical details may be added – Contact information verified
Tabletop Exercise Document
Recovery Test Update Creation Tabletop Exercises Recovery Test Disaster
Recovery Test Documents • Testing Scripts • Disaster Recovery Scorecard
Recovery Test Scripts
Recovery Test Scripts Cont.
Testing Scorecard • Setup & Dismantle Results • Application Results
Testing Scorecard: Infrastructure
Testing Scorecard: Applications
Disaster Event Update Plan Creation Tabletop Exercises Disaster Test Disaster
Disaster Event • • • Conference Call Disseminate the Scenario Involve the BC Manager Treat it as a Tabletop Exercise Update Plan
From Then to Now Previous Plan Formal BC plan perspective Offsite recovery Monolithic document Created independently Unknown shelf-ware New Plan Operational service perspective Off and Onsite recovery Bite-sized pieces Shared among Testing, IM, SR, CM processes Underpinning knowledge, Shared during outages
Remember • Organization is Key – Define operational and technical services to drive plan scenarios and to integrate with processes • Keep it Simple and Practical – The most comprehensive Business Continuity plan is often the least used – Create & tabletop scenarios with a history FIRST • Leverage Personalities and People
Logistics of Disaster Recovery The First 24 -72 Hours John Popa Xerox State and Local Solutions ERICSA 50 th Annual Training Conference & Exposition ▪ May 19 – 23 ▪ Hilton Orlando Lake Buena Vista, Florida
Definitions Business Continuity Disaster Recovery Business Recovery Back up Hot Site Back up Cold Site
Our Approach Designate a primary DR site for each of our operational State Disbursement Units PA SDU is the hot site for 8 of the XEROX operated SDUs Same or similar hardware, software and procedures All systems are loaded, maintained and tested on a periodic basis
New York Preparedness Example Two DR tests per year Client attends and brings their own data set Each task must be completed at the recovery site and match the test case outcomes predicted by the customer
LOGISTICS IN A TRUE DISASTER
A Disaster is Declared… What needs to be done in 24 -72 hours? Communicate with State Leadership Communicate with the affected site personnel Staffing the additional work Ready technology to activate hardware and software Initiate operations checklists Refine the process where needed Affected site begins business recovery analysis Communicate with Internal Leadership Commit to restore operations in the hot site
Staffing Mobilize Xerox and local staffing resources Transportation and lodging for staff relocating to the recovery site Activate training for all staff working the DR Project Activate local security for all new and visiting staff
Communications Command Center coordinates disaster site recovery and hot site operation Establish hour-by-hour communication between Command Center and all recovery functional elements Establish local communications capability for visiting staff members Ensure the connectivity and communications with the depository bank is established and working Establish client related communications, reporting, status, and program related items Ensure IVR or web based communications reflects current disaster status, alternate resources and recovery timeframes to keep all stakeholders informed
Infrastructure Facility Ready office space and training facilities Install/energize additional office equipment Prepare storage/secure processing areas Equipment Prepare work stations Printer access Provide additional telecom services Review and implement records retention requirements and storage areas Technology Establish and test connectivity internal/external Prepare scanners, robotics, and ancillary systems Prepare systems security Energize and test processing equipment
Operations Segregation of work from all other performed for the host State Payment Processing – scanning, payment recording, deposit preparation. Transmissions/Reconciliation Process Disbursements Processing Banking and Reconciliation Program Reporting Print Services (Coupons, Checks, Special Processing Instructions) Customer Service
Refining the Process Analyze / Communicate Site Performance and Recommend Improvements Implement Improvements and Disaster Recovery Plan Updates Stabilize Processing Environment Rebuilding the Disaster Site assesses damage plans for and rebuilds Coordinates with DR Site for planning and cutover Return to normal operations
Physical Disasters: Preventative Planning and Recovery Alisha A. Griffin, IV-D Director New Jersey ERICSA 50 th Annual Training Conference & Exposition ▪ May 19 – 23 ▪ Hilton Orlando Lake Buena Vista, Florida
Disaster: v a sudden calamitous event bringing great damage or destruction v A sudden greatness for time or failure Synonyms v apocalypse, catastrophe, debacle, tragedy
Goals: v Manage Effectively - a potential disaster or only an incident / failure v Ensure Quality Customer Service
Key Components v v Ownership Documentation and Planning Disaster Recovery Business Continuity
Ownership v It’s Everyone’s Responsibility ü Technical ü Program v Communication P Internal P External P Customers
Responsibility: v Technical P Create and Maintain Matrix Ø All infrastructures Ø All interfaces Ø All Connections ü Prioritize v Program ü Key Managers – Central ü Key Managers - Local
Communication: v Internal – program / Team P Maintain a master list of: Ø Primary email Ø Secondary email Ø Primary phone number Ø Secondary phone numbers ü Agenda item to regular meetings ü Service Level Agreements (SLA’s) ü Internal – Executive Ø Public Information Officer. Ø Engage other Agency Heads Ø General
Communication v External: P Contracts / Outside Agencies Ø Incorporate into contracts Ø Incorporate into Matrix Ø Assign a point of communication v Customers: P Notification Ø Websites Ø IVR Ø Notices Ø Press Releases Ø Postings
Documentation and Planning v Everyday / Don’t Delay v Rigorous Attention and Priority v Schedule Regular Updates v QA
Disaster Recovery v Inventory of Systems PMain PSide components / key interfaces PLocal operations v Continuity ü Assess for vulnerability PAlternative options PFull and partial PShort and long term
Disaster Recovery v Biennial Requirements ü Environments Ø Full Ø Partial ü Tests Ø Regular / Episodic Ø Full / Partial
DR Readiness Checklist Responsible Duration Estimate 1. DR Systems Rob Cislak 2 weeks Team DR Coordinator or Deputy 2. Send e-mail to DR Distribution list of GO/NO GO decision. Notify Stakeholders of test DR Coordinator or Deputy, Program Owner Ensure accessibility and DR System privileges in NJKi. DS DR Team servers including Autosys for Batch Operations 3. 4. 5. Verify Connectivity from SAFE, Datamation and Bull Mainframe DR System 6. Verify and E-mail- NJKIDS processing is complete NJKi. DS Batch 7. Notify the DR Coordinator that batch processing complete. Verify all code is in synch between production and DR servers. NJKi. DS Batch 8. 9. Verify all Interface files are in synch between production and DR servers. Resource Group Action Step Connectivity test using test page. 10. Notify Elapsed Estimate Date Time Status Description / Notes 10/5 Completed At least 2 weeks prior to test Rob Cislak 10/18 Completed At least 1 week before test Rob Cislak 10/15 Completed At least 1 week before test Jayakumar, Vijay Prabhu 10/16 Completed At least 1 week before test Greg Steen 10/19 Completed At least 1 week before test Jayakumar, Vijay Prabhu 10/20 Day of cutover Operations Jayakumar, Vijay Prabhu EBSU AI Gottsch 10/20 Day of cutover NJKi. DS Batch Jayakumar, Vijay Prabhu 10/20 Day of cutover Ed Michalak, Rob Cislak 10/20 Day of Cutover Team Operations Restoration and DR Coordinator or Support Team (EBSU) to Deputy prepare servers
Failover to DR Checklist Step Action Resource Group Responsible Duration Elapsed Estimate Date Time Status Description / Notes 1. Initial Damage assessment Disaster Recovery Coordinator/Deputy DRC Ed Michalak, Rob Cislak 10/21 7: 00 AM 2. Confirm a Disaster has been Declared Disaster Recovery Coordinator/Deputy DRC Ed Michalak, Rob Cislak 10/21 7: 00 AM 3. Contact all DR Team(s) Members Disaster Recovery Coordinator/Deputy DRC Ed Michalak, Rob Cislak 10/21 7: 00 AM EMAIL DR team after completion Ensure Network Readiness Emergency Management Team Restoration and Support Operations Team (Network) Steve Pohler 10/21 7: 15 AM EMAIL DR team after completion a. Network monitoring tools (PRTG, Whatsup, etc) Restoration and Support Operations Team (Network) Vivek Bansal, Paul Bostwick 10/21 b. Wide Area Link Restoration and Support Operations Team (Network) Vivek Bansal, Paul Bostwick 10/21 c. Routers Restoration and Support Operations Team (Network) Vivek Bansal, Paul Bostwick 10/21 d. Switches Restoration and Support Vivek Bansal, Paul Bostwick 10/21 Ensure DR Readiness 4. 30 mins
October 29, 2012 - Super Storm Sandy hits the east coast
Hurricane Sandy Keys
Hurricane Sandy Key (continued) v Management v Lessons Learned ü Flooding’s of the Delaware / Hurricane Irene / Anthrax v Communication ü Emergency Management ü Loss of contact v Business Continuity – Options ü Assessing and reassessing regularly ü Staff impact
Outcomes
Future Improvement
Future Improvement v Movement of checks – done v Distributed Systems üAbility to prioritize and respond v Reliance on other Systems v Document Imaging
Questions?
- Verizon disaster recovery plan
- Payroll disaster recovery plan
- Manufacturing disaster recovery
- Law firm disaster recovery
- Records management disaster recovery plan
- Oracle disaster recovery plan
- Disaster recovery plan presentation
- School disaster recovery plan
- Database disaster recovery plan
- Oracle disaster recovery best practices
- Ict disaster recovery plan
- Pickwick syndroom
- Disaster recovery plan template nist
- Disaster recovery plan adalah
- Disaster recovery plan adalah
- Principles of incident response and disaster recovery
- Backup and disaster recovery mississippi
- Mainframe adalah
- Sql server high availability and disaster recovery
- Emc disaster recovery
- Cissp business continuity plan
- Ad disaster recovery
- Isaca business continuity
- Vmware offsite disaster recovery
- Socio economic trends
- Hot site cold site warm site disaster recovery
- Disaster recovery for dummies
- Disaster recovery call tree template
- Sql server high availability and disaster recovery
- Open source disaster recovery
- Disaster recovery telecommunications
- National disaster recovery framework 2016
- Disaster recovery techniques
- What is a disaster
- Coop disaster recovery
- Aws disaster recovery pilot light vs warm standby
- Disaster recovery cost curve
- Disaster recovery cost curve
- Bia bcp drp
- Disaster recovery process flow diagram
- Share data
- Principles of incident response and disaster recovery
- National disaster recovery framework
- National disaster recovery framework
- Disaster recovery international
- Disaster recovery lombardia
- Desoto county emergency management
- Disaster recovery planning in system analysis and design
- Ibm zerto recovery
- Disaster management and sustainable development
- Bdrrm plan message
- Bob carlson retirement watch
- Python
- Chrissa carlson
- Dr carlson advises his depressed patients
- Cwt travel
- Criterios de frame y carlson
- Carlson elementary
- Motifs in of mice and men
- Of mice and men characters
- Carlson center events fairbanks ak
- Coordinated management of meaning definition
- Carlson control system
- Richard carlson doe
- Carlson
- Of mice and men chapter 3 questions
- Carlson
- Carlson
- Uncinatus pancreas
- Membrana buccopharyngea
- Caudális
- Dr lyndsey carlson
- Of mice and men--study questions answers chapter 1
- Why is candy unable to imagine getting rid of his old dog?
- Generalized model
- Russian cartoon carlson
- Sean carlson oakland county
- Carlson leadership academy
- Carlson design hull designer