Sval Tech Getting Started Using Database Archiving Toronto
Sval. Tech Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack. olson@Sval. Tech. com www. svaltech. com “Database Archiving: How to Keep Lots of Data for a Long Time” Jack E. Olson, Morgan Kaufmann, 2008 Copyright Sval. Tech, Inc. , 2009
Sval. Tech Why This Presentation • A common position of many IT shops is – We know we should be doing database archiving – We know it will be valuable to us – But we don’t know how to get started • Database archiving is an enterprise technology: it can be used in many applications • Not all database applications are suitable for database archiving • Suitable applications have widely differing return-on-investment potential Copyright Sval. Tech, Inc. , 2009 2
Sval. Tech The Database Archiving Survey organize survey team application enumeration first-cut feasibility data-life-cycle analysis operational analysis risk analysis metric gathering evaluate implementation options business case development prioritization Copyright Sval. Tech, Inc. , 2009 3
Sval. Tech The Survey Organization Mandate People Inputs Mandate A management directive that creates the database archiving survey task force and gives them the scope and objectives of the study. Scope: business units to include, organizational units (divisions, companies, campuses) Objectives: find best candidates for cost reduction, fixing operational problems, risk reduction Copyright Sval. Tech, Inc. , 2009 4
Sval. Tech The Survey Team Mandate People Inputs Copyright Sval. Tech, Inc. , 2009 Chair Fulltime members IT/enterprise architect storage administration records retention Subject matter members database architect data management business unit data analyst database administration Incidental members legal department IT compliance data governance security administration data analyst (BI type) 5
Sval. Tech Starting Materials Mandate People Inputs Copyright Sval. Tech, Inc. , 2009 Enterprise data model Data classification results SLA’s IT storage strategy Regulations/compliance rules Data governance mandates 6
Sval. Tech Application Enumeration Limit search to those within mandate business unit location enterprise Identify Operational Applications classify as transactional vs. static data include those already archiving to any extent Identify Retired Applications still retaining data Identify applications about to change consolidations planned or recent acquisitions replacements/ conversions/ reengineering identify any strategies for application retirement Copyright Sval. Tech, Inc. , 2009 7
Sval. Tech Application Enumeration For Applications with potential, Capture application data model Identify business records within the data model Connect business records to records retention and legal categories Identify database information: system/dbms/file/metadata Create a Database Topology chart Identify parallel applications within the corporation (even if out of scope) Identify operational replicates Identify backup/disaster recovery stores and strategies Identify recurring data extracts for BI, etc. Get rough idea of db size and transaction rates Copyright Sval. Tech, Inc. , 2009 8
Sval. Tech Database Topology Chart create data operational replicate operational archive offline storage backup BI stores CRM backup disaster recovery Copyright Sval. Tech, Inc. , 2009 9
Sval. Tech First-Cut Feasibility Factors for continuing to consider, important data lots of individual business records simple data structures relatively stable data structures (little change) long retention requirement long inactive period within retention requirement low frequency access requirement in inactive period low performance requirement in inactive period simple access requirements in inactive period Apply criteria after each subsequent step to further eliminate bad candidates Copyright Sval. Tech, Inc. , 2009 10
Sval. Tech Examples Good Bank deposits and withdrawals Stock trades Credit card transactions Ticketmaster transactions Medical claim data Casualty claim data (auto, home) Retail sales inventory transactions Package tracking Passenger flight data Driver license records Sales tax records Property tax records Telephone call transactions Nuclear reactor monitoring records Auto warrantee records Copyright Sval. Tech, Inc. , 2009 Not Good Customer master data Airplane manufacturing records HR records Felony records Home sales 11
Sval. Tech Data Life Cycle Analysis Create a database archiving DLCA for each business record type Data Retention Chart Business Record Process Chart to determine inactive period Business Record SLA chart by age of record Copyright Sval. Tech, Inc. , 2009 12
Sval. Tech Data Retention Chart The requirement to keep data for a business object for a specified period of time. The object cannot be destroyed until after the time for all such requirements applicable to it has past. Business Requirements Regulatory Requirements The Data Retention requirement is the longest of all requirement lines. Copyright Sval. Tech, Inc. , 2009 13
Sval. Tech Business Record Process Chart for a single instance of a data object Retention requirement time Create PO Update PO Create Invoice Backorder Create Financial Record Update on Ship Update on Ack Weekly Sales Report Quarterly Sales report Extract for data warehouse Extract for bus analysis Common customer queries Common bus queries Ad hoc requests Law suit e-Discovery requests Investigation data gathering operational Copyright Sval. Tech, Inc. , 2009 reference inactive 14
Sval. Tech Business Record SLA Chart by Age for a single instance of a data object Retention requirement time Query response time Transaction volume create/update Security (no users) read operational Copyright Sval. Tech, Inc. , 2009 reference inactive 15
Sval. Tech Operational Analysis Don’t assume there are no problems. Talk to DBAs and users. Look for trends Look for escalating operational costs. Get numbers. Copyright Sval. Tech, Inc. , 2009 16
Sval. Tech Operational Analysis • Performance Issues – Not meeting response time SLA – Longer time to run extracts – Longer time to run backups – Longer time to run database reorganizations – Running reorganizations more frequently – More difficult to tune • Risk Issues – Longer estimated time to run recovery – Longer estimated time to run disaster recovery • Cost Issues – Higher annual hardware costs – Higher annual MIP-based software cost – Adding expensive DASD to support database and backups Copyright Sval. Tech, Inc. , 2009 17
Sval. Tech Risk Analysis • Data Loss Risk – Isolation from internet hackers – Prevent ANY updates or deletes – Preserve data through multi-site backups and periodic pings • Data Quality Risk – Changing data structures and column semantics – Changing reference data • Unauthorized Access Risk – Reduced (or different) user set – Audit trail of access • Legal Risk – Preserve authenticity of data in archive – Reduce cost and time to produce data for discovery requests Copyright Sval. Tech, Inc. , 2009 18
Sval. Tech Metric Gathering Data bytes stored per business object new transactions created per day bytes for backups, replicates growth in transactions rates any sudden expected additions past history plus future projections Storage Costs cost per byte: operational cost per byte: backup cost per byte: archive compression ratios System Costs mips required to process software license fees staff for operational Copyright Sval. Tech, Inc. , 2009 19
Sval. Tech Metric Gathering For retired applications concentrate on displaced system cost displaced software cost displaced staff cost NOT shared Shared IBM mainframe IMS DBMS CICS DBA/SYSPROG Copyright Sval. Tech, Inc. , 2009 LINUX server Archive software JDBC Archive admin 20
Sval. Tech Evaluate Implementation Options • Software – Vendor provided software – Custom built solution • Access tools – Original application – Generic report generation/ query tools – Custom built • Storage for archive – Storage subsystem – Hosted storage Copyright Sval. Tech, Inc. , 2009 21
Sval. Tech Architecture of Database Archiving Operational System Application program Archive Extractor Archive Administrator Archive Designer Archive Data Manager Archive Access Manager OP DB Archive extractor Archive Server archive catalog archive storage Copyright Sval. Tech, Inc. , 2009 22
Sval. Tech Estimate Implementation Time and Cost • Archiving systems required – Servers – Storage systems (hosted storage? ) – Licensed software • Application Design • Implementation • Test • Deployment • Ongoing operation and administration Copyright Sval. Tech, Inc. , 2009 23
Sval. Tech Business Case Development – Lower IT costs – Improved operational efficiency – Risk reduction Copyright Sval. Tech, Inc. , 2009 24
Sval. Tech Lower IT Costs • Systems – Reduce size/cost of operational systems – Put off or eliminate need for system upgrades • Software – Eliminate or reduce cost of expensive system software • DBMS • Transaction system – Eliminate or reduce cost of application software • Storage costs – Switch to lower cost storage – Impact on backups/ disaster recovery stores – Reduction in byte count stored • Staff – Eliminate or reduce legacy system staff Copyright Sval. Tech, Inc. , 2009 25
Sval. Tech Chart it All data in operational db Inactive data in archive db most expensive system most expensive storage most expensive software least expensive system least expensive storage least expensive software In a typical op db 60 -80% of data is inactive Size Today This percentage is growing operational Copyright Sval. Tech, Inc. , 2009 operational archive 26
Sval. Tech Lower IT Costs • First year impact • Time to recover project costs • Chart cost savings over time – Plot data growth over time for operational – Plot data growth over time of archive Copyright Sval. Tech, Inc. , 2009 27
Sval. Tech Operational Improvements • Itemize improvements expected – – – Performance of operations Reduction of utility times Reduction of recovery times Reduction of disaster recovery times Reduction of DBA workload • Provide cost savings where appropriate Copyright Sval. Tech, Inc. , 2009 28
Sval. Tech Risk Reduction • Itemize improvements expected – – – – Less risk of failing e-Discovery request Enhanced data quality of older data Less exposure to loss of data authenticity Better access control Better compliance Better data governance Less dependence on legacy systems • Provide cost savings where appropriate Copyright Sval. Tech, Inc. , 2009 29
Sval. Tech Prioritization – Determine Prioritization Criteria • Cost is most common primary factor – First archiving project may have other goals • Lower risk of failure • Faster implementation • Faster return on investment • Usually a retired application project – Risk may over-ride other factors • Preserve data authenticity Copyright Sval. Tech, Inc. , 2009 30
Sval. Tech Final Thought • Always do a survey to find the best applications to start with • Always do a survey to identify those that make sense to proceed with versus those that do not: don’t waste time on apps that are too hard to implement or that will have little value. • A good database archive application can save millions of dollars per year, increase performance of operational systems and reduce risk all at the same time. The trick is identifying them and proving it. • Repeat the Database Archiving Survey from time to time in the future. Copyright Sval. Tech, Inc. , 2009 31
- Slides: 31