EMC Next Generation Backup Data DeDuplication High Level
EMC Next Generation Backup & Data De-Duplication High Level Overview and Strategy Joe Staiber EMC Corporation Data De-Duplication Product Manager Backup, Recovery and Archive Division © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 1
Typical Issus with Traditional Backup Long Backup Windows Backup Servers Affecting Production Server Cost / Licensing Tape Cost VM Guest Proliferation License Cost Off-site storage Cost to use Disk Technology Iron Mountain / Transport Client Licensing Tape Rotation and Changes VMWare Resources Restore times VCB Infrastructure Restore complexity Tape Drive Failure Multiple Solutions Tape Read/Write Errors Remote office backup Tape Drive Maintenance DR / Business Continuity Intraday Restore needs GROWTH / TIME Retention © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 2
What is Data De-Duplication? – An Analogy How many times does the word “THE” appear in a sentence, a chapter, an entire book, a library? Data is not unlike words in print, only instead of words, data uses strings of 1’s and 0’s. A book may contain 4 million words in it, but only 200, 000 different words, 3. 8 million words are repeats. Some of them, hundreds or thousands of times. The Amount of de-duplication possible in your data center is in line with these numbers…. Its staggering “Would you rather copy 200, 000 or 4 million words every day? ” © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 3
How it Works Simple Example of Global, Source Data De-duplication Data Center First Instance Duplicate Instance A Only unique data segments are backed up B C D Remote Site 1 Data already backed up, so only unique IDs stored (20 byte pointers) A B C D Modified Instance E Remote Site 2 New data segment identified and backed up E De-duplication Server © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber (stored backup data) 4
Where can De-Duplication Occur? IT’S NOT JUST IN BACKUP!!!!!! De-Dup is theoretically possible ANYWHERE But it comes with a price…. Processing, latency, bandwidth, and most importantly TIME Who does the actual processing? Storage Array? Software? Backup Server? Tape Device / VTL? © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber Backup Server De-duplication Device 5
De-Duplication Concepts: Prominent Use Cases Where is De-Duplication being applied today? Backup – address significant inefficiency & cost due to redundant data – – Integrated end-to-end backup software stack B 2 D H/W Target component for incumbent backup environments Archive Applications and Platforms – efficient retention over time – Low cost, “acceptable performance” secondary storage for mid-term retention, where regulatory compliance is not required – As an efficiency feature in compliant archive (e. g. Centera) Primary Storage - “Capacity Optimized” ILM tier – Block and file for tier 2 applications – Different performance and cost characteristics Replication – Save bandwidth & time by moving less data – Inherent in most storage use case solutions – Also found in WAAS/WAFS solutions © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 6
De-Duplication in PRIMARY Storage will Change the GAME !!!! Technologies like Flash drives and NAS subfile de-dup are HERE. EMC Centera Celerra CLARii. ON Invista Connectrix NS 40 NS 20 CX 3 Ultra. Scale Series DL 4 x 00 EMC Disk Library AX 150 Symmetrix NS 40 G NS 80 G Fiber Channel and i. SCSI DL 6000 NSX NS 80 EMC Centera Gen 4 LPNode DL 210 DMX-4 950 New DMX-4 and DMX-3 Flash Drives © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 7
Different Vendors De-Dup in Different Places Lets look at the Vendors who play in this equation What happens when the backup application does the de-dup? (such as Commvault) EMC, HP Net. App, IBM Etc etc Primary Storage SAN/NAS – Do we need DD or Exagrid to do it again? No we don't What happens when the primary SAN does it? (Net. App & EMC Celerra) Backup Application Data De-dupe Symantec Commvault Etc etc – Do we need Commvault or DD to do it again? No we don’t And if they did, they would have to “un-dedup (rehydrate) the data to even be able to read it!!! © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber Target Device Data De-dupe Data Domain Exagrid, Quantum Etc etc SOFTWARE TARGET BASEDDE-DUP Commvault Data Domain / Pure. Disk / Exagrid 8
BUYER BEWARE!!!!! Primary Storage SAN/NAS Data De-dupe Backup Application Data De-dupe EMC IS THE ONLY COMPANY THAT MANUFACTURES PRODUCTS IN EVERY SECTION OF THE DEDUPLICATION MARKET EMC is ready and capable in leveraging deduplication across the spectrum What happens to vendors like Data Domain and Commvault, when the data is already de-duplicated? ? ? Target Device Data De-dupe © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber Other vendors see De-Dup as a product, not a technology… 9
What is Most Impactful to You TODAY? Backup is still the best and most efficient application for De-Dup today It is proven and available It is out of the production window There are several ways to De-Duplicate data in a backup environment But first, lets define the backup challenge we all are facing… TARGET DE-DUPLICATION B B Backup Server B B B SOURCE DE-DUPLICATION B De-Dup Device De-duplication Device © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 10
Backup De-duplication – Media Impact Traditional Backup v. EMC Avamar Cumulative Media Required 4 weeks Traditional Backup w/Compression (2: 1) 8 weeks EMC Avamar r ma a v A Avamar makes backup to disk more economical © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 11
The Avalanche Would you rather stop the avalanche here? The Goal is to De -Duplicate as close to the SOURCE as possible Or here? © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 12
The Power of Avamar and De-Duplication What has Avamar resulted in for Customers: 70 Hour backup down to 4 Hours 400 servers backed up in 5 hours over T 1 or less bandwidth Eliminated Tape Eliminated 40 backup servers Improved Restore times Centralized all Backup Operations 300 GB of backup stored in 10 GB 99. 8% de-dup rate in Windows 99% de-dup rate in SQL © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 10 x Faster Backups 500: 1 reduction in network bandwidth 50: 1 reduction in backup infrastructure Elimination of off-site tape storage 13 13
BC & Disaster Recovery: Primary Data Storage: Daily Cumulative: 50 TB 8 TB Weekly Cumulatives: 48 TB Weekly Full Backups: 50 TB 95+% Less n Primary Data Storage: n Axion Daily Snapups: n Axion Weekly Snapups: 3. 5 TB n Weekly Full Backups: N/A 98 TB n n 50 TB. 5 TB 3. 5 TB 70 hour staged full backup window reduced to 4 hours Cost-effective replication to two sites “Avamar has a game changing solution. Through their innovative technology, we have been able to rethink our backup, recovery and replication infrastructure, providing Morgan Stanley with better local and remote recovery at a greatly reduced TCO. ” —Guy Chiarello, CTO/CIO, Morgan Stanley © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 14
Expected Results for Manufacturing Co. Backup Exec Current Full Backups 5 TB Backup Window 28 hrs Media Used (1 year) = 106 TB n n n Current Full Backups 5 TB Backup Window 1. 2 Hours Media Used (1 year) = 6 TB 28 hour staged full backup window reduced to 1. 2 Hours Cost savings estimated at $23, 851 for 3 Years New Functionality, Centralized, Faster Backups, Streamlined Avamar starts at 17 k and goes up from there based on Capacity and Retention periods © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 15
Avamar Customers (Notable) Verizon Wireless Home Depot The Limited Ann Taylor GE Cardinal Health Nationwide Pepsi AT&T VMWare Nexon Travelers Corporate Express Wellesley College Sterilite CRI Technologies Danvers Bank CISCO Churchill Downs Arizona Dept of Education New Albany PPG Bank of New York Medco Dell Kelley Drye & Warren City of Kirkland Brooks Automation Univ of CA Chrysler Kiewit Morriston Forester Komatsu Lexis Nexis Iowa Dept of Transportation Kroger Duoline Baker & Mc. Kenzie Citizens Bank Rob Roy Nodaway Bank Reckitt Benckiser Plymouth Rock Steamship Authority Chadwick Martin La Quinta Auto Owners 21 st Century Mile High Banks Oklahoma Turnpike Farallon © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 16
Eastern Regional Avamar Installed Customers (Commercial) Montgomery County Public Schools Howard County Public Schools Country Meadow Associates Arraya Solutions IPR Evolve IP Hydro. Gene. Logic American Healthways Net. Tel. Cos Expedient ADLCM Kirklands Retail Restaurant Services Inc SEA Medical Center DCH Informed Medical Orange Lake Resorts Welbro Construction FCCJ Seminole Community College Avocent © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber Debartolo Properties First Bank GPX Leesburg Regional Hospital Northside Hospital Lithonia Lighting Manatee County Palm Beach County Parker Hudson Rainer & Dobbs LLC Miles & Stockbridge Reynolds Smith and Hills Sarasota County Clerk Satilla Regional Medical Southern Bone & Joint Success For All CGI Mecklenburg County Barlowworld ABNB Federal Credit Union Wunderlich Microstrategy 17
Where is Avamar MOST common Avamar is used in nearly every industry Every type of infrastructure Across most platforms Its biggest Value comes in areas where backup time / bandwidth are limited: Remote Offices / Branch Offices Data Centers / Enterprise Backup Management VMWare & File Sytems NAS © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 18
Remote Office Backup Via WAN Without Avamar Clients With Avamar Clients Data De-dupe Central Data Center Data De-dupe WAN Challenges WAN blockage Poor reliability Decentralized Untrained IT staff Data De-dupe Server WAN Advantages Automated Encrypted Centralized Outstanding ROI Target approach requires hardware at every site © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 19
Real Example from Avamar MD Public School System (WAN)
Virtualization Creates New Backup Challenges OLD PARADIGM NEW PARADIGM Low overall utilization and plenty of bandwidth for backup High overall server utilization, but low bandwidth for backup © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 21
Backup Built for VMware Infrastructure Avamar Efficiently Protects Virtual Machines Traditional moves ~200% weekly Up to 95% reduction in data moved Up to 90% reduction in backup times Up to 50% reduction in disk impact Up to 95% reduction in NIC usage Up to 80% reduction in CPU usage Up to 50% reduction in memory usage All backups stored as “virtual full backups, ” ready for immediate restore Avamar moves ~2% weekly Maintain effective consolidation ratios without over-taxing CPU utilization © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 22
EMC Avamar Solutions for VMware Infrastructure Flexible, Fast, Efficient and Reliable Backup and Recovery AVAMAR CLIENT BACKUP SOLUTIONS Guest VCB Service Console © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber AVAMAR SERVER BACKUP SOLUTIONS Avamar Software Avamar Virtual Edition Avamar Data Store 23
Lightweight Agents / Reduced CPU Utilization Total CPU Utilization by Event (Time Elapsed) Full Avamar: Efficient Full Backups Incrementals Traditional Incremental + Full Backups • Avamar reduces backup times by up to 90% weekly • CPU utilization slightly higher during backup operation (~15%) • Reduced time = weekly CPU utilization reduced by up to 85% • Avamar backups set in “nice mode” or low priority: minimizes CPU contention © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 24
EMC Avamar Data Store Gen 2 SUSTAINABLE GRID (RAIN) TECHNOLOGY Avamar Data Store – Multi-node configuration starts at 4 TB and scales to support up to 32 TB licensable de-duplicated capacity – Equivalent of up to 1. 1 PB of cumulative traditional disk or tape backup storage* – Backup media requirement reduced 20– 40 times – High availability and reliability with RAIN architecture, RAID, daily integrity checks, and redundant power Avamar Data Store, Single Node – Supports 1 TB and 2 TB licensable de-duplicated storage capacity configurations – Equivalent of up to 70 TB of cumulative traditional disk or tape backup storage* – Designed for easy deployment at remote offices – Offers fast, local recovery without dependence on a WAN connection *Note: Equivalent traditional backup capacity assumptions: 100 percent MS Office file data, weekly full and daily incremental backups, no compression, 10 percent daily change rate, 90 -day retention © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 25
Avamar’s Major Competitive Differentiations Who is REALLY less expensive? Symantec Pure. Disk Data Domain Exagrid Commvault AVAMAR Software Only Hardware Only Software Only HW & SW Purchase Hardware Purchase Software Purchase Hardware Additional $$ Must have Net. Backup and license Agents Must License each Agent Additional $$ All Components Included All Agents included Additional $$ Not Pure Source De-Dup Starts over Not Pure Source De- Grid Architecture Dup – occurs at with each additional Dup – occurs at allows for true Global Media Server box –Target Only box – Target Only Media Server De-Dup – scalable by adding nodes Requires HW and No Hardware or SW at Remote Software required to offices backup remote offices Additional $$ Does NOT SIGNIFICANTLY significantly improve improves Backup times Backup times Requires separate No additional backup servers servers required Additional $$ © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber Additional $$ 26
Why an Integrated Solution? HARDWARE ONLY SOLUTIONS: (Data Domain / Exagrid) – As software is now performing the De-Duplication, the hardware de-duplication is NO LONGER required (This is now the case with Symantec, Commvault v 8, and Avamar) SOFTWARE ONLY SOLUTIONS: (Symantec, Commvault, etc) – As primary storage arrays begin to utilize Data De-Duplication technologies, the Backup Software is not aware and its value diminishes if the data is already in a De-Duplicated state. Re-Hydration would be required. (This is already the case with Net. App and Celerra NAS based De-Dup and there are more to come) EMC is the ONLY vendor in the De-Duplication space that manufactures Primary Storage, Backup Software and Backup Hardware. – Regardless of where the de-duplication occurs, EMC is ready and capable to leverage and optimize it. EMC|Avamar is the only vendor to utilize variable length segments when deduplicating data. It will ALWAYS store less, send less and backup faster! “What vendor do you want to make a strategic investment in? ” Ask these vendors what their strategy is, as data is already de-duplicated before it gets to their product… © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 27
The Economics of Backup & Recovery $9 No recurring tape spend = All client software/agents incl. , 1 0 1 year included HW and Maint No software, use existing. 9 k per year in maint 66 3 years all inclusive (HW, SW, Maint) $35, 000 Investment $1 0, 00 0 $90, 000 Investment $3500 per year for new clients Offisite replication included © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber = O C L TA TO ST ST O C TO L $9000 for VCB SW (v. Ranger) + a server All data retained on disk and all media included for the 3 years of retention $2700 per year for additional media 20% growth rate of data was factored 11, 500 per year in offsite costs into the system No data growth factored in Backup window reduced by 300% New media / upgrade required year 2 Restore times improved at 18 k. New server too? Time to first byte of restore within HW maint of 12 k years 2 and 3 minutes No significant backup improvements TA 0 Traditional Backup Solution Avamar Example 28
Avamar SOLVES issues Long Backup Windows Backup Servers Affecting Production Server Cost / Licensing Tape Cost VM Guest Proliferation License Cost Off-site storage Cost to use Disk Technology Iron Mountain / Transport Client Licensing Tape Rotation and Changes VMWare Resources Restore times VCB Infrastructure Restore complexity Tape Drive Failure Multiple Solutions Tape Read/Write Errors Remote office backup Tape Drive Maintenance DR / Business Continuity Intraday Restore needs GROWTH / TIME © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 29
Intuitive, Policy-Based Management Console © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 30
In Summary File System and VMWare benefits of Source De-Dup alone, justify the investment You can start SMALL with Avamar (single use) and grow it easily into a full integrated enterprise solution Source Based De-Duplication makes the Difference, beware of the values of a Target Based De-Dup Competitive Solutions around De-Dup have value, but understand the differences. They are Band-Aid’s not long term solutions EMC has the ONLY broad based De-Dup strategy that will grow and continue to add value as De-Dup stretches into new areas © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 31
Next Steps Live Demo’s provided every FRIDAY at 11 am EST – Performed by an Avamar Engineer – Live via Web – Ask questions, see the product in action Solution Sizing – – How much data is transferred in a full backup today % of data is FS/Exchange/DB/Images/VMWare Retention periods on disk Replication? Avamar Virtual Demo Configuration / Pricing / Cost Justifications Commonality Analysis Proof of Concept / Evaluations © Copyright 2009 EMC Corporation. All rights reserved. Joe Staiber 32
where YOUR information should live where PRIMARY information lives where TIERED information lives where VIRTUAL information lives where BACKUP information lives where REPLICATED information lives where DE-DUPLICATED data lives where ARCHIVED data lives
- Slides: 33