Your Data Any Place Any Time Data Warehouse

  • Slides: 40
Download presentation
Your Data Any Place, Any Time Data Warehouse Platform

Your Data Any Place, Any Time Data Warehouse Platform

Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW

Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW faster – Manage your DW – Deliver Insights • Analysts Reviews • Customer References

Data Warehouse Challenges Build Inability to incorporate legacy systems and unstructured data Dirty data

Data Warehouse Challenges Build Inability to incorporate legacy systems and unstructured data Dirty data from sources Cant make current decisions based on old data Complex development tools Manage Deliver Cant keep pace with growing data volumes and users Concurrent workloads on the system Disappearing batch windows Unable to identify bottlenecks Low usage of data warehouse Steep learning curve for users Inability to gain insight from data Unable to collaborate and share data

Accelerate Business Insight with SQL Server 2008 for Data Warehousing Deliver Build Visual Development

Accelerate Business Insight with SQL Server 2008 for Data Warehousing Deliver Build Visual Development Environment Enterprise Data Integration platform Connectivity to heterogeneous data sources Clean and profile data End to End Business Intelligence Manage Enterprise Scalability Visibility across all data warehouse workloads Continuous Availability Simplified Manageability Deep analytical capability Enterprise Scale Reporting Close integration with Office System Collaborate and share information with Office Share. Point Server

Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW

Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW faster – Manage your DW – Deliver Insights • Analysts Reviews • Customer References

Build Your DW faster § BI Development Studio to build your data warehouse §

Build Your DW faster § BI Development Studio to build your data warehouse § Visual Development environment to build your reports, cubes and ETL packages § Powerful debugging capabilities § Development lifecycle support § SQL Server Integration Services provides an Enterprise Data Integration platform § § Connect to heterogeneous data sources Clean and validate your data Load directly into partitioned tables and cubes Enterprise Scalability to load larger volumes of data in smaller batch windows § Supports continuous loading with SQL Server Service Broker connectivity

BI Development Studio § Complete, Integrated development tool in BI Development Studio § Build

BI Development Studio § Complete, Integrated development tool in BI Development Studio § Build end to end DW solution § § Create reports Build OLAP cubes and KPI Create Data Mining models Create your end to end data integration packages § Development Lifecycle support with Visual Studio TFS § Integrated into Visual Studio

Breakthrough ETL Capabilities § Change Data Capture (CDC) § Identifies operational changes § MERGE

Breakthrough ETL Capabilities § Change Data Capture (CDC) § Identifies operational changes § MERGE Transact-SQL keyword § Addresses common DW load scenarios § SQL Server Integration Services § § § Connectivity to any source Scalable pipeline architecture Comprehensive set of transformations Improved performance of lookup tasks DW smarts handling slowly changing dimensions and direct loading of cubes § Clean and validate using data mining and data quality § Built-in data profiling for source and destination § Extensible platform through pluggable custom tasks and Componentized development using. NET

Change Data Capture Generalize log-reading capability for all applications, notably DW • Improves DW

Change Data Capture Generalize log-reading capability for all applications, notably DW • Improves DW scale OLTP Change Tables – Provide change information upfront for efficient processing of dimensions – Eliminate expensive lookups – DW can be loaded throughout the day or at specific intervals • Changes are captured and placed in change tables Data warehouse – Log based capture with minimal impact on source – Captures complete content of changes & maintains cross table consistency – Change Metadata stored as part of the changes – ETL process calls functions to retrieve changes – Works even across schema changes!

Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW

Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW faster § Manage your DW § Deliver Insights § Analysts Reviews § Customer References

Manageability § Partition your large data sets into manageable chunks using partitioning § Simplify

Manageability § Partition your large data sets into manageable chunks using partitioning § Simplify aggregation of your data using Indexed views § Take advantage of simplified Index Management § Use Resource Governor to manage concurrent workloads § Use data compression to reduce the storage costs and increase query performance by reducing I/O § Use backup compression to reduce storage costs

Resource Governor SQL Server Backup OLTP Activity Admin Tasks Executive Reports Ad-hoc Reports High

Resource Governor SQL Server Backup OLTP Activity Admin Tasks Executive Reports Ad-hoc Reports High Admin Workload OLTP Workload Report Workload Min Memory 10% Max Memory 20% Max CPU 90% Admin Pool Application Pool § Ability to differentiate workloads § e. g. app_name, login § Per-request limits § Max memory % § Max CPU time § Grant timeout § Max Requests § Resource monitoring

Data Partitioning § Partitioning breaks a single object into multiple manageable pieces § §

Data Partitioning § Partitioning breaks a single object into multiple manageable pieces § § § Transparent to the application Allows easy management of very large tables and indexes The row is the unit of partitioning All partitions run on a single SQL Server database Table fully available while loading, indexing a new partition Create new and drop old partition quickly Sliding Window Time Obsolete Near-Line Read-Only Archived Read-Mostly Hot

Simplify Aggregate Management Using Indexed Views § Aggregations can be precomputed to minimize query

Simplify Aggregate Management Using Indexed Views § Aggregations can be precomputed to minimize query execution time § Tables can pre-joined with stored results sets § Automatic query rewrite to use Indexed Views as appropriately reducing response times § Proactive manageability using Database Tuning Advisor to suggest appropriate views

Simplified Index Management § SQL Server 2008 includes Alter Index for online Index Management

Simplified Index Management § SQL Server 2008 includes Alter Index for online Index Management § Rebuild Indexes without taking data offline § Allows Indexes to be rebuilt at partition level § Significant Improvement in Adding and Dropping Clustered Indexes § Parallel Index Management § Identify new or unused Indexes using Missing Indexes feature – DMV and Show Plan

Manageability Tools § Management Studio to manage your entire BI infrastructure § Declarative Management

Manageability Tools § Management Studio to manage your entire BI infrastructure § Declarative Management Framework provides policy-based configuration management § Dynamics Management Views to get real time insight into the database § SQL Query Analyzer provides prescriptive guidance on tuning query performance § Management Objects allows integration into existing management infrastructure

Enterprise Scalability § Sophisticated Indexing functionality to query large amounts of data § Parallel

Enterprise Scalability § Sophisticated Indexing functionality to query large amounts of data § Parallel Query Execution to handle large numbers of concurrent users § Partitioned Query Processing that enables queries to span multiple partitions § Increase query performance in large data warehouses by enabling Star Schema Query Optimizations

Star Join Improvement § Pain points: § Slow star join query with grouping and

Star Join Improvement § Pain points: § Slow star join query with grouping and aggregation § No fact table partition elimination through join on date dimension. § Bad cardinality estimates due to “Gatekeeper row” problem: presence of one row in filtered result from dimension table drastically affects # of matching rows in fact table.

Star Join Enhancement Features • Improved recognition of star join pattern in query (available

Star Join Enhancement Features • Improved recognition of star join pattern in query (available in some cases since SQL Server 7. 0) • Enhanced rules to generate query plan alternatives for Star Join • Improved costing of Star Join plans • Multiple bitmap filter query execution strategy added

Data Compression § Key goal: shrink DW fact tables § Secondary goal: improve query

Data Compression § Key goal: shrink DW fact tables § Secondary goal: improve query performance § Less I/O (but higher CPU) § Higher buffer hit rates § 2 X to 7 X compression ratio for real DW fact data anticipated, depending on data § Orthogonal to other features § Works for data and indexes § Techniques: page level dictionary, differential encoding, variable-length encoding of ints, dates, don’t store 0 s.

High Availability § Use Snapshot Isolation to take read snapshots of your data §

High Availability § Use Snapshot Isolation to take read snapshots of your data § Use Database Mirroring to minimize data loss and reduce down time § Backup and Restore partitions to reduce downtime

Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW

Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW faster § Manage your DW § Deliver Insights § Analysts Reviews § Customer References

Deliver Insights § Deep Analytical Capabilities using SQL Server Analysis Services § § Unified

Deliver Insights § Deep Analytical Capabilities using SQL Server Analysis Services § § Unified Dimension Models Key Performance Indicators Predictive Analysis using integrated Data Mining Scalable Analytical Engine that can handle large analytical data sets with larger number of users § Present the information to the organization using SQL Server Reporting Services § Ad-hoc reporting using Report Builder § Supports OLAP and relational sources § Deliver rich information via Office System 2007 § Better Integration with Excel 2007, including full predictive analysis through data mining § Rich Visualization using Microsoft Office Performance. Point Server § Collaborate and share Excel sheets, reports and dashboards using Office Share. Point Server 2007

Analysis Services Enhanced OLAP and Data Mining Capabilities Unified Dimension Model Metadata layer for

Analysis Services Enhanced OLAP and Data Mining Capabilities Unified Dimension Model Metadata layer for analytical applications Provides single version of truth for attributes and calculations Enable deep analysis using Aggregation Engine Advanced Business Intelligence Key Performance Indicators Drive processes using Actions Open Interface to End User Tools via Web Services plus… Decision Trees Clustering Time Series Naïve Bayes Sequence Clustering Association Neural Net • Logistic Regression • Linear Regression • Text Mining

OLAP Leadership Unprecedented 3 rd year as market growth leader Highest growth Rate in

OLAP Leadership Unprecedented 3 rd year as market growth leader Highest growth Rate in industry http: //www. olapreport. com

Reporting Services Enhanced Reporting Capabilities § Enterprise Reporting § Build Complex reports using integrated

Reporting Services Enhanced Reporting Capabilities § Enterprise Reporting § Build Complex reports using integrated Visual Designer § Present data from both relational and OLAP sources § Multiple delivery Options § Enable Adhoc reporting via Report Builder § Collaborate and share reports via with Office Share. Point Server § Provide reports over the Internet § Scalable Reporting Engine to drive reports of any size or complexity

Business Intelligence With Office 2007 Analyze data in Excel Reuse spreadsheets, reports, and documents

Business Intelligence With Office 2007 Analyze data in Excel Reuse spreadsheets, reports, and documents in dashboards and business applications Best with SQL Server Publish spreadsheets and documents to Share. Point Access and explore data in spreadsheets, reports, and documents in the Portal

Demo § End to End BI Demo

Demo § End to End BI Demo

Industry Accolades “The 2005 results reveal significant strengths for Microsoft SQL Server in the

Industry Accolades “The 2005 results reveal significant strengths for Microsoft SQL Server in the very large database (VLDB) market for both data warehouse (DW) and transaction processing (OLTP) systems. ” “They can scale and handle data warehouses of any size; we know that now, ”… “I routinely talk to SQL Server customers that have built 7 TB data warehouses or Windows Datacenter [systems] with 3, 000 users. ” Donald Feinberg VP & Distinguished Analyst “The recent release of the 2006 Gartner DW DBMS Magic Quadrant demonstrates the significant progress Microsoft has made in the Data Warehouse (DW) Database Management Systems (DBMS) market, moving from deep placement in the challenger’s quadrant to riding the line between leader and challenger. ”. "Microsoft's BI Tools revenue growth in 2005 was more than 25 percent, growing at more than twice the rate of the overall market. Microsoft has seen strong growth over the past several years as it has expanded and enhanced its database-embedded BI features …. Microsoft's impact on the BI tools market cannot be overemphasized. “ Dan Vesset

SQL Server 2005 TB+ Customers 2. 2 TB SAP Business Warehouse system In the

SQL Server 2005 TB+ Customers 2. 2 TB SAP Business Warehouse system In the process of migrating SAP R/3 OLTP to SS 2005 3 TB Retail Data Warehouse on HP Superdome Uses SQL Server OLAP, SSIS, SSRS 5. 1 TB Credit Card DW, 5 Mil card holders 300 power users, complex query, OLAP, SSIS, SSRS, Office BSM 4 TB Consumer Packaged Goods (CPG) DW 350 users, complex relational query, SSIS & SSRS 2 TB DW, originally migrated from Informix 300 users, complex query, OLAP & Data Mining 2 TB of Clinical Data & growing, 50 TB storage US Dept of Veterans Affairs, 1200+ facilities, 40+ OLAP Marts

Resources § Build Your Data Warehouse faster § § § Manage your Data Warehouse

Resources § Build Your Data Warehouse faster § § § Manage your Data Warehouse § § § Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server Managing Aggregations with SQL Server 2005 Indexed Views Scale Out using SQL Server 2005 for DW Workloads Microsoft Data Warehouse Deliver insights to your organization § § § Learn for free, SQL Server BI technologies Develop ETL packages with SQL Server Integration Services Project REAL — Business Intelligence in Practice Data Warehousing Webcasts – presented by Kimball Group Microsoft BI SQL Server Reporting Services SQL Server Analysis Services Predictive Analysis using Data Mining Learn more about Microsoft’s presence in BI & DW § § § Microsoft Data Warehouse customer references Microsoft BI Market Share info Winter Corp Survey results

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only.

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Appendix

Appendix

SQL Server Momentum – Winter Corp Survey § Great Progress in Winter Corporation’s Data

SQL Server Momentum – Winter Corp Survey § Great Progress in Winter Corporation’s Data Warehousing Top. Ten 2005 Survey § § Largest data warehousing entry at 19. 5 terabytes Eighth largest data warehousing database in the world Eight entries at or more than 4 terabytes The largest showing ever: § 23 data warehousing entries in all § Five times more data warehousing entries compared with previous round http: //www. microsoft. com/sql/prodinfo/compare/wintercorp-survey. mspx

Customer: Cox Communications Leading network service provider – telephony, cable TV, internet Customer Business

Customer: Cox Communications Leading network service provider – telephony, cable TV, internet Customer Business Challenge Event data logged differently for various areas of network Dispersed among separate databases, Oracle, My. SQL and SQL Server Needed an integrated BI platform for all data Needed a single version of network health Solution Developed Net. Mon based on the SQL Server BI platform SQL SSIS to integrate heterogeneous data into a single data warehouse Analysis Services for building and analyzing OLAP cubes Reporting Services to instantly report on network health Customer Results/Benefits Instant insight to health of entire network Operations now 30 -50% more efficient Improved network troubleshooting and manageability Developers not taxed with report generation Faster delivery of all types of reports “Net. Mon should helps us be 30 -50% more efficient…we’ll be able to shift a good amount of energy toward new proactive projects” –Tim Winebarger, Manager NOC Tools Group, Cox Comm.

Project Objective Capture transaction level sales and store stock position to increase sales and

Project Objective Capture transaction level sales and store stock position to increase sales and improve profitability Customer Business Challenge Instances of store out of stock Inappropriate stock levels Difficult to perform analysis of large data volumes Overburdened operational systems Solution: Microsoft BI End to end business intelligence with: SQL Server, Reporting Services, Analysis Services, Integration Services and Data Mining 1. 6 TB Raw Data, 64 -bit, 20 -way server Customer Results/Benefits 50% reduction in development time over Oracle SQL Server Integration Services shows up to 400% performance improvement Deliver more meaningful, up-to-date info

Customer: Comp. USA Largest North American Retailer Customer Business Challenge 150+ disparate, disconnected legacy

Customer: Comp. USA Largest North American Retailer Customer Business Challenge 150+ disparate, disconnected legacy systems Integrated reporting system for 200+ stores, districts, and corporate Maintain competitive advantage through customer innovation Agile and flexible infrastructure with immediate, demonstrable ROI Solution SQL Server data warehouse Analysis Services cubes with 15+ dimensions Pro. Clarity – Web-based reports for stores and desktop client for corporate Solution Decision Factors: Time to market and cost Customer Results/Benefits Single version of the truth for store managers and corporate executives Improved productivity for store managers Enhanced attention to margin drivers New opportunities aftermarket sales (technical assistance program) and better inventory tracking to reduce fraud ROI estimated at $6 million “We're looking for an ROI next year of about $6 million, and that's a very conservative estimate—one that just looks at taking advantage of the aftermarket sales these reports will help generate. ” –Cathy Witt, Vice President and CIO, Comp. USA

Edcon DW System § Scenario/Business § Top retailer in South Africa, 650 stores, 11

Edcon DW System § Scenario/Business § Top retailer in South Africa, 650 stores, 11 K employees § Diversified retailing conglomerate with centralized back-end ops § Multiple store lines § Won data warehouse from Oracle § Workload/Architecture § 9 TB of total data in DW § 5 production servers, multiple instances, based on retail store line § Growth of approximately 1 TB per month total § All ETL uses DTS § 4 TB single instance for customer data § Front-ended by SQL Server OLAP cubes § All ad hoc query workload with Pro. Clarity § 300 users total growing to 500 by YE 04 § DB Platform § Currently running on 32 -bit systems § Currently evaluating 64 -bit HW platforms § Storage plans to grow to 70 TB total, with 15 TB allocated for DW

Clalit Health Services Patient DW § Scenario/Business § Oldest and largest HMO in Israel,

Clalit Health Services Patient DW § Scenario/Business § Oldest and largest HMO in Israel, covers 60% (3. 7 mil) of the population § 15 hospitals, 1300 clinics, 400 pharmacies, 32 K employees (6 K doctors) § IBM Informix on time Sun Solaris migration in 2002 § 67% TCO savings, 25% perf improvements: 15 x in some OLAP queries § Workload/Architecture § 5 TB of total data, multiple instances § 1 TB in largest single SQL Server instance § Ad hoc query workload § 15 -20 concurrent users at this time, 20 concurrent in-flight queries often § OLAP cubes using Microsoft Analysis Services § ISV front-end app: Pro. Clarity § ETL: Ascential Software’s Datastage ETL § Patient data collected from 2 S/390, 2 AS 400, 150 Unix servers § Also runs main clinic system OLTP on SQL Server § Migrating 1000 local SQL Servers in clinics to central DB, 2 TB expected § DB Platform § Windows Server 2003, Unisys 8 -way, 64 -bit, 32 GB RAM, EMC SAN

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only.

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.