Your Data Any Place Any Time Data Warehouse
- Slides: 40
Your Data Any Place, Any Time Data Warehouse Platform
Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW faster – Manage your DW – Deliver Insights • Analysts Reviews • Customer References
Data Warehouse Challenges Build Inability to incorporate legacy systems and unstructured data Dirty data from sources Cant make current decisions based on old data Complex development tools Manage Deliver Cant keep pace with growing data volumes and users Concurrent workloads on the system Disappearing batch windows Unable to identify bottlenecks Low usage of data warehouse Steep learning curve for users Inability to gain insight from data Unable to collaborate and share data
Accelerate Business Insight with SQL Server 2008 for Data Warehousing Deliver Build Visual Development Environment Enterprise Data Integration platform Connectivity to heterogeneous data sources Clean and profile data End to End Business Intelligence Manage Enterprise Scalability Visibility across all data warehouse workloads Continuous Availability Simplified Manageability Deep analytical capability Enterprise Scale Reporting Close integration with Office System Collaborate and share information with Office Share. Point Server
Agenda • Data Warehouse Challenges • Microsoft Data Warehouse Platform – Build Your DW faster – Manage your DW – Deliver Insights • Analysts Reviews • Customer References
Build Your DW faster § BI Development Studio to build your data warehouse § Visual Development environment to build your reports, cubes and ETL packages § Powerful debugging capabilities § Development lifecycle support § SQL Server Integration Services provides an Enterprise Data Integration platform § § Connect to heterogeneous data sources Clean and validate your data Load directly into partitioned tables and cubes Enterprise Scalability to load larger volumes of data in smaller batch windows § Supports continuous loading with SQL Server Service Broker connectivity
BI Development Studio § Complete, Integrated development tool in BI Development Studio § Build end to end DW solution § § Create reports Build OLAP cubes and KPI Create Data Mining models Create your end to end data integration packages § Development Lifecycle support with Visual Studio TFS § Integrated into Visual Studio
Breakthrough ETL Capabilities § Change Data Capture (CDC) § Identifies operational changes § MERGE Transact-SQL keyword § Addresses common DW load scenarios § SQL Server Integration Services § § § Connectivity to any source Scalable pipeline architecture Comprehensive set of transformations Improved performance of lookup tasks DW smarts handling slowly changing dimensions and direct loading of cubes § Clean and validate using data mining and data quality § Built-in data profiling for source and destination § Extensible platform through pluggable custom tasks and Componentized development using. NET
Change Data Capture Generalize log-reading capability for all applications, notably DW • Improves DW scale OLTP Change Tables – Provide change information upfront for efficient processing of dimensions – Eliminate expensive lookups – DW can be loaded throughout the day or at specific intervals • Changes are captured and placed in change tables Data warehouse – Log based capture with minimal impact on source – Captures complete content of changes & maintains cross table consistency – Change Metadata stored as part of the changes – ETL process calls functions to retrieve changes – Works even across schema changes!
Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW faster § Manage your DW § Deliver Insights § Analysts Reviews § Customer References
Manageability § Partition your large data sets into manageable chunks using partitioning § Simplify aggregation of your data using Indexed views § Take advantage of simplified Index Management § Use Resource Governor to manage concurrent workloads § Use data compression to reduce the storage costs and increase query performance by reducing I/O § Use backup compression to reduce storage costs
Resource Governor SQL Server Backup OLTP Activity Admin Tasks Executive Reports Ad-hoc Reports High Admin Workload OLTP Workload Report Workload Min Memory 10% Max Memory 20% Max CPU 90% Admin Pool Application Pool § Ability to differentiate workloads § e. g. app_name, login § Per-request limits § Max memory % § Max CPU time § Grant timeout § Max Requests § Resource monitoring
Data Partitioning § Partitioning breaks a single object into multiple manageable pieces § § § Transparent to the application Allows easy management of very large tables and indexes The row is the unit of partitioning All partitions run on a single SQL Server database Table fully available while loading, indexing a new partition Create new and drop old partition quickly Sliding Window Time Obsolete Near-Line Read-Only Archived Read-Mostly Hot
Simplify Aggregate Management Using Indexed Views § Aggregations can be precomputed to minimize query execution time § Tables can pre-joined with stored results sets § Automatic query rewrite to use Indexed Views as appropriately reducing response times § Proactive manageability using Database Tuning Advisor to suggest appropriate views
Simplified Index Management § SQL Server 2008 includes Alter Index for online Index Management § Rebuild Indexes without taking data offline § Allows Indexes to be rebuilt at partition level § Significant Improvement in Adding and Dropping Clustered Indexes § Parallel Index Management § Identify new or unused Indexes using Missing Indexes feature – DMV and Show Plan
Manageability Tools § Management Studio to manage your entire BI infrastructure § Declarative Management Framework provides policy-based configuration management § Dynamics Management Views to get real time insight into the database § SQL Query Analyzer provides prescriptive guidance on tuning query performance § Management Objects allows integration into existing management infrastructure
Enterprise Scalability § Sophisticated Indexing functionality to query large amounts of data § Parallel Query Execution to handle large numbers of concurrent users § Partitioned Query Processing that enables queries to span multiple partitions § Increase query performance in large data warehouses by enabling Star Schema Query Optimizations
Star Join Improvement § Pain points: § Slow star join query with grouping and aggregation § No fact table partition elimination through join on date dimension. § Bad cardinality estimates due to “Gatekeeper row” problem: presence of one row in filtered result from dimension table drastically affects # of matching rows in fact table.
Star Join Enhancement Features • Improved recognition of star join pattern in query (available in some cases since SQL Server 7. 0) • Enhanced rules to generate query plan alternatives for Star Join • Improved costing of Star Join plans • Multiple bitmap filter query execution strategy added
Data Compression § Key goal: shrink DW fact tables § Secondary goal: improve query performance § Less I/O (but higher CPU) § Higher buffer hit rates § 2 X to 7 X compression ratio for real DW fact data anticipated, depending on data § Orthogonal to other features § Works for data and indexes § Techniques: page level dictionary, differential encoding, variable-length encoding of ints, dates, don’t store 0 s.
High Availability § Use Snapshot Isolation to take read snapshots of your data § Use Database Mirroring to minimize data loss and reduce down time § Backup and Restore partitions to reduce downtime
Agenda § Data Warehouse Challenges § Microsoft Data Warehouse Platform § Build Your DW faster § Manage your DW § Deliver Insights § Analysts Reviews § Customer References
Deliver Insights § Deep Analytical Capabilities using SQL Server Analysis Services § § Unified Dimension Models Key Performance Indicators Predictive Analysis using integrated Data Mining Scalable Analytical Engine that can handle large analytical data sets with larger number of users § Present the information to the organization using SQL Server Reporting Services § Ad-hoc reporting using Report Builder § Supports OLAP and relational sources § Deliver rich information via Office System 2007 § Better Integration with Excel 2007, including full predictive analysis through data mining § Rich Visualization using Microsoft Office Performance. Point Server § Collaborate and share Excel sheets, reports and dashboards using Office Share. Point Server 2007
Analysis Services Enhanced OLAP and Data Mining Capabilities Unified Dimension Model Metadata layer for analytical applications Provides single version of truth for attributes and calculations Enable deep analysis using Aggregation Engine Advanced Business Intelligence Key Performance Indicators Drive processes using Actions Open Interface to End User Tools via Web Services plus… Decision Trees Clustering Time Series Naïve Bayes Sequence Clustering Association Neural Net • Logistic Regression • Linear Regression • Text Mining
OLAP Leadership Unprecedented 3 rd year as market growth leader Highest growth Rate in industry http: //www. olapreport. com
Reporting Services Enhanced Reporting Capabilities § Enterprise Reporting § Build Complex reports using integrated Visual Designer § Present data from both relational and OLAP sources § Multiple delivery Options § Enable Adhoc reporting via Report Builder § Collaborate and share reports via with Office Share. Point Server § Provide reports over the Internet § Scalable Reporting Engine to drive reports of any size or complexity
Business Intelligence With Office 2007 Analyze data in Excel Reuse spreadsheets, reports, and documents in dashboards and business applications Best with SQL Server Publish spreadsheets and documents to Share. Point Access and explore data in spreadsheets, reports, and documents in the Portal
Demo § End to End BI Demo
Industry Accolades “The 2005 results reveal significant strengths for Microsoft SQL Server in the very large database (VLDB) market for both data warehouse (DW) and transaction processing (OLTP) systems. ” “They can scale and handle data warehouses of any size; we know that now, ”… “I routinely talk to SQL Server customers that have built 7 TB data warehouses or Windows Datacenter [systems] with 3, 000 users. ” Donald Feinberg VP & Distinguished Analyst “The recent release of the 2006 Gartner DW DBMS Magic Quadrant demonstrates the significant progress Microsoft has made in the Data Warehouse (DW) Database Management Systems (DBMS) market, moving from deep placement in the challenger’s quadrant to riding the line between leader and challenger. ”. "Microsoft's BI Tools revenue growth in 2005 was more than 25 percent, growing at more than twice the rate of the overall market. Microsoft has seen strong growth over the past several years as it has expanded and enhanced its database-embedded BI features …. Microsoft's impact on the BI tools market cannot be overemphasized. “ Dan Vesset
SQL Server 2005 TB+ Customers 2. 2 TB SAP Business Warehouse system In the process of migrating SAP R/3 OLTP to SS 2005 3 TB Retail Data Warehouse on HP Superdome Uses SQL Server OLAP, SSIS, SSRS 5. 1 TB Credit Card DW, 5 Mil card holders 300 power users, complex query, OLAP, SSIS, SSRS, Office BSM 4 TB Consumer Packaged Goods (CPG) DW 350 users, complex relational query, SSIS & SSRS 2 TB DW, originally migrated from Informix 300 users, complex query, OLAP & Data Mining 2 TB of Clinical Data & growing, 50 TB storage US Dept of Veterans Affairs, 1200+ facilities, 40+ OLAP Marts
Resources § Build Your Data Warehouse faster § § § Manage your Data Warehouse § § § Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server Managing Aggregations with SQL Server 2005 Indexed Views Scale Out using SQL Server 2005 for DW Workloads Microsoft Data Warehouse Deliver insights to your organization § § § Learn for free, SQL Server BI technologies Develop ETL packages with SQL Server Integration Services Project REAL — Business Intelligence in Practice Data Warehousing Webcasts – presented by Kimball Group Microsoft BI SQL Server Reporting Services SQL Server Analysis Services Predictive Analysis using Data Mining Learn more about Microsoft’s presence in BI & DW § § § Microsoft Data Warehouse customer references Microsoft BI Market Share info Winter Corp Survey results
© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
Appendix
SQL Server Momentum – Winter Corp Survey § Great Progress in Winter Corporation’s Data Warehousing Top. Ten 2005 Survey § § Largest data warehousing entry at 19. 5 terabytes Eighth largest data warehousing database in the world Eight entries at or more than 4 terabytes The largest showing ever: § 23 data warehousing entries in all § Five times more data warehousing entries compared with previous round http: //www. microsoft. com/sql/prodinfo/compare/wintercorp-survey. mspx
Customer: Cox Communications Leading network service provider – telephony, cable TV, internet Customer Business Challenge Event data logged differently for various areas of network Dispersed among separate databases, Oracle, My. SQL and SQL Server Needed an integrated BI platform for all data Needed a single version of network health Solution Developed Net. Mon based on the SQL Server BI platform SQL SSIS to integrate heterogeneous data into a single data warehouse Analysis Services for building and analyzing OLAP cubes Reporting Services to instantly report on network health Customer Results/Benefits Instant insight to health of entire network Operations now 30 -50% more efficient Improved network troubleshooting and manageability Developers not taxed with report generation Faster delivery of all types of reports “Net. Mon should helps us be 30 -50% more efficient…we’ll be able to shift a good amount of energy toward new proactive projects” –Tim Winebarger, Manager NOC Tools Group, Cox Comm.
Project Objective Capture transaction level sales and store stock position to increase sales and improve profitability Customer Business Challenge Instances of store out of stock Inappropriate stock levels Difficult to perform analysis of large data volumes Overburdened operational systems Solution: Microsoft BI End to end business intelligence with: SQL Server, Reporting Services, Analysis Services, Integration Services and Data Mining 1. 6 TB Raw Data, 64 -bit, 20 -way server Customer Results/Benefits 50% reduction in development time over Oracle SQL Server Integration Services shows up to 400% performance improvement Deliver more meaningful, up-to-date info
Customer: Comp. USA Largest North American Retailer Customer Business Challenge 150+ disparate, disconnected legacy systems Integrated reporting system for 200+ stores, districts, and corporate Maintain competitive advantage through customer innovation Agile and flexible infrastructure with immediate, demonstrable ROI Solution SQL Server data warehouse Analysis Services cubes with 15+ dimensions Pro. Clarity – Web-based reports for stores and desktop client for corporate Solution Decision Factors: Time to market and cost Customer Results/Benefits Single version of the truth for store managers and corporate executives Improved productivity for store managers Enhanced attention to margin drivers New opportunities aftermarket sales (technical assistance program) and better inventory tracking to reduce fraud ROI estimated at $6 million “We're looking for an ROI next year of about $6 million, and that's a very conservative estimate—one that just looks at taking advantage of the aftermarket sales these reports will help generate. ” –Cathy Witt, Vice President and CIO, Comp. USA
Edcon DW System § Scenario/Business § Top retailer in South Africa, 650 stores, 11 K employees § Diversified retailing conglomerate with centralized back-end ops § Multiple store lines § Won data warehouse from Oracle § Workload/Architecture § 9 TB of total data in DW § 5 production servers, multiple instances, based on retail store line § Growth of approximately 1 TB per month total § All ETL uses DTS § 4 TB single instance for customer data § Front-ended by SQL Server OLAP cubes § All ad hoc query workload with Pro. Clarity § 300 users total growing to 500 by YE 04 § DB Platform § Currently running on 32 -bit systems § Currently evaluating 64 -bit HW platforms § Storage plans to grow to 70 TB total, with 15 TB allocated for DW
Clalit Health Services Patient DW § Scenario/Business § Oldest and largest HMO in Israel, covers 60% (3. 7 mil) of the population § 15 hospitals, 1300 clinics, 400 pharmacies, 32 K employees (6 K doctors) § IBM Informix on time Sun Solaris migration in 2002 § 67% TCO savings, 25% perf improvements: 15 x in some OLAP queries § Workload/Architecture § 5 TB of total data, multiple instances § 1 TB in largest single SQL Server instance § Ad hoc query workload § 15 -20 concurrent users at this time, 20 concurrent in-flight queries often § OLAP cubes using Microsoft Analysis Services § ISV front-end app: Pro. Clarity § ETL: Ascential Software’s Datastage ETL § Patient data collected from 2 S/390, 2 AS 400, 150 Unix servers § Also runs main clinic system OLTP on SQL Server § Migrating 1000 local SQL Servers in clinics to central DB, 2 TB expected § DB Platform § Windows Server 2003, Unisys 8 -way, 64 -bit, 32 GB RAM, EMC SAN
© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
- What is data mining and data warehousing
- Contoh data warehouse dan data mart
- Data warehouse components
- Contoh data mart pada departemen penjualan
- Introduction to data warehouse
- Arsitektur data mining
- Data mining dan data warehouse
- Data warehouse and olap technology for data mining
- What is data acquisition in data warehouse
- Data warehouse vs data mart
- Two tier architecture of data warehouse
- Data warehouse dan data mining
- Datamart olap
- Millions billions trillions chart
- What is disturbance that transfers energy
- A wave is a disturbance that transfers energy
- Start time end time and elapsed time
- Give us your hungry your tired your poor
- There is there are
- Any to any connectivity
- Seknder
- Visio sql
- Collier county schools data warehouse
- Sql server parallel data warehouse
- Populating data warehouse
- Epm data warehouse
- Olam in data warehouse
- Mount sinai data warehouse
- Kimball bus architecture
- Data warehouse requirements gathering template
- Azure sql data warehouse
- Azure sql data warehouse mpp
- Data warehouse empresarial
- Indexing in data warehouse
- Query driven approach in data warehouse
- Data warehouse and olap technology
- Basic concept of data warehousing
- Slide data warehouse
- Benefits of a data warehouse
- Building blocks of data warehouse
- Pengertian data warehouse