Introduction to OLAP and Analysis Services from Microsoft

  • Slides: 20
Download presentation
Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE) n n

Introduction to OLAP and Analysis Services from Microsoft (ONLY for INTERNAL USE) n n n Josef Schiefer IBM Watson Research Center josef. schiefer@us. ibm. com

What is OLAP? n n n Slide 2 Online Analytical Processing coined by EF

What is OLAP? n n n Slide 2 Online Analytical Processing coined by EF Codd in 1994 paper contracted by Arbor Software* Generally synonymous with earlier terms such as Decisions Support, Business Intelligence, Executive Information System OLAP = Multidimensional Database MOLAP: Multidimensional OLAP (Arbor Essbase, Oracle Express) ROLAP: Relational OLAP (Informix Meta. Cube, Microstrategy DSS Agent)

OLAP is FASMI n Fast n Analysis n Shared n Multidimensional n Information Nigel

OLAP is FASMI n Fast n Analysis n Shared n Multidimensional n Information Nigel Pendse, Richard Creath - The OLAP Report Slide 3

Cubes Slide 4 n A cube stores information in a multidimensional structure and is

Cubes Slide 4 n A cube stores information in a multidimensional structure and is the central object in a multidimensional database. n Each cube contains a set of dimensions and measures. n Dimensions are derived from the tables and columns in your data that provide the categories you want to analyze. n Measures are the quantitative data derived from your data columns

Dimensions n n Slide 5 The dimensions you build should be distinct categories you

Dimensions n n Slide 5 The dimensions you build should be distinct categories you want to add to cubes in your OLAP database. example: geography, time, or employee dimensions represented in the picture

Cube and Dimensions: Hierarchical summarization paths eg io n Product, Region, Time Product R

Cube and Dimensions: Hierarchical summarization paths eg io n Product, Region, Time Product R W S N Juice Cola Milk Cream Gel Soap 1 2 3 45 6 7 Slide 6 Month Product Industry Region Country Time Category Region Quarter Product City Office Year Month Week Day

Dimensions and Hierarchy n n Dimensions are the categories used to organize or describe

Dimensions and Hierarchy n n Dimensions are the categories used to organize or describe analysis information Dimensions are used to navigate the information and to summarize the details into more aggregate data. Frequently used dimensions include time periods, geography, products, organization, and so on. Often dimensions are hierarchical (World - Continents - Countries) Slide 7

Measures = numercial Values n n n Slide 8 Measures are the quantitative data

Measures = numercial Values n n n Slide 8 Measures are the quantitative data in an OLAP database. For example, values such as sales, budget, cost, and so on, are all examples of measures. Measure values are organized in data cubes according to dimensions

Aggregations n n Slide 9 Aggregations greatly improve query efficiency and response time. A

Aggregations n n Slide 9 Aggregations greatly improve query efficiency and response time. A cube can hold a number of aggregations. The aggregation amount is based on several factors - the size of the data, the amount of storage space you allocate for aggregation storage, the mode of storage you select, and how much you want to optimize the aggregations design.

Primary OLAP Problems n Rigid, inflexible architectures – MOLAP or ROLAP n Significant scalability

Primary OLAP Problems n Rigid, inflexible architectures – MOLAP or ROLAP n Significant scalability problems – Data explosion and sparsity – Poor distributed client/server implementation n Separation of data warehousing from OLAP tools – Lack of integration between user tools and OLAP n Difficult to prototype, develop, deploy – Time and expense Slide 10

MS-AS: Architecture Microsoft Analysis Services are optimized for all OLAP architectures and offers seamless

MS-AS: Architecture Microsoft Analysis Services are optimized for all OLAP architectures and offers seamless integration n MOLAP: MOLAP aggregations & details managed in an efficient multidimensional store ROLAP: ROLAP aggregations created in relational store HOLAP: HOLAP different things to different vendors – Aggregations: Aggregations details in relational, aggregations in MOLAP store – Partitions: Partitions single logical cube physically divided into multiple MOLAP and ROLAP partitions – Virtual cubes: cubes “view-like” join of multiple MOLAP and ROLAP cubes Slide 11

MS-AS: Scalability n MS-AS offer major innovation – Data explosion managed by partial preaggregation

MS-AS: Scalability n MS-AS offer major innovation – Data explosion managed by partial preaggregation – Automatic elimination of sparse storage n Partitioned cubes – parallel query processing across clustered servers – fine tuning of aggregations, to better manage performance and disk space trade-offs Slide 12

MS-AS: Scalability n Cooperative client/server query management and caching – network traffic minimized –

MS-AS: Scalability n Cooperative client/server query management and caching – network traffic minimized – server queries processed efficiently n Microsoft Data Cube Service – desktop component ships with next release of Office – used with Excel, Access, and Web – supports local, offline usage Slide 13

Microsoft Data Cube Service n Basic architecture: – Cache query results and metadata, not

Microsoft Data Cube Service n Basic architecture: – Cache query results and metadata, not disk pages. n Algorithms deduce missing data and transform queries – Aggregation – Filtering – Combination n Slide 14 Instant reply to cached queries

MS Data Cube Benefits n n n Slide 15 Efficient distribution of query and

MS Data Cube Benefits n n n Slide 15 Efficient distribution of query and calculation processing across client & server Single component spans Microsoft desktop and server platforms & products Unifies the MD data access story across Excel, MS-AS, and SQL Server Enables Microsoft to establish industry standard for MD data access Basis for MS-AS and Excel mobile story

MS-AS: Integration n The Microsoft Analysis Services integrate the maintenance of OLAP with the

MS-AS: Integration n The Microsoft Analysis Services integrate the maintenance of OLAP with the underlying data warehouse – Design the DW structure – Create the DW tables/cubes – Populate the DW tables/cubes – Maintain by incremental loads – Optimize by actual usage patterns – Manage users, scripts, usage, metadata – Multiple data sources (not just SQLS) Slide 16

MS-AS: Integration n OLE DB for OLAP & ADO MD – based upon existing

MS-AS: Integration n OLE DB for OLAP & ADO MD – based upon existing data access technology – establishes industry standard for MD data access n n Slide 17 OLE DB/ODBC enable MS-AS to access data in all major RDBMs Third party client applications

OLAP Problem: Complexity n OLAP products are traditionally difficult to configure, develop, and deploy

OLAP Problem: Complexity n OLAP products are traditionally difficult to configure, develop, and deploy – Arcane tools – Heavy consulting – Poor integration Slide 18

3 Tier Architecture & Components Client Tier • Excel • Active. X Controls •

3 Tier Architecture & Components Client Tier • Excel • Active. X Controls • Third Party Applications ADO MD OLE DB for OLAP DCube MS-AS Server Data Warehouse Tier • MS-AS Server • SQL Server DTS OLTP Source Tier • RDBMs Slide 20 MOLAP HOLAP OLE DB ROLAP • Data selection & navigation • Presentation and charting • What-if formulas • Client side caching • Desktop object model • Offline usage • Multidimensional calcs • MOLAP/ROLAP/HOLAP data Modeling/aggregations • Security • Metadata management • Server side caching • Administrative tools • Server object model • Query distribution

Let‘s go to the demonstration. . . Slide 21

Let‘s go to the demonstration. . . Slide 21