Analysis Services Choosing between Multidimensional and Tabular Helena
Analysis Services Choosing between Multidimensional and Tabular Helena Cabral - BI 4 ALL hcabral@bi 4 all. pt
Our Main Sponsors:
Say Thank you to Volunteers: § They spend their FREE time to give you this event. § Because they are crazy. § Because they want YOU to learn from the BEST IN THE WORLD.
Paulo Matos:
Pedro Simões:
André Batista:
Paulo Borges:
André Melancia:
Murilo Miranda:
Quilson Antunes:
5 Sponsor Sessions at 15: 10 § Don’t miss them, they might be getting distributing some awesome prizes! § § § Rumos BI 4 ALL Bold Int Cozy. Roc Pythian
Important Activities: WIT – Women in Technology § 15: 10 at Biz. Spark Room (Ground Floor) SQLClinic Challenges § 10: 00 DEV (Neil Hambly) § 11: 50 DBA (Uwe Ricken) § 17: 00 BI (Steph Locke)
Agenda § SSAS Evolution § Tabular: What’s it and how does it work? § Multidimensional vs Tabular § § § § Development experience Data modeling Scalability Query performance Process performance Security Client applications support § Conclusions
v 12 SQL Server 2014 v 11 SQL Server 2012 v 10 SQL Server 2008 (and R 2) v 9 SQL Server 2005 v 8 SQL Server 2000 v 7 SQL Server 7. 0 • First release with OLAP Services • MOLAP, ROLAP and HOLAP architectures • MDX query language • SSDT-BI • Power View for Multidimensional Models (using DAX) • x. Velocity engine • Power Pivot and Tabular Model • DAX language • Business Intelligence Semantic Model (BISM) • Improved Scalability and Performance • Attribute and User Hierarchies • Business Intelligence Development Studio (BIDS) • Unified Dimension Model (UDM) • Data Mining services • Improved dimension design (changing dimensions, parent-child, virtual dimensions) • Improved calculation engine (custom rollups, cell calculations) • Dimension security • Connectivity over HTTP SSAS Evolution
SSAS Evolution § Why two approaches? § Only one data modeling tool for different needs § Similar development experience § Easy migration between Self. Service and Corporate BI § New column-oriented and inmemory storage technology § Multidimensional concepts are still difficult to learn
What is it? How does it work? Does it still have dimensions and measure groups? SSAS TABULAR
SSAS Tabular § What is it? § Relational modeling § Uses x. Velocity which is an in-memory engine with column-oriented storage § Provides higher compression and faster in-memory operations such as aggregations § Supports for both MDX and DAX query languages § Available in SQL Server 2012/14 Enterprise and Business Intelligence editions § Also supports a Direct. Query mode
§ You can add calculated columns and measures to any table in the model § You can create hierarchies in any table in the model § You can partition any table in the model § Unlike Multidimensional models, there’s no difference between a dimension table and a measure group table. § Sometimes we don’t need a dimensional model! Dimensions vs Fact Tables
Which development tools can I use? How long is the learning curve? DEVELOPMENT EXPERIENCE
Development tools TABULAR § SQL Server Data Tools for Visual Studio § Open an existing Power Pivot model MULTIDIMENSIONAL § SQL Server Data Tools for Visual Studio
Development process TABULAR § Easy to wrap a model over raw sources or DW for prototyping purposes § Model can be created from a Self-Service experience § Smaller learning curve (Excel like) § Need to learn DAX MULTIDIMENSIONAL § Model should be created over a DW source § Not so easy prototyping § Higher learning curve § IT required § Wide knowledge base § Need to learn MDX
Development experience TABULAR § Workspace database directly related with the project § WYSIWYG experience, but also can make the development experience too slow § Single developer at a time (one model per solution) MULTIDIMENSIONAL § Need to be deployed and processed to server to see the results § Support for multiple developers
Does Tabular have the same advanced model options? Do I have to learn DAX? DATA MODELING
Data Modeling TABULAR MULTIDIMENSIONAL Relational Modeling: § Tables and more tables Dimensional Modeling: § Facts and Dimensions § Basic Relationships (…and advanced relationships through DAX calculations) § Advanced Relationships natively supported (parentchild, many-to-many, attribute relationships, …) § No role playing dimension (two tables can only have one active relationship) § Support multiple granularity § DAX calculations § Do not support multiple granularity in the same measure group § MDX calculations
Other (un)supported features TABULAR MULTIDIMENSIONAL § Some features (as Actions, Translations and Display Folders) are possible by editing the generated XMLA or by using BIDS Helper tool available at Code. Plex § Named-Sets, Scopes, Actions, Translations § Scopes can be defined by using calculated columns § Display Folders, custom format of measures, default values for dimensions attributes § Write-back, Data Mining, Custom Assemblies, Custom Rollups § Process behavior configuration
Relational model Sales. Amount. With. Reason : = CALCULATE ( SUM ( 'Internet Sales'[Sales Amount] ); FILTER ( Fact. Internet. Sales. Reason; CALCULATE ( COUNTROWS ( Dim. Sales. Reason ) > 0 ) ) ) § Many to many relationships are not supported by Tabular (natively) § But can be achieved by using some advanced DAX calculations! DAX Calculation DEMO #1
Relational model DAX Query EVALUATE SUMMARIZE ( CALCULATETABLE ( 'Product Inventory'; 'Product Category'[Product Category Name] = "Clothing" ); 'Product'[Product Name]; 'Date'[Date]; "Total Units Movement", [Total Units Movement] ) ORDER BY 'Product'[Product Name]; 'Date'[Date] § Calculating balances by using transactions § Need to sum all transactions up to a given day § A common solution is to create a snaphost table with the quantity (calculated in ETL) for each product/day § Calculate balances on the fly using Tabular/DAX! DEMO #2
Does Tabular support a large volume of data? SCALABILITY AND DEPLOY
Operation modes TABULAR MULTIDIMENSIONAL In-memory/cached: § All data is loaded in memory and all queries are answered from there MOLAP: § Data is pulled from relational store, is aggregated and stored in a multidimensional format Direct Query: § Bypass the in-memory model, queries are answered directly from database source (SQL Server only) ROLAP: § All data stay in the relational store and additiona objetcs are created for calculated aggregations
Model scalability TABULAR MULTIDIMENSIONAL § Data volume mostly limited to physical memory § Data volumes can scale to multiple terabytes § Small/medium size models § Medium/large size models Processor centric: § Memory is the most critical resource Disk centric: § Disk performance is important because is used as the primary data storage for preaggregated data § Disk performance is not an important factor
How fast is Tabular compared to Multidimensional? Is always faster? QUERY PERFORMANCE
Querying scenarios TABULAR § Always read columnar data from RAM § Aggregate data in memory § Second (warm) run is just as fast the first (cold) one MULTIDIMENSIONAL § With no aggregations defined: read atomic data from disk and aggregate data in memory § With predefined aggregations: § Cold Cache: read aggregated data from disk § Warm Cache: read aggregated data from RAM
Querying preparation TABULAR MULTIDIMENSIONAL § Use DAX for detailed reports § Use MDX or DAX for querying § Use MDX for aggregated data § Need to define aggregations § No special tuning is required § Benefits from warmup cache mechanisms Outperforms Multidimensional: § Report on low granularity data Outperforms Tabular: § Report on aggregated data with predefined aggregations on Warm Cache § Report on aggregated data with no predefined aggregations
MDX Query SELECT { [Measures]. [Discount Amount] , [Measures]. [Reseller Freight Cost] , [Measures]. [Reseller Gross Profit] , [Measures]. [Reseller Total Product Cost] , [Measures]. [Reseller Sales Amount] , [Measures]. [Reseller Tax Amount] } ON COLUMNS, NON EMPTY ( [Geography]. [City] *[Product] *[Sales Territory]. [Sales Territory Region] *[Promotion] ) ON ROWS FROM [Adventure Works] § Detailed queries are normally an headache § Most of the times, we opt to query directly the DW relational source § With Tabular, this detailed queries perform better! DEMO #3
Relational model § Distinct Count (DC) operations are expensive in Multidimensional § Moreover, need to create and maintain multiple measure groups (one for each DC measure) § With Tabular, DC measures performs better without need special modeling! DAX Query New. Customers : = CALCULATE ( DISTINCTCOUNT ( 'Internet Sales'[Customer. Key] ); FILTER ( ALL ( Date ); 'Date'[Date. Key] <= MAX ( 'Date'[Date. Key] ) ) ) - CALCULATE ( DISTINCTCOUNT ( 'Internet Sales'[Customer. Key] ); FILTER ( ALL ( Date ); 'Date'[Date. Key] < MAX ( 'Date'[Date. Key] ) ) ) DEMO #4
What are the process options? Is there any dependencies? PROCESS PERFORMANCE
Processing preparation TABULAR § Every table can be partitioned § No processing dependencies (one table can be processed without processing related tables) MULTIDIMENSIONAL § Only measure groups can be partitioned § Processing dependencies (need to process dimensions first) § Don’t support parallel partition processing § Support for parallel partition processing § Data in other tables/partitions can be queried during processing § Aggregations must be recalculated when dimensions processing occurs
Processing options TABULAR § Full process of a whole DB might consume too much memory § Process Defrag: rebuilds the table dictionary § Process Recalc: recalculates calculated columns, rebuilds relationships, user hierarchies and other internal engine structures MULTIDIMENSIONAL • Process Index: creates or rebuilds indexes and aggregations for all processed partitions
Does Tabular have the same security options as Multidimensional? SECURITY
Security TABULAR MULTIDIMENSIONAL § Don’t support cell security, only table and row level security • Advanced security features that can go up to cell-level § Drill through permission is set to all roles • It’s possible to set drill through permission at role level § Dynamic Security • Dynamic Security § Only supports visual totals • Supports both visual totals and non-visual totals § Use DAX to restrict access to Row Sets
Do I have to change my client applications? CLIENT APPLICATIONS SUPPORT
Client Applications Support TABULAR Query languages: MULTIDIMENSIONAL Query languages: § In Cached Mode: MDX and DAX § In Direct. Query mode: only DAX • MDX and DAX (as of SQL Server 2012 SP 1 CU 4) Client tool Language Microsoft Excel Pivot. Table MDX Power View DAX SQL Server Reporting Services MDX / DAX Performance. Point MDX Other applications (ADO. NET of OLE DB provider) MDX / DAX
So. . What do I choose? CONCLUSIONS
You should choose Multidimensional! § You have an extremely large cube that exceeds server memory § You need some of the features not supported by Tabular (even with DAX) § You have a complex model (budgeting/forecasting, multiple many -to-many, parent-child…) § You are an MDX expert § You hate workarounds § Your reporting needs are typically aggregated § You have all your models in Multidimensional mode and want to migrate to Tabular just because!
You should choose Tabular! § You have small/medium models that fit in server memory § You need a non-multidimensional model § You have simple models (and not need neither of the unsupported features) § You really never understood MDX (just give a try to DAX, although is not so simple than they say) § You are an enthusiastic about workarounds! § You have detailed and ad-hoc reports that takes to long to run § You have a new model with no special needs and it will be great if you can show it just after lunch (prototyping!)
Our Main Sponsors:
- Slides: 46