DAX Gently A fast paced but sequential approach
DAX Gently A fast paced but sequential approach to learn Data Analysis e. Xpressions Alnis Bajars. alnis@bajars. com @alnisb
Agenda DAX Gently Recap of Advantages of Power Pivot Fundamentals of Power Pivot Environment Calculated Columns First Pivot Table ex Power Pivot Calculated Fields aka Measures Functions and their use NOTE: Download Adventure Works DW 2014 (or recent version) to do the demos yourself. Load Fact. Internet. Sales, selected Related Tables, select Product Sub. Cat and Cat.
Assumptions So we don’t get bogged down. • Excel 2013 only (not 2010), and you can enable Power Pivot. (Note: Power BI Desktop emerging) • You know where Power Pivot fits into Power BI • You know the difference between Personal BI, Team BI and Corporate BI • You know the basic Power Pivot environment. • You know how to acquire data in Power Pivot • You understand the concept of the Data Model
Hot off the Press! Power BI Desktop coming July 24. • Power BI Designer Preview going live • Separates Power BI from Excel (corporate friendly) • No impediment to using 64 bit version http: //blogs. msdn. com/b/powerbi/archive/2015/07/10/announcing-power-bigeneral-availability-coming-july-24 th. aspx
References Power Pivot Succinctly by James Beresford • Best possible price • Great overview of Power Pivot ecosystem. Explains DAX well. Engine deep dive. DAX Patterns 2015 • By Marco Russo and Alberto Ferrari (The Italians) • Fast paced yet rigorous, does not over explain concepts Other Books by the Italians • SQL Server 2012 Analysis Services: The BISIM Tabular Model • Excel 2013: Building Data Models with Power. Pivot • Very rigorous and detailed, can be hard to digest Books by Rob Collie, Bill Jelen, Kapser de Jonge • Widely available on Kindle. Some easy to digest explanations, lacks rigour in places
Recap Advantages of Power Pivot
Advantages of Power Pivot • Breaks Excel 1 million row barrier • Fast and compressed x. Velocity (ex Vertipaq), cf SSAS Tabular) • Eclectic range of data sources • Data modelling, no cell references • No Vlookup! The streets are safe again. • Can do a lot of things Excel formulas can’t easily do. • Prototype SSAS Tabular
But be warned Looks like simple Excel functions. . But be warned. • Mismatch of expectations by Microsoft • Subtleties of row context and filter context are not easy to master • Not necessarily a power user tool And this is the focus of this presentation. Fast track over the bleeding obvious and focus on the more challenging concepts.
MDX vs DAX Power Pivot Succinctly by James Beresford.
Fundamentals of Power Pivot Environment
Getting Started • Enable Power Pivot in Excel 2013. • Power Pivot: Switch to Advanced Mode
Demo: Relationships View • Free us from the tyranny of cells and Vlookups • Auto create known relationships • Manually create relationships • Cardinality is important: 1 -many vs many-1 • No direct support for many-many, simple workarounds later • Multiple relationships – but only one active (no direct role playing)
Demo: Hierarchies • As per OLAP • Create in Diagram View • Either drag drop members, or right mouse click
Data Manipulation Notes • Limited # of data types. Text, Decimal, Whole, Currency, TRUE/FALSE • Compression good, except for Decimal/Currency • Note BLANK() function. Behaviour not always obvious for arithmetic and Boolean. • DIVIDE function handles div by zero, optional parameter to override BLANK(). Cleaner code and performs better than IF.
Functions: The Bleeding Obvious
Demo: Cleanse Data • Delete columns not needed cf. Filter at Preview cf filter at data source • Helper columns not needed by user: Hide From Client Tools • Format columns
Demo: Mark as Date Table • Essential prerequisite for advanced time intelligence functions • Design -> Mark as Date Table • Nominate a Unique Column • Also sort Month Name by Month Number of Year
Calculated Columns vs Calculated Fields Calculated Columns • Custom columns • Applies row by row , individual cells cannot be edited • By default, ROW CONTEXT Calculated Fields Called Measures in Excel 2010 (and Power BI Designer!) • Must be aggregates of some sort • By default, FILTER CONTEXT Warning • Filters can be enforced, overridden or transitioned • Context must be understood to understand many of the powerful functions. This is the KEY takeaway slide (hence Gold colour heading). .
Context Overview Row Context • • Custom columns Applies row by row Individual By default, ROW CONTEXT Filter Context Called Measures in Excel 2010 • Must be aggregates of some sort • By default, FILTER CONTEXT
Calculated Columns
Calculated Columns Basics An extra column! • Format [Column Name] = <Formula> Row Context • Applies row by row • See later contrasts with Filter Context Table Qualification • Can use [Field Name] • Can use Table[Field Name} for disambiguation cf SQL
Demo: Calculated Columns • Note the ways to name the column • Note references to other columns cf. “official tables” • Note behavior of aggregations
Demo: Reference Other Tables • Premature look at first functions • RELATED() when direction from many to 1 • RELATEDTABLE() when direction 1 to many • Will traverse relationships for as long as relationships of the same type • Sneak preview. CALCULATE auto works relationships 1 -many
EARLIER • Each pass of calculation engine creates a row context level • Acts like “CURRENTROW”, remembers previous row context • EARLIER(<column>, <skip_levels> ) • skip_levels defaults to 1 Eg. List. Price. Rank = COUNTROWS ( FILTER ( Dim. Product, Dim. Product[List. Price] > EARLIER ( Dim. Product[List. Price] ) )) + 1
First Pivot Table From Power Pivot
Demo: Create Pivot Table • Note all tables visible, except “Hide From Client Tools” • Otherwise a lot like Pivot Tables … so far
Calculated Fields And Filter Context
Demo: First Calculated Fields • Enter below the line • Note Pascal like assignment [Calculated Column] : = <Formula> • Enter from Calculated Fields button Note the default behaviour of filter context!
Aggregation Function Types Standard Functions • AVERAGE, MAX, MIN, SUM • Can only pass one column as an argument Append with A • AVERAGEA, MAXA, MINA • If text, returns 0 instead of error Append with X • • Solves the one column problem <Function>X(Table, Expression) SUMX( Sales, Sales[Order. Quantity] * Sales[Unit. Price] ) AVERAGEX, MAXX, MINX, SUMX
Count Functions • COUNT, COUNTA • COUNTX, COUNTAX • COUNTBLANK • COUNTROWS • COUNTDISTINCT
Essential Functions And their use
FILTER • Supply a table and filters • Returns table subset (cf. WHERE clause), still a table • FILTER(<table>, <filter 1>, <filter 2> …> Eg. Sum. Hi. Value. Prod : = SUMX( FILTER(Fact. Internet. Sales, Fact. Internet. Sales[Unit. Price]>100), Fact. Internet. Sales [Order. Quantity] * Fact. Internet. Sales [Unit. Price] )
CALCULATE • Probably the most important DAX function • Roll your own filter context, including none at all • Still obeys active filters • CALCULATE(<expression>, <filter 1>, <filter 2> …> Eg. [Sales 2011] : = CALCULATE ( SUM(Fact. Internet. Sales[Sales. Amount]), Dim. Date[Calendar. Year] = 2011)
ALL • Effectively removes filter each time a calculated field is executed Eg. [Sales 2011 ALL] : = CALCULATE ( SUM(Fact. Internet. Sales[Sales. Amount]), Dim. Date[Calendar. Year] = 2011, ALL(Dim. Date))
HASONEVALUE • TRUE if current context filtered to just one value, otherwise FALSE Eg. [Reseller. Sales compared to 2011] : = IF(HASONEVALUE(Date. Time[Calendar. Year]), SUM(Reseller. Sales[Sales. Amount])/ CALCULATE(SUM(Reseller. Sales[Sales. Amount]), Date. Time[Calendar. Year]=2011) , BLANK())
USERELATIONSHIP • Make one of multiple relationships between tables active • Workaround for lack of support for role playing dimensions Eg. [Sales by Ship Date] =CALCULATE(SUM(Fact. Internet. Sales[Sales. Amount]), USERELATIONSHIP(Fact. Internet. Sales[Ship. Date. Key], Dim. Date[Date. Key]))
First Look at Time Intelligence • Functions very difficult to do in Excel • Huge choice https: //msdn. microsoft. com/en-us/library/ee 634763. aspx Eg 1 [Sales. YTD] : = TOTALYTD( Sum(Fact. Internet. Sales[Sales. Amount]), Dim. Date[Full. Date. Alternate. Key] ) Eg 2 [PYSales] : = CALCULATE(Sum(Fact. Internet. Sales[Sales. Amount]), SAMEPERIODLASTYEAR(Dim. Date[Full. Date. Alternate. Key]) )
Many to Many Relationship A number of workarounds, eg. SUMMARIZE( Bridge_Account. Number, Dim. Account[ID_Account])
Summary Basic DAX simple and intuitive …. But there a lot of subtle tricks. Lots of in depth material from The Italians ie. Marco Russo and Alberto Ferrari Future Gently Talks If you think this format works well…. • M Gently (Power Query) • R Gently (Machine Learning/ Predictive Analysis) Alnis Bajars. Email: alnis@bajars. com Twitter: @alnisb
- Slides: 39