Building a Dynamic OLAP Environment Dr John Tunnicliffe

Building a Dynamic OLAP Environment Dr. John Tunnicliffe Independent BI Architect Mobile: 07771 818770 john@bovi. co. uk www. bovi. co. uk

Sparks from the Anvil l Blog URL l http: //sqlblogcasts. com/blogs/drjohn l See speaker profile on www. SQLBits. com l Tiny URLs in presentation to specific blog entries l e. g. tinyurl. com/4 yolm 6 k

Agenda l l Why build an dynamic OLAP environment? Technical architecture The build sequence Tools & techniques deep dive: l l l ! ! ! g n n o i i s n s r e a s l W leve SSIS infrastructure Creating the relational database structure Handling extension tables OLAP cube deployment Modifying cube attributes Dealing with unnatural hierarchies 0 40

Introducing the OLAP data mart l l Data Mart = subset of data to meet specific business needs OLAP = specific technology allowing users to access data OLAP data mart = specific subset of data to meet specific business needs + presented using correct business terminology and attributes + delivered using OLAP: allows data to be analysed using familiar tools such as Excel or 70+ other client tools (see www. ssas-info. com: client tools) KIS – keep it simple!

Why build a Dynamic OLAP Environment? l Market Data / Market Research Providers l l l Sell data specific to each client or research group Require cube to support analysis by special attributes Add value to data by providing better analysis tools e. g. IMS Health Portfolio Analysis in Financial Services l l Institutional investors Private clients / high net worth individuals Traders e. g. Bank of America / Merrill Lynch

Why build a Dynamic OLAP Environment? Characteristics of the need l l Common core data model Common analysis needs l Ultimately can design and build l l Basic database structure Basic cube ‘template’ l Users with specific need to extend core data model l Multiple core data models l l Multiple ‘templates’ Specific security concerns

Technical Architecture l l All databases created on-the-fly by control system Single database used for staging and data mart

ETL Architecture

Overall Build Package

Tools & techniques deep dive l l l SSIS infrastructure Creating the relational database Handling extension tables l l name/value pairs OLAP cube deployment Modifying OLAP cube attributes Dealing with unnatural hierarchies

SSIS Infrastructure SSIS Logging l l Log activity to control database Implement custom logging using event handlers l l Write customer order number to log to allow analysis See Jamie’s blog: tinyurl. com/3 c 4 dg 2 g

SSIS Infrastructure Setting Package Configurations - Options 1. Use dtexec. exe command line options l 2. Use standard package configurations held in database l l 3. Problem with overly long and complex command lines Easy to alter connection strings using SQL Limited to one simultaneous build Invoke SSIS using custom C# program l l l Use SSIS API to set connection strings etc. Provides fully scalable solution – multiple simultaneous builds Allows for full logging (see dtloggedexec. codeplex. com)

SSIS infrastructure Package configurations l l l Local dts. Config file provides connection to control database All others read from SSISConfiguration table in control database Connection strings altered by Prepare. Order SQL stored proc

Creating databases with SSIS

Creating databases with SSIS l Use connection manager to master database l l Create database in SIMPLE recovery mode l l l Connect with sysadmin privileges Point of failure recovery is unnecessary Unwilling to incur overhead of fully recoverable transaction log Use expression on SSIS variable to create SQL IF EXISTS (SELECT name FROM sys. databases WHERE name = 'My. Database') DROP DATABASE My. Database; GO CREATE DATABASE My. Database GO ALTER DATABASE My. Database SET RECOVERY SIMPLE GO

Creating the database structure

Creating the database structure Managing the SQL Scripts l Using SSMS database projects l l Use numbers in file names to set order Use batch file to concatenate files l l l Use type command Ensure files have carriage return at end! Write to file using redirection > and >> See my blog: tinyurl. com/67 wueu 5

Creating the database structure Executing the SQL Script using SSIS l Use Execute SQL task to execute concatenated script l l l Set SQL Source Type to File connection Set package configuration on File. Connection manager Script creates all l l Schemas Tables Views Stored procs etc. See my blog: tinyurl. com/67 wueu 5

Handling extension tables

Handling extension tables The design problem l The Problem l Users want an extensible data model l SSIS ‘breaks’ whenever the input file or table schema changes Solution 1: l Dynamically generate SSIS package in memory using C# l l i. e. add attributes to any dimension Costly development & hard to maintain Solution 2: l l Import data as name/value pairs PIVOT data using dynamically generated SQL Hold pivoted data in separate tables, linked to core table by ID ‘Sit’ cube on views and dynamically change views

Handling extension tables Target table structure

Handling extension tables Importing name/value pairs l l Use Unicode file formats for international support Input file & matching table contains three columns l l l ID Name Value ID identifies record from core table Does not ‘break’ SSIS

Handling extension tables Metadata l Table metadata identifies: l l l extension table pivot table ID column name dimension view dimension name Field metadata identifies: l l l extension table field names data type of each field include in cube include as cube property

Handling extension tables Table to hold pivoted data l Option 1: l l Dynamically create CREATE TABLE statement Option 2: l ‘Stub’ table already exists e. g. l Dynamically create ALTER TABLE statement

Handling extension tables Pivoting name/value pairs l Option 1: l l Use stored proc to read metadata & dynamically generate INSERT statements Write INSERT statements to a SQL script file Have SSIS execute the generated SQL script file Option 2: l Use stored proc to read metadata & dynamically generate INSERT INTO … SELECT … PIVOT statements

Handling extension tables Pivoting name/value pairs l Query metadata and dynamically generate PIVOT SQL l l Note: MAX function works with strings too! Generate SQL dynamically or hold stub of PIVOT query in table INSERT INTO dbo. Event. Pivot. Ext (Event. ID, FIELD_LIST) SELECT Event. ID, FIELD_LIST FROM dbo. Event. Ext ps PIVOT ( MAX(Field. Value) FOR Field. Name IN ( FIELD_LIST )) AS pvt;

Handling extension tables Data architecture l Cube ‘sits’ on views l l l One view for each dimension & fact Use vw. Dim & vw. Fact prefixes Dimension views l l Denormalise table structure Views can be dynamically changed to include extension attributes
![Handling extension tables Dynamically altering views l Obtain current view definition SELECT @cmd=[definition] FROM Handling extension tables Dynamically altering views l Obtain current view definition SELECT @cmd=[definition] FROM](http://slidetodoc.com/presentation_image_h2/d87a95b285cc4fa400496810950dfc1b/image-28.jpg)
Handling extension tables Dynamically altering views l Obtain current view definition SELECT @cmd=[definition] FROM sys. sql_modules WHERE OBJECT_NAME([object_id])=@Dimension. View. Name; l l Read metadata tables to obtain FIELD_LIST Insert FIELD_LIST before FROM operator Add LEFT JOIN to pivoted extension table Run commands l l DROP VIEW CREATE VIEW

OLAP cube manipulation

OLAP cube manipulation SSIS package to Deploy, Alter and Process OLAP database

OLAP cube manipulation Deploying the OLAP database l Background l Visual Studio (BIDS) solution consists of lots of small XML files l l . dim. cube. partitions Visual Studio (BIDS) builds a single. asdatabase file Use Analysis Services Deployment Wizard to deploy Deploy using C# Script Task l l No examples on internet of deploying OLAP database using C# Thomas Kejser’s blog had only example of automating process l l l Run Deployment Wizard in “answer mode” Use Deployment Wizard to generate XMLA Use Ascmd command to deploy See Thomas Kejser’s blog: tinyurl. com/5 wjjlke

OLAP cube manipulation Deploy using C# Script Task l Option 1: l l Preparation: Developer to create XMLA file using Analysis Services Deployment Wizard during ‘release’ process SSIS tasks reads XMLA and modifies XML to change server and database names Deploy XMLA using AMO Execute Option 2: l l SSIS tasks reads. asdatabase file, adds XMLA wrapper and modifies XML to change server and database names Deploy XMLA using AMO Execute

OLAP cube manipulation Deploy using C# Script Task l Working with AMO in SSIS Script Task l Add reference to Microsoft. Analysis. Services. dll l C: Program FilesMicrosoft SQL Server100SDKAssemblies Select File->Save All from menu!!!!! Connect to OLAP server using Microsoft. Analysis. Services; … using (Server svr = new Server()) { svr. Connect(olap. Server. Name); if (svr. Connected) { See my blog: tinyurl. com/6 jqta 8 q

OLAP cube manipulation Deploy using C# Script Task l l Create XMLA command wrapper in Xml. Document Read. asdatabase file into Xml. Document l l Merge into XMLA command Remove read-only attributes l l l l l Created. Timestamp Last. Schema. Update Last. Processed State Impersonation. Info. Security Current. Storage. Mode Processing. State Connection. String. Security See my blog: tinyurl. com/6 jqta 8 q

OLAP cube manipulation Deploy using C# Script Task l Modify XML to change l l l database name connection string to relational data mart Execute XMLA command Xmla. Result. Collection o. Results = svr. Execute(xmla. Command. Inner. Xml); Do not process cube yet! See my blog: tinyurl. com/6 jqta 8 q

OLAP cube manipulation Adding attributes to dimensions l l Read attribute metadata with OLE DB source component Use script component as destination transformation l Simplifies data access Input columns

OLAP cube manipulation Adding attributes to dimensions l Pre. Execute() method l l Process. Input. Row(Attribute. Buffer Row) l l Connect to OLAP server Add / Remove attributes Post. Execute() method l Disconnect from OLAP server

OLAP cube manipulation Process. Input. Row logic l Check if the attribute appears in the DSV Data. Source. View dsv = db. Data. Source. Views[0]; … string table. Name = "dbo_" + Row. Dimension. View. Name; Data. Table data. Table = dsv. Schema. Tables[table. Name]; … Data. Column data. Column = data. Table. Columns[Row. Attribute. Name]; … data. Column = data. Table. Columns. Add(Row. Attribute. Name); data. Column. Max. Length = Row. Field. Length; dsv. Update();

OLAP cube manipulation Process. Input. Row logic l Find dimension Dimension dim = db. Dimensions. Find. By. Name(Row. Dimension. Name); l Check if the attribute appears in dimension Dimension. Attribute attr = dim. Attributes. Find. By. Name(attr. Name); … attr = dim. Attributes. Add(attr. Name); attr. Order. By = Order. By. Name; attr. Key. Columns. Add(new Data. Item(data. Table. Name, Row. Attribute. Name, Ole. Db. Type. Converter. Get. Restricted. Ole. Db. Type(data. Column. Data. Type))); dim. Update(Update. Options. Expand. Full | Update. Options. Alter. Dependents);

OLAP cube manipulation Process. Input. Row logic l Removing attributes Dimension. Attribute attr = dim. Attributes. Find. By. Name(attr. Name); … dim. Attributes. Remove(attr); dim. Update(Update. Options. Expand. Full | Update. Options. Alter. Dependents);

OLAP cube manipulation Process & backup OLAP database task l Process OLAP database Database db = svr. Databases. Find. By. Name(olap. Database. Name); … db. Process(Process. Type. Process. Full); l Backup OLAP database db. Backup(olap. Backup. File. Path, true); // allow overwrite

Dealing with unnatural hierarchies l Problem: l l SSAS user hierarchies easily broken by ‘bad’ data Often caused by same child having multiple parents l l Solution 1: l l Use key collection in SSAS Solution 2: l l Hierarchy seems logical to user as child taken in context with parents Use CHECKSUM to generate key Solution 3: l Use RANK() to generate key

Dealing with unnatural hierarchies Using RANK() SELECT DISTINCT RANK() OVER (ORDER BY Event. Class) AS Event. Class. Key, Event. Class, RANK() OVER (ORDER BY Event. Sub. Class, Event. Class) AS Event. Sub. Class. Key, Event. Sub. Class, RANK() OVER (ORDER BY Event. Description, Event. Sub. Class, Event. Class) AS Event. Description. Key, Event. Description FROM dbo. Event

Building a Dynamic OLAP Environment Dr. John Tunnicliffe Independent BI Architect Mobile: 07771 818770 john@bovi. co. uk www. bovi. co. uk
- Slides: 44