Andrzej Kukua Easy ETL with and SSIS Sponsors
- Slides: 29
Andrzej Kukuła Easy ETL with and SSIS
Sponsors
Session will begin very soon : ) Please complete the evaluation form from your pocket after the session. Your feedback will help us to improve future conferences and speakers will appreciate your feedback! Enjoy the conference!
About me 20+ years of professional experience in IT Trainer, Consultant Architect of HA/DR Solutions, BI Solutions, Data Platform Solutions, Cloud Solutions Building proficiency in AI/ML and Data Science MCSE, MCITP, MCTS Certified Polish Data Community (former PLSSUG) – PASS Chapter Leader Conference and user group organizer, speaker Andrzej@Kukula. pl, @Andrzej. Kukula
Agenda The ETL SSIS and its problems Introduction to Biml. Script and its features Code generation described Interesting functionalities Digressions Demos Summary
The ETL is a significant cost in DW project But it‘s not going to disappear tomorrow It is still the foundation of many big Enterprise Data Warehouses Thus the goal: simplify ETL to lower cost SSIS is very good ETL solution but. . .
SSIS and its problems Manual package creation using visual editor + drag & drop Slow, boring, error prone Not adaptable to requirements (think about changing logic in 10/100/… packages) Not developer-friendly Not generic Version control is hardly possible No easy API No code templates/design patterns/DRY No CI/CD Should we really spend our time fighting this?
Example package
Example DTSX Is this reasonable amount of code to accomplish the task? …
Enter
What’s Biml? Business Intelligence Markup Language A really easy XML-based language to describe BI assets Connections, Tables, Views SSIS Packages, SSIS Projects Dimensions, Measure Groups, Cubes and more… Smart default values and behavior Don’t bother with SSIS metadata (most of the time) Easy to integrate with VC (TFS, Git, and more) Easy to organize code into reusable libraries Available for free in Visual Studio (SSDT) with Biml. Express (formerly BIDS Helper) Demo!
Example Biml elements Quite easy to remember, aren’t they?
Biml and Biml. Script Biml = XML Domain Specific Language Biml. Script = Biml + C#/VB code nuggets called template Demo!
Biml. Script We have full power of. NET at our disposal Execute code during compilation Read files Access remote data Even create database objects (e. g. semi-temporary tables) Reference external. NET assemblies Supports including and calling other Biml files Extensive. NET Biml API built-in
Biml. Script expansion First, Biml. Script is expanded to Biml Then Biml is compiled to DTSX
The process • Biml + Biml. Script source files • C#/VB compiler • Biml. Script expansion to Biml • Population of in-memory Object Model • Single, in-memory, expanded, compiled and merged Biml Code generator …
Biml compilation Biml is compiled to DTSX + DTProj The process is repeatable Can be directly opened in SSDT (BIDS) Execution using DTExec / SSISDB / SSDT èNo additional runtime or runtime license required to execute package Compilation is just automation of SSIS The same rules apply Objects must exist in database (so that code generator can create SSIS metadata and mappings)
Root. Node The. NET object which represents all assets and metadata Populated during parsing of Biml files Useful in Biml. Script Allows creation of very dynamic solutions Root. Node. Connections Root. Node. Databases Root. Node. Tables Root. Node. Packages and many more
Code Reuse Can be used anywhere in Biml file Expands file given in argument and return Biml string Can specify additional arguments Also Call. Biml. Script. With. Output Include (expand) specified Biml file (or. NET resource) before current file
Interesting features <Annotation></Annotation> Lightweight metadata, can be associated with any asset Documentation annotations Interesting use case: SQL extended properties Object. Tag Property of all Biml. Script. NET objects Accessed programmatically from Biml. Script Very convenient and extensible object storage
Extending Biml. Script object model Contains plain C# code (no templating) Can interact with Biml object model Can use extended methods Can define new classes We can subclass original Biml classes to change/extend their behavior
More features Transformers and „Frameworks” Ability to change Biml after it is already expanded Can „reverse engineer” DTSX packages into Biml It’s possible to write custom code generators Different target runtime than DTSX
Source object types (digression) Views in custom schema, e. g. BI. Person Isolation Doesn’t need to modify production objects Can be in different database than production on the same instance (if absolutely no modification of production is allowed) Abstraction Rename columns Convert data, create new columns Model data – e. g. join with other sources Filter data Add extended properties Security Dedicated ETL user can be granted access only to views in this schema, doesn’t need access to underlying tables
Separation of duties External metadata Business analysts don’t touch the packages Programmers don’t model the business Easy storage MDS – Excel client! Extended properties custom
Complete demo Complete ETL solution Storage of metadata (connections, mappings) outside of code Dynamic import of database table definitions Dynamic generation of stage tables Annotations and SQL extended properties as a metadata source Execute SQL during Biml. Script expansion Logical. Display. Folder as a metadata source Conditional logic Type conversion on the fly Nonstandard column mapping Master package Dynamic package project Dynamic rename of objects during ETL Package to create stage tables Tiers MSBuild Demo!
How to benefit it? BI Project Decision Makers Faster initial delivery Lower cost of change Immediate Biml/Mist ROI BI Architects Reusable Design Patterns with Biml/Biml. Script One project compatible with SQL Server 2005 -2014 Easily manage large BI code base, tasks, and issues using TFS Plan for BI Continuous Integration/Continuous Delivery
How to benefit it? ETL developers Fast Biml learning curve Generate your DTSX faster with Biml instead of drag’n’drop Embrace DRY in BI development Use proper version control to manage your sources BI consultants Increased productivity Build your Biml/Biml. Script code library and reuse it in different projects No runtime license costs for customers
Questions? Andrzej@Kukula. pl @Andrzej. Kukula
- Inductive reasoning definition
- Inductive vs deductive geometry
- Every quiz has been easy. therefore the quiz will be easy
- Laws governing medicare parts c and d
- Etl design and development
- Etl in data cleaning and preprocessing stands for
- Types of sponsors
- Thank you to all the sponsors
- Rmit fc
- Calgary flames sponsors
- Jeopardy sponsors
- Notre dame suter
- It etl
- Power bi etl
- Etl prosessi
- Etl vs middleware
- Data modeling best practices for data warehousing
- Source:systems, process
- Enterprise etl
- White rabbit ohdsi
- Etl service manager
- Golden gate etl
- Etl process flow
- Acronimo etl
- Metadata driven etl framework
- Etl extract transform load
- Realtime etl
- Herramienta etl
- Etl components
- Airflow embulk