Arne Bartels SSIS Custom Pipeline Component A stepbystep
Arne Bartels SSIS Custom Pipeline Component A step-by-step guide #SQLSat. Denmark
BIG Thanks to SQLSat Denmark sponsors #SQLSat. Denmark
Agenda Who is Vestas Whoami Motivation Goal Loong Demo Outlook #SQLSat. Denmark
Who is Vestas, what are we doing #SQLSat. Denmark
§ § § Vestas is currently the world biggest wind turbine manufacturer Surveillance of 2400+ plants with 33000+ turbines Supercomputer Mindstorm § § § Datawarehouse/Data Mart server § § § § 16848 cores 41600 GB Memory 481. 788 TFlop/s currently Rank 440 Six servers Dev/Test/Prod ETL(”DSA”) DM(”DPA”) SQL 2016 Prod and Test 40 core 3. 5 GHz 1. 5 TB Memory Dev 20 core 3 GHz 256 GB Memory Storage raw 160 TB, all SSD, 800 TB effective capacity Hardware located in Copenhagen DK Core Team of a handfull SQL/SSIS-Developers from five countries mainly in Aarhus DK #SQLSat. Denmark
Whoami § Arne Bartels § Working for Vestas Wind Systems A/S in Aarhus DK. Since approx. 2007 as Lead Software Development Engineer § Software developer in wind industry since 2001 § Physicist by education, programmer by profession § Worked in C/C++, Pascal, Linux scripts/PHP, My. SQL… § Recent: T-SQL, SSIS, C# § arbar@vestas. com § arne. bartels 2@gmail. com § Linked. In #SQLSat. Denmark
Motivation § Inspired by real world challenge to import files into a DW § Make that automizable/Biml-able § Show you to overcome quite some of the tripwires § Motivate to play around § Show off what I can do #SQLSat. Denmark
Goal Build a reusable component to scan folders for file patterns within date ranges and batch sizes. Sort by write date and pathname. Optionally delete old data. #SQLSat. Denmark
What is a Custom Pipeline Component? § Class. Library (DLL) implementing the necessary interfaces for SSIS. § To use it in development of SSIS, it has to be copied at the ”right” place (DEV). § To be executed, it has to be added to the assembly (DEV and PROD), which requires admin rights. All the above has to be correct otherwise the SSIS-toolbox won’t show the component. There is no debug option; you have to try, and try and… #SQLSat. Denmark
What is needed? § Visual Studio e. g. 2015, Community edtion is enough § +SSDT for SSIS-package editing § Optionally Icon editor e. g. Gimp § Ideas and patience #SQLSat. Denmark
Documentation § § Microsoft: Developing a Custom Data Flow Component http: //blogs. perficient. com/microsoft/2016/04/developingcustom-ssis-component-9 -lessons-learned/ § http: //www. sqlis. com/sqlis/ § § https: //sqlbits. com/Sessions/Event 2/Extending_SSIS_with_custom_Da ta_Flow_components How to add an icon to your component SQLSaturday DK 2015 Wolfgang Strasser https: //www. red-gate. com/simple-talk/sql/ssis/developing-acustom-ssis-source-component/ #SQLSat. Denmark
Demo: how to get started Have a SSIS test project Start new project of type Class. Library Sign assembly Add Reference to Microsoft. SQLServer. Pipeline. Host. dll § Inherit class from Pipeline. Component § Add Dts. Pipeline. Component. Attribute § For convenience setup build events and debug § § #SQLSat. Denmark
The bare minimum #SQLSat. Denmark
Add Icon [Dts. Pipeline. Component(Display. Name = "Folder Source" , Description = "Display folder information as data stream" , Icon. Resource = "Folder_Source. Resources. Folder. Source. ico" , Component. Type = Component. Type. Source. Adapter )] To find the right string, ildasm is helpful #SQLSat. Denmark
Override design time methods … #region design time methods public override void Provide. Component. Properties() { base. Provide. Component. Properties(); Remove. All. Inputs. Outputs. And. Custom. Properties(); //todo: setup Custom. Properties an Output } public override DTSValidation. Status Validate() { return base. Validate(); } #endregion design time methods #SQLSat. Denmark
… and run time methods #region run time methods public override void Pre. Execute() { //do work } public override void Prime. Output(int outputs, int[] output. IDs, Pipeline. Buffer[] buffers) { //do output } //public override void Process. Input(int input. ID, Pipeline. Buffer buffer) //{ // no Input to process => nothing to do //} #endregion run time methods #SQLSat. Denmark
Add columns… #region Output Declaration IDTSOutput 100 output = Component. Meta. Data. Output. Collection. New(); output. Name = "Output"; output. Synchronous. Input. ID = 0; output. Delete. Output. On. Path. Detached = true; output. Is. Sorted = true; IDTSOutput. Column 100 output. Column = null; output. Column = output. Output. Column. Collection. New(); output. Column. Name = "Full. Name"; output. Column. Set. Data. Type. Properties(Data. Type. DT_WSTR, 260, 0, 0, 0); output. Column. Sort. Key. Position = 2; //. . . #endregion Output Declaration #SQLSat. Denmark
…and custom properties #region Properties Declaration IDTSCustom. Property 100 property = null; property = Component. Meta. Data. Custom. Property. Collection. New(); property. Name = "Folder. Patterns"; property. Description = "Files matching will be displayed. "; property. Expression. Type = DTSCustom. Property. Expression. Type. CPET_NOTIFY; property. Type. Converter = typeof(String). Assembly. Qualified. Name; property. Value = @""; //. . . #endregion Properties Declaration #SQLSat. Denmark
Debugging § Design time methods: straightforward, hit F 5 § Run time methods: Attach to process § Hit F 5, stop SSIS project at breakpoint, note processid § Go back Class. Library debug and ”Attach to process” Dts. Debug. Host. exe to the ”other” Process. ID. If needed use ”taskkill /f /im dtsdebughost. exe” in adm command prompt . #SQLSat. Denmark
Or use ”inline debugging” #region Members private bool fireagain = false; #endregion Members … public override void Pre. Execute() { Component. Meta. Data. Fire. Information(0, "Pre. Execute", "start", null, 0, ref fireagain); foreach (IDTSCustom. Property 100 property in Component. Meta. Data. Custom. Property. Collection) Component. Meta. Data. Fire. Information(0, "Property: "+ property. Name, property. Value, null, 0, ref fireagain); foreach (IDTSOutput 100 output in Component. Meta. Data. Output. Collection) { Component. Meta. Data. Fire. Information(0, "Output: "+output. Name, "has: "+output. Output. Column. Collection. Count. To. String()+" columns", null, 0, ref fireagain); foreach(IDTSOutput. Column 100 output. Column in output. Output. Column. Collection) Component. Meta. Data. Fire. Information(0, "Output column: " + output. Column. Name, output. Column. Data. Type. To. String(), null, 0, ref fireagain); } Component. Meta. Data. Fire. Information(0, "Pre. Execute", "end", null, 0, ref fireagain); } #SQLSat. Denmark
Read and store column mapping § Store all column indices in Key=>Value Array: private Sorted. List output. Column. Indices = null; … #region Output column index mapping output. Column. Indices = new Sorted. List(); try {//remember the indices of the actual remaining columns for later use IDTSOutput 100 output = Component. Meta. Data. Output. Collection["Output"]; if(output!=null) foreach (IDTSOutput. Column 100 output. Column in output. Output. Column. Collection) output. Column. Indices. Add(output. Column. Name, Buffer. Manager. Find. Column. By. Lineage. ID(output. Buffer, output. Column. Lineage. ID)); } catch (Exception e) { … } #endregion Output column index mapping #SQLSat. Denmark
Read and store custom properties private List<String> folder. Patterns = new List<String>(); … #region Properties mapping try { foreach (IDTSCustom. Property 100 property in Component. Meta. Data. Custom. Property. Collection) { switch (property. Name) { case "Folder. Patterns": folder. Patterns. Add. Range(((String)property. Value). Split('|')); break; … } } } catch (Exception e) { … } #endregion Properties mapping #SQLSat. Denmark
Parse folder structure recursively § Split folder strings of the form C: folder 1folder 2** recursively and check for each recusing step matching subfolders until leading string empty. E. g. C: folder 1folder 2**, . folder 1folder 2**, C: folder 1 **, C: folder 1folder 2folder 3 afolder 4 aa C: folder 1folder 2folder 3 afolder 4 ab *, C: folder 1folder 2folder 3 b C: folder 1folder 2folder 3 afolder 4 ba C: folder 1folder 2folder 3 afolder 4 bb § § Then check for matching files after coming back from recursion. Sort after date #SQLSat. Denmark
Outlook § Add customized userinterface (Windows Forms) § Add connection manager (again with Windows Forms) § Add other source systems ftp/sftp/… #SQLSat. Denmark
Wanted! Data Engineering & Analytics is looking for consultants in the greater DK area, preferably close to Aarhus DK. To work with SQL/SSIS Data warehousing and supercomputing. Contact: maqis@vestas. com sedha@vestas. com #SQLSat. Denmark
BIG Thanks to SQLSat Denmark sponsors #SQLSat. Denmark
- Slides: 26