Custom Auditing in SSIS Meghana Vasavada Contents A

  • Slides: 39
Download presentation
Custom Auditing in SSIS - Meghana Vasavada

Custom Auditing in SSIS - Meghana Vasavada

Contents A bit about Myself Ø SQL Server Business Intelligence Ø SQL Server Integration

Contents A bit about Myself Ø SQL Server Business Intelligence Ø SQL Server Integration Services (SSIS) Ø v. Essential Aspects in Data Warehousing v. Key Components of Package Auditing v. Performance tuning Demo : Custom Auditing Ø Take away Ø References/Resources Ø Questions Ø Evaluation Ø

Briefing Myself SQL Server Database Developer and Analyst Big Data Certified M. S. Bioinformatics

Briefing Myself SQL Server Database Developer and Analyst Big Data Certified M. S. Bioinformatics (2013)

SQL Server Business Intelligence: A set of processes, tools, technologies and methodologies that consolidate

SQL Server Business Intelligence: A set of processes, tools, technologies and methodologies that consolidate raw data scattered across functional areas and transform the data into meaningful information in order to gain insight for driving strategic or tactical decision making.

SQL Server Business Intelligence Microsoft Integration Services Analysis Services Reporting Services SQL Server Integrate

SQL Server Business Intelligence Microsoft Integration Services Analysis Services Reporting Services SQL Server Integrate l l Data acquisition from source systems and integration Data transformation and synthesis SQL Server Analyze l l Data enrichment, with business logic, hierarchical views Data discovery via data mining Report l l Data presentation and distribution Data access for the masses

SQL Server Business Intelligence How does BI help Business? ◦ Fast and fact based

SQL Server Business Intelligence How does BI help Business? ◦ Fast and fact based strategic, tactical, informed decision making in timely manner ◦ Align the organization towards its key objectives ◦ Keep track of progress over the period of time ◦ Decreasing operational costs and improve operational efficiency ◦ Improving operational efficiency of each operational area, for example to understand what your true manufacturing costs are and how to optimize them ◦ Identifying cross-selling and up-selling opportunities for the business ◦ Understanding how the business is doing over a period of time, i. e. where it has been, where it is now and where it is going

SQL Server Business Intelligence Who needs Business Intelligence? Top Level Executive and Business Decision

SQL Server Business Intelligence Who needs Business Intelligence? Top Level Executive and Business Decision Makers Middle Level Management Line Managers BUSINESS INTELLIGENCE Business Analysts External Customers /Vendors

SQL Server Integration Services (SSIS) SQL Server Integration Services? �A component of the Microsoft

SQL Server Integration Services (SSIS) SQL Server Integration Services? �A component of the Microsoft SQL Server database software. �A Microsoft SQL Server Business Intelligence Application. �A fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL) �The Platform for a new generation of high performance Data Integration Technologies.

SQL Server Integration Services (SSIS) Customer benefits of SSIS § Performance: § Data flows

SQL Server Integration Services (SSIS) Customer benefits of SSIS § Performance: § Data flows process large volumes of data efficiently § Facility: § Many prebuilt adapters and transformations reduce hand coding § Highly productive visual environment § Data cleansing features § Data mining

SQL Server Integration Services (SSIS) ETL Package Configuration s Package Deployment Basic aspects in

SQL Server Integration Services (SSIS) ETL Package Configuration s Package Deployment Basic aspects in Datawarehouse Integration Logging / Auditing Error Handling Event Handling / tracking Events

SQL Server Integration Services (SSIS) ETL = Extract – Transform – Load Sourc e

SQL Server Integration Services (SSIS) ETL = Extract – Transform – Load Sourc e Destination Get the data from source system as efficiently as possible Load the data in the target storage Perform Calculations on the data

SQL Server Integration Services (SSIS) Dimensional Data Modeling: q Data structure technique optimized for

SQL Server Integration Services (SSIS) Dimensional Data Modeling: q Data structure technique optimized for Data warehousing tools. q The concept developed by Ralph Kimball q Comprised of "fact" and "dimension" tables. q Designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. q Used in data warehouse systems

SQL Server Integration Services (SSIS) Components of Dimensional Data Modeling: Star Schema Dim_Product_ID Product_nam

SQL Server Integration Services (SSIS) Components of Dimensional Data Modeling: Star Schema Dim_Product_ID Product_nam e Attributes Fact Table Dim_Customer_ID Name Title Phone Relationships Fact _Sales Customer_ID Product_ID Order_ID Shipping_ID Sales Amount Dim_Shipment Shipping_ID Ship_date Ship_country Dim_Order_ID Unit_Price Order_date Dimension Tables

SQL Server Integration Services (SSIS) Slowly Changing Dimensions (SCDs)

SQL Server Integration Services (SSIS) Slowly Changing Dimensions (SCDs)

SQL Server Integration Services (SSIS) SSIS Package Configuration: v A configuration is a property/value

SQL Server Integration Services (SSIS) SSIS Package Configuration: v A configuration is a property/value pair that you add to a completed package. v Used to update the values of properties at run time. v For example, by using a configuration, you can change the connection string of a connection manager, or update the value of a variable.

SQL Server Integration Services (SSIS) Types of SSIS Package Configuration: Configuration Types Use XML

SQL Server Integration Services (SSIS) Types of SSIS Package Configuration: Configuration Types Use XML Configuration File Saves the configuration settings in XML file. Most common approach while deploying reports. Environment Variable Allows you to use the Environment variables to save the configuration settings, or package variables values. Registry Entry Allows to save the configuration settings in Registry. Parent Package Variable Saves the configuration settings in a variable. You can use this variable to update properties in child packages. Example SQL Server Configuration settings will be saved in a table present in the SQL Server Database.

SQL Server Integration Services (SSIS) Error Handling in SSIS Package task failed !!!! Reasons

SQL Server Integration Services (SSIS) Error Handling in SSIS Package task failed !!!! Reasons Failure of ancestors control � Truncation or source/destination connection issue � Conversion failure � Issues from migrating files and files data � Package has an error due to privileges of OS controls � Other failure reasons �

SQL Server Integration Services (SSIS) Error Handling in SSIS Control Flow: ü Add a

SQL Server Integration Services (SSIS) Error Handling in SSIS Control Flow: ü Add a failure constraint and redirects the workflow to an alternate task. Data Flow: ü Send the row out to an error path by configuring the error output of the Source/ Destination/Transformation as redirect to error row and save it to review later. ü Use Event Handler: On. Error event in a separate window. Write custom script or just send an email to a team to notify the error.

SQL Server Integration Services (SSIS) Logging in SSIS: Log providers Log Providers Description Text

SQL Server Integration Services (SSIS) Logging in SSIS: Log providers Log Providers Description Text File log provider writes log entries to ASCII text files in a comma-separated value (CSV) format (. log) SQL Server Profiler log provider writes traces that you can view using SQL Server Profiler (. trc) SQL Server log provider writes log entries to the sysssislog table in a SQL Server database Windows Event log writes entries to the Application log in provider the Windows Event log on the local

SQL Server Integration Services (SSIS) Event Handling in SSIS Integration Services packages are event-driven.

SQL Server Integration Services (SSIS) Event Handling in SSIS Integration Services packages are event-driven. This means we can specify routines to execute when a particular event occurs. An event can be the completion of a task or an error that occurs during task execution. Event Handlers Description On. Error Generated as the result of an error condition On. Pre. Validate Fired before Validation process starts On. Query. Cancel Fired when user clicks on cancel or during an executable to determine whether it should stop running On. Task. Failed Signals the failure of a task and typically follows On. Error Event On. Pre. Execute Indicates that an executable component is about to be launched On. Post. Execute Takes place after an executable component finishes running

SQL Server Integration Services (SSIS) SSIS Auditing � Add task(s) to the Event Handlers

SQL Server Integration Services (SSIS) SSIS Auditing � Add task(s) to the Event Handlers of the package. � Select auditing for the entire package or for a specific task. � Select events On. Error, On. Post. Execution, On. Variable. Value. Changed etc. � Inside every dataflow task add row count components after source and before target to track extracted and loaded row count. � Add Variables at package level scope to store rowcount for each dataflow. � Add Variables in the On. Post. Execute event handler scope to store certain information about Data. Flow source/target (e. g. Query, Table. Name. . . )

SQL Server Integration Services (SSIS) SSIS Auditing: Helps to answer the following questions �

SQL Server Integration Services (SSIS) SSIS Auditing: Helps to answer the following questions � Which package was run and for how long? � Who owns the package? Or who modified the package? � When was the package executed? � What kind of data and how many records were written or changed by ETL? � What kind of errors and how many errors occurred?

SQL Server Integration Services (SSIS) Benefits that Custom Auditing and Logging can bring to

SQL Server Integration Services (SSIS) Benefits that Custom Auditing and Logging can bring to your ETL process Help you provide regulatory compliance Ø Provide a deep understanding of database activity and additional insight into data anomalies that may occur Ø Can help answer important questions like, “When was that row last updated? ” Ø help you identify specific data for targeted rollbacks Ø

SQL Server Integration Services (SSIS) Logging vs Custom Auditing Logging o Captures Metadata- Information

SQL Server Integration Services (SSIS) Logging vs Custom Auditing Logging o Captures Metadata- Information about package execution itself Custom Auditing o Captures Information about data, along with package metadata o Errors encountered, Execution o Row counts of Extracts, Inserts, Time for package, Data bytes, Updates, Deletes and Errors, you data flow buffer details, Machine can default the status of the name, Package name, Task name changed rows and package execution o Choose the Log providers and its o Use Execute SQL task to define location variables , parameter binding, and assigning values to the parameters o Provides limited information on the o Designed for DBAs/ Users who package can query and request for more

SQL Server Integration Services (SSIS) Question: If Change Data Capture (CDC) reads and tracks

SQL Server Integration Services (SSIS) Question: If Change Data Capture (CDC) reads and tracks every Historical data and net changes from SQL Server transaction logs, then whey not use CDC for Auditing?

SQL Server Integration Services (SSIS) Answer: There’s a downside to this 1. The amount

SQL Server Integration Services (SSIS) Answer: There’s a downside to this 1. The amount of history data can become huge fast 2. Does not return all information about the changes you might need, for e. g. who made the change, when and how? (when a record is deleted or updated) 3. Delay possible: The history data takes some time to catch up, because it is based on the transaction logs and the operation is asynchronous. 4. It depends on the SQL Server Agent. If the Agent is not running or crashes, no history is being tracked.

SQL Server Integration Services (SSIS) Longer running package? How would you optimize the Package

SQL Server Integration Services (SSIS) Longer running package? How would you optimize the Package Execution? ü ü ü ü Parallel execution of SSIS tasks In case of Incremental load, Use Execute SQL task instead of OLEDB Command transformation to process the updated/ new inserts Avoid processing the redundant columns in the data flow task Keep notice of buffers and execution tree Avoid using checkpoints while auditing SSIS package as they cannot store variables as Type objects and cannot integrate with, or are most often ignored by event handlers Use Lookup, Conditional split to customize the SCD work flow Enable Error handling and logging on package failure

SQL Server Integration Services (SSIS) Key Components in SSIS Package Custom Auditing ETL: Data

SQL Server Integration Services (SSIS) Key Components in SSIS Package Custom Auditing ETL: Data warehouse tables (Staging, Dimensions, Facts, Data marts) Audit SSIS Package Slowly Changing Dimensions Type 1/2 Extract Meta data and Row counts (DML Operations) Parent-Package Configuration On-Error Event Handler to Capture Error message

Demo An Audit table was structured by defining the components with the required information

Demo An Audit table was structured by defining the components with the required information regarding the Metadata and the transactional records. Attributes Definition Audit Key A global unique ID assigned to every execution of ETL package or packages in the target table. Usually it is an auto identity integer starting from ‘ 1’ Parent. Auditkey Surrogate ID assigned to the execution of child packages as metadata. ID inherited from the audit key of master package. Batch/ Load ID mapped with the process of every loads/ updates Pkg. Name assigns the Name of every Corresponding Package executed including Master package and child packages. Pkg. ID Internal GUID of every SSIS package

Demo (contd. ) Attributes Definition Exec. Stop. DT End time of package execution Table.

Demo (contd. ) Attributes Definition Exec. Stop. DT End time of package execution Table. Name Assigns the table name when package executes to define or populate a table. Pkg. Name assigns the Name of every Corresponding Package executed including Master package and child packages. Execution. Instance. GUID Contains the Global Unique ID for every process, generated by SSIS Extract. Row. Cnt Contains the Count of records extracted from the source file Insert. Row. Cnt Contains the Count of Inserted Records in Staging and Dimension tables by the ETL process. Update. Row. Cnt: Contains the count of updated records, especially in dimension Tables which uses the SCD Type II functionality.

Demo (contd. ) Attributes Definition Error. Row. Cnt Contains the records which were erroneous

Demo (contd. ) Attributes Definition Error. Row. Cnt Contains the records which were erroneous or not processed in ETL Table. Initial. Row. Cnt Contains the number of records initially existing in any staging and Dimension table Table. Final. Row. Cnt Contains the total number of records in any staging and Dimension table after inserted , updated or deleted in the execution process Delete. Row. Cnt Contains the number records deleted in the ETL process Successfully. Processing. In determines the status of execution of every d ETL. If successfully executed then sets ‘Y’. Default is set to ‘N’.

Demo (contd. ): Workflow 1. Master package (Capturing Meta data) Execute SQL task that

Demo (contd. ): Workflow 1. Master package (Capturing Meta data) Execute SQL task that checks for the number of rows affected and inserts a Temporary Dummy row in the beginning of the Audit table Add Audit key and Parent Audit Key variables and add Execute SQL task that Stores the highest value of Audit key in the parameter. Add Execute SQL task which populates the Meta data in the Audit table. In the Task Editor connect to the target database and write a T-SQL query to insert the Meta data to the defined Parameters including Audit and Parent audit keys

Demo (contd. ): Workflow 1. Master package (calling child packages in data warehouse) Add

Demo (contd. ): Workflow 1. Master package (calling child packages in data warehouse) Add the Execute Package Task to call the Child package (e. g Audit package of a Staging table) which processes the loading and auditing of the first Staging table Add the Execute Package Task to call the Child package (e. g Audit package of a Dimension table) which processes the loading and auditing of the firs Dicmension table updates the Metadata specifically the end time of the execution and success status of the execution process in the audit table

Take Aways § SQL Server Integration Services is an exceptionally high performance integration and

Take Aways § SQL Server Integration Services is an exceptionally high performance integration and transformation tool § Customize the Auditing using Execute SQL tasks, Row counts, parameters, system package variables to capture transactional information and metadata (DMLs) § Recommended using an alternative to SCD transform component to preserve historical record and counts in Dimension tables if implementing SCD Type 2 § Implement Error capturing strategies in Data flow and Control flow tasks § Unless required, limit the use of Event handlers to On. Error and/ or On. Task. Failed events as it carries a large I/O overhead and can slow the application performance dramatically § Use Script component and Script task to customize the error information at the package level and at individual task flow

Resources/ References v Microsoft SQL Server community on the web (www. sqlservercentral. com )

Resources/ References v Microsoft SQL Server community on the web (www. sqlservercentral. com ) v SQL Server Integration Services on the web (www. sqlis. com) v Custom Logging and Auditing (https: //key 2 consulting. com/custom-ssis-logging-and- auditing/ v https: //www. mssqltips. com/sqlservertip/5679/capturingsql-server-integration-services-package-errors-usingonerror-event-handlers/ v https: //docs. microsoft. com/en-us/sql/integrationservices/performance/integration-services-ssislogging? view=sql-server-2017

Questions Email: msvasavada@gmail. com Linkedin: https: //www. linkedin. com/in/meghana-vasavada 4 a 654 b 2

Questions Email: msvasavada@gmail. com Linkedin: https: //www. linkedin. com/in/meghana-vasavada 4 a 654 b 2 a/

Evaluation How did we do? Please complete an Evaluation to provide feedback to our

Evaluation How did we do? Please complete an Evaluation to provide feedback to our speakers!

Sponsors and Raffle Please make sure to visit and thank our fantastic sponsors for

Sponsors and Raffle Please make sure to visit and thank our fantastic sponsors for their valuable assistance in putting on this event.