SSIS Best Practices Basic best practices for SQL
SSIS Best Practices Basic best practices for SQL Server Integration Services (SSIS) development
About the speaker Nate Locklin Senior Database Analyst, PPG Industries nate@locklin. org 2 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Best Practice #1: Logging § Two often used log providers § Text Files § SQL Server (sysssislog in 2008, sysdtslog 90 in 2005) – created when first used § Many events possible to log § On. Error and On. Task. Fail events most common § Can reduce detail to Computer, Source, and Message § Here you will find all the error messages: § http: //msdn. microsoft. com/en-us/library/ms 345164. aspx 3 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Best Practice #2: Configuration files § Properties of objects can be changed at runtime § No need to change the package itself to promote from test to production § Connection string is all that needs to change § Several formats: § XML configuration file most commonly used § For SQL Server 2012, Project Deployment Model is added 4 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Best Practice #3: Checkpoints § Checkpoints let a package start where it left off previously, such as when an error occurs in the middle of your package § Lets you fix the error, then restart the package, without it re-running the parts already done Demo developed by: Stephen Tirone 5 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Setting up for Checkpoints § Set the following properties of the Package § Checkpoint. File. Name = <some name> § Checkpoint. Usage = If. Exists § Save. Checkpoints = True § Set the following properties on all the tasks: § Fail. Package. On. Failure = True 6 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Checkpoint Caveats § Most importantly: § “A package can be restarted only at the control flow level. You cannot restart a package in the middle of a data flow. ” § Also: § “When a package is restarted the Foreach Loop containers and its child containers are run again. If a child container in the loop runs successfully, it is not recorded in the checkpoint file, instead it is rerun. ” 7 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Error handling § The On. Error handler gives you great flexibility in addressing problems: 8 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Design considerations § Get in & get out: Limit the data you are pulling from your source systems to just the columns and rows you need. § Stage your data: Copy your data to a staging data warehouse database and then perform your transformations. 9 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Other design considerations § Retain. Same. Connection § Making lots of connections? Use this, please! § Can be found as a property of the Connection Mgr § Packet size § Consider increasing (32, 767) for larger data § Row Per Batch Size § Sometimes speed is enhanced when setting the “Rows per batch” to something other than blank, like 1000, 5000 or 10, 000 (particularly when run through BIDS over the network) 10 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Retain same connection § Property of the Connection Manager: 11 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Packet size § On All page of the Connection Manager: 12 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Additional resources § Pragmatic. Works § www. pragmaticworks. com § BI Developer Network (BIDN) § www. bidn. com § Jamie Thompson’s SSIS blog § http: //sqlblog. com/blogs/jamie_thomson/ § Matt Masson’s blog § http: //mattmasson. com/ 13 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
Any questions? Please feel free to contact me at: nate@locklin. org Slides and demo files will be available for download at: http: //tinyurl. com/SSISBest. Practices 250 Also available for download on the SQLSaturday #250 website: http: //www. sqlsaturday. com/250/schedule. aspx 14 | 9/14/2013 | SQL Saturday #250 – Pittsburgh
- Slides: 14