Tracking the Enterprise Data Landscape Todd Sicard DAMAMN





























- Slides: 29
Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota
Today’s Purpose > Keeping track of 100’s of databases is difficult. Without a map it means constant re-discovery (at best) or making mistakes (at worst). > Today I’ll outline the metamodel of a rich topographic map of an enterprise-level data landscape to keep track of 100’s of databases. 2
Who Am I? > Todd Sicard – Todd_Sicard@Blue. Cross. MN. com > Started at Blue Cross in 1993 > Enterprise Data Architect 2004 -2009 > Enterprise Architect 2010+ > CDMP 3
Goal Create an overall model of - what data is stored where, - whose data it is, - when it arrives, - where it came from, and - which technology it uses. >Collect, don’t forget >Keep it one-person simple >Useful, Usable, Used 4
This isn’t column level or table level… This is database-level metadata. Breadth before depth Accuracy before precision It had to be one-person-able. 5
By doing this, you will be able to… 1) Understand 2) Manage 3) Leverage > You can’t leverage what you don’t manage, and you can’t manage what you don’t understand! 6
Datastore > A datastore is any electronic (? ) repository of structured (? ) information. (Not all structured data is in a database) (Not all important data is always electronic) (Not all important data is structured) > A list of all the logical names using the most common and accurate vernacular > Data System: A collection of datastores. – Composition: Essential to the definition. – Aggregation: Non-essential to the definition, usually a collection of independent datastores. 7
Datastores ARC-DB "As-Received Claim Database“ Started in 1995 as an MS Access DB, then converted to RDBMS. Contains 24 rolling months of claims data. Business Owner: Warren Buffet Business SME: Blarfengaar B. Technical Owner: Bill Gates Technical SME: Steve Hoberman 8
Information Models Data Models Domain - DDM Subject - SAM Concept - CDM Entity - LDM >“What” data does it contain? Table - PDM 9
Information Models Data Domain Model The Claim Subject Area Model 10
Information Models: “Scope + 1” The Claim Subject Area Model “Plus One” 11
A Datastore’s Subject Area Model (SAM) 12
“Line of Business” >“Whose” data is it? A poor name for a mix of stuff: –Industry Subtypes –Corporate Legal Entity Structure –Product Lines –Market Segments –External Data Actors 13
LOB’s… 14
Datastore LOB 15
Lineage - “Database, Flow. ” >“Where” does the data come from? No matter: –How it moves, –How it’s transformed, –How it’s rolled up, –How big it is, –How mangled it becomes… I t ’ n do ! e r a c …It’s just a data flow 16
Lineage = “Database, Flow. ” Information moves from A to B… that’s all that matters! 17
State > “When” does the data arrive? > The relevant lifecycle of a piece of important data with lots of processing. > Generic lifecycle: 1. Creation, 2. Formation, 3. Maturity, 4. Destruction 18
Datastore State 19
Technology 20
Datastore Technology 21
Deployment: Servers, Instances, etc. > I didn’t go there > Why not? – “One-person-able” – Breadth before depth. – Accuracy before precision. – Understand, Manage, then Leverage – Manage information at the Enterprise-level > But it sure would be nice… maybe later 22
The Metamodel (UML Model) 23
The Metamodel (ER Model) 24
Drawing the Pictures > Datastore-centric: LOB, SAM, Lineage, Tech, State, Composition > Reference (Process POV) Claims Data Flow > Project POV In-scope Datastores (Scope + 1) 25
Potential Users >Warehouse architects >Data modelers >Data stewards >DBA's >Data leadership >Enterprise architects >Business continuity planners >Disaster recovery planners >Testers >Internal audit >Corporate attorneys >Security architects 26
Enough talking… let’s see it. 27
The Tool But only from a vendor-neutral perspective… > Sparx Enterprise Architect – Corporate Edition, Standard License – www. sparxsystems. com. au 28
Thank you! Todd_Sicard@Blue. Cross. MN. com 29