Tracking the Enterprise Data Landscape Todd Sicard DAMAMN

  • Slides: 29
Download presentation
Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue

Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

Today’s Purpose > Keeping track of 100’s of databases is difficult. Without a map

Today’s Purpose > Keeping track of 100’s of databases is difficult. Without a map it means constant re-discovery (at best) or making mistakes (at worst). > Today I’ll outline the metamodel of a rich topographic map of an enterprise-level data landscape to keep track of 100’s of databases. 2

Who Am I? > Todd Sicard – Todd_Sicard@Blue. Cross. MN. com > Started at

Who Am I? > Todd Sicard – Todd_Sicard@Blue. Cross. MN. com > Started at Blue Cross in 1993 > Enterprise Data Architect 2004 -2009 > Enterprise Architect 2010+ > CDMP 3

Goal Create an overall model of - what data is stored where, - whose

Goal Create an overall model of - what data is stored where, - whose data it is, - when it arrives, - where it came from, and - which technology it uses. >Collect, don’t forget >Keep it one-person simple >Useful, Usable, Used 4

This isn’t column level or table level… This is database-level metadata. Breadth before depth

This isn’t column level or table level… This is database-level metadata. Breadth before depth Accuracy before precision It had to be one-person-able. 5

By doing this, you will be able to… 1) Understand 2) Manage 3) Leverage

By doing this, you will be able to… 1) Understand 2) Manage 3) Leverage > You can’t leverage what you don’t manage, and you can’t manage what you don’t understand! 6

Datastore > A datastore is any electronic (? ) repository of structured (? )

Datastore > A datastore is any electronic (? ) repository of structured (? ) information. (Not all structured data is in a database) (Not all important data is always electronic) (Not all important data is structured) > A list of all the logical names using the most common and accurate vernacular > Data System: A collection of datastores. – Composition: Essential to the definition. – Aggregation: Non-essential to the definition, usually a collection of independent datastores. 7

Datastores ARC-DB "As-Received Claim Database“ Started in 1995 as an MS Access DB, then

Datastores ARC-DB "As-Received Claim Database“ Started in 1995 as an MS Access DB, then converted to RDBMS. Contains 24 rolling months of claims data. Business Owner: Warren Buffet Business SME: Blarfengaar B. Technical Owner: Bill Gates Technical SME: Steve Hoberman 8

Information Models Data Models Domain - DDM Subject - SAM Concept - CDM Entity

Information Models Data Models Domain - DDM Subject - SAM Concept - CDM Entity - LDM >“What” data does it contain? Table - PDM 9

Information Models Data Domain Model The Claim Subject Area Model 10

Information Models Data Domain Model The Claim Subject Area Model 10

Information Models: “Scope + 1” The Claim Subject Area Model “Plus One” 11

Information Models: “Scope + 1” The Claim Subject Area Model “Plus One” 11

A Datastore’s Subject Area Model (SAM) 12

A Datastore’s Subject Area Model (SAM) 12

“Line of Business” >“Whose” data is it? A poor name for a mix of

“Line of Business” >“Whose” data is it? A poor name for a mix of stuff: –Industry Subtypes –Corporate Legal Entity Structure –Product Lines –Market Segments –External Data Actors 13

LOB’s… 14

LOB’s… 14

Datastore LOB 15

Datastore LOB 15

Lineage - “Database, Flow. ” >“Where” does the data come from? No matter: –How

Lineage - “Database, Flow. ” >“Where” does the data come from? No matter: –How it moves, –How it’s transformed, –How it’s rolled up, –How big it is, –How mangled it becomes… I t ’ n do ! e r a c …It’s just a data flow 16

Lineage = “Database, Flow. ” Information moves from A to B… that’s all that

Lineage = “Database, Flow. ” Information moves from A to B… that’s all that matters! 17

State > “When” does the data arrive? > The relevant lifecycle of a piece

State > “When” does the data arrive? > The relevant lifecycle of a piece of important data with lots of processing. > Generic lifecycle: 1. Creation, 2. Formation, 3. Maturity, 4. Destruction 18

Datastore State 19

Datastore State 19

Technology 20

Technology 20

Datastore Technology 21

Datastore Technology 21

Deployment: Servers, Instances, etc. > I didn’t go there > Why not? – “One-person-able”

Deployment: Servers, Instances, etc. > I didn’t go there > Why not? – “One-person-able” – Breadth before depth. – Accuracy before precision. – Understand, Manage, then Leverage – Manage information at the Enterprise-level > But it sure would be nice… maybe later 22

The Metamodel (UML Model) 23

The Metamodel (UML Model) 23

The Metamodel (ER Model) 24

The Metamodel (ER Model) 24

Drawing the Pictures > Datastore-centric: LOB, SAM, Lineage, Tech, State, Composition > Reference (Process

Drawing the Pictures > Datastore-centric: LOB, SAM, Lineage, Tech, State, Composition > Reference (Process POV) Claims Data Flow > Project POV In-scope Datastores (Scope + 1) 25

Potential Users >Warehouse architects >Data modelers >Data stewards >DBA's >Data leadership >Enterprise architects >Business

Potential Users >Warehouse architects >Data modelers >Data stewards >DBA's >Data leadership >Enterprise architects >Business continuity planners >Disaster recovery planners >Testers >Internal audit >Corporate attorneys >Security architects 26

Enough talking… let’s see it. 27

Enough talking… let’s see it. 27

The Tool But only from a vendor-neutral perspective… > Sparx Enterprise Architect – Corporate

The Tool But only from a vendor-neutral perspective… > Sparx Enterprise Architect – Corporate Edition, Standard License – www. sparxsystems. com. au 28

Thank you! Todd_Sicard@Blue. Cross. MN. com 29

Thank you! Todd_Sicard@Blue. Cross. MN. com 29