Office of the Chief Information Officer NLIT 08

  • Slides: 25
Download presentation
Office of the Chief Information Officer NLIT 08 Summit ___________ National Labs Information Technology

Office of the Chief Information Officer NLIT 08 Summit ___________ National Labs Information Technology 2008 Summit EA Mesodata: The Key to Unlocking EA Benefits Bruce Gras Contractor DOE OCIO EA Team May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits

Office of the Chief Information Officer Agenda q Enterprise Architecture at DOE q EA

Office of the Chief Information Officer Agenda q Enterprise Architecture at DOE q EA Mesodata is a “Rosetta Stone” q EA Mesodata DOE Example: The IT Portfolio The Analysis in 5 Phases q EA Mesodata DOE Example: The Bottom Line The EA Mesodata Decision Matrix and List q Summary: The EA Benefits derived from EA Mesodata in DOE Example May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 2

Office of the Chief Information Officer Enterprise Architecture at DOE The DOE Target EA

Office of the Chief Information Officer Enterprise Architecture at DOE The DOE Target EA will be built up over time by assembling DOE Segment Architectures May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 3

Office of the Chief Information Officer A Current Dilemma Our different Architectures need to

Office of the Chief Information Officer A Current Dilemma Our different Architectures need to communicate as shown by the “ ” arrows, but they use different expressions DOE Baseline To DOE Target DOE TARGET EA Segment TARGET Architecture “D” Segment TARGET Architecture “C” Segment TARGET Architecture “B” Segment TARGET Architecture “A” Segment BASELINE Architecture “D” Segment BASELINE Architecture “C” Segment BASELINE Architecture “B” Segment Targets to DOE Target DOE BASELINE EA Segment BASELINE Architecture “A” Segment Baselines to DOE Baseline to describe the SAME or SIMILAR artifacts. Segment Baselines to Segment Targets May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 4

Office of the Chief Information Officer The Result of this Current Dilemma It is

Office of the Chief Information Officer The Result of this Current Dilemma It is very hard to use our Architectures to lower costs, avoid costs or improve mission performance by: 1. Reusing artifacts (instead buying new ones that do the same thing) 2. Repurposing artifacts (instead of disposing of them) 3. Refitting artifacts to upgrade them (instead of buying whole new ones) 4. Reducing redundancies (by identifying shared resource opportunities) 5. Retiring legacy systems safely (by identifying dependencies) JUST FITTING TOGETHER IS NOT ENOUGH, THINGS HAVE TO WORK TOGETHER TOO May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 5

Office of the Chief Information Officer A Possible Solution to the Current Dilemma SEGMENT

Office of the Chief Information Officer A Possible Solution to the Current Dilemma SEGMENT “B”, SEGMENT “C”, We could use “Rosetta Stone” that reliably translates the expressions of one Architecture into those of the other when they are mapped for comparison. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 6

Office of the Chief Information Officer EA Mesodata is a “Rosetta Stone” EA Metadata

Office of the Chief Information Officer EA Mesodata is a “Rosetta Stone” EA Metadata is EA Mesodata is q an IDENTIFIER kind of data (i. e. , a “reference” data type) q a RELATIONSHIP kind of data (i. e. , a “tuple” data type) q ATTRIBUTES of individual artifacts q STATES of artifact relationships q STATIC in nature q DYNAMIC in nature May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 7

Office of the Chief Information Officer q EA Mesodata is class of “Latent Variables”

Office of the Chief Information Officer q EA Mesodata is class of “Latent Variables” Latent Variables cannot be directly observed or measured, but rather are inferred from other variables that are directly observable and measurable. q Typical Examples: Quality of Life, Morale, Consumer Confidence, Happiness May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 8

Office of the Chief Information Officer Basic EA Mesodata Analysis is based on CHAID

Office of the Chief Information Officer Basic EA Mesodata Analysis is based on CHAID q Latent Class Modeling (LCM) relates a set of observed discrete multivariate variables to a set of Latent Variables. q Probabilistic Latent Semantic Analysis (PLSA) is a form of LCM, adding a sounder probabilistic model. q Chi-Squared Automatic Interaction Detection (CHAID) is a recursive partitioning statistical method for categorical dependent variables (latent or otherwise), and the current best of breed version is a hybrid with LCM and PLSA. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 9

Office of the Chief Information Officer Advanced EA Mesodata Analysis at the University of

Office of the Chief Information Officer Advanced EA Mesodata Analysis at the University of Idaho Dr. Milos Manic and Kevin Mc. Carty Research Program Highlights q Adaptive Query Engine for deeper analyses including: § Complex semantic transformations § Polymorphic hermeneutic equivalences q GUI Development for intuitive user data visualization including: § “in-use” graphical workspaces of findings for evaluations § “ad-hoc” orientations and associations of data for discovery q Advanced Data Mining Techniques for resilient analyses providing: § Noise reduction while finding hidden relationships (simultaneously decreasing incidents of Type I & II errors) § Natural Language Processing with Fuzzy Classifiers to automate hermeneutic resolutions § Bayesian Networks to specifically assume semantic non-stationarity May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 10

Basic EA Mesodata Analytical Process Office of the Chief Information Officer I. ANALYTICAL PROBLEM

Basic EA Mesodata Analytical Process Office of the Chief Information Officer I. ANALYTICAL PROBLEM STATEMENT: specify what it is that the analysis is expected to accomplish II. ANALYTICAL METHODOLOGY PHASES: 1. Discovery: identify effective latent variable(s) 2. Definition: establish possible variable categories 3. Design: determine potential observable attributes 4. Development: create partitioning with CHAID 5. Deployment: implement in decision tree May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 11

Office of the Chief Information Officer EA Mesodata DOE Example: The Analysis in 5

Office of the Chief Information Officer EA Mesodata DOE Example: The Analysis in 5 Phases I. ANALYTICAL PROBLEM STATEMENT: specify what it is that the analysis is expected to accomplish. OUR DOE EXAMPLE FOR THIS DISCUSSION: IDENTIFY POTENTIAL REUSE, REPURPOSING, REFITTING, REDUNDANCY, AND RETIREMENT IN THE IT INVESTMENT PORTFOLIO May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 12

Office of the Chief Information Officer EA Mesodata DOE Example & the DOE Strategic

Office of the Chief Information Officer EA Mesodata DOE Example & the DOE Strategic Portfolio Review (SPR) This is where this EA Mesodata DOE Example fits into the SPR process. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 13

Office of the Chief Information Officer EA Mesodata DOE Example Scope q In the

Office of the Chief Information Officer EA Mesodata DOE Example Scope q In the DOE IT BY 2009 portfolio, there are 1, 061 individual investments. q As a result, there are 1, 125, 721 pairwise relationships to assess. q Even if we cut that in half by using bi-lateral relationship definitions, we still have 561, 798 pairwise relationships to figure out. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 14

Office of the Chief Information Officer Phase 1 – DOE Example: Identify effective latent

Office of the Chief Information Officer Phase 1 – DOE Example: Identify effective latent variable(s) Step 1 - Specify “comparative” business decision DOE Example: For any given pairwise comparison: § Can system “A” be reused, repurposed or refitted to satisfy system “B”? § Is system “A” redundant with “B”? § Could system “A” be safely retired with respect to “B”? Step 2 - Describe the elements being “compared” DOE EXAMPLE: Investment Projects in the DOE IT Investment Portfolio in the FEA Mapping Spreadsheet from SPR of BY 2009 IT Investment Portfolio. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 15

Office of the Chief Information Officer Phase 2 – DOE Example: Establish possible variable

Office of the Chief Information Officer Phase 2 – DOE Example: Establish possible variable categories Step 1 - Determine range of business decisions DOE Example: System “A” can be the “same as” System “B” at one extreme or “completely unlike” System “B” at the other. Step 2 - Establish category gradient within range DOE Example: System “A” can be: § EQUIVALENT TO System “B” § PART OF System “B” or vice versa § SIMILAR TO System “B” § DISSIMILAR TO System “B” May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 16

Office of the Chief Information Officer Phase 3 – DOE Example: Determine potential observable

Office of the Chief Information Officer Phase 3 – DOE Example: Determine potential observable attributes Step 1 - List possible attributes for decision elements Step 2 - Create comparative attributes for pairs of decision elements DOE Example: Possible attributes from DOE Example: Comparative attributes EACH INDIVIDUAL IT Investment: for EACH PAIR of IT Investments: q DOE Mission Area q Same Mission Area or not q PSO Sponsor q Same PSO or not q FEA Mappings q Difference of Mapping Indices q New Investment Amount q Difference of New Amounts q Maint Investment Amount q Difference of Maint Amounts q etc. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 17

Office of the Chief Information Officer Phase 4 - DOE Example: Create partitioning with

Office of the Chief Information Officer Phase 4 - DOE Example: Create partitioning with CHAID Step 1 - Create “training” set of decision cases DOE Example: Extract a random sample of Decision Cases as “training” cases. Step 2 - Assign “known” decisions to “training” cases DOE Example: Manually “classify” each “training” case according to the category gradient. Step 3 - Run modeling software with “training” cases DOE Example: Use the classified “training” cases to run CHAID to derive the partitioning relative to the desired comparative business decision. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 18

Office of the Chief Information Officer Phase 5 Steps – DOE Example: Implement in

Office of the Chief Information Officer Phase 5 Steps – DOE Example: Implement in decision tree Step 1 - Transform partitioning into a decision tree DOE Example: Put the rules and conditional probabilities into one decision tree with likelihoods. Step 2 - Test decision tree for accuracy rate DOE Example: Run the unclassified “training” set through the tree to find the error rate. Step 3 - Embed decision tree into enterprise DSS DOE Example: Export the decision tree to an application service in the enterprise DSS. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 19

Office of the Chief Information Officer DOE Example: The EA Mesodata Decision Tree CHAID

Office of the Chief Information Officer DOE Example: The EA Mesodata Decision Tree CHAID selects and only uses a small set of Comparative Attributes that will optimally classify each pair under consideration. DOE Mission Area A & B are in the same area A & B are in different areas Difference DME BY Amount Difference FEA Level III Index |A - B| >$2 M |A - B| <$1 M < |A - B| <$2 M |A – B| < 100 Difference FEA Level III Index Difference FEA Level II Index ETC. Difference FEA Level I Index ETC. |A – B| >700 “A” IS EQUIVALENT TO “B” “A” IS DISSIMILAR TO “B” 100 < |A – B| <700 ETC. ILLUSTRATIVE DIAGRAM ONLY May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits “A” IS PART OF “B” or “B” IS PART OF “A” 20

Office of the Chief Information Officer DOE Example: The Bottom Line – The EA

Office of the Chief Information Officer DOE Example: The Bottom Line – The EA Mesodata Decision Matrix It is a Huge Matrix of 561, 798 [“A” is what to “B”] pairwise combinations for comparison EA Mesodata Analysis ILLUSTRATIVE DIAGRAM ONLY May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 21

Office of the Chief Information Officer EA Mesodata Analysis DOE Example: The Bottom Line

Office of the Chief Information Officer EA Mesodata Analysis DOE Example: The Bottom Line – The EA Mesodata Decision List The Matrix becomes a Short Decision List of 100 to 200 [“A” is what to “B”] pairwise candidates for cost savings with a confidence level of 80% or higher ILLUSTRATIVE DIAGRAM ONLY May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits etc. 22

Office of the Chief Information Officer Summary: The EA Benefits derived from EA Mesodata

Office of the Chief Information Officer Summary: The EA Benefits derived from EA Mesodata in DOE Example ü EA ATTRIBUTES PLAY THE KEY ROLE IN BUILDING THE MESODATA DECISION TREE: EA artifacts and relationships are the basis for the EA Mesodata used to possibly combine and reduce DOE IT investments by: Ø Reuse of EQUIVALENT TO applications and/or data needed elsewhere, Ø Retirement of EQUIVALENT TO applications and/or data that are redundant, Ø Refitting of applications and/or data with existing modules shown to be a PART OF them, or Ø Repurposing of SIMILAR TO applications and/or data May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 23

Office of the Chief Information Officer Summary: The EA Benefits derived from EA Mesodata

Office of the Chief Information Officer Summary: The EA Benefits derived from EA Mesodata in DOE Example ü FINDING COST SAVINGS/AVOIDANCE CANDIDATES IS AN AUTOMATED PROCESS: Once built, the EA Mesodata Decision Tree automatically evaluates 561, 798 possible pairwise candidates, each in turn individually. ü SUSTAINMENT OF THE AUTOMATED PROCESS IS ALSO MOSTLY AN AUTOMATED PROCESS: Periodically, the EA Mesodata Decision Tree needs to be “refreshed” and “updated, ” and this is done with a new set of Training Cases that are manually scored. May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 24

Office of the Chief Information Officer Thank You Questions? For more information, please contact:

Office of the Chief Information Officer Thank You Questions? For more information, please contact: Denise Hill Senior Technical Advisor IT Capital Planning, Architecture and E-Gov denise. hill@hg. doe. gov 202 -586 -5848 May 12, 2008 NLIT 08 Summit - EA Mesodata: The Key to Unlocking EA Benefits 25