In the Data Lake Not Waving but Drowning
In the Data Lake: Not Waving but Drowning www. 9 sight. com Dr. Barry Devlin 9 sight Consulting @Barry. Devlin
What is a Data Lake? § Words have meanings § Metaphors make images 2 e r o t s a t r a m a t a d d n a a f o d e k s n i n h a t e l u c – y r s e "If yo t a a e r w o d f e l d t e t r o u b t c e of u g r r t a l s a d n s i a e d k e a g l a a t. a e d t pack a e t h s t l – a r n u t o i a t n p e m r u n o i s m n m o a a c e n r i t r s e t e a k a w l f a o t a y d d d e n bo h t a f , o e k s a t l n e e t h t n l o l i o c f t e e o t h m T e o c c r u n o a s c a e k m a l o r " e f. h s t e f l o p s m r a e s s e u k ) s a 1 t u 1 0 r o 2 i o , s , e b n var r i o F e ( v o i h a d t n , e e P , n i O T C , n o x exam i D s e m a J Copyright © 2014, 9 sight Consulting
Data Lake – definitions and questions § Is all data of equal value? § Is quality and consistency no longer needed? t c e j b o e t g a r h a t l y a r o s i t e s a k o m a p l r e o r a f t e a e g d v i a t r A a o t n s s. t d i d e e m n s i o d a c. e a s b I t t e a a n h d W s , i s e t s d i l u o l o i R t h t e n r u arga M § Should we really store everything? § Build it and they will come? § What problem are we trying to solve? A data lake is a mas easily s i v a e, c c centra essib l l e i z , e d large r e posito volum r y o e f s o and u f s tructu nstruc r e d t u red da Cory J t ansse a. n, Tec chnope dia. co m 3 Copyright © 2014, 9 sight Consulting
The Data Lake Fallacy: All Water and Little Substance § Gartner report, G 00264950, 23 July 2014, Nick Heudecker, Andrew White § The main risk of using data lakes is the absence of metadata and an underlying mechanism to maintain it… the lack of which can turn a data lake into a “data swamp” § https: //www. gartner. com/doc/2805917 4 Image: anaxi. deviantart. com/art/Lostless-Swamp-Concept 01 -173098108 Copyright © 2014, 9 sight Consulting
Do we need a new architecture? § Yes! § Original data warehouse is too restrictive § Business needs agility, speed and consistency § Emerging biz-tech ecosystem - Business / IT symbiosis 5 Speed of decision Market flexibility and uncertainty Competition Externally-sourced information Copyright © 2014, 9 sight Consulting and appropriate action Customer interaction and technical savvy Mobile devices Information abundance and variety
One more time, let’s do architecture § The IDEAL architecture consists of three conceptual “thinking spaces”. § Characteristics - Integrated People - Distributed - Emergent Process - Adaptive - Latent § Also read as a story: People process information 6 Information Copyright © 2014, 9 sight Consulting
The tri-domain information model Structure/Context x § Process-mediated data ipl e - “Traditional” operational & informational data riv ed m Machinegenerated data w In-flight Live - From Tweets to Videos 7 Timeliness/ Consistency Ra - Subjectively interpreted record of personal experiences At § Human-sourced information om ic - The Internet of Things Process-mediated data De - Output of machines and sensors Co § Machine-generated data po un d Te - Via data entry & cleansing processes xtu al M ult Human-sourced information Copyright © 2014, 9 sight Consulting Stable Reconciled Historical
Introducing information pillars § One architecture for all types of information - Mix/match technology as needed - Relational, No. SQL, Hadoop, etc. Machinegenerated Processmediated Humansourced (data) (information) § Integration of sources and stores - Instantiation gathers inputs Assimilation Context-setting (information) - Assimilation integrates stored info. Transactional (data) § Data flows as fast as needed and reconciled when necessary Transactions - No unnecessary storage or transformations § Distinct data management / governance approaches as required 8 Instantiation Measures Copyright © 2014, 9 sight Consulting Events Messages
From metadata to context-setting information § Metadata is two four-letter words! - Information (not data) - Describes all “stuff” (not just data) - Indistinguishable (mostly) from “business information” § Context-setting information (CSI) - New image – describes what it is and does What was the most expensive metadata error The Mars Climate in history? Orbiter, lost in 1999, at a cost of $325 M, due to metadata error - Provides the background to each piece of information, to every process component and to all the people that constitute the business - All information adds context to something else; it is all context setting 9 Copyright © 2014, 9 sight Consulting
Meaning Information - People process information 10 Mental Articulation Practice Tacit Knowledge Understanding Insight Do cum en Le arn t ing Hard Information om cal r F si y rld h P o W Data Strict Copyright © 2014, 9 sight Consulting Videoing - The Web has fully devalued “facts” Explicit Knowledge Observing - Data is simply information optimized for computers Physical § Information precedes data Locus Data Soft Information Modeling Interpreting Content Structure Loose Objective / universal Knowledge The stories we tell ourselves Sensemaking Wisdom Mentoring § Ackoff’s DIKW pyramid is no longer viable Subjective / unique the modern meaning model Interpersonal 3 m: Fr Hu om W ma or n ld
Human, social and collaborative dimension § Meaning is a personal/ social interpretation based (loosely) on information and knowledge - Rationality is only one part - Gut-feel may be more effective than rationality in decision making - Emotional state plays an important role § Intention drives understanding and action § We are social animals - Business is a social enterprise § Innovation is often team-based 11 Copyright © 2014, 9 sight Consulting
From BI to Business un. Intelligence § Rationality of thought and far beyond it § Logic of process, predefined and emergent § Information, knowledge and meaning § The confluence of - Reason and inspiration, emotion and intention - Collaboration and competition - All that comprises the human and social milieu that is business § Not business intelligence… Business un Intelligence ^ § http: //bit. ly/Bun. I-Technics : 25% discount with code “BIInsights 25” 12 Copyright © 2014, 9 sight Consulting
Conclusions 1. Speed, flexibility and quality vital in modern business - Biz-tech ecosystem shows direction - Data Lake driven by “Big Data blindness” 2. Modern information architecture is highly diverse - Structure and consistency where needed Agility and speed when required Data Lake ignores need for structure and consistency 3. Context and meaning are keystone concepts - Flexibility & quality bridged via context-setting information - Business un. Intelligence provides overall structure 13 Copyright © 2014, 9 sight Consulting
Not Waving but Drowning Nobody heard him, the dead man, But still he lay moaning: I was much further out than you thought And not waving but drowning. Poor chap, he always loved larking And now he’s dead It must have been too cold for him his heart gave way, They said. www. 9 sight. com Oh, no no no, it was too cold always (Still the dead one lay moaning) I was much too far out all my life And not waving but drowning. Stevie Smith (1957)
- Slides: 14