Data and Knowledge Management 1 Data Management A

  • Slides: 41
Download presentation
Data and Knowledge Management 1

Data and Knowledge Management 1

Data Management: A Critical Success Factor • • • The difficulties and the process

Data Management: A Critical Success Factor • • • The difficulties and the process Data sources and collection Data quality Multimedia and object-oriented databases Document management 2

Difficulties • Data amount increases exponentially • Data: multiple sources • Small portion of

Difficulties • Data amount increases exponentially • Data: multiple sources • Small portion of data useful for specific decisions • Increased need for external data 3

Difficulties. . 2 • Differing legal requirements among countries • Selection of data management

Difficulties. . 2 • Differing legal requirements among countries • Selection of data management tool - large number • Data security, quality, and integrity 4

Data Life Cycle Process and Knowledge Discovery • • • Data collected and stored

Data Life Cycle Process and Knowledge Discovery • • • Data collected and stored in databases Processed and stored in data warehouses Transformation - ready for analysis Data mining tools - knowledge Presentation 5

Data Sources and Collection • • Internal data Personal data External data Internet and

Data Sources and Collection • • Internal data Personal data External data Internet and commercial database services 6

Data Quality (DQ) Intrinsic – Accuracy, objectivity, believability, and reputation Accessibility – Accessibility and

Data Quality (DQ) Intrinsic – Accuracy, objectivity, believability, and reputation Accessibility – Accessibility and access security 7

Data Quality. . 2 Contextual DQ – Relevancy, value added, timeliness, completeness Representation DQ

Data Quality. . 2 Contextual DQ – Relevancy, value added, timeliness, completeness Representation DQ – Interpretability, ease of understanding, concise representation, and consistent representation 8

9

9

Complex Databases • Object-Oriented database • Multimedia database • Document management 10

Complex Databases • Object-Oriented database • Multimedia database • Document management 10

Data Warehousing, Mining, and Analysis • Transaction versus analytical processing • Data warehouse and

Data Warehousing, Mining, and Analysis • Transaction versus analytical processing • Data warehouse and data marts • Knowledge discovery, analysis, and mining 11

Good Data Delivery System • • Easy data access by end users Quicker decision

Good Data Delivery System • • Easy data access by end users Quicker decision making Accurate and effective decision making Flexible decision making 12

Processing Solutions • Business representation of data for end users • Client-server environment -

Processing Solutions • Business representation of data for end users • Client-server environment - end users query and reporting capability • Server-based repository (data warehouse) 13

Data Warehouse and Marts The purpose of a data warehouse is to establish a

Data Warehouse and Marts The purpose of a data warehouse is to establish a data repository that makes data accessible in a form readily acceptable for analytical processing activities. A data mart is dedicated to a functional or regional area. (subset of a warehouse) 14

Data Warehouse • A data warehouse contains historical data, not operational • It contains

Data Warehouse • A data warehouse contains historical data, not operational • It contains data from a number of databases so the data must be ‘cleaned’ to ensure that the data definitions are consistent 15

Characteristics of Data Warehousing • • • Organization Consistency Time variant Nonvolatile Relational 16

Characteristics of Data Warehousing • • • Organization Consistency Time variant Nonvolatile Relational 16

The Data Warehouse and Marts • • • Benefits Cost Architecture Putting the data

The Data Warehouse and Marts • • • Benefits Cost Architecture Putting the data warehouse on the internet Suitability 17

Knowledge Discovery, Analysis, and Mining • Foundations of knowledge discovery in databases (KDD) •

Knowledge Discovery, Analysis, and Mining • Foundations of knowledge discovery in databases (KDD) • Tools and techniques of KDD • Online analytical processing (OLAP) • Data mining 18

The Foundations of Knowledge Discovery in Databases (KDD) • Massive data collection • Powerful

The Foundations of Knowledge Discovery in Databases (KDD) • Massive data collection • Powerful multiprocessor computers • Data mining algorithms 19

20

20

OLAP Queries • Access very large amounts of data • Analyze the relationships between

OLAP Queries • Access very large amounts of data • Analyze the relationships between many types of business elements • Involve aggregated data • Compare aggregated data over hierarchical time periods 21

OLAP Queries. . 2 • Present data in different perspectives • Involve complex calculations

OLAP Queries. . 2 • Present data in different perspectives • Involve complex calculations between data elements • Able to respond quickly to user requests 22

Data Mining • Automated prediction of trends • Automated discovery of previously unknown patterns

Data Mining • Automated prediction of trends • Automated discovery of previously unknown patterns • Example: People who buy Barbie dolls also buy a particular chocolate bar – What can we do with that information? 23

Data Mining Characteristics and Objectives • Data often buried deep within large databases •

Data Mining Characteristics and Objectives • Data often buried deep within large databases • Data may be consolidated in data warehouse or kept in internet and intranet servers • Usually client-server architecture 24

Data Mining Characteristics and Objectives • Data mining tools extract information buried in corporate

Data Mining Characteristics and Objectives • Data mining tools extract information buried in corporate files or archived public records • The “miner” is often an end user • “Striking it rich” usually involves finding unexpected, valuable results • Parallel processing 25

Data Mining Characteristics and Objectives • Data mining yields five types of information •

Data Mining Characteristics and Objectives • Data mining yields five types of information • Data miners can use one or several tools 26

Data Mining Yields Five Types of Information • • • Association Sequences Classifications Clusters

Data Mining Yields Five Types of Information • • • Association Sequences Classifications Clusters Forecasting 27

Data Mining Techniques • • Case-based reasoning Neural computing Intelligent agents Others: decision trees,

Data Mining Techniques • • Case-based reasoning Neural computing Intelligent agents Others: decision trees, genetic algorithms, nearest neighbor method, and rule reduction 28

Data Visualization Technologies • Data visualization • Multidimensionality • Geographical information systems (GIS) 29

Data Visualization Technologies • Data visualization • Multidimensionality • Geographical information systems (GIS) 29

Data Visualization Data visualization refers to presentation of data by technologies digital images, geographical

Data Visualization Data visualization refers to presentation of data by technologies digital images, geographical information systems, graphical user interfaces, multidimensional tables and graphs, virtual reality, three-dimensional presentations and animation. 30

Multidimensionality Major advantage – data can be organized the way managers prefer to see

Multidimensionality Major advantage – data can be organized the way managers prefer to see the data Three factors – dimensions, measures, and time 31

Examples Dimensions – Products, salespeople, market segments, business units, geographical locations Measures – Money,

Examples Dimensions – Products, salespeople, market segments, business units, geographical locations Measures – Money, sales volume, head count, inventory, profit, actual versus forecasted Time – Daily, weekly, monthly, quarterly, yearly 32

Geographical Information Systems (GIS) A GIS is a computer-based system for capturing, storing, checking,

Geographical Information Systems (GIS) A GIS is a computer-based system for capturing, storing, checking, integrating, manipulating, and displaying data using digitized maps. 33

Components of a GIS • Software • Data • Emerging GIS applications 34

Components of a GIS • Software • Data • Emerging GIS applications 34

Emerging GIS Applications Integration of GIS and GPS – Reengineer aviation and shipping industries

Emerging GIS Applications Integration of GIS and GPS – Reengineer aviation and shipping industries Intelligent GIS (integration of GIS and ES) User interface – Multimedia, 3 D graphics, animated and interactive maps Web applications 35

Knowledge Management • Knowledge management or managing knowledge databases • A knowledge base is

Knowledge Management • Knowledge management or managing knowledge databases • A knowledge base is a database that contains information or organizational know how. 36

Accenture’s Learning Organization Knowledge Base • Global best practices • These data combined with

Accenture’s Learning Organization Knowledge Base • Global best practices • These data combined with ongoing research identify areas to be developed • Research analysis team with content experts to develop best practices • Qualitative and quantitative information and tools in Intranet for corporate wide access 37

Accenture’s Knowledge Base. . 2 • • • Best company profiles Relevant Accenture engagement

Accenture’s Knowledge Base. . 2 • • • Best company profiles Relevant Accenture engagement experience Top 10 case studies and articles World-class performance measures Diagnostic tools 38

Accenture’s Knowledge Base. . 3 • • • Customizable presentations Process definitions Directory of

Accenture’s Knowledge Base. . 3 • • • Customizable presentations Process definitions Directory of internal experts Best control practice Tax implementations 39

Conclusion • • • Cost-benefit analysis Where to store data physically Disaster recovery Internal

Conclusion • • • Cost-benefit analysis Where to store data physically Disaster recovery Internal or external Data security and ethics Data purging 40

Conclusion. . 2 • • • The legacy data problem Data delivery Privacy –

Conclusion. . 2 • • • The legacy data problem Data delivery Privacy – especially customer information What to do? When to do it? 41