Introduction to Data Information and Knowledge Management Dr
























- Slides: 24
Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge Management January 2014
Data Management l Concepts in database systems l Types of database systems l Distributed Data Management l Heterogeneous database integration l Federated data management
An Example Database System Adapted from C. J. Date, Addison Wesley, 1990
Metadata l Metadata describes the data in the database - Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary l Metadatabase stores the metadata - Could be physically stored with the database l Metadatabase may also store constraints and administrative information l Metadata is also referred to as the schema or data dictionary
Functional Architecture Data Management User Interface Manager Schema (Data Dictionary) Manager (metadata) Query Manager Security/ Integrity Manager Transaction Manager Storage Management File Manager Disk Manager
DBMS Design Issues l Query Processing - Optimization techniques l Transaction Management - Techniques for concurrency control and recovery l Metadata Management - Techniques for querying and updating the metadatabase l Security/Integrity Maintenance - Techniques for processing integrity constraints and enforcing access control rules l Storage management - Access methods and index strategies for efficient access to the database
Types of Database Systems l Relational Database Systems l Object Database Systems l Deductive Database Systems l Other - Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.
Relational Database: Example Relation S: S# S 1 S 2 S 3 S 4 S 5 SNAME Smith Jones Blake Clark Adams Relation SP: STATUS CITY 20 London 10 Paris 30 Paris 20 London 30 Athens Relation P: P# P 1 P 2 P 3 P 4 P 5 P 6 PNAME Nut Bolt Screw Cam Cog COLOR WEIGHT CITY Red 12 London Green 17 Paris Blue 17 Rome Red 14 London Blue 12 Paris Red 19 London S# S 1 S 1 S 1 S 2 S 3 S 4 S 4 P# P 1 P 2 P 3 P 4 P 5 P 6 P 1 P 2 P 2 P 4 P 5 QTY 300 200 400 200 100 300 400 200 300 400
Example Class Hierarchy Document Class D 1 D 2 Method 1: ID Name Author Publisher Print-doc-att(ID) Journal Book Subclass B 1 Method 2: Print-doc(ID) Subclass Volume # # of Chapters J 1
Example Composite Object Composite Document Object Section 2 Object Section 1 Object Paragraph 2 Object
Distributed Database System Database 1 Database 3 DBMS 3 Distributed Processor 3 Site 3 DBMS 1 Distributed Processor 1 Communication Network Site 1 Database 2 Distributed Processor 2 DBMS 2 Site 2
Data Distribution SITE 1 EMP 1 DEPT 1 SS# Name Salary D# D# Dname MGR 1 2 3 4 5 6 John Paul James Jill Mary Jane 20 30 40 50 60 70 10 20 20 20 10 C. Sci. Jane 30 English David 40 French Peter D# DEPT 2 Dname MGR 50 Math John 20 Physics Paul SITE 2 EMP 2 SS# 9 Name Mathew Salary 70 D# 50 7 David 80 30 8 Peter 90 40
Interoperability of Heterogeneous Database System A Database System B (Relational) Transparent access to heterogeneous databases both users and application programs; Query, Transaction processing (Object. Oriented) Network Database System C (Legacy)
Different Data Models Network Node A Node B Database Relational Model Network Model Node C Database Hierarchical Model Node D Database Object. Oriented Model Developments: Tools for interoperability; commercial products Challenges: Global data model
Federated Database Management Database System A Database System B Federation F 1 Cooperating database systems yet maintaining some degree of autonomy Federation F 2 Database System C
Federated Data and Policy Management Data/Policy for Federation Export Data/Policy Component Data/Policy for Agency A Component Data/Policy for Agency C Component Data/Policy for Agency B
Outline of Part I: Information Management l Information Management Framework l Information Management Overview l Some Information Management Technologies l Knowledge Management
What is Information Management? l Information management essentially analyzes the data and makes sense out of the data l Several technologies have to work together for effective information management - Data Warehousing: Extracting relevant data and putting this data into a repository for analysis - Data Mining: Extracting information from the data previously unknown - Multimedia: managing different media including text, images, video and audio - Web: managing the databases and libraries on the web
Data Warehouse Users Query the Warehouse Oracle DBMS for Employees Data Warehouse: Data correlating Employees With Medical Benefits and Projects Sybase DBMS for Projects Could be any DBMS; Usually based on the relational data model Informix DBMS for Medical
Data Mining Information Harvesting Knowledge Mining Data Mining Knowledge Discovery in Databases Data Dredging Data Archaeology Data Pattern Processing Database Mining Siftware Knowledge Extraction The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques (Thuraisingham 1998)
Multimedia Information Management Video Source Broadcast News Editor (BNE) Scene Change Detection Frame Classifier Imagery Silence Detection Correlation Story GIST Theme Broadcast Detection Commercial Detection Key Frame Selection Story Segmentation Audio Closed Caption Text Speaker Change Detection Closed Caption Preprocess Segregate Video Streams Broadcast News Navigator (BNN) Token Detection Named Entity Tagging Analyze and Store Video and Metadata Multimedia Database Management System Video and Metadata Web-based Search/Browse by Program, Person, Location, . . .
Image Processing: Example: Change Detection: l Trained Neural Network to predict “new” pixel from “old” pixel - Neural Networks good for multidimensional continuous data - Multiple nets gives range of “expected values” l Identified pixels where actual value substantially outside range of expected values - Anomaly if three or more bands (of seven) out of range l Identified groups of anomalous pixels
Semantic Web 0 Adapted from Tim Berners Lee’s description of the Semantic Web T R U S T P R I V A C Y Logic, Proof and Trust Rules/Query RDF, Ontologies Other Services XML, XML Schemas URI, UNICODE 0 Some Challenges: Security and Privacy cut across all layers
Knowledge Management Components Knowledge Components of Management: Components, Cycle and Technologies Components: Strategies Processes Metrics Cycle: Knowledge, Creation Sharing, Measurement And Improvement Technologies: Expert systems Collaboration Training Web