ER 2016 Tutorial on Abstractions in conceptual modelling
ER 2016 - Tutorial on Abstractions in conceptual modelling and surroundings Part 2. Abstraction as a representation 1
© Carlo Batini, 2015 This work is licensed under the Creative Commons Attribution-Non. Commercial-No. Derivatives 4. 0 International License. To view a copy of this license, visit http: //creativecommons. org/licenses/by-nc-nd/4. 0/
Types (and their subtypes) of abstractions with >= 2 occurrences Aggregation 27 Filtering 26 Generalization 25 Granularity 19 Module (in ontol. ) 19 Hierarchization 16 Classification 12 Relation to function 12 Pattern 11 Folding 10 Contextualization 10 Mapping 9 Selective 7 Still image 6 Clustering 6 Grouping -Association 6 Sampling 5 Segmentation 5 Metamodel Smoothing Summarization Temporal Contextualization Axiomatic mapping Bayesian Conceptual Deep Instantiation Integration 2 Materialization Mult-level objects Multi-level relationships Multi-level temporal cl. Naming Parametrization Perceptual Powertype 4 4 3 3 3 2 2 2
Types of abstractions with one occurrence (selection…) Behavioural (abstraction) Caricaturing Change of geometry type Clique Codification Conceptualization Contraction Dataflow Dilatation Domain Effect Empirical Enlargement Exoskeleton Formal Generic Generation Geographical data reduction Geometric Geometry enhancing Graphic Ground Information Iteration Layout Lossy compression Navigational Normalization (normal case first) Normalization (in the rel. Model) Numerical Ownership Recursion Reflective Regularization Realization Role Schematizing Sensor annotated hierarchical Simplification Squaring Task Topological Versioning Well definedness 4
Abstractions in visualization, mathematics, automatic layout of diagrams, machine learning, software engineering In visualization Categorization 3 D Multi-scale based Importance adaptive Regularization Non linear diffusion Symmetry based Shape Structure a. Parametrizing Lowest common a. of a set of graphs Stylistic Non photorealistic rendering In Mathematics Decontextualization Reification Equivalence relation Lexicalization Nominalization Register Generic example Elicitation of form over content In automatic layout of diagrams Topological Shape Metric In Machine Learning (except Bayesian) Feature selection Instance selection Feature discretization F. construction Predicate invention Term abstraction Propositionalization In software engineering A. by parametrization A. by specification Procedural Polymorphic abstr. Control a. Iteration a. Abstract data type Procedural data type Interface abstr. Subroutine Function Procedure Pointer Metamodel …….
1. 1. Generic abstractions Abstractions by forgetting vs abstraction by hiding 6
Two different abstractions Abstraction by forgetting Id Date of birth Associate professor Full professor Id Place of birth Professor Id Abstraction by hiding Id Date of birth Associate professor Full professor Id Place of birth Professor Id 7
Abstraction by forgetting [2015 Palmonari] Dacena tool … only paths whose length n < = 3 8
Abstraction by hiding Id Professor • Conceptual DB Schema Associate professor Date of Birth • Logical DB Schema • Physical DB Schema Full professor Place of Birth Professor Id Type Date of Birth Place of Birth … … … …
Abstraction by hiding Id Professor • Conceptual DB Schema Associate professor Date of Birth • Logical DB Schema • Physical DB Schema Full professor Place of Birth Professor Id Type Date of Birth Place of Birth … … … …
Abstraction in mathematics vs abstraction in computer science 11
Primary products of Maths. & Comp Science [Colburn 2007] • We can compare the nature of mathematics and computer science by comparing their primary products. • We argue that the primary product of mathematics is inference structures, while the primary product of computer science is interaction patterns. • This is a crucial difference, and it shapes their use of formalism and the kind of abstraction used in the two disciplines. 12
Thesis about Abstractions in Mathematics & Abstractions in Computer Science • Mathematics, being primarily concerned with developing inference structures, has information by forgetting as its abstraction objective. • Computer science, being primarily concerned with developing interaction patterns, has information by hiding as its abstraction objective. 13
Conclusions • Any science, including computer science, succeeds by constructing formal mathematical models of their subject matter that eliminate inessential details. • We might call such elimination information neglect. • However, that science is distinguished from Whycomputer – Information hiding as separation of mathematics in the enables use of a kind of abstraction that computer concerns to design representations scientists call information hiding. • The complexity of behaviour of modern computing devices makes the task of programming them impossible without abstraction tools that hide, but do not neglect, details that are essential in a lower-level processing context but inessential in a software design and programming context principle of separation of concerns. 14
1. 2 Abstractions in Economics & Knowledge Representation 15
Abstractions as economic enablers – the approach in [Boisot 2005] Abstractions allow us to organize knowledge assets as accumulations that – produce a stream of useful services over time while – economizing on the consumption of physical resources, i. e. minimizing the rate of entropy consumption. 16
Two mechanisms to organize knowledge assets: 1. Abstraction • Abstraction – Knowledge that is set out in documents or in people’s heads has to be more abstract than knowledge that is embedded in artifacts of necessity • A crucial difference between concrete and abstract knowledge is that the first type is confined to specific applications in space and time whereas the second type is more general and less restricted in its scope. • E. g. Maxwell’s equations find application in far larger number of contexts than can the more concrete knowledge embedded in the security code of an alarm system. 17
Two mechanisms to organize knowledge assets: 2. Codification • Codification - Mass-produced artifacts embed knowledge, that, in order to be trasmitted or reused, has to be systematically formalized and codified. • Knowledge that is expressed discursively, by contrast, usually allows for a greater degree of informality. Yet even discursive knowledge must be codifiable to some minimum degree if it is to be transmissible. 18
Moving in the abstraction-codification two dimensional space – an example Codification Employee Department Dept# Abstraction Department Dept Name Addr. Dept. Emp Dept# in Order Emp Name City Employee Sales person Item Emp# of Purchaser We aim to manage in a database departments, employees, sales persons, for each employee the city where he/she is born, etc. Emp# %oftime. Addr.
Abstraction and Codification together as founding concepts • In summary, Abstraction and Codification constitute two founding concepts for the analysis of knowledge assets. • They represent two quite different though interrelated ways to lower the cost of converting potentially usable knowledge into knowledge assets. • Furthermore, they are crucial in getting value from knowledge asset exchange. 20
Value, utility, scarcity • The two concepts that characterize information/knowledge value are – utility and – scarcity. 21
First and second dimension of the I-Space: codification and abstraction • According to the neoclassical theory of value, utility measures what an individual economic agent (person or firm) gets out of consuming a given quantity of an economic good value in use • In the Boisot’ approach, the utility that can be extracted from an information good is a function of codification and abstraction, two of the three dimensions in the I-Space 22
Codification The I-Space [Boisot 1995] Abstraction 23
The thrid dimension: diffusion • Value is partly relational: it measures not only the utility of a good but also its scarcity relative to demand to alternative offerings. • The third dimension of the I-Space, diffusion, allows us to establish the scarcity of information products. 24
The I-Space (Figure from Codification Boisot 1995) max and min value of knowledge Why - Abstraction enables Sharing knowledge for (in this case) economic exchange purposes Abstraction Diffusion
Selective abstractions Fisheye abstraction 26
Goals of Fisheye abstractions [Furnas 1992], [Gansner 2003] The Steiberg’s poster - A "fisheye" lens can show places nearby in great detail while still showing the whole world -- simply by showing the more remote regions in successively less detail. 27
Focus on… • Distorsion – Graphically distorts the view, where items shrink as they move away from the focus point. • Filtering - A distance function determines whether or not items should appear on the display. • Topology driven fisheye view – The fisheye is determined by the topology. 28
Example of distorsion by variable scale mapping [Harrie 2002] 29
Example of filtering [Gansner 2003] The 4 elt graph, |N|=15, 606, |E|=45, 878 Approximating the 4 elt graph at three different scales of decreasing size and accuracy 4394 -node approximation 1223 -node approximation 341 -node approximation 30
Topological driven fisheye view Views are formed by superposition of several approximations of the graph. The focus area from the finest graph is in red. The three examples focus on • the right hand side What – Remove detail, while enhancing • the small central hole relevant information • the left hand side 31
2. Selected abstraction types 32
Elementary vs multitype abstractions • Elementary a. – abstractions types that cannot result in other abstraction types • Multitype a. – Abstraction types that can be specialized in other (more elementary) abstraction types 33
Elementary abstractions Generalization in conceptual modeling vs ontologies 34
Generalization in conceptual modeling intension Id Id Date of birth Associate professor Full professor Id Place of birth Date of birth Professor Associate professor Full professor Place of birth 35
Generalization in ontologies intension & extension [Calegari 2009] 36
Elementary abstractions Generalization and Aggregation in Conceptual Modeling vs GIS 37
Construct of the ER model Generalization and Aggregation Abstractions in the ER Model Entity Attribute of entity Relationship Attribute of relationship Is-a hierarchy Generalization hierarchy Identifier Diagrammatic representation
Metamodeling of GIS&Maps with ISO 191 xx Standards Example of Road-bridge application schema [2008 Shekhar] 39
Generalization in an old map…… 40
Differently shaped generalizations in a geographical directory 41
Different metaphors for generalization 42
Instance vs class 43
Generalization and Aggregation 44
Elementary Abstractions Generalization and Aggregation in Data Bases and Service Science 45
DB design and service design States of real world Changes of states of real world Conceptual design Conceptual schema Abstract service Logical design Logical schema Concrete services Database design Service design 46
Example of abstract service representation: Hotel reservation Service name Functional property Non-functional properties Data Schema Hotel Reservation a. Reserve a room in a hotel b. Change of state in Petri Net Notation a. List of non functional properties: 1. Price 2. Payment method Reserva tion Person check-in date Hotel check-out date 47
A list of conceptually related services A list of abstract services Change of address Change of addr. between two municipalities Change of addr. between Italy and abroad Change of addr. between two foreing countries Update residence address Update addr. in driver’s licence Choose new doctor Choose new electricity supplier 48
Example of conceptual schema for services related to change of home address [Batini 2015] From a list of abstract services To a semantically structured repository Change of address Change of addr. between two municipalities is-a Change of addr. between two municipalities What – to design service systems Change of addr. between Italy and abroad Change of addr. between two foreing countries part-of Change of addr. between Italy and abroad Change of addr. between two foreing countries Update residence address Update addr. in driver’s licence Choose new doctor Choose new electricity supplier 49
Elementary abstractions Summarization in Big Data – the case of Linked Open Data 50
The LOD Cloud 2015 2007 What types of resources are there in a data set? How are they described? What types of resources are linked by a certain property and how frequently?
Two types of summarization • Fine grain summarization • Coarse grain summarization 52
Fine grain summarization in LOD 53
Example of fine grain summarization at different levels starting from a keyword 2015 2007 What types of resources are there in a data set? How are they described? What types of resources are linked by a certain property and how frequently?
Coarse grain summarization 55
The Abstat approach [Spahiu 2015] Relevance Based Summarization Identifying subsets of data sets or ontologies that are considered to be more relevant. Pattern Based Approaches Aims at extracting knowledge patterns for a complete representation of the dataset. ABstraction and STATistics Schema Induction Induces a schema from the data and aims at extracting stronger assertions. Statistics about the dataset Aims at reporting statistics about the usage of different vocabularies, properties and types in the data. 56
Abstat q ABSTAT (http: //abstat. disco. unimib. it) is an ontology-driven linked data summarization framework q A summary provides a complete but compact schema-level representation of a data set § A set of Abstract Knowledge Patterns (AKPs) § Statistics What – to improve comprehension of the LOD cloud An AKP represents the fact that there are instances of type Person linked with instances of type Settlement by the property birthplace How many times does this pattern occur in the data set How many times does a certain type occur as minimal type and how many times does the property occur in the dataset
Remark • I skip the following, I will come back in case I have time 58
Elementary abstractions Granularity abstraction 59
Granular computing • Granular Computing is an emerging conceptual and computing paradigm of information processing. • A central notion is an information-processing pyramid with different levels of clarifications. Each level is usually represented by ‘chunks’ of data or granules, also known as information granules. 60
Introduction to granularity • Granularity deals with articulating something (hierarchically) according to certain criteria, the granular perspective, where a lower level within a perspective contains knowledge (i. e. entities, concepts, relations, constraints) or data (measurements, laboratory experiments etc. ) that is more detailed than the adjacent higher level. • Conversely, a higher level ‘abstracts away’ – simplifies or makes indistinguishable – finer-grained details. • A granular level is also called grain size and contains one or more entities and/or instances. 61
Example of granularity process - 1 62
Example of granularity process - 2
Gestalt effect • Granularity can be investigated in terms of the gestalt effect: our brains and psychology, as human beings, are tuned to the perception of entities in a quite narrow range of spatial and temporal scales. • For example, we do not immediately perceive atoms or galaxies, although we are constantly “seeing” them. We don’t perceive geological eras, although we are in one. 64
Gestalt effect - example - Walk with a friend to the edge of a forest and ask him/her what s/he sees. The response will probably be “trees”, or perhaps “a forest” but it is unlikely to be “leaves”, even though this is also correct. The response is even more unlikely to be “cells” or “part of the biosphere”. - A tree frog probably perceives the world differently: its brain is most likely tuned to the leaf concept. 65
[Keet 2006] taxonomy c. G: consisting of the basic elements common to all types of granularity, such as • the domain demarcation, • a granular perspective (e. g. time, human structural anatomy), • levels within a perspective, • relations between the levels, and • constraints like that a granular perspective must have at least two levels 66
High level taxonomy • s. G: scale-dependent granularity where the contents is structured according to a (more or obvious) arbitrary Why – find structures andless regularities scale; consists of the c. G and additional constraints. • E. g. calendar hierarchy, rounding off of altitude lines on a cartographic map. 67
2. 2 Multitype Abstractions 68
Multi-type abstractions • Multi-level objects and multi-level relationships [Neumayr 2008], [Neumayr 2009], [Neumayr 2011], • Materialization • Powertypes • Deep instantiation 69
Characteristics of an m-object An m-object can concretize another m-object, which is referred to as its parent. A concretize-relationship comprises classification-, generalization- and aggregation-relationships between the levels of a m-object and the levels of its parent, as follows: • Classification - Instantiation: Each m-object can be regarded as an instance of its parent m-object. In particular, the top-level of a mobject is an instance of the second top-level of its parent m-object. • Generalization - Specialization: The level descriptions of a m-object correspond to subclasses of the corresponding levels of its parent. The m- object can define new levels or add attributes to levels. • Aggregation - Decomposition: The concretization path between mobjects of different levels expresses an aggregation hierarchy at the instance level. At the schema level the aggregation hierarchy is given by the order of levels of a m-object. 70
Multi-faceted nature of m-objects A concretization relationship between two m- objects does not reflect that one m-object is at the same time an instance of, component of, and subclass of another m-object as a whole. Rather, a concretization relationship between two m-objects, such as Car and Product in figure, is to be interpreted in a multi-faceted way. • M-object Car is an instance of m-object Product with respect to that m-object's second-top level, Category, and gives values for attributes of that level, e. g. , tax. Rate. • M-object Car is a component of m-object Product considered as an object of that m-object's top-level, Catalog. • Each level of m-object Car can be seen as a subclass of the corresponding level of m-object Product. 71
Product catalog modeled with m-objects and m-relationships 72
- Slides: 72