Mapping Transforming Data for Semantic Integration The X















- Slides: 15
Mapping & Transforming Data for Semantic Integration The X 3 ML Toolkit Syntactic and Semantic Conversion of Metadata RDA 17 th Plenary Meeting – Edinburgh (Virtual 20 -23 April 2021 Maria Theodoridou Foundation for Research & Technology – Hellas (FORTH) Institute of Computer Science (ICS) maria@ics. forth. gr www. ics. forth. gr/isl/cci
Outline ❑ Motivation - Goals ❑ Requirements ❑ Data Transformation Workflow ❑ X 3 ML toolkit ❑ X 3 ML Mapping Definition Language ❑ 3 M Editor ❑ X 3 ML Engine ❑ Exploitation ❑ Pros & Cons 2
Motivation - Goals ❑ Institutions like galleries, libraries, archives and museums curate different types of collections that, even between similar types of institutions, are documented in different ways using different languages; influenced by different disciplines, objectives and geography, and are encoded using different metadata schemas. ❑ Handling such metadata as a unified whole is vital for progressing new fields of research and discovery, providing more knowledgeable information retrieval and (meta) data exchange. ❑ A unified source is the result of data aggregation and integration. ❑ Data aggregation and integration has the potential to create rich resources useful for a range of different purposes, from research and data modeling to education and engagement. It is being accomplished by incorporating several tools for modelling, cleaning, normalizing and transforming data.
Requirements To facilitate data transformations while preserving or even enhancing the semantics of data. ❑ In � � � ❑ In � � terms of conceptualization Definition of guidelines and best practices Reliance on standards Specifications for supporting schema mappings terms of technology Software for assisting users in describing their transformations Software for supporting data transformations, with emphasis on ❖ ❖ ❖ Configurability Extensibility Scaling Automation Ease of Use 4
Data Transformation Workflow Aggregator Semantic Integration & Interoperability CIDOC CRM & family of models Transformation - X 3 ML engine Data Transformation Tools Terminology Mapping URI generation specification - LOD Schema Matching Data Providers Heterogeneous Data Collections 5
Example Aggregator Semantic Integration & Interoperability Transformation - X 3 ML engine Data Transformation Tools Terminology Mapping URI generation specification - LOD Schema Matching Data Providers Heterogeneous Data Collections Documents Photos, Persons Places, Objects 6
X 3 ML Toolkit A set of small, open source, microservices designed with open interfaces, easily customized and adapted to complex environments that assist the data provisioning process for information integration, using X 3 ML, a mapping definition language: ❑ X 3 ML mapping definition language, an XML based declarative language which describes schema mappings in such a way that they can be collaboratively created and discussed by experts. ❑ 3 M – the Mapping Memory Manager, a tool for managing mapping definitions. ❑ 3 M Editor, a web application suite to assist users during the mapping definition process, using a human-friendly user interface and a set of sub-components that either suggest or validate the user input. ❑ X 3 ML Engine, is a tool that realizes the transformation of data resources to a target format with respect to an X 3 ML Mapping definition language. 7
X 3 ML Mapping Definition Language Specification ❑ ❑ X 3 ML is a declarative, XML based language which describes schema mappings in such a way that they can be collaboratively created and discussed by experts. Key Features � It provides a declarative way for describing schema mappings � Focuses on properly mapping schema resources � Decoupled from the URI and values generation process � Mappings are described using XML serialization https: //github. com/isl/x 3 ml/blob/master/docs/x 3 ml-language. md 8
3 M Editor ❑ ❑ Enables the creation of mapping definitions (X 3 ML) between source and target schemata Supports guided mappings by analyzing source resources and target schemata Provides user space and mapping storage Transforms data (in RDF format) using X 3 ML Engine http: //www. ics. forth. gr/isl/3 M 9
3 M Editor • • Implemented using modern and responsive technologies Faster and light-weight (at client side) Allows concurrent edits of mappings from different users (a la Google docs) Beta version to be announced early 2020 10
X 3 ML Engine Realizes the transformation of data resources to a target format with respect to an X 3 ML mapping definition. Main principles: ❑ Simplicity by design ❑ Transparency in terms of expected output ❑ Re-use of standards and technologies as much as possible ❑ Facilitating the instance matching process ❑ Available as: API, executable (console-based & GUI), service History: ✔ Designed by FORTH. ✔ Initial development by X 3 ML Input X 3 ML Engine Generator Policy Ontology-based descriptions DELVING B. V. under the support and contribution of FORTH (until v. 1. 3). ✔ FORTH took over the full development since 3/2015. ✔ 24 Releases (Latest: v. 1. 9. 4 8/2020) Terminology https: //github. com/isl/x 3 ml 11
Exploitation / Assets Matrix X 3 ML 3 M Editor X 3 ML Engine FP 7 ARIADNE ✔ ✔ ✔ H 2020 Blue. BRIDGE ✔ H 2020 Blue. Cloud ✔ ✔ ✔ H 2020 VRE 4 EIC ✔ ✔ ✔ H 2020 PARTHENOS ✔ ✔ ✔ H 2020 SSHOC ✔ ✔ ✔ H 2020 ARIADNEplus ✔ ✔ ✔ H 2020 Sea. Li. T ✔ ✔ ✔ British. Museum ✔ ✔ 12
X 3 ML Toolkit pros & cons Pros Cons Simple model for defining mappings Currently only xml → rdf Supports incremental changes of source & target schemata Shallow learning curve Supports customized URI generation policies URI specification needs technical skills Decouples schema mapping from URI specification Scalability (good for small and large datasets, memory issues with huge datasets - big data) Easily deployed in different environments Promotes the collaborative work of experts 13
Useful links The X 3 ML Toolkit https: //www. ics. forth. gr/isl/x 3 ml-toolkit The source code is open source available on github: • 3 M - Mapping Memory Manager https: //github. com/isl/Mapping-Memory-Manager • 3 M Editor https: //github. com/isl/3 MEditor • X 3 ML Engine https: //github. com/isl/x 3 ml Free to use deployment of 3 M https: //isl. ics. forth. gr/3 M/ 14
Thank you for your attention! Maria Theodoridou Foundation for Research and Technology – Hellas (FORTH) maria@ics. forth. gr