Arto Vitikka Arctic Centre University of Lapland www

  • Slides: 28
Download presentation
Arto Vitikka Arctic Centre University of Lapland www. arcticcentre. org WWW. ARCTICCENTRE. ORG Metadata

Arto Vitikka Arctic Centre University of Lapland www. arcticcentre. org WWW. ARCTICCENTRE. ORG Metadata and semantic web

Metadata • introduction to metadata • metadata on scientific data • Open Archives Initiative

Metadata • introduction to metadata • metadata on scientific data • Open Archives Initiative • examples Semantic web tools and technologies in Finland • introduction to semantic web technology • development work in Finland • examples WWW. ARCTICCENTRE. ORG Contents

Introduction to metadata • • Introduction to Metadata, Online Edition, version 3. 0 ,

Introduction to metadata • • Introduction to Metadata, Online Edition, version 3. 0 , Tony Gill, Anne J. Gilliland, Maureen Whalen, and Mary S. Woodley, Edited by Murtha Baca. http: //www. getty. edu/research/conducting_research/standards/intrometadata/ Wikipedia • Data about data • Used in several domains: research, geographical information systems, libraries and social media (tags in Flickr, Del. icio. us) WWW. ARCTICCENTRE. ORG Sources used here:

 • Organization and description. A primary function of metadata is the description and

• Organization and description. A primary function of metadata is the description and ordering of original objects or items in a repository or collection, as well as of the information objects relating to the originals • Creation, multiversioning and reuse of information objects. Multiple versions of the same object may be created for preservation, research, exhibit and dissemination. Administrative and descriptive metadata should be included by the creator or digitizer, especially if reuse is envisaged. • Searching and retrieval. Good descriptive metadata is essential to users’ ability to find and retrieve relevant metadata and information objects. • Validation. To ascertain the authoritativeness and trustworthiness of the information. WWW. ARCTICCENTRE. ORG Primary Functions of Metadata

 • Utilization and preservation. Metadata on information objects related to user annotations, rights

• Utilization and preservation. Metadata on information objects related to user annotations, rights tracking, and version control may be created. Digital objects also need to be subject to a continuous preservation regime and undergo processes such as refreshing, migration, and integrity checking to ensure their continued availability and to document any changes that might have occurred to the information object during preservation processes. • Disposition. Metadata is a key component in documenting the disposition (e. g. , accessioning, deaccessioning) of original objects and items in a repository, as well as of the information objects relating to those originals. WWW. ARCTICCENTRE. ORG Primary Functions of Metadata /2

The more highly structured an information object is, the more that structure can be

The more highly structured an information object is, the more that structure can be exploited for searching, manipulation, and interrelating with other information objects and systems. Then metadata: • certifies the authenticity and degree of completeness of the content; • establishes and documents the context of the content; • identifies and exploits the structural relationships that exist within and between information objects; • provides a range of intellectual access points for an increasingly diverse range of users; and • building of new services where integrating and reusing existing information sources WWW. ARCTICCENTRE. ORG Benefits of structured metadata

 • The Open Archives Initiative develops and promotes interoperability standards that aim to

• The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. • The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. The essence of the open archives approach is to enable access to Web-accessible material through interoperable repositories for metadata sharing, publishing and archiving. • The OAI-PMH gives a simple technical option for data providers to make their metadata available to services, based on the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language). WWW. ARCTICCENTRE. ORG The Open Archives Initiative, Protocol for Metadata Harvesting - OAI-PMH

 • Data Provider: maintains one or more repositories (web servers) that support the

• Data Provider: maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata. • Service Provider: issues OAI-PMH requests to data providers and uses the metadata as a basis for building value-added services. A Service Provider in this manner is "harvesting" the metadata exposed by Data Providers • Harvesting: refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store. WWW. ARCTICCENTRE. ORG Definitions

 • The metadata that is harvested may be in any format that is

• The metadata that is harvested may be in any format that is agreed by a community. Dublin Core is specified to provide a basic level of interoperability. • Thus, metadata from many sources can be gathered together in one database, and services can be provided based on this centrally harvested, or "aggregated" data. • The link between this metadata and the related content is not defined by the OAI protocol. WWW. ARCTICCENTRE. ORG Services and applications

 • OAI-PMH does not provide a search across this data, it simply makes

• OAI-PMH does not provide a search across this data, it simply makes it possible to bring the data together in one place. In order to provide services, the harvesting approach must be combined with other mechanisms. • OAI-PMH is technically very simple, but building coherent services that meet user requirements remains complex. • A number of software systems support the OAIPMH: Fedora, GNU EPrints, Open Journal Systems, DSpace, Digi. Tool and Meta. Lib among others. WWW. ARCTICCENTRE. ORG Services and applications / 2

 • Doria contains digital collections of Finnish universities and polytechnics. • University of

• Doria contains digital collections of Finnish universities and polytechnics. • University of Lapland is now starting to implement open archives system, integrated into Doria. • Work starts with master's thesis, later on the publications of the staff and the Lapland University Press. • https: //oa. doria. fi/ WWW. ARCTICCENTRE. ORG Open Archives in Finland

 • Map of OA repositories: http: //maps. repository 66. org/ • Registry of

• Map of OA repositories: http: //maps. repository 66. org/ • Registry of Open Access Repositories http: //roar. eprints. org/ • The aim of ROAR is to promote the development of open access by providing timely information about the growth and status of repositories throughout the world. • Arctic Open Archives application to serve the UArctic and the arctic science community? WWW. ARCTICCENTRE. ORG Examples

Sources used here and more information: • Open Archives Forum http: //www. oaforum. org/tutorial/

Sources used here and more information: • Open Archives Forum http: //www. oaforum. org/tutorial/ • The Open Archives Initiative Protocol for Metadata Harvesting - http: //www. openarchives. org/pmh/ WWW. ARCTICCENTRE. ORG More information

 • Description of research data • Answers to questions: who, what, where, when,

• Description of research data • Answers to questions: who, what, where, when, how and how to obtain the data • International metadata standard: – Directory Interchange Format (DIF) – used in Global Change Master Directory • Required, Highly recommended and Recommended fields – title, parameters, data center, summary, personel, instrument, resolution, temporal and spatial coverage, etc. WWW. ARCTICCENTRE. ORG Metadata on research data

Data portals – maintained by NASA – Earth science data sets and services relevant

Data portals – maintained by NASA – Earth science data sets and services relevant to global change – more than 30 000 descriptions on data and services – gcmd. nasa. gov • Antarctic Master Directory – part of GCMD – about 6 400 data descriptions (3. 3. 2010) – national Antarctic data portals • IPY Metadata Portal – part of GCMD – 363 descriptions (3. 3. 2010) WWW. ARCTICCENTRE. ORG • Global Change Master Directory (GCMD)

 • • Facilitate access to data and maximise the use of data Avoid

• • Facilitate access to data and maximise the use of data Avoid duplication of research and data collection Improve efficiency of scientific data management Facilitate new research through access to existing scientific data • Improve cooperation and interoperability between disciplines • Data may be valued more than the immediate publications it has generated • Scientists cannot be expected to know how their data may be used in the future WWW. ARCTICCENTRE. ORG Benefits of metadata

 • The Semantic web - the Internet of meanings - is the next

• The Semantic web - the Internet of meanings - is the next generation of the Internet. • The idea of the semantic web is to make content understandable for machines by binding it to some formal and meaningful description. • Enables user communities to put machine-understandable contents on the web which can be shared and processed both by automated tools and people. • Integration and reuse of the information in new unforeseeable applications and domains is possible. WWW. ARCTICCENTRE. ORG Semantic web

 • Ontologies are the infrastructure of the semantic web. • Ontologies serve to

• Ontologies are the infrastructure of the semantic web. • Ontologies serve to make metadata understandable by computers, they define the way descriptive terms are interrelated and used in a given domain of interest. • Semantic web concept makes finding the correct data and information more effective, also ensuring the validity of the information and enabling language independence. • For example when talking about Nokia - town, rubber boots, car tires, the Nokia company or a Nokia phone? • Or ‘Paris’ in a web page tells the computer explicitly that in this context the information is about town Paris, Texas, US WWW. ARCTICCENTRE. ORG Semantic web / 2

 • The development of the Semantic Web started about ten years ago •

• The development of the Semantic Web started about ten years ago • European Commission has funded related research and development projects. • The Semantic computing research group at the Aalto University has conducted several years Semantic Web technology development projects • Variety of Semantic Web infrastructure services like the Finnish Ontology Library Service and open source semantic tools for creating applications. • Now we are at a state where the Semantic Web is moving from being a vision to becoming reality. WWW. ARCTICCENTRE. ORG Semantic web / 3

Semantic web in Finland Finnish Ontology Library Service ONKI http: //www. yso. fi/ The

Semantic web in Finland Finnish Ontology Library Service ONKI http: //www. yso. fi/ The ONKI service contains Finnish and international ontologies, vocabularies and thesauri needed for publishing your content cost-efficiently on the Semantic Web. Ontologies are conceptual models identifying the concepts of a domain. They contain machine "understandable" descriptions of the relations between the concepts. WWW. ARCTICCENTRE. ORG Services

 • Finnish General Upper Ontology (YSO) with ca. 20 000 concepts • Besides

• Finnish General Upper Ontology (YSO) with ca. 20 000 concepts • Besides general ontology there are several special ontologies • Ontologies have been created either based on existing vocabularies or from scratch • The Finnish General Upper Ontology has been made available for users (ontology developers, content indexers, information search) by setting up ontology server and providing applications for integrating the ontology into existing content management systems • http: //www. seco. tkk. fi/ontologies/ WWW. ARCTICCENTRE. ORG Semantic web in Finland /2

WWW. ARCTICCENTRE. ORG

WWW. ARCTICCENTRE. ORG

Semantic Portal Building Tools • Lightweight multifaceted search engine for RDF data • Browser-based

Semantic Portal Building Tools • Lightweight multifaceted search engine for RDF data • Browser-based semantic annotation tool • Tool for Creating Semantic View-Based Search and Browsing Portals • Generic View-Based RDF Search Engine • A tool for creating static web sites based on semantical content. Semantic Information Extraction • A framework for automatic annotation • Automatic Information Retrieval Ontologically WWW. ARCTICCENTRE. ORG Semantic web open source tools

 • • • National Ontology Service ONKI Ontology repository Ontology server for publishing

• • • National Ontology Service ONKI Ontology repository Ontology server for publishing vocabularies Ontology Service for Geographical Data Ontology Service for Finding People and Organizations • Ontology-based Annotation Assistant WWW. ARCTICCENTRE. ORG Ontology services

Culture. Sampo • Semantic web portal and a publication channel for Finnish cultural heritage.

Culture. Sampo • Semantic web portal and a publication channel for Finnish cultural heritage. • Contents comes from over 20 different Finnish museums, libraries, archives and other source, as well as from the Getty Foundation and Wikipedia. • The system aggregates cross-domain content of various kinds including artifacts, paintings, scuplture, drawings, abstract art, novels, comics, web pages, folklore and runes of different kinds, fictive persons and places, folk music, photos, persons, organizations, historical events, videos, buildings, and cultural sites. WWW. ARCTICCENTRE. ORG Applications

Terve. Suomi - Health. Finland - Demo • Metadata, ontology, and service infrastructure -

Terve. Suomi - Health. Finland - Demo • Metadata, ontology, and service infrastructure - based on W 3 C semantic web recommendations, a domain-specific metadata schema (Dublin Core application), and a set of ontologies and services provided within the National Ontology Service. • Semantic content creation process - for producing semantically annotated contents, based on the shared metadata model and ontologies. • Semantic portal Health. Finland - material is published via a semantic portal that creates a single national entry-point for health information, health promotion and health-related news. – The information is collected from a diverse group of sources including expert organizations, governmental institutes and NGOs. – A quality control process WWW. ARCTICCENTRE. ORG Applications / 2

 • Metadata enables the creation of new intelligent web services • Reuse and

• Metadata enables the creation of new intelligent web services • Reuse and integration of information • Standards and tools are existing • Open source tools does not mean that they are free • Building services still requires lots of work and a good funding • Tools to build services for the arctic communities WWW. ARCTICCENTRE. ORG Conclusions

Kiitos paljon! Thank You! WWW. ARCTICCENTRE. ORG Tack så mycket!

Kiitos paljon! Thank You! WWW. ARCTICCENTRE. ORG Tack så mycket!