When terminology and semantic web meet Denis Dechandon
When terminology and semantic web meet Denis Dechandon, Eugeniu Costeţchi, Anikó Gerencsér, Anne Waniart Metadata Sector A. 1 Unit, Standardisation - Directorate A, Information Management Publications Office of the European Union Translating & the Computer 40 London, 15 -16/11/2018
Outline n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § A few milestones and figures § Advantages and limitations § A tool ready for the semantic web § A tool supporting multilingualism § A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § A few milestones and figures § Advantages and limitations § A tool ready for the semantic web § A tool supporting multilingualism § A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
Publications Office – Our Raison d’être Production Access and re-use Long-term preservation Publishing EU law and other information from EU institutions https: //publications. europa. eu/en/home
Linguistic tools and their purposes n Interinstitutional Style Guide http: //publications. europa. eu/code/en/en-000100. htm n Authority lists https: //publications. europa. eu/en/web/eu-vocabularies/authority-tables n Taxonomies E. g. CORDIS (not published), CPV n. A thesaurus Euro. Voc n Ontologies e-Procurement (under construction) , ELI
Around controlled vocabularies (1) Alignments with Euro. Voc EU Member States – N-Lex project § Ongoing: n. N-Lex project: of single entry of point to the national EU projects • Thesaurus the Court Justice of the European Union law databases on individual EU countries • STW: German national Library of Economics § • http: //eur-lex. europa. eu/n-lex//index_en Legi. Lux: Ministry of State, Luxembourg European Commission HR – Functions § • Establishing links between. DG Member State law and Domains Taxonomyand EU law information n. Ministry of State, Luxembourg: Legi. Lux § Foreseen: § Presentation: Law via the Internet conference • Publications Office of the EU, CORDIS – Euro. Sci. Voc § Alignment with Euro. Voc (Field of Science Taxonomy) n. France (EC-funded project): Legi. Voc http: //lynx-project. eu/ n. Foreseen: Croatia, Finland
Around controlled vocabularies (2) SEMIC N-Lex ISA 2 Joinup BARTOC. org EU Vocabularies IMSB and DGs of the European Commission EU institutions, bodies and agencies
n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § A few milestones and figures § Advantages and limitations § A tool ready for the semantic web § A tool supporting multilingualism § A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
Euro. Voc – A few milestones and figures (1) n 1982 § Comparative study § Decision to construct a multilingual thesaurus § Compliant with the relevant international standards n 1984 § First edition in seven languages (Danish, Dutch, German, Greek, English, French and Italian) § Immediately into use at the European Parliament and the Publications Office (respective owner and manager of thesaurus) n 1995 § descriptors in their semantic context and includes references for non-descriptors. Based on this edition, Euro. Voc has been used as an indexing tool for the documentary databases of the Publications Office for the production of catalogues and tables of the Official Journal.
Euro. Voc – A few milestones and figures (2) n 1999 § Two new interinstitutional committees made up of representatives of the institutions involved in the project • The Steering Committee • The Maintenance Committee n 2016 Euro. Voc downloadable in SKOS/RDF n 2018 § Available in all EU official languages + 3 others § Re-used by EU institutions, national and regional parliaments, and economic operators • 7, 180 concepts, each of them consisting of – preferred and non-preferred terms – semantic relations to other concepts (hierarchical conceptual network) • And containing in most cases – definitions, scope notes, synonyms and quasi-synonyms – linguistic equivalents in all EU official languages • 400, 000 labels in multiple natural languages • Split in 21 domains, and 127 microthesauri represented as SKOS concept schemes • Available on the EU Vocabularies website and on the EU Open Data Portal
Euro. Voc – Advantages and limitations n Advantages § Terminological standardisation of indexing vocabularies more accurate documentary searches § Multilingualism documents are indexed in the language of the documentalist while searches to be made in the user’s language § Available on the web n Limitations § Designed to meet the needs of systems of general documentation on the activities of the European Union § Does not cover the various national situations at a sufficiently detailed level
Euro. Voc – A tool ready for the semantic web n. Built according to the ISO 25964 -1: 2011 standard § Maintenance and development of thesauri intended for information storage and retrieval § Concept-oriented and not term-oriented n. Implemented using the SIMPLE KNOWLEGDE ORGANISATION SYSTEM (SKOS) model (W 3 C) which § Provides a model for expressing the basic structure and content of concept schemes, § Supports interoperability • Linking of concepts from different datasets (mappings), • Reuse and sharing of concepts and their descriptions. § Concepts are identified with URIs. § Mappings • Exact match, close match • Broader match, narrower match or related match
Euro. Voc – A tool supporting multilingualism n A fundamental principle of the European Union n Role of the Directorate-General for Translation (European Commission) n Not terms but concepts (not translating but rather providing equivalents in other languages) n Provision of equivalents of non-preferred labels Should the structure be the same for each language?
Euro. Voc – A tool supporting the access to information n Guides the indexer and the searcher to choose the same term for the same concept, n Lists all the relevant concepts, n Provides corresponding preferred terms for each concept, n Allows easy navigation between concepts for example from broader to narrower or to related ones
n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § § § A few milestones and figures Advantages and limitations A tool ready for the semantic web A tool supporting multilingualism A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
Controlled vocabularies vs. terminological resources
Terminology resources and controlled vocabularies – Compared use Find terms Understand them Get their linguistic equivalents Tag pages Index content Find (search, access) information Retrieve it Reuse them Produce meaningful and fit for purpose content in various languages
Some particularities Available online Maintained with Voc. Bench 3 Downloadable • open-source solution • web-based Multilingual • Collaborative Controlled • Multilingual Used by indexers, linguists… • Fully compliant with W 3 C and machines (semantic web) standards For an Euro. Voc improvedwiki discoverability • GIL and interoperability Downloadable Maintained with IATE Multilingual with the additional use of Excel Coordination and the Eur. Term wiki Used by linguists… and (interinstitutional machines (term portal for terminology) recognition/CAT tools)
n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § A few milestones and figures § Advantages and limitations § A tool ready for the semantic web § A tool supporting multilingualism § A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
Voc. Bench 3 – An authoring tool n Voc. Bench • Open-source web-based platform • Collaborative editing and management of controlled vocabularies (ontologies, thesauri, taxonomies, authority lists and glossaries) • Supports multilingualism n Fully compliant with W 3 C standards n Production and publication of Linked Open Data n e p O n Used by several public administrations in the EU Member States as well as EU institutions and international organisations https: //joinup. ec. europa. eu/solution/vocbench 3 e c r u o s
Voc. Bench 3 – Or how to push the frontiers (1) U RI
Voc. Bench 3 – Or how to push the frontiers (3) ECLAS GEMET Agrovoc UNESCO Thesaurus GND INSPIRE Euro. Voc UNBIS STW RAMEAU LCSH UMTHES MESH
Voc. Bench 3 – Or how to push the frontiers (4) LEGILUX International organisations FDK National legal vocabularies Euro. Voc European Commission vocabularies LEGIVOC EU institutions EU agencies DET
Voc. Bench 3 – Or how to push the frontiers (5)
Voc. Bench 3 – Or how to push the frontiers (6)
Voc. Bench 3 – Or how to push the frontiers (7)
Voc. Bench 3 – Next steps
n Publications Office § Our Raison d’être § Linguistic tools and their purposes § Around controlled vocabularies n Euro. Voc § A few milestones and figures § Advantages and limitations § A tool ready for the semantic web § A tool supporting multilingualism § A tool supporting the access to information n Different animals § Controlled vocabularies vs. terminological resources § Terminology resources and controlled vocabularies – Compared use § Some particularities n Voc. Bench 3 § An authoring tool § Or how to push the frontiers § Next steps n Perspectives n Take-aways
A 1 st perspective “The domain of each IATE entry should be identified by the most relevant EUROVOC descriptor. If the descriptor is from one of the top three levels of EUROVOC, it appears in the IATE Domain field. If it is from a lower level of EUROVOC, it appears (in English) in the Domain Note field, alongside the corresponding third-level domain” (IATE Handbook, Appendix B),
More concretely Euro. Voc domains Improved discoverability
A 2 nd perspective – Move from Excel to Voc. Bench n Voc. Bench • Open-source web-based platform • Collaborative editing and management of controlled vocabularies (ontologies, thesauri, taxonomies, authority lists and glossaries) • Supports multilingualism n Fully compliant with W 3 C standards n Production and publication of Linked Open Data n e p O n Used by several public administrations in the EU Member States as well as EU institutions and international organisations e c r u o s
A 3 rd perspective – Make them 5 Linked Open Data https: //www. w 3. org/TR/ld-glossary/
Take-aways n Publish your terminological resources n SKOSify them r a n Opt for p. URIs t ! S 5 - ata D m n Align your terminological resources with others e en h t p e O n Map concepts… k a d M ke n i L
Denis Dechandon Eugeniu Costeţchi Anikó Gerencsér Anne Waniart Publications Office of the European Union A 1. 002 - Sector Metadata, Head A. 1 Unit - Standardisation A Directorate – Information Management Thank you! Do you have any questions? +352 29 29 1 Name. Surname@publications. europa. eu https: //publications. europa. eu/en/home
- Slides: 34