Knowledge Organization in the Light of Intertextual Semantics

  • Slides: 30
Download presentation
Knowledge Organization in the Light of Intertextual Semantics A Natural-Language Analysis of Controlled Vocabularies

Knowledge Organization in the Light of Intertextual Semantics A Natural-Language Analysis of Controlled Vocabularies Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal ISKO 2008 - Montréal

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example Consequences of IS view Future work ISKO 2008 - Montréal 2

Intertextual semantics (IS) • A way to envision how meaning is conveyed by information-bearing

Intertextual semantics (IS) • A way to envision how meaning is conveyed by information-bearing objects • Based on natural language (NL) • Not a semantics for natural language • Rather a natural-language semantics for artificial information-bearing objects • Goal: design "better" information-bearing objects (more effective and usable) ISKO 2008 - Montréal 3

Scope of IS reflection • Information-bearing objects – Primarily structured documents (e. g. ,

Scope of IS reflection • Information-bearing objects – Primarily structured documents (e. g. , XML) – Any data structure designed to hold information in an information system • Ex. : database table / record / field • Communication of meaning to human persons interacting with the object through any kind of interface ISKO 2008 - Montréal 4

IS – Background (1/2) • Introduced at Extreme Markup Languages (EML) 2006 – valid

IS – Background (1/2) • Introduced at Extreme Markup Languages (EML) 2006 – valid XML documents only – modeler-author communication – further development (EML 2007) • Applied to classical data structure for information exchange (SIGDOC 2007) ISKO 2008 - Montréal 5

IS – Background (2/2) • One in a series of semiotics-based approaches to improve

IS – Background (2/2) • One in a series of semiotics-based approaches to improve systems design – Knuth (1984), De Souza (2005) • One in a series of semantic frameworks for structured documents (XML, etc. ) – Sperberg-Mc. Queen et al. (2000), Renear et al. (2002), Wrightson (2005) ISKO 2008 - Montréal 6

Example Facts about some US cities City Population Denver Rochester Palm Spring 850, 000

Example Facts about some US cities City Population Denver Rochester Palm Spring 850, 000 240, 000 48, 000 ISKO 2008 - Montréal Annual snowfall (inches) 23 88 0 7

Modeler prepares “peritext” segments Element text-before facts-about-US-cities "Here are facts about empty some US

Modeler prepares “peritext” segments Element text-before facts-about-US-cities "Here are facts about empty some US cities. " city " The city " ". " name "named " empty population " has a population of " " inhabitants " annual-snowfall-in-inches " and an annual snowfall of " " inches" ISKO 2008 - Montréal text-after 8

Possible “semantic” (or IS) view for authors Here are facts about some US cities.

Possible “semantic” (or IS) view for authors Here are facts about some US cities. The city named Denver has a population of 850, 000 inhabitants and an annual snowfall of 23 inches. The city named Rochester has a population of 240, 000 inhabitants and an annual snowfall of 88 inches. The city named Palm Spring has a population of 48, 000 inhabitants and an annual snowfall of 0 inches. ISKO 2008 - Montréal 9

Example • Raw XML document: <billing> <amount-burial>1205. 47</amount-burial> <payable-burial>D</payable-burial> <amount-cremation>788. 00</amount-cremation> <payable-cremation>F</payable-cremation> </billing> ISKO

Example • Raw XML document: <billing> <amount-burial>1205. 47</amount-burial> <payable-burial>D</payable-burial> <amount-cremation>788. 00</amount-cremation> <payable-cremation>F</payable-cremation> </billing> ISKO 2008 - Montréal 10

IS view ISKO 2008 - Montréal 11

IS view ISKO 2008 - Montréal 11

IS specification of the model (peritexts prepared by modeler) Element text-before text-after billing "This

IS specification of the model (peritexts prepared by modeler) Element text-before text-after billing "This section gives the billing information for this order. " " End of billing information section. " amount-burial "Amount charged for the burial service: " " canadian dollars; " payable-burial "this amount is payable by: " " (D = Funeral director; F = Family). " amount-cremation "Amount charged for the cremation " canadian dollars; " service: " payable-cremation "this amount is payable by: " ISKO 2008 - Montréal " (D = Funeral director; F = Family). " 12

IS – Key ideas • The semantic (IS) view is the reference interpretation and

IS – Key ideas • The semantic (IS) view is the reference interpretation and should convey, in NL, to humans, all the meaning intended / expected by the modeler • The semantic (IS) view can (and should) contain hyperlinks to material not already known by target community of users, but necessary to make sense of the data structure ISKO 2008 - Montréal 13

IS – Hypothesis (ISH-1) • The IS view of a document is one of

IS – Hypothesis (ISH-1) • The IS view of a document is one of the most workable incarnation of its meaning – Wittgensteinian position • The (human) task of interpreting the IS view of a document is representative of the task of "understanding" the document ISKO 2008 - Montréal 14

IS – Consequences on design • An intricate structure of the prose in the

IS – Consequences on design • An intricate structure of the prose in the IS view, or a high number of hyperlink traversals indicate that the document (or data structure) is hard to understand – Gaps imply incomprehensible document! • Design goals for modelers are thus: – Prose as simple as possible (but no more) – Low number of hyperlink traversals ISKO 2008 - Montréal 15

IS – Notes • The network of resources anchored (via hyperlinks) in the semantic

IS – Notes • The network of resources anchored (via hyperlinks) in the semantic view suggests an actual interpretation (sense-making) path, but does not impose it • Any specific reading of a document yields more information than the IS view, but the IS view is considered a minimum for all readings, and thus, serves as a reference ISKO 2008 - Montréal 16

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example Consequences of IS view Future work ISKO 2008 - Montréal 17

Controlled vocabularies (CVs) • Same scope as SKOS concept schemes: – Thesauri, classification schemes,

Controlled vocabularies (CVs) • Same scope as SKOS concept schemes: – Thesauri, classification schemes, subject heading systems, subject indexes, taxonomies • CVs are data structures – Designed by information professionnals – Populated by corpus analysts ("authors") – Used by document analysts to index documents, and users to find documents ISKO 2008 - Montréal 18

CVs in IS • SKOS allows CVs to be expressed as XML documents –

CVs in IS • SKOS allows CVs to be expressed as XML documents – Eases the thought experiment of applying IS • A CV can be expressed as a single XML document – Not as reductive as it sounds. . . – Example will concentrate on designer-author communication ISKO 2008 - Montréal 19

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example Consequences of IS view Future work ISKO 2008 - Montréal 20

SKOS example <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: skos="http: //www.

SKOS example <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: skos="http: //www. w 3. org/2004/02/skos/core#"> <skos: Concept rdf: about="http: //www. my. com/#canals"> <skos: definition>Manmade waterway used by watercraft or for drainage, irrigation, or water power</skos: definition> <skos: scope. Note>A feature type category for places such as the Erie Canal</skos: scope. Note> <skos: pref. Label>canals</skos: pref. Label> <skos: alt. Label>drainage canals</skos: alt. Label> <skos: broader rdf: resource= "http: //www. my. com/#hydrographic%20 structures"/> </skos: Concept> <skos: Concept rdf: about= "http: //www. my. com/#hydrographic%20 structures"> <skos: pref. Label>hydrographic structures</skos: pref. Label> </skos: Concept> </rdf: RDF> ISKO 2008 - Montréal 21

IS view of same example [… Introductory section for the whole CV: background, purpose,

IS view of same example [… Introductory section for the whole CV: background, purpose, scope, etc. (omitted) …] Section for concept with formal identifier: http: //www. my. com/#canals This concept can be defined as Manmade waterway used by watercraft or for drainage, irrigation, or water power. It can be used as A feature type category for places such as the Erie Canal. The official accepted word or expression for referring to this concept is canals. Another word or expression commonly used to refer to this concept is drainage canals are special cases of hydrographic structures. End of section Section for concept with formal identifier: http: //www. my. com/#hydrographic%20 structures The official accepted word or expression for referring to this concept is hydrographic structures. End of section ISKO 2008 - Montréal 22

IS specification • Table of text-before and text-after for all SKOS elements and attributes

IS specification • Table of text-before and text-after for all SKOS elements and attributes • Specified by designer (modeler) of CV before it is populated ISKO 2008 - Montréal 23

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example Consequences of IS view Future work ISKO 2008 - Montréal 24

IS specification • Makes explicit the often hidden complexity of the CV model for

IS specification • Makes explicit the often hidden complexity of the CV model for users • Is an opportunity for specifying extra semantics of the CV model, over and above SKOS semantics – Ex. : "is-a" instead of just "broader term" • Cleary shows the cognitive price of using artificial codes, e. g. , numbers instead of names to identify concepts ISKO 2008 - Montréal 25

Extensions • If SKOS extensions are used (e. g. , custom relationships), IS specification

Extensions • If SKOS extensions are used (e. g. , custom relationships), IS specification is even more useful, because there are no "standard" interpretation of extensions ISKO 2008 - Montréal 26

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example

Overview • • • Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example Consequences of IS view Future work ISKO 2008 - Montréal 27

Future work (1/2) • Development of IS framework – From intertexts to geometrized text

Future work (1/2) • Development of IS framework – From intertexts to geometrized text – Application to interface / interaction design • Application to CVs – IS analysis of other uses of CVs, e. g. , for indexing and searching – Work out an IS specification for a real CV and experiment ISKO 2008 - Montréal 28

Future work (2/2) • Integration of IS in SKOS – IS-peritexts are not by

Future work (2/2) • Integration of IS in SKOS – IS-peritexts are not by refinement of SKOS documentation properties – Rather domain-specific XML elements and/or attributes ISKO 2008 - Montréal 29

Thank you ! Questions ? yves. marcoux@umontreal. ca ISKO 2008 - Montréal

Thank you ! Questions ? yves. marcoux@umontreal. ca ISKO 2008 - Montréal