VT 1072020 1 Ontologie und die Integration des

  • Slides: 99
Download presentation
VT 10/7/2020 1

VT 10/7/2020 1

Ontologie und die Integration des medizinischen Wissens Barry Smith 10/7/2020 2

Ontologie und die Integration des medizinischen Wissens Barry Smith 10/7/2020 2

IFOMIS Institute for Formal Ontology and Medical Information Science Faculty of Medicine University of

IFOMIS Institute for Formal Ontology and Medical Information Science Faculty of Medicine University of Leipzig 10/7/2020 3

http: //ifomis. de Institut für formale Ontologie und Medizinische Informationswissenschaft 10/7/2020 4

http: //ifomis. de Institut für formale Ontologie und Medizinische Informationswissenschaft 10/7/2020 4

Ontologie als Zweig der Philosophie die Wissenschaften von den Arten und Strukturen von Objekten,

Ontologie als Zweig der Philosophie die Wissenschaften von den Arten und Strukturen von Objekten, Qualitäten, Prozessen, Ereignissen, Funktionen und Relationen in allen Bereichen der Wirklichkeit 10/7/2020 5

Aristotle 10/7/2020 Erste Ontologe 6

Aristotle 10/7/2020 Erste Ontologe 6

Eine biologische Ontologie 10/7/2020 7

Eine biologische Ontologie 10/7/2020 7

Linnaeus 1763: Genera Morborum (Nosologie, oder Ontologie von Krankheitsarten) 10/7/2020 8

Linnaeus 1763: Genera Morborum (Nosologie, oder Ontologie von Krankheitsarten) 10/7/2020 8

Q: Warum ‘Ontologie’ in der Informatik? A: Das Babelturmproblem der Informationssysteme 10/7/2020 9

Q: Warum ‘Ontologie’ in der Informatik? A: Das Babelturmproblem der Informationssysteme 10/7/2020 9

Das Babelturmproblem Jede Krankenhausinformationssystem verwendet eigene Termini- und Kategoriensysteme, um die eingegebene Information zu

Das Babelturmproblem Jede Krankenhausinformationssystem verwendet eigene Termini- und Kategoriensysteme, um die eingegebene Information zu organisieren. Wie können wir die Inkompatibiläten lösen, die entstehen, wenn Information verschiedenen Quellen zusammgebracht wird? Vgl. : Wie können wir Chemie und Biologie fusionisieren? Wie können wir Anatomie und Physiologie fusionisieren? 10/7/2020 10

Wie lösen (z. B. Medizinstudenten) dieses Problem? durch die Begegnung mit dem Patienten Der

Wie lösen (z. B. Medizinstudenten) dieses Problem? durch die Begegnung mit dem Patienten Der Patient und die in ihm ablaufenden Prozesse dienen als Kristallisationspunkt für eine sinnvolle Ordnung sonst isoliert stehender (gelernter) Fakten. Integration entsteht durch die Bildung praktischen Wissens (aus Wissen-dass wird Wissen-wie) 10/7/2020 11

Computer sind dumm Analog müssen in Medizininformationssystemen isolierte Wissensartefakte zu einheitlichem und anwendbarem Wissen

Computer sind dumm Analog müssen in Medizininformationssystemen isolierte Wissensartefakte zu einheitlichem und anwendbarem Wissen integriert werden. Aber wie? 10/7/2020 12

Ursprünglicher Traum der Ontologie in der Informatik Eine einzige allumfassende Taxonomie von allen Gegenstandsarten,

Ursprünglicher Traum der Ontologie in der Informatik Eine einzige allumfassende Taxonomie von allen Gegenstandsarten, die als zentrale einheitliche Kategoriensystem für alle Informationssysteme dienen würde. Dieser Traum ist ausgeträumt. . . 10/7/2020 13

Gegenwärtige Lösungen Standardisierte Terminologien UMLS SNOMED HL 7 ICD-10 usw. 10/7/2020 14

Gegenwärtige Lösungen Standardisierte Terminologien UMLS SNOMED HL 7 ICD-10 usw. 10/7/2020 14

UMLS Universal Medical Language System National Library of Medicine Bethesda, MD eine Zusammenstellung verschiedener

UMLS Universal Medical Language System National Library of Medicine Bethesda, MD eine Zusammenstellung verschiedener maschinenlesbarer Quellterminologien 10/7/2020 15

Example 1: UMLS 134 semantic types 800, 000 concepts 10 million interconcept relationships 10/7/2020

Example 1: UMLS 134 semantic types 800, 000 concepts 10 million interconcept relationships 10/7/2020 16

Example 2: SNOMED-RT Systematized Nomenclature of Medicine A Reference Terminology der American College of

Example 2: SNOMED-RT Systematized Nomenclature of Medicine A Reference Terminology der American College of Pathologists 10/7/2020 17

Example 2: SNOMED-RT 121, 000 concepts, 340, 000 relationships “common reference point for comparison

Example 2: SNOMED-RT 121, 000 concepts, 340, 000 relationships “common reference point for comparison and aggregation of data throughout the entire healthcare process” 10/7/2020 18

Standardisierte Terminologien sollen Zugriff auf biomedizinische Literatur und Faktendatenbanken erleichtern eine neue Art medizinischer

Standardisierte Terminologien sollen Zugriff auf biomedizinische Literatur und Faktendatenbanken erleichtern eine neue Art medizinischer Forschung soll dadurch ermöglicht werden 10/7/2020 19

Blood 10/7/2020 20

Blood 10/7/2020 20

Representation of Blood in UMLS Blut als Gewebe 10/7/2020 21

Representation of Blood in UMLS Blut als Gewebe 10/7/2020 21

Representation of Blood in Me. SH Blut als Körperflüssigkeit 10/7/2020 22

Representation of Blood in Me. SH Blut als Körperflüssigkeit 10/7/2020 22

Database standardization is desparately needed in medicine … to enable the huge amounts of

Database standardization is desparately needed in medicine … to enable the huge amounts of data resulting from clinical trials by different groups working on the same drugs/therapies/diagnostic methods …to be fused together 10/7/2020 23

How make ONE SYSTEM out of this? To reap the benefits of standardization we

How make ONE SYSTEM out of this? To reap the benefits of standardization we need to resolve such incompatibilities? 10/7/2020 24

Defizite traditioneller Kodiersysteme (SNOMED) 1 DB-62110 Diabetic nephropathy 2 DB-61000 Diabetes mellitus G-C 025

Defizite traditioneller Kodiersysteme (SNOMED) 1 DB-62110 Diabetic nephropathy 2 DB-61000 Diabetes mellitus G-C 025 Causing D 7 -11000 Nephropathy 3 DB-61000 Diabetes mellitus G-C 025 Causing DF-00000 Disease G-C 006 Locatedin T-71000 Kidney Fehlende formal Sprache Medizinische Begriffs-und Dokumentationssysteme WS 2000/2001 Barbara Heller, IMISE, UNI Leipzig 16. 01. 2001 / Folie 7 von 10/7/2020 25

Defizite traditioneller Kodiersysteme (SNOMED) DB-62110 Diabetic nephropathy. GC 006 Locatedin. T-71000 Kidney 5 DB-62110

Defizite traditioneller Kodiersysteme (SNOMED) DB-62110 Diabetic nephropathy. GC 006 Locatedin. T-71000 Kidney 5 DB-62110 Diabetic nephropathy G-C 006 Locatedin. T-11000 Bone 6 DB 62110 Diabetic nephropathy 10/7/2020 26

It will develop medical ontologies at different levels of granularity: cell ontology drug ontology

It will develop medical ontologies at different levels of granularity: cell ontology drug ontology * protein ontology gene ontology * * = already exists (but in a variety of mutually incompatible forms) 10/7/2020 27

and also anatomical ontology * epidemiological ontology disease ontology therapy ontology pathology ontology *

and also anatomical ontology * epidemiological ontology disease ontology therapy ontology pathology ontology * 10/7/2020 28

 together with physician’s ontology patient’s ontology and even hospital management ontology * 10/7/2020

together with physician’s ontology patient’s ontology and even hospital management ontology * 10/7/2020 29

Presentation overview Problem description: patient eligibility for clinical trial Meaning theories Required technology for

Presentation overview Problem description: patient eligibility for clinical trial Meaning theories Required technology for natural language understanding Implementation of a realist ontology for medical natural language understanding Conclusions If enough time: a guided tour of Link. Factory 30 10/7/2020

The Medical Informatics Dogma Everything should be structured Fact: computers can only deal with

The Medical Informatics Dogma Everything should be structured Fact: computers can only deal with structured representations of reality: – structured data: • relational databases, spreadsheets – structured information: • XML simulates context – structured knowledge: • rule-based knowledge systems Typical conclusion (Dogma? ): – there is a need for structured data, hence … – … there is a need for structured data entry 10/7/2020 31

Structured data entry Current technical solutions: – rigid data entry forms – coding and

Structured data entry Current technical solutions: – rigid data entry forms – coding and classification systems But: – the description of biological variability requires the flexibility of natural language and it is generally desirable not to interfere with the traditional manner of medical recording (Wiederhold, 1980) – Initiatives to facilitate the entry of narrative data have focused on the control rather than 32 10/7/2020 the ease of data entry (Tanghe, 1997)

Drawbacks of structured data entry Loss of information – qualitatively • limited expressiveness and

Drawbacks of structured data entry Loss of information – qualitatively • limited expressiveness and inherent defects of coding and classification systems, controlled vocabularies, and “traditional” medical terminologies • use of purpose oriented systems – don’t use data for another purpose than originally foreseen (J VDL) – quantitatively • too time-consuming to code all information manually Speech recognition and forms for structured 33 10/7/2020

Areas for application of medical natural language understanding Coding patient data Structured information extraction

Areas for application of medical natural language understanding Coding patient data Structured information extraction from unstructured clinical notes Clinical protocols and guidelines Assessing patient eligibility for clinical trial entry Triggering and alerts Linking case descriptions to scientific literature Easy access to content. . . towards a medical semantic web 10/7/2020 34

Clinical history description Mr. Kovács is an 83 -year-old man with a past medical

Clinical history description Mr. Kovács is an 83 -year-old man with a past medical history of hypertension, congestive heart failure, atrial fibrillation, hypercholesterolemia, and a history of CVA who presented himself to Budapest Emergency Room on April 25 with primary complaint of right-sided chest pain since April 24. The patient was in his usual state of health until April 24 when he experienced right-sided chest pain after 10 minutes of bicycling exercise at the YMCA. He described the chest pain as a dull ache in the right side of his chest radiating posteriorly to the right scapular area. He rated the intensity as 7 out of 10. The chest pain lasted about 3 minutes and resolved with rest. That same night, the patient once again experienced right-sided chest pain while lying in bed just before he went to sleep. He describes the pain as right-sided chest pain with same radiation to posterior at an intensity of 6 -7 10/7/2020 out of 10. The chest pain lasted about 10 minutes and 35 resolved spontaneously.

Inclusion criteria of the INVEST study 1. Male or female 2. Age 50 to

Inclusion criteria of the INVEST study 1. Male or female 2. Age 50 to no upper limit 3. a) Hypertension documented as according to the 6 th report of the Joint National Committee on Detection and Evaluation of the treatment of high BP (JNC VI) , b) and the need for drug therapy (previously documented hypertension in patients currently taking antihypertensive agents is acceptable) 4. Documented CAD (e. g. , classic angina pectoris; stable angina pectoris; Heberden angina pectoris), myocardial infarction three or more months ago, abnormal coronary angiography, or concordant abnormalities on two different 10/7/2020 36 types of stress tests

Do they match ? • Mr. Kovács is … an 83 - year-old man

Do they match ? • Mr. Kovács is … an 83 - year-old man with past medical history of hypertension, congestive heart failure, atrial fibrillation, hypercholesterolemia, history of CVA who presented to Budapest Emergency Room on April 25 with chief complaint of right-sided chest pain since April 24. The patient was in his usual state of health until April 24 when he experienced right-sided chest pain after 10 minutes of bicycling exercise at YMCA. He described the chest pain as a dull ache in the right side of his chest radiating posteriorly to the right scapular area. He rated the intensity as 7 out of 10. The chest pain lasted about 3 minutes and resolved with rest. That same night, the patient once again experienced right-sided chest pain while lying in bed right before he went to sleep. He describes the pain as right-sided chest pain with same radiation to posterior at an intensity of 6 -7 out of 10. The chest pain lasted about 10 minutes and resolved spontaneously. 10/7/2020 • 1. Male or female • • 2. Age 50 to no upper limit 3. Hypertension documented according to the 6 th report of the Joint National Committee on Detection and Evaluation of the treatment of high BP (JNC VI) and the need for drug therapy (previously documented hypertension in patients currently taking antihypertensive agents is acceptable) 4. Documented CAD (e. g. , classic angina pectoris (stable angina pectoris; Heberden angina pectoris), myocardial infarction three or more months ago, abnormal coronary angiography, or concordant abnormalities on two different types of stress tests) 5. Willingness to sign informed consent 37

If the computer is to make this deduction. . . • Mr. Kovács is

If the computer is to make this deduction. . . • Mr. Kovács is … an 83 -year-old man with past medical history of hypertension, congestive heart failure, atrial fibrillation, hypercholesterolemia, history of CVA who presented to Budapest Emergency Room on April 25 with chief complaint of right-sided chest pain since April 24. The patient was in his usual state of health until April 24 when he experienced right-sided chest pain after 10 minutes of bicycling exercise at YMCA. He described the chest pain as a dull ache in the right side of his chest radiating posteriorly to the right scapular area. He rated the intensity as 7 out of 10. The chest pain lasted about 3 minutes and resolved with rest. That same night, the patient once again experienced rightsided chest pain while lying in bed right before he went to sleep. He describes the pain as right-sided chest pain with same radiation to posterior at an intensity of 67 out of 10. The chest pain lasted about 10 minutes and resolved spontaneously. 10/7/2020 • 1. Male or female • • 2. Age 50 to no upper limit 3. Hypertension documented according to the 6 th report of the Joint National Committee on Detection and Evaluation of the treatment of high BP (JNC VI) and the need for drug therapy (previously documented hypertension in patients currently taking antihypertensive agents is acceptable) 4. Documented CAD (e. g. , classic angina pectoris (stable angina pectoris; Heberden angina pectoris), myocardial infarction three or more months ago, abnormal coronary angiography, or concordant abnormalities on two different types of stress tests) 5. Willingness to sign informed consent • . . . it must be able to understand ! • 38

What is understanding ? To understand something is to know what its significance is.

What is understanding ? To understand something is to know what its significance is. What 'knowing significance' amounts to may be very different in different contexts: thus understanding a piece of music requires different things of us than understanding a sentence in a language we are learning, for instance. It would be useful, then, for theorists to look at the different kinds of understanding that there are, and examine them in detail and without prejudice, rather than looking for the essence of understanding. mind) 10/7/2020 (Tim Crane, philosopher of 39

The etymology of “understanding” Latin “substare” – literally: “to stand under” Websters Dictionary (1961)

The etymology of “understanding” Latin “substare” – literally: “to stand under” Websters Dictionary (1961) understanding = the power to render experience intelligible by bringing perceived particulars under appropriate concepts. “particulars” = what is NOT SAID of a subject (Aristotle) – substances: this patient, that tumor, . . . – qualities: the red of that patient’s skin, his body temperature, blood pressure, . . . – processes: that incision made by that surgeon, the rise of that patient’s temperature, . . . “concepts”: may be taken in the above definition as Aristotle’s “universals” = what is SAID OF a subject – Substantial concepts: patient, tumor, . . . – Quality concepts: white, temperature –. . . 10/7/2020 40

What is natural language understanding? NLU is constructing meaning from “written” language by which

What is natural language understanding? NLU is constructing meaning from “written” language by which the degree of understanding involves a multifaceted meaning-making process that depends on knowledge about language and knowledge about the world. ( cf. “reading comprehension” by humans. ) But then: what is “meaning” 10/7/2020 41

Why are concepts not enough? Why must our theory address also the referents in

Why are concepts not enough? Why must our theory address also the referents in reality? – Because referents are observable fixed points in relation to which we can work out how the concepts used by different communities relate to each other ; – Because only by looking at referents can we establish the degree to which concepts are good for their purpose. 10/7/2020 42

But why then this fixation on normative “concepts” in Medical Informatics (standards) ? CEN/TC

But why then this fixation on normative “concepts” in Medical Informatics (standards) ? CEN/TC 251 ENV 12264 : – This ENV is applicable to the description of the categorial structure of systems of concepts supporting computer-based terminological systems, including coding systems, for health-care. – concept : “unit of thought constituted through abstraction on the basis of properties common to a set of one or more referents” BUT THEY NEVER IN FACT LOOK AT THE REFERENTS AT ALL! ISO/TC 215/N 142: Health informatics —Vocabulary of terminology – The purpose of this International Standard is to define a set of basic concepts required to describe and discuss formal representation of concepts and characteristics, for use especially in formal computer based concept representation systems. – concept: “unit of knowledge created by a unique combination of characteristics” THEY ARE ALREADY TWO LEVELS REMOVED FROM THE REFERENT! 10/7/2020 43

Why existing “ontologies” don’t match OUR needs for a “core” ontology Me. SH: inconsistency

Why existing “ontologies” don’t match OUR needs for a “core” ontology Me. SH: inconsistency in hierarchical relationships Med. DRA: no difference between concepts and terms UMLS: integrates various source terminologies without taking different meanings of terms, different structures, different purposes, etc. . . into account SNOMED: formal system, but lacks sufficient depth of the ontology GALEN: very detailed ontology for some parts of healthcare but very poor coverage over healthcare as a whole. The ontology is independent from language as medium of communication (the ontology does not accept language as part of reality). . . 10/7/2020 Most important: all of them deal with alternative realities or possible worlds and none is focused on the referents in THIS world ! 44

Based on formal ontology HASOVERLAPPING -REGION HASPARTIALSPATIALOVERLAP ISSPATIAL -PARTOF ISPROPERSPAT. PART-OF HAS-DISCRETEDREGION HASSPATIAL -PART

Based on formal ontology HASOVERLAPPING -REGION HASPARTIALSPATIALOVERLAP ISSPATIAL -PARTOF ISPROPERSPAT. PART-OF HAS-DISCRETEDREGION HASSPATIAL -PART HASPROPERSPATIAL -PART HASDISCONNECTEDREGION HASEXTERNALIS-NONCONNECTINGTANG. IS-TANG. REGION SPAT. - IS-SPAT. HAS-NON- HASPART-OF -EQUIV. TANG. -OF SPAT. PART 10/7/2020 HAS-SPATIALPOINTREFERENCE HASCONNECTINGREGION IS-PARTLY- IS-INSIDE IN-CONVEX- -CONVEX -HULL-OF ISHULL-OF OUTSIDECONVEXHULL-OF ISIS-GEOINSIDE- TOPOINSIDEOF OF 45

Example: joint anatomy joint HAS-HOLE joint space joint capsule IS-OUTER-LAYER-OF joint meniscus – IS-INCOMPLETE-FILLER-OF

Example: joint anatomy joint HAS-HOLE joint space joint capsule IS-OUTER-LAYER-OF joint meniscus – IS-INCOMPLETE-FILLER-OF joint space – IS-TOPO-INSIDE joint capsule – IS-NON-TANGENTIAL-MATERIAL-PART-OF joint – IS-CONNECTOR-OF bone X – IS-CONNECTOR-OF bone Y synovia – IS-INCOMPLETE-FILLER-OF joint space 10/7/2020 synovial membrane IS-BONAFIDEBOUNDARY-OF joint space 46

Linking external ontologies MESH-2001 : “Seizures” Snomed-RT : “Convulsion” ISA IS-narrower-than MESH-2001 : “Convulsions”

Linking external ontologies MESH-2001 : “Seizures” Snomed-RT : “Convulsion” ISA IS-narrower-than MESH-2001 : “Convulsions” Has-CCC Snomed-RT : “Seizure” Has-CCC L&C : Health crisis IS-A L&C : Seizure L&C : Convulsion IS-A 10/7/2020 IS-A L&C : Epileptic convulsion 47

Linguistic and domain ontologies Generalised Possession Human Haspossessor 1 2 IS-A 1 IS-A Haspossessed

Linguistic and domain ontologies Generalised Possession Human Haspossessor 1 2 IS-A 1 IS-A Haspossessed Healthcare phenomenon 1 Having a healthcare phenomenon 2 IS-A Is-possessor-of Patient 3 Has-Healthcarephenomenon IS-A Cancer patient Malignant neoplasm IS-A 3 lung carcinoma Mr. Kovács has a pulmonary carcinoma 10/7/2020 48

From text to meaning Dom-ent human Is-assignedname-of Concept male human name Mr Kovács 10/7/2020

From text to meaning Dom-ent human Is-assignedname-of Concept male human name Mr Kovács 10/7/2020 Is-assignedname-of “Kovács” Instance Text 49

“Mr Kovács” analysed syntactically, and features used to drive mapping. The Orth slot of

“Mr Kovács” analysed syntactically, and features used to drive mapping. The Orth slot of a word gives its surface string. The append( ) operator joins together its arguments as a single string. HUMANNAME-TYPE female-titled Title: female-title Sem: female-human titled-human TITLEDHUMAN-TYPE Title: title Title: male-title Title -2 Sem: male-human untitled-human-name Sem: human HUMANNAME-TYPE 4 human-surname male-titled genderless-titled prenamed-provided human-firstname Prename: human-firstname HUMANNAME-TYPE 3 Prename -1 Sem. Assigned_Name = append{Prenam. Orth, Orth} prename-not-provided 10/7/2020 Sem. Assigned_Name = Orth 50

Conclusions “Understanding a message” comes down to identifying what parts of that message correspond

Conclusions “Understanding a message” comes down to identifying what parts of that message correspond to reality, and what parts are expressions of what doesn’t exist. If a machine has to understand, it must be based on algorithms that use a realist ontology that takes the world, language(s) and the relationship amongst them, properly into account. The medical informatics community (specifically that part dealing with “concept systems”) must become aware that most current approaches 51 10/7/2020 confuse “what is real” with “what is thought to be

The Reference Ontology Community IFOMIS (Leipzig) Laboratories for Applied Ontology (Trento/Rome, Turin) Foundational Ontology

The Reference Ontology Community IFOMIS (Leipzig) Laboratories for Applied Ontology (Trento/Rome, Turin) Foundational Ontology Project (Leeds) Ontology Works (Baltimore) Ontek Corporation (Buffalo/Leeds) Language and Computing (L&C) (Belgium/Philadelphia) 10/7/2020 52

Domains of Current Work IFOMIS Leipzig: Medicine, Bioinformatics Laboratories for Applied Ontology Trento/Rome: Ontology

Domains of Current Work IFOMIS Leipzig: Medicine, Bioinformatics Laboratories for Applied Ontology Trento/Rome: Ontology of Cognition/Language Turin: Law Foundational Ontology Project: Space, Physics Ontology Works: Genetics, Molecular Biology Ontek Corporation: Biological Systematics Language and Computing: Natural Language Understanding 10/7/2020 53

Testing the BFO/Med. O approach collaboration with Language and Computing nv (www. landcglobal. be)

Testing the BFO/Med. O approach collaboration with Language and Computing nv (www. landcglobal. be) 10/7/2020 54

L&C’s ‘Semantic Indexing for Smart Information Retrieval and Extraction’ solution allows companies to more

L&C’s ‘Semantic Indexing for Smart Information Retrieval and Extraction’ solution allows companies to more efficiently and accurately manage and retrieve documents. L&C also offers solutions for information analysis, document mining, information extraction, and coding. 10/7/2020 55

L&C Technology Free. Pharma®, L&C’s natural language analyzer for converting free text (spoken or

L&C Technology Free. Pharma®, L&C’s natural language analyzer for converting free text (spoken or typed) prescription and pharmacology information into XML. Fast. Code®, L&C’s automated clinical coding product for translation of free text strings into ICD, SNOMED, Med. DRA, etc. Lin. KBase®, the largest formal medical knowledge base in the world, representing medicine in such a way that it is understandable for a computer. Lin. KFactory®, L&C’s product suite for developing and managing large formal multilingual ontologies. 10/7/2020 56

L&C statistical technology unearthed errors in SNOMED via patternrecognition of semantic connections 10/7/2020 57

L&C statistical technology unearthed errors in SNOMED via patternrecognition of semantic connections 10/7/2020 57

The Project collaborate with L&C to show an ontology constructed on the basis of

The Project collaborate with L&C to show an ontology constructed on the basis of philosophical principles can help in overhauling and validating the large terminology-based medical ontology Link. Base® used by L&C for NLP 10/7/2020 58

L&C Lin. KBase®: world’s largest terminologybased ontology with mappings to UMLS, SNOMED, etc. +

L&C Lin. KBase®: world’s largest terminologybased ontology with mappings to UMLS, SNOMED, etc. + Lin. KFactory®: suite for developing and managing large terminology-based ontologies 10/7/2020 59

Lin. KBase BFO and Med. O designed to add better reasoning capacity by tagging

Lin. KBase BFO and Med. O designed to add better reasoning capacity by tagging Lin. KBase domain-entities with corresponding BFO/Med. O categories by constraining links within Lin. KBase according to theory of granular partitions 10/7/2020 60

Three levels of ontology 1) formal ontology, seeks the construction of a framework of

Three levels of ontology 1) formal ontology, seeks the construction of a framework of the categories – object, event, whole, part – employed in every domain, 2) domain ontology, a top-level system applying the structure of formal ontology to a particular domain, such as medicine or genetics, 3) terminology-based ontology, a very large, lowerlevel system dealing with the complete terminology of a given domain. 10/7/2020 61

L&C’s long-term goal Transform the mass of unstructured patient records into a gigantic medical

L&C’s long-term goal Transform the mass of unstructured patient records into a gigantic medical experiment 10/7/2020 62

IFOMIS’s long-term goal Build a robust high-level BFO-Med. O framework THE WORLD’S FIRST INDUSTRIALSTRENGTH

IFOMIS’s long-term goal Build a robust high-level BFO-Med. O framework THE WORLD’S FIRST INDUSTRIALSTRENGTH PHILOSOPHY which can serve as the basis for an ontologically coherent unification of medical knowledge and terminology 10/7/2020 63

The solution “ONTOLOGY!” But what does “ontology” mean? 10/7/2020 64

The solution “ONTOLOGY!” But what does “ontology” mean? 10/7/2020 64

Example: The Gene Ontology (GO) hormone ; GO: 0005179 %digestive hormone ; GO: 0046659

Example: The Gene Ontology (GO) hormone ; GO: 0005179 %digestive hormone ; GO: 0046659 %peptide hormone ; GO: 0005180 %adrenocorticotropin ; GO: 0017043 %glycopeptide hormone ; GO: 0005181 %follicle-stimulating hormone ; GO: 0016913 % = subsumption (lower term is_a higher term) 10/7/2020 65

as tree hormone digestive hormone peptide hormone adrenocorticotropin glycopeptide hormone 10/7/2020 follicle-stimulating hormone 66

as tree hormone digestive hormone peptide hormone adrenocorticotropin glycopeptide hormone 10/7/2020 follicle-stimulating hormone 66

GO is very useful for purposes of standardization in the reporting of genetic information

GO is very useful for purposes of standardization in the reporting of genetic information but it is not much more than a telephone directory of standardized designations organized into hierarchies 10/7/2020 67

First Problem can in practice be used only by trained biologists (know how) whether

First Problem can in practice be used only by trained biologists (know how) whether a GO-term stands in the subsumption relationship depends on the context in which the term is used (for example on the type of organism) 10/7/2020 68

Second Problem GDB: a gene is a DNA fragment that can be transcribed and

Second Problem GDB: a gene is a DNA fragment that can be transcribed and translated into a protein Gen. Bank: a gene is a DNA region of biological interest with a name and that carries a genetic trait or phenotype GO uses ‘gene’ in its term hierarchy, but it does not tell us which of these definitions is correct 10/7/2020 69

GO has no robust formal organization no capability to be aligned with systems which

GO has no robust formal organization no capability to be aligned with systems which would have the power to use it to reason with genetic information 10/7/2020 70

SNOMED RT (2000) already has description logic definitions but it also has some bad

SNOMED RT (2000) already has description logic definitions but it also has some bad coding, which derives from failure to pay attention to ontological principles: e. g. both testes is_a testis 10/7/2020 71

How resolve such incompatibilities? enforce terminological compatibility via standardized term hierarchies, with standardized definitions

How resolve such incompatibilities? enforce terminological compatibility via standardized term hierarchies, with standardized definitions of terms 10/7/2020 72

Problem: People are lazy Half the pages on Geocities are called “Please title this

Problem: People are lazy Half the pages on Geocities are called “Please title this page” 10/7/2020 73

Problem: People are stupid The vast majority of the Internet's users (even those who

Problem: People are stupid The vast majority of the Internet's users (even those who are native speakers of English) cannot spell or punctuate Will internet users learn to accurately tag their information with whatever hierarchy they're supposed to be using? 10/7/2020 74

Solutions in the medical domain Problem 1: People lie Problem 2: People are lazy

Solutions in the medical domain Problem 1: People lie Problem 2: People are lazy Problem 3: People are stupid None of these is true in the world of medical informatics 10/7/2020 75

Solutions in the medical domain Problem 1: People lie Problem 2: People are lazy

Solutions in the medical domain Problem 1: People lie Problem 2: People are lazy Problem 3: People are stupid Achieve quality control via division of labour 10/7/2020 76

Division of Labour 1. Clinical activities 2. Structured data representation 3. Software coding (e.

Division of Labour 1. Clinical activities 2. Structured data representation 3. Software coding (e. g. for NLP) 10/7/2020 77

Division of Labour 1. Clinical activities 2. Structured data representation 3. Software coding 4.

Division of Labour 1. Clinical activities 2. Structured data representation 3. Software coding 4. Ontology building Use 4. to constrain 2. and 3. to achieve better data processing via quality control 10/7/2020 78

Problem: Multiple descriptions Requiring everyone to use the same vocabulary to describe their material

Problem: Multiple descriptions Requiring everyone to use the same vocabulary to describe their material is not always medically practicable 10/7/2020 79

Clinicians often do not use category systems at all – they use unstructured text

Clinicians often do not use category systems at all – they use unstructured text from which usable data has to be extracted in a further step Why? Because every case is different, much patient data is context-dependent 10/7/2020 80

David J. Rothwell (one of the two original authors of SNOMED) The notion of

David J. Rothwell (one of the two original authors of SNOMED) The notion of a standard vocabulary and in particular a coding system to serve as the answer to the ills besetting adoption of an Electronic Medical Record is, in my view, quite wrong. Traditional coding schemes, SNOMED included, are a nineteenth century idea, that despite 100 years of effort have failed. There are narrowly defined areas where codes function well, but these areas must be precisely defined. e. g. ICD-O, Drugs, organisms. It is my belief that natural language is the "code" that, despite its difficulties, we must learn to work with to address the issues encountered in a medical record. 10/7/2020 81

SARS is NOT Severe Acute Respiratory Syndrome it is THIS collection of instances of

SARS is NOT Severe Acute Respiratory Syndrome it is THIS collection of instances of Severe Acute Respiratory Syndrome associated with THIS coronavirus and ITS mutations 10/7/2020 82

BFO ontology not the ‘standardization’ or ‘specification’ of concepts (not a branch of knowledge

BFO ontology not the ‘standardization’ or ‘specification’ of concepts (not a branch of knowledge or concept engineering) but an inventory of the types of entities existing in reality 10/7/2020 83

BFO goal: to remove ontological impedance by constraining terminology systems with good ontology 10/7/2020

BFO goal: to remove ontological impedance by constraining terminology systems with good ontology 10/7/2020 84

BFO not a computer application but a reference ontology (not a reference terminology in

BFO not a computer application but a reference ontology (not a reference terminology in the sense of SNOMED) 10/7/2020 85

Recall: GDB: a gene is a DNA fragment that can be transcribed and translated

Recall: GDB: a gene is a DNA fragment that can be transcribed and translated into a protein Genbank: a gene is a DNA region of biological interest with a name and that carries a genetic trait or phenotype 10/7/2020 86

Ontology ‘fragment’, ‘region’, ‘name’, ‘carry’, ‘trait’, ‘type’ . . . ‘part’, ‘whole’, ‘function’, ‘inhere’,

Ontology ‘fragment’, ‘region’, ‘name’, ‘carry’, ‘trait’, ‘type’ . . . ‘part’, ‘whole’, ‘function’, ‘inhere’, ‘substance’ … are ontological terms in the sense of traditional (philosophical) ontology 10/7/2020 87

UMLS Semantic Network entity event physical conceptual entity idea of concept 10/7/2020 88

UMLS Semantic Network entity event physical conceptual entity idea of concept 10/7/2020 88

Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or

Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence 10/7/2020 89

UMLS has ontological problems, too Idea or Concept Functional Concept Qualitative Concept Quantitative Concept

UMLS has ontological problems, too Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence 10/7/2020 90

Sachsen-Anhalt is an Idea or Concept 10/7/2020 91

Sachsen-Anhalt is an Idea or Concept 10/7/2020 91

UMLS has ontological problems, too Idea or Concept Functional Concept Qualitative Concept Quantitative Concept

UMLS has ontological problems, too Idea or Concept Functional Concept Qualitative Concept Quantitative Concept Spatial Concept Body Location or Region Body Space or Junction Geographic Area Molecular Sequence Amino Acid Sequence Carbohydrate Sequence Nucleotide Sequence 10/7/2020 92

Testing the BFO/Med. O approach collaboration with Language and Computing nv (www. landcglobal. be)

Testing the BFO/Med. O approach collaboration with Language and Computing nv (www. landcglobal. be) 10/7/2020 93

The Project collaborate with L&C to show an ontology constructed on the basis of

The Project collaborate with L&C to show an ontology constructed on the basis of philosophical principles can help in overhauling and validating the large terminology-based medical ontology Link. Base® used by L&C for NLP 10/7/2020 94

L&C Lin. KBase®: world’s largest terminologybased ontology with mappings to UMLS, SNOMED, etc. +

L&C Lin. KBase®: world’s largest terminologybased ontology with mappings to UMLS, SNOMED, etc. + Lin. KFactory®: suite for developing and managing large terminology-based ontologies 10/7/2020 95

Lin. KBase BFO and Med. O designed to add better reasoning capacity • by

Lin. KBase BFO and Med. O designed to add better reasoning capacity • by tagging Lin. KBase domain-entities with corresponding BFO/Med. O categories • by constraining links within Lin. KBase according to theory of granular partitions 10/7/2020 96

L&C’s long-term goal Transform the mass of unstructured patient records into a gigantic medical

L&C’s long-term goal Transform the mass of unstructured patient records into a gigantic medical experiment 10/7/2020 97

IFOMIS’s long-term goal Build a robust high-level BFO-Med. O framework THE WORLD’S FIRST INDUSTRIALSTRENGTH

IFOMIS’s long-term goal Build a robust high-level BFO-Med. O framework THE WORLD’S FIRST INDUSTRIALSTRENGTH PHILOSOPHY which can serve as the basis for an ontologically coherent unification of medical knowledge and terminology 10/7/2020 98

END http: //ontologist. com http: //ifomis. de 10/7/2020 99

END http: //ontologist. com http: //ifomis. de 10/7/2020 99