Ontologies CSE 4095 5810 Prof Steven A Demurjian

  • Slides: 56
Download presentation
Ontologies CSE 4095 5810 Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department

Ontologies CSE 4095 5810 Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-255 Storrs, CT 06269 -2155 steve@engr. uconn. edu http: //www. engr. uconn. edu/~stev e (860) 486 - 4818 ONTO-1

Motivation m CSE 4095 5810 m Ontologies – Biomedical and Clinical q What are

Motivation m CSE 4095 5810 m Ontologies – Biomedical and Clinical q What are they? q How are they Used? What is Issue Facing Ontologies in Future? q Each HIT System has its Own Ontology q HIE Requires Ø Integration of Patient Data Ø Dealing with Semantic Differences (one EMR has weight in lbs, one in kg) Ø Reconciling Ontologies – Each HIT System with Ontology for Same Info – Ontology + Data Impacts Integration – How do we Resolve Dramatic Differences? ONTO-2

Placing Ontologies into Perspective m CSE m 4095 5810 m m Historical Evolution of

Placing Ontologies into Perspective m CSE m 4095 5810 m m Historical Evolution of WWW Ontology q Definition and Description q RDF and OWL Present Biomedical Ontology Applications of Biomedical Ontologies q Clinical Trials q OASIS: Integration Technique q Clinical Decision Support System 3 ONTO-3

Current Information Systems on WWW m CSE 4095 5810 m m First Generation: q

Current Information Systems on WWW m CSE 4095 5810 m m First Generation: q Raw data which was pretty much hand-coded by the user was published online q For example, Static web pages Second Generation: q Dynamic content generation driven by MDA and databases q Machines generate the respective HTML Third Generation: Semantic Web: q Generating machine processable information where the content is machine understandable, enabling intelligent services such as information brokers, search agents, information filters to process domain related information. ONTO-4

What other Advances have Taken Place? m CSE 4095 5810 m XML q XML

What other Advances have Taken Place? m CSE 4095 5810 m XML q XML was designed to store and transport data. XML was designed to be both human- and machine-readable q W 3 C recommended XML 1. 0 on 2/10/1998 HTML 5 q 5 th revision of html q markup language used for structuring and presenting content on the World Wide Web q W 3 C published in October 2014 ONTO-5

What are Ontologies? m CSE 4095 5810 m Definition (from Philosophy) : q Ontology

What are Ontologies? m CSE 4095 5810 m Definition (from Philosophy) : q Ontology is study of being or existence and forms the basic subject matter of metaphysics. It seeks to describe the basic categories and relationships of being or existence to define entities and types of entities within its framework. Definition (from Computer Science): q In Computer science , Ontology means “specification of a conceptualization”. It means “A data model that represents a set of concepts within a domain and the relationships between those concepts”. ONTO-6

Advantages of Ontology m CSE 4095 5810 m m Semantic way of representing knowledge

Advantages of Ontology m CSE 4095 5810 m m Semantic way of representing knowledge of the domain Intelligent system can provide reasoning Systems to make inferences within the Ontology Two main Objectives q Share the common structure of information q Reuse the similar ontology in another domain ONTO-7

Development of Ontology m CSE 4095 5810 m m Determine the domain and Scope

Development of Ontology m CSE 4095 5810 m m Determine the domain and Scope (Range) of the knowledge Look for an existing ontology in the similar domain q Reuse without change (will it be possible? ) q Basis to evolve to domain-specific solution Listing all of Terminologies or Concepts of domain List all of classes and instances to be created in the ontology Create the properties which will relate these concepts in the ontology ONTO-8

Example of Ontology CSE 4095 5810 Wine Australian Yellow Tail Individual Class Grape Properties

Example of Ontology CSE 4095 5810 Wine Australian Yellow Tail Individual Class Grape Properties Color Yellow Flavor Delicate Maker Australia German ONTO-9

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-10

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-10

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-11

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-11

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-12

Parkinson’s Disease Management Ontology CSE 4095 5810 ONTO-12

Parkinson’s Treatment Ontology CSE 4095 5810 ONTO-13

Parkinson’s Treatment Ontology CSE 4095 5810 ONTO-13

Parkinson’s Treatment Ontology CSE 4095 5810 ONTO-14

Parkinson’s Treatment Ontology CSE 4095 5810 ONTO-14

Neurological-Disease Ontology CSE 4095 5810 ONTO-15

Neurological-Disease Ontology CSE 4095 5810 ONTO-15

Neurological-Disease Ontology CSE 4095 5810 ONTO-16

Neurological-Disease Ontology CSE 4095 5810 ONTO-16

Excerpt of Medical Condition Ontology CSE 4095 5810 ONTO-17

Excerpt of Medical Condition Ontology CSE 4095 5810 ONTO-17

Patient Ontology CSE 4095 5810 ONTO-18

Patient Ontology CSE 4095 5810 ONTO-18

Skelton Ontology CSE 4095 5810 What is Phenotypic? A phenotype is the composite of

Skelton Ontology CSE 4095 5810 What is Phenotypic? A phenotype is the composite of an organism's observable characteristics or traits ONTO-19

How do Ontologies Related to other Models? m UML Model CSE 4095 5810 ONTO-20

How do Ontologies Related to other Models? m UML Model CSE 4095 5810 ONTO-20

How do Ontologies Related to other Models? m Entity Relationship Diagram CSE 4095 5810

How do Ontologies Related to other Models? m Entity Relationship Diagram CSE 4095 5810 ONTO-21

How do Ontologies Related to other Models? m CSE 4095 5810 XML Schema <xs:

How do Ontologies Related to other Models? m CSE 4095 5810 XML Schema <xs: element name=“Patient"> <xs: complex. Type> <xs: sequence> <xs: element name=“id" type="xs: integer"/> <xs: element name=“ethnicity" type="xs: string"/> <xs: element name=“race" type="xs: string"/> ………. <xs: element name=“tel" type=“xs: string"/> </xs: sequence> </xs: complex. Type> </xs: element> <xs: element name=“Substance"> <xs: complex. Type> <xs: sequence> <xs: element name=“id" type="xs: integer"/> <xs: element name=“name" type="xs: string"/> <xs: element name=“status. Code" type="xs: string"/> ………. <xs: element name=“repeat. Number" type=“xs: integer"/> </xs: sequence> </xs: complex. Type> </xs: element> <xs: element name=“takes. Prescribed. Medication"> <xs: sequence> <xs: element name=“Observation"> <xs: element ref =“Patient"/> <xs: complex. Type> <xs: element ref =“Substance"/> <xs: sequence> </xs: sequence> <xs: element name=“id" type="xs: integer"/> </xs: element> <xs: element name=“name" type="xs: string"/> <xs: element name=“has. Medical. Observation"> <xs: element name=“value" type="xs: string"/> <xs: element name=“status. Code" type=“xs: string"/> <xs: sequence> <xs: element ref =“Patient"/> </xs: sequence> <xs: element ref =“Observation"/> </xs: complex. Type> </xs: sequence> </xs: element> ONTO-22

How do we Model Ontologies? m CSE 4095 5810 m Researchers proposed Semantic Web

How do we Model Ontologies? m CSE 4095 5810 m Researchers proposed Semantic Web Stack illustrating hierarchy of languages, where each layer exploits and uses capabilities of the layers below OWL and RDF belong the family of knowledge representation language. q RDF: Resource Description Framework Ø http: //www. w 3. org/RDF/ q OWL: Web Ontology Language Ø http: //www. w 3. org/TR/owl-features/ m RDF reminds of Semantic Networks which were popular in 1970’s ONTO-23

Introduction to RDF / OWL CSE 4095 5810 ONTO-24

Introduction to RDF / OWL CSE 4095 5810 ONTO-24

RDF: Resource Description Framework m CSE 4095 5810 m m RDF represents the knowledge

RDF: Resource Description Framework m CSE 4095 5810 m m RDF represents the knowledge in triples format: Subject – Predicate – Object For example, Students – register. To – Classes (Subject) (Predicate) (Object) One triple is RDF is referred as a statement RDF is grammar based language has syntax similar to XML RDFS (RDF Schema) has syntax similar to RDF and provide schema grammar to RDF. For example, rdfs: Class, rdfs: sub. Class. Of etc ONTO-25

RDF: Resource Description Framework m CSE 4095 5810 RDF syntax of the above example:

RDF: Resource Description Framework m CSE 4095 5810 RDF syntax of the above example: <rdfs: Class rdf: about="http: //www. example. com/examle#Students" rdfs: label="Students"> </rdfs: Class> <rdfs: Class rdf: about="http: //www. example. com/examle#Classes" rdfs: label=“Classes"> </rdfs: Class> m All the concepts described in the RDF are identified using an URI q (ex. http: //www. example. com/examle#Students). m RDF can be viewed as standardized framework for providing metadata to domain concepts. ONTO-26

OWL: Web Ontology Language m CSE 4095 5810 m m OWL is placed on

OWL: Web Ontology Language m CSE 4095 5810 m m OWL is placed on the top of the semantic web stack, utilizing all the powerful features offered by the layers below (RDF, RDFS, XML) OWL design has been influenced by description logic & knowledge representational paradigms q SHIQ, Semantic Networks, Frames, SHOE, DAML, OIL, DAML+OIL. OWL provides richer semantic capabilities than its predecessor RDF q For example, in the previous example, the predicate register. To is of type rdf: Property. ONTO-27

OWL: Web Ontology Language m CSE 4095 5810 m m OWL differentiates between properties

OWL: Web Ontology Language m CSE 4095 5810 m m OWL differentiates between properties by defining q owl: Object. Property – for connecting two concepts (register. To) and q owl: Datatype. Property - for connecting a concept to a datatype (utilized from XML) These two properties inherit from RDF property OWL also defines owl: Annotation. Property for embedding metadata onto classes, rules and axioms The following slide illustrates the use of OWL, RDF and RDFS ( taken from cardiac ontology build in OWL using protégé tool) ONTO-28

OWL: Web Ontology Language <owl: Class rdf: ID="Veins"> <rdfs: sub. Class. Of> <owl: Class

OWL: Web Ontology Language <owl: Class rdf: ID="Veins"> <rdfs: sub. Class. Of> <owl: Class rdf: ID="Heart"/> </rdfs: sub. Class. Of> </owl: Class> <Veins rdf: ID="Pulmonary_Vein"/> CSE 4095 5810 Heart Vein Pulmonary Vein Ø Pulmonary Vein is sub-class of Vein which is subclass of Heart. Ø The next slide illustrates the OWL properties and expressive power of OWL to restrict the domain and range values accepted by these properties. Bio. Medical Informatics ONTO-29

OWL: Web Ontology Language <owl: Object. Property rdf: ID="Complications"> <rdfs: domain rdf: resource="#Cardiology_Diseases"/> <rdfs:

OWL: Web Ontology Language <owl: Object. Property rdf: ID="Complications"> <rdfs: domain rdf: resource="#Cardiology_Diseases"/> <rdfs: range> <owl: Class> <owl: union. Of rdf: parse. Type="Collection"> <owl: Class rdf: about="#Cardiology_Complications"/> <owl: Class rdf: about="#Cardiology_Diseases"/> <owl: Class rdf: about="#Cardiology_Causes"/> </owl: union. Of> </owl: Class> </rdfs: range> </owl: Object. Property> CSE 4095 5810 Ø Ø The object property “Complications” can take domain values from class “Cardiology_Diseases” and range values from combination of classes OWL combined with RDF/RDFS provides an environment for developing domain ontologies by organizing and describing the domain concepts Bio. Medical Informatics ONTO-30

Disease Ontology CSE 4095 5810 of s e s s es s a l

Disease Ontology CSE 4095 5810 of s e s s es s a l a e C s Sub logy Di dio r a C Instances of Mitral_Valve_Disorders Hierarchical organization of Cardiology Diseases ONTO-31

Disease Ontology CSE 4095 5810 Property Defined Representation of “Mitral_Valve_Prolapse” knowledge using properties and

Disease Ontology CSE 4095 5810 Property Defined Representation of “Mitral_Valve_Prolapse” knowledge using properties and instances ONTO-32

Implemented Ontology in OWL Format …………. . CSE 4095 5810 <Congenital_Heart_Disease rdf: ID="Atrial_septal_defect"> <Complications>

Implemented Ontology in OWL Format …………. . CSE 4095 5810 <Congenital_Heart_Disease rdf: ID="Atrial_septal_defect"> <Complications> <Cardiac_Arrhythmias rdf: ID="Arrhythmia"> <Has_Intervention rdf: datatype="http: //www. w 3. org/2001/XMLSchema#string" >defibrillation</Has_Intervention> <Have_Symptoms> <Cardiology_Symptoms rdf: ID="Dyspnea"/> </Have_Symptoms> <Has_Diagnosis_Test> <Cardiology_Diagnosis_Test rdf: ID="Coronary_Angiography"> <Has_Synonyms rdf: datatype="http: //www. w 3. org/2001/XMLSchema#string" >coronary catheterization </Has_Synonyms> ………………. . ONTO-33

Bio-Medical Ontologies m CSE 4095 5810 Review a Wide Range of Available Ontologies and

Bio-Medical Ontologies m CSE 4095 5810 Review a Wide Range of Available Ontologies and Standards: q Open. Cyc q Word. Net q Galen q UMLS q SNOMED – CT q FMA q Gene Ontology ONTO-34

Sample EHR Model in UML via HL 7 CDA CSE 4095 5810 ONTO-35

Sample EHR Model in UML via HL 7 CDA CSE 4095 5810 ONTO-35

OWL Equivalent for Observation CSE 4095 5810 <owl: Class rdf: Id=“IVL_TS”/> <owl: Datatype. Property

OWL Equivalent for Observation CSE 4095 5810 <owl: Class rdf: Id=“IVL_TS”/> <owl: Datatype. Property rdf: Id=“Low”/> <owl: Datatype. Property rdf: Id=“High”/> <owl: Datatype. Property rdf: Id=“width”/> <owl: Datatype. Property rdf: Id=“center”/> <owl: Datatype. Property rdf: Id=“low. Closed”/> <owl: Datatype. Property rdf: Id=“high. Closed”/> </owl: Class> <owl: Class rdf: Id=“Observation”/> <owl: Datatype. Property rdf: Id=“id”/> <owl: Datatype. Property rdf: Id=“has. Status. Code”/> <owl: Attribute rdf: Id=“has. Effective. Time”/> <owl: Attribute rdf: Id=“has. Code”/> <owl: Attribute rdf: Id=“has. Value”/> <owl: Attribute rdf: Id=“has. Target. Site”/> </owl: Class> ONTO-36

OWL Equivalent for Observation <owl: Class rdf: Id=“CD”/> CSE 4095 5810 <owl: Attribute rdf:

OWL Equivalent for Observation <owl: Class rdf: Id=“CD”/> CSE 4095 5810 <owl: Attribute rdf: Id=“text”/> <owl: Datatype. Property rdf: Id=“code”/> <owl: Attribute rdf: Id=“has. Effective. Time”/> <owl: Datatype. Property rdf: Id=“code. System”/> <owl: Domain rdf: Id=“Observation”/> <owl: Range rdf: Id=“IVL_TS”/> <owl: Datatype. Property rdf: Id=“code. System. Name”/> <owl: Attribute/> <owl: Datatype. Property rdf: Id=“code. Syste. Version”/> <owl: Attribute rdf: Id=“has. Effective. Time”/> <owl: Domain rdf: Id=“Observation”/> <owl: Datatype. Property rdf: Id=“display. Name”/> <owl: Range rdf: Id=“IVL_TS”/> </owl: Class> <owl: Attribute/> <owl: Attribute rdf: Id=“has. Code”/> <owl: Domain rdf: Id=“Observation”/> <owl: Range rdf: Id=“CD”/> <owl: Attribute rdf: Id=“has. Value”/> <owl: Domain rdf: Id=“Observation”/> <owl: Range rdf: Id=“ANY”/> <owl: Attribute rdf: Id=“has. Target. Site. Code”/> <owl: Domain rdf: Id=“Observation”/> <owl: Range rdf: Id=“CD”/> <owl: Attribute/> ONTO-37

Sample OWL Ontology Model CSE 4095 5810 ONTO-38

Sample OWL Ontology Model CSE 4095 5810 ONTO-38

Ontology Example: Open Cyc m CSE 4095 5810 m Open Cyc is an Upper

Ontology Example: Open Cyc m CSE 4095 5810 m Open Cyc is an Upper level ontology developed by Cycorp Inc. Open Cyc has 60, 000 hand coded assertions that capture “common sense language”, so that AI algorithms can perform human like reasoning and contains 6, 000 concepts ONTO-39

Example of Open Cyc CSE 4095 5810 ONTO-40

Example of Open Cyc CSE 4095 5810 ONTO-40

Ontology Example: Word Net m CSE 4095 5810 Word. Net is an electronic lexical

Ontology Example: Word Net m CSE 4095 5810 Word. Net is an electronic lexical database developed at Princeton University that serves as a resource for applications in natural language processing and information retrieval. cancer, malignant neoplastic disease: any malignant growth or tumor caused by abnormal and uncontrolled cell division; it may spread to other parts of the body through the lymphatic system or the blood stream Cancer, Crab: (astrology) a person who is born while the sun is in Cancer: a small zodiacal constellation in the northern hemisphere; between Leo and Gemini Cancer, Cancer the Crab, Crab: the fourth sign of the zodiac; the sun is in this sign from about June 21 to July 22 Cancer, genus Cancer: type genus of the family Cancridae ONTO-41

Unifies Medical Language System m CSE 4095 5810 UMLS was developed for National Library

Unifies Medical Language System m CSE 4095 5810 UMLS was developed for National Library of Medicine Disease is semantic type with around 392 relations (109 semantic relations and 22 other relations). Pneumonia categorized under one semantic type Disease, but has hundreds of relations. ONTO-42

Example Ontology: SNOMED-CT m CSE 4095 5810 SNOMED stands for Systemized Nomenclature Of SNOMED

Example Ontology: SNOMED-CT m CSE 4095 5810 SNOMED stands for Systemized Nomenclature Of SNOMED stands for Medicine Clinical Terms. SNOMED-CT is the result of merging two ontologies: SNOMED-RT and Clinical Terms. ONTO-43

Example Ontology: Clinical Trials m CSE 4095 5810 m m Low participation in Clinical

Example Ontology: Clinical Trials m CSE 4095 5810 m m Low participation in Clinical Trials is the major problem in Clinical and translational research area. Matching the patient records to clinical trials is presently a manual procedure and its tedious. Need a Semantic Bridge between Clinical Ontologies (SNOMED CT, etc. . ) and raw patient data for q retrieving matching patient records, clinical guidelines and clinical decision support systems ( CDSS). ONTO-44

Technical Challenges m CSE 4095 5810 m m Challenges to be faced during real

Technical Challenges m CSE 4095 5810 m m Challenges to be faced during real time scenario: q Knowledge Engineering. q Scalability q Noisy or Incomplete Data Knowledge Engineering q Clinical Ontology has the concept “Drug”, which described active composition of the various drugs q However, patient record contains name of vendorspecific drugs list Clinical Ontology describe the cause of the disorder. The patient records only specify the presence or absence of the disorder and where was the clinical test conducted. ONTO-45

Architecture of Solution CSE 4095 5810 Clinical Trials Patient Data SNOMED-CT Query Ontology ABox

Architecture of Solution CSE 4095 5810 Clinical Trials Patient Data SNOMED-CT Query Ontology ABox Reasoner TBox ONTO-46

Implementation Approach m CSE 4095 5810 m m Mapping Patient Data Terminology to SNOMED-CT

Implementation Approach m CSE 4095 5810 m m Mapping Patient Data Terminology to SNOMED-CT q Using UMLS as intermediate target. q NLP mapping techniques q Manual Mapping Map the raw patient data to SNOMED-CT terminology. q Example: Cerner Drug: Lactulose Syrup 20 G/30 ml q SNOMED-CT: administered. Substance Allow user to specify which terms in the definition to be matched. Last Bullet Means Ontology Matching NOT Fully Automated! This is a Real Problem for Interoperating Data! ONTO-47

Contrast in Representation CSE 4095 5810 Ø Example: Ø SNOMED-CT: Disease 1 has. Agent

Contrast in Representation CSE 4095 5810 Ø Example: Ø SNOMED-CT: Disease 1 has. Agent Virus 007 Infection due to Bacteria 001 Infection due to Micro. Bacteria 007 ØPatient Record: Disease 1 Positive. Ø As there is not much information in the patient record the query reasoner cannot find the records with partial data. ONTO-48

How are Observations Reconciled? CSE 4095 5810 Clinical Trials Description NCT 00084266 Patients with

How are Observations Reconciled? CSE 4095 5810 Clinical Trials Description NCT 00084266 Patients with MSRA NCT 00288808 Patients with warfarin NCT 00298870 Patients on steroids NCT 00304382 Patients with Pneumonia, source of Blood or Sputum Э associated. Observation MRSA Э associated. Observation Pneumococcal Penumonia П Э has. Speciman. Source Blood Ц Sputum ONTO-49

Clinical Decision Support System CSE m 4095 5810 m Clinical Decision Support Systems (CDSS)

Clinical Decision Support System CSE m 4095 5810 m Clinical Decision Support Systems (CDSS) are q Interactive computer programs q Designed to assist physicians and other health professionals with decision making tasks Components of CDSS: q Knowledge Base q Rule Based Engine q Case Base q Business Models ONTO-50

Example of Usage of Rules CSE 4095 5810 IF “ RULE 1” &“RULE 2”

Example of Usage of Rules CSE 4095 5810 IF “ RULE 1” &“RULE 2” &“RULE 3” …. . “Rule n” THEN “INTERVENTION 1 or Rule M” IF p. get. Gender() = “male” & p. get. Age()=34 & p. get. BP() <140 & p. get. Insulin. Level()<20 THEN “ Asthma Intervention Level 2” Class Patinet Has. Gender “male” П has. Age “ 34” П has. BP More. Than 140 П has. Insulin. Level More. Than 20 ONTO-51

Ontology Integration m CSE 4095 5810 m m m All ontologies developed have a

Ontology Integration m CSE 4095 5810 m m m All ontologies developed have a common aim, describing the domain knowledge Integration of ontologies is becoming very critical q Applications tend to use multiple ontologies q Concepts in the various ontologies overlap or same concept is described in multiple ways. For example, the concept “Blood” is described as differently q “Fluid” in one ontology q “Substance” in another ontology q “semi-solid” in a third ontology Need to Reconcile these Differences When Attempting to “Combine” data that Originates from Different Ontologies ONTO-52

Example of Conflicting Ontologies • CSE 4095 5810 Ontology 1: q q Disease References

Example of Conflicting Ontologies • CSE 4095 5810 Ontology 1: q q Disease References Symptoms which References Treatments Hierarchy of: • Disease • Respiratory Disease • Cardio Disease • Nervous Disease • Symptoms • General Symptoms • Behavioral Symptoms • Treatment • General Treatment • Surgical Treatments Ø • Ontology 2: q Symptoms References Diseases which References Treatments q Hierarchy of: • Symptoms • General Symptoms • Behavioral Symptoms • Disease • Respiratory Disease • Cardio Disease • Nervous Disease • Treatment • General Treatment • Surgical Treatments Previously Discussed Issues: § How do you Integrate Ontologies Across HIT to Support HIE and Virtual Chart? § How do you Merge Data Intensive Conflicting Ontologies? § How do you query from Inside Out? ONTO-53

Ontology Integration m CSE m 4095 5810 Semantics vs Structural Integration ? Difficulties of

Ontology Integration m CSE m 4095 5810 Semantics vs Structural Integration ? Difficulties of integration arise with similar, same and complementary ontology integration. Ontology B ONTO-54

OASIS m Ontology Mapping and Integration Framework CSE 4095 5810 ONTO-55

OASIS m Ontology Mapping and Integration Framework CSE 4095 5810 ONTO-55

Summary - Ontologies m CSE 4095 5810 m m m Ontology q Definition and

Summary - Ontologies m CSE 4095 5810 m m m Ontology q Definition and Descriptions q Many Examples in Practice q OWL and RDF Biomedical Ontology q Open Cyc q Word. Net q SNOMED - CT Application of Biomedical Ontology q Clinical Trials q OASIS: Integration Technique q Clinical Decision Support System Integration of Ontologies ONTO-56