R T U New York State Center of

  • Slides: 36
Download presentation
R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Sharing Data while Keeping Control Werner CEUSTERS, MD Director, Ontology Research Group Center of Excellence in Bioinformatics and Life Sciences University at Buffalo, NY, USA

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences ? 1977 1959 - 2010 2006 Short personal history 1989 2004 1992 2002 1995 1993 1998

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Outline • The ontology of data – representations of reality • Realism-based ontology – keeping track of what is generic • Referent Tracking – keeping track of what is specific • Referent Tracking for data management – keeping track of data and of what they are about

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Data generation and use data organization model development observation & measurement use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences A crucial distinction: data and what they are about First. Order Reality is abo ut observation & measurement data organization model development Representation use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences A non-trivial relation Referent Reference

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences A non-trivial relation Concept ? Referent Reference

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some key questions Referent Reference • What referents, if any at all, are depicted by a putative reference? • How do changes at the level of the referents correspond with changes in the collection of references? • If references are transmitted, how can the receiver know what referents are depicted?

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Realism-based Ontology Some answers are in Realism-based • • There is an external reality which is ‘objectively’ the way it is; That reality is accessible to us; We build in our brains cognitive representations of reality; We communicate with others about what is there, and what we believe there is there. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Realism makes crucial distinctions • Between data and what data are about: – Level 1 entities (L 1): • everything what exists or existed • some are referents • some are L 2, some are L 3, none are L 2 and L 3 (‘are’ used informally) – Level 2 entities (L 2): beliefs • all are L 1 • some are about other L 1 -entities but none about themselves – Level 3 entities (L 3): expressions • all are L 1, none are L 2 • some are about other L 1 -entities and some about themselves

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Realism makes crucial distinctions • Between data and what data are about; • Between continuants and occurrents: – obvious differences: • a person versus his life • an elevator versus his going up and down • space versus time – more subtle differences (inexistent for flawed models e. g. HL 7 -RIM): • observation (data-element) versus observing • diagnosis versus making a diagnosis • message versus transmitting a message

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Realism makes crucial distinctions • Between data and what data are about; • Between continuants and occurrents; • Between what is generic and what is specific …

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Data and what they are about First. Order Reality is abo ut observation & measurement data organization model development Representation use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Generic versus specific referents data organization t u bo observation & measurement a is model development specific generic is about use = outcome add Δ verify out b a s i Generic beliefs

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Using generic representations for specific entities is inadequate Pt. ID Date SNOMED CT code Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 Other lesion on other specified region 5572 17/05/1993 79001 Essential hypertension 298 22/08/1993 2909872 Closed fracture of radial head 298 22/08/1993 9001224 Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The representational square Generic L 3. Representation L 2. Beliefs (knowledge) L 1. First-order reality ‘person’ ‘drug’ ‘insulin’ DIAGNOSIS INDICATION PATHOLOGICAL STRUCTURE DRUG MOLECULE PERSON DISEASE PORTION OF INSULIN Basic Formal Ontology Specific ‘W. Ceusters’ ‘my sugar’ my doctor’s work plan my doctor’s diagnosis my doctor me my doctor’s computer my NIDDM my blood glucose level Referent Tracking

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences A division of labor • Basic Formal Ontology (BFO), a specific embodiment of realism-based ontology, aims to represent what is generic. • Referent Tracking aims to represent what is specific.

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Referent Tracking • A paradigm under development since 2005, – based on Basic Formal Ontology, – designed to keep track of relevant portions of reality and what is believed and communicated about them, – enabling adequate use of realism-based ontologies, terminologies, thesauri, and vocabularies, – originally conceived to track particulars on the side of the patient and his environment denoted in his EHR, – but since then studied in and applied to a variety of domains, – and now evolving towards tracking absolutely everything.

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Stead and Lin’s ‘Principles for Success’ in Health IT • Evolutionary change • Radical change: • Principle 6: Architect Information and Workflow Systems to Accommodate Disruptive Change » Organizations should architect health care IT for flexibility to support disruptive change rather than to optimize today’s ideas about health care. • Principle 7: Archive Data for Subsequent Re-interpretation » Vendors of health care IT should provide the capability of recording any data collected in their measured, uninterpreted, original form, archiving them as long as possible to enable subsequent retrospective views and analyses of those data. NOTE: ‘See, for example, Werner Ceusters and Barry Smith, “Strategies for Referent Tracking in Electronic Health Records” Journal of Biomedical Informatics 39(3): 362 -378, June 2006. ’ Willam W. Stead and Herbert S. Lin, editors; Committee on Engaging the Computer Science Research Community in Health Care Informatics; National Research Council. Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions (2009)

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Key feature of Referent Tracking • The assignment of an Instance Unique Identifier (IUI) to each entity in reality (i. e. a referent) about which a representation (i. e. a reference) is maintained in some system. • Representations are linked by means of these IUIs following principles established in realism-based ontology.

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Codes for ‘types’ AND identifiers for instances Pt. ID Date SNOMED CT Code Narrative 5572 04/07/1990 26442006 IUI-001 closed fracture of shaft of femur 5572 04/07/1990 81134009 IUI-001 Fracture, closed, spiral 5572 12/07/1990 26442006 IUI-001 closed fracture of shaft of femur 5572 12/07/1990 9001224 IUI-007 Accident in public building (supermarket) 5572 04/07/1990 79001 IUI-005 Essential hypertension 0939 24/12/1991 255174002 IUI-004 benign polyp of biliary tract 2309 21/03/1992 26442006 IUI-002 closed fracture of shaft of femur 2309 21/03/1992 9001224 IUI-007 Accident in public building (supermarket) 47804 03/04/1993 58298795 IUI-006 Other lesion on other specified region 5572 17/05/1993 79001 IUI-005 Essential hypertension 298 22/08/1993 2909872 IUI-003 Closed fracture of radial head 298 22/08/1993 9001224 IUI-007 Accident in public building (supermarket) 5572 01/04/1997 26442006 IUI-012 closed fracture of shaft of femur 5572 01/04/1997 79001 IUI-005 Essential hypertension IUI-004 malignant polyp of biliary tract 0939 20/12/1998 255087006 7 distinct disorders

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depends this-2 and this-3 has this-4’, where • • • this-1 this-2 this-3 this-4 this-5 … instance. Of quality. Of instance. Of role. Of instance. Of part. Of human being age-of-40 -years this-1 at t 2 patient-role this-1 at t 3 tumor at t 4 this-5 at t 6 stomach this-1 at t 8 at t 1 at t 2 at t 3 at t 7

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depends this-2 and this-3 has this-4’, where • • • this-1 this-2 this-3 this-4 this-5 … instance. Of quality. Of instance. Of role. Of instance. Of part. Of human being age-of-40 -years this-1 at t 2 patient-role this-1 at t 3 tumor at t 4 this-5 at t 6 stomach this-1 at t 8 at t 1 at t 2 at t 3 at t 7 denotators for particulars (specific entities)

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depends this-2 and this-3 has this-4’, where • • • this-1 this-2 this-3 this-4 this-5 … instance. Of quality. Of instance. Of role. Of instance. Of part. Of human being age-of-40 -years this-1 at t 2 patient-role this-1 at t 3 tumor at t 4 this-5 at t 6 stomach this-1 at t 8 at t 1 at t 2 at t 3 at t 7 denotators for appropriate relations

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depends this-2 and this-3 has this-4’, where • • • this-1 this-2 this-3 this-4 this-5 … instance. Of quality. Of instance. Of role. Of instance. Of part. Of human being age-of-40 -years this-1 at t 2 patient-role this-1 at t 3 tumor at t 4 this-5 at t 6 stomach this-1 at t 8 at t 1 at t 2 at t 3 at t 7 denotators for universals or classes (what is generic) or particulars

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depends this-2 and this-3 has this-4’, where • • • this-1 this-2 this-3 this-4 this-5 … instance. Of quality. Of instance. Of role. Of instance. Of part. Of human being age-of-40 -years this-1 at t 2 patient-role this-1 at t 3 tumor at t 4 this-5 at t 6 stomach this-1 at t 8 at t 1 at t 2 at t 3 time periods (for continuants) at t 7 when the relationships hold

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Relevance: the way RT-representations interact with representations of generic portions of reality instance-of at t caused #105 by

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Referent Tracking based data warehousing

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Referent Tracking System Environment

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Networks of Referent Tracking systems

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences General principles of RT-enabled data warehousing (1) • Unique identifier for: – – each data-element and combinations thereof (L 3), what the data-element is about (L 1), each generated copy of an existing data-element (L 3), each transaction involving data-elements (L 1); • Identifiers centrally managed in RTS; • Exclusive use of ontologies for type descriptions following OBO-Foundry principles; • Centrally managed data dictionaries, data-ownership, exchange criteria.

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences General principles of RT-enabled data warehousing (2) • Central inventory of ‘attributes’ but peripheral maintenance of ‘values’; • Identifiers function as pseudonyms: – centrally known that for person IUI-1 there are values about instances of UUI-2 maintained by researcher/clinician IUI-3 for periods IUI-4, IUI-5, … • Disclosure of what the identifiers stand for based on need and right to know; • Generation of off-line datasets for research with transaction-specific identifiers for each element.

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Feedback to clinical care • Finding ‘similar’ patient cases: – suggestions for prevention, investigation, treatment; • ‘Outbreak’ detection; • Comparing outcomes; – related to disorders, providers, treatments, … • Links to literature; • Clinical trial selection; • …

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Assigning IUIs #8: this spreadsheet #9: this column of #8 #1: this lady #7: #1’s last name #4: #1’s mass #10: format of entries in #9 #2: “Simpson” #3: “Smith” #5: representation of #4’ at 2010 -03 -31: 08. 30 #6: representation of #4’ at 2010 -04 -14: 09. 57 #11: owner of #8 #12: copy of #8 send to #13 …

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Using IUIs #8: this spreadsheet #9: this column of #8 #1 has-name #7 at … #10: format of entries in #9 #7 represented-by #2 at t 1 #2: “Simpson” #7 represented-by #3 at t 2 #3: “Smith” #4 inheres-in #1 since … #5: representation of #4’ at 2010 -03 -31: 08. 30 #6: representation of #4’ at 2010 -04 -14: 09. 57 #11: owner of #8 #4 represented by #5 since … … #12: copy of #8 send to #13

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences Conclusion: a general framework for unambiguous representation L 1 R L 2 L 3 symbolizations beliefs ‘about’