Randi Vita M D Better living through ontologies
Randi Vita, M. D. Better living through ontologies at the Immune Epitope Database La Jolla Institute for Allergy & Immunology Division of Vaccine Discovery La Jolla, California 1
The Immune Epitope Database Free online resource of experimentally-derived epitope information IEDB has >99% of all published epitope data Allergy, Infectious diseases, Autoimmune diseases, Transplant/alloantigens New data is added every week 18, 300 references >1, 000 experimental assays 275, 000 peptidic epitopes 2, 400 nonpeptidic epitopes 2
What is an epitope? The portion of a pathogen, allergen, or autoantigen that the immune system recognizes is the epitope Antibodies and T cells bind to epitopes to trigger an immune response Antibodies typically bind to discontinuous residues of proteins T cells recognize epitopes (typically peptides) presented by MHC molecules T cell APC 3
Why do epitopes matter? Vaccine development Allergy immunotherapy Immunogenicity Transplantation 4
Collaborations with existing resources and ontologies (OBO) Ensures consistency and accuracy Provides standardized nomenclature Provides definitions, synonyms, and hierarchical relationships for database terms Makes curation easier Enhances user experience Facilitates interoperability 5
Embedded Finders driven by external resources and ontologies (OBO) Structure Epitope Source Molecule Organism source Host Immunization Process(es) Immunogen Disease Assay Type Assay T cells MHC restriction MRO 6
Uni. Prot Molecule Finder Peptidic Epitope Amino acid sequence Protein source Organism source Users can see all proteins expressed by an organism as well as processed fragments
NCBI Organism Finder Peptidic Epitope Amino acid sequence Protein source Organism source 8
Organism example in IEDB NCBI hierarchy 9
Organism example in IEDB Immunologist friendly hierarchy 10
Disease Ontology Finder Host Immunization Immunogen Process(es) Disease 11
Model, Reason and Infer We use logical reasoning to create validation rules that can identify errors within the data, enforce accurate curation, and infer data fields. For example, Timothy grass allergy is logically defined as has_allergic_trigger (pollen produced_by Phleum pratense) Timothy grass allergy should not be curated as being caused by a chicken egg. Using the logical definitions to drive validation, such existing data is flagged as an error, the curation interface will not allow new entries with this type of error, and the allergen can be inferred Curation Fields Logical Definition invalid
Gained Knowledge The curator only has to only specify ‘Occurrence of allergy’ and the allergen as benzylpenicillin CHEBI: 18208 • benzylpenicillin allergy is inferred as the disease • drug allergy, antibacterial drug, beta-lactam antibiotic, etc are gained knowledge
OBI Assay Finder Assay Type Assay T cells MHC restriction Users can search on all T cell assays, all cytokine assays, or selectively on IL-2 assays. 14
Interoperability As more resources represent information utilizing formal ontologies (especially OBO foundry) • Interoperability between data sources is facilitated • Queries across resources become possible. Example: What are shared features of chemicals causing allergic responses 15
Thanks James Overton Ch. EBI, DO, OBI, Gaz – PRO, MGI, IPTM 16
- Slides: 16