GO and OBO an introduction What is the
GO and OBO: an introduction
• What is the Gene Ontology? • What is OBO? • OBO-Edit demo & practical Jane Lomax EMBL-EBI
Gene Ontology • Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” • Applicable to all species Jane Lomax EMBL-EBI
Evolution of GO • Original GO created in 2000 • Three databases involved: – Fly. Base (Drosophila) – MGI (Mouse) – SGD (S. cerevisae) • Used immediately Jane Lomax EMBL-EBI
Evolution of GO • Later databases: – – TAIR (Arabadopsis) TIGR (microbes including prokaryotes) SWISS-PROT (several thousand species inc. human) PSU (P. falciparum) • Recent additions – ZFIN (zebrafish) – PAMGO (plant pathogens) Jane Lomax EMBL-EBI
Evolution of GO • GO development traditionally annotationdriven – development directed by use • Terms added as new species annotated • Terms added on as as-needed basis Jane Lomax EMBL-EBI
Evolution of GO • Developed by an international consortium of biologists and computer scientists – members from individual databases – central office at EBI • Development involves collaboration with domain experts from different biological fields – also formal ontologists Jane Lomax EMBL-EBI
Evolution of GO • Resulted in ‘organic’ structure, little formality • Ontological formality added subsequently – philosophical and logical Jane Lomax EMBL-EBI
Growth of GO Jane Lomax EMBL-EBI
How does GO work? What information might we want to capture about a gene product? • What does the gene product do? • Where and when does it act? • Why does it perform these activities? Jane Lomax EMBL-EBI
GO structure • GO terms divided into three parts: – cellular component – molecular function – biological process Jane Lomax EMBL-EBI
Cellular Component • where a gene product acts Jane Lomax EMBL-EBI
Cellular Component Jane Lomax EMBL-EBI
Cellular Component Jane Lomax EMBL-EBI
Cellular Component • Enzyme complexes in the component ontology refer to places, not activities. Jane Lomax EMBL-EBI
Molecular Function • activities or “jobs” of a gene product glucose-6 -phosphate isomerase activity Jane Lomax EMBL-EBI
Molecular Function insulin binding insulin receptor activity Jane Lomax EMBL-EBI
Molecular Function drug transporter activity Jane Lomax EMBL-EBI
Molecular Function • A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. • Sets of functions make up a biological process. Jane Lomax EMBL-EBI
Biological Process a commonly recognized series of events cell division Jane Lomax EMBL-EBI
Biological Process transcription Jane Lomax EMBL-EBI
Biological Process regulation of gluconeogenesis Jane Lomax EMBL-EBI
Biological Process limb development Jane Lomax EMBL-EBI
Biological Process courtship behavior Jane Lomax EMBL-EBI
Ontology Structure • Terms are linked by two relationships – is-a – part-of Jane Lomax EMBL-EBI
Ontology Structure cell membrane mitochondrial membrane is-a part-of chloroplast membrane Jane Lomax EMBL-EBI
Ontology Structure • Ontologies are structured as a hierarchical directed acyclic graph (DAG) • Terms can have more than one parent and zero, one or more children Jane Lomax EMBL-EBI
Ontology Structure cell membrane mitochondrial membrane Directed Acyclic Graph (DAG) - multiple parentage allowed chloroplast membrane Jane Lomax EMBL-EBI
Open Biomedical Ontologies (OBO) • GO is a member of OBO • An umbrella project for grouping different ontologies in biological/medical field – a repository for ontologies with defined set of standards • Available from a single source: http: //obo. sourceforge. net/ Jane Lomax EMBL-EBI
Why do we need OBO? • GO covers small area of biology: – molecular function of a protein – biological function of a protein – cellular location of a protein Jane Lomax EMBL-EBI
Why do we need OBO? • Lots of other aspects that also need to be captured, e. g. : – phenotype – anatomy – genomic – taxonomy Jane Lomax EMBL-EBI
Why do we need OBO? • Many groups develop their own ontologies – e. g. plant ontology, anatomies for specific organisms • No standardisation of ontologies with respect to: – format – scope – relationships • No way of knowing whether such ontologies already exist • No mechanism of distribution for other groups Jane Lomax EMBL-EBI
Why do we need OBO? • Creating ontologies takes a lot of work – Makes sense to reuse existing ontologies where possible • Improves data integration where small set of ontologies used • Allows ontologies to be made available from a single place Jane Lomax EMBL-EBI
Why do we need OBO? • Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint Jane Lomax EMBL-EBI
OBO requirements: open • Ontologies can be used by anyone without any constraints, except: – original authors are acknowledged – cannot be edited and then released under same name Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax Jane Lomax EMBL-EBI
OBO requirements: syntax • Usually the OBO format, same as primary GO format – and adaptions of OBO format • Also accept OWL (Web Ontology Language) format • Allows the same tools to be applied, facilitating shared software implementations Jane Lomax EMBL-EBI
Anatomy of an OBO term id: GO: 0006094 name: gluconeogenesis namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [http: //cancerweb. ncl. ac. uk/omd/index. html] exact_synonym: glucose biosynthesis xref_analog: Meta. Cyc: GLUCONEO-PWY is_a: GO: 0006006 is_a: GO: 0006092 unique ID term name ontology definition synonym database ref parentage Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO Jane Lomax EMBL-EBI
OBO requirements: overlapping • Ontologies can (and should) overlap partially, but large overlap should be avoided • Idea is that terms from different ontologies can be combined to form new terms • Striving for accepted standards rather than competition Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space Jane Lomax EMBL-EBI
OBO requirements: id space • So, for example, the GO identifier is “GO”: – No other OBO ontology could use this id space • Prevents problems where multiple ontologies are used together Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space • Include text definitions of their terms Jane Lomax EMBL-EBI
OBO requirements • In addition, OBO includes ontology of relationships – all ontologies should use these definitions of relationships • For example – part_of – develops_from – regulates Jane Lomax EMBL-EBI
What’s available • demo: http: //obo. sourceforge. net/ Jane Lomax EMBL-EBI
Editing ontologies • GO is edited using OBO-Edit – stand-alone Java application – available for all platforms – browse, create or edit any ontology in OBO format Jane Lomax EMBL-EBI
OBO-Edit demo • Browsing ontologies – – – loading ontologies (including loading multiple ontologies) graph viewer reasoner/single relationship views searching/filtering/rendering help • Creating/editing ontologies – – – creating a new ontology adding terms copying/moving/deleting terms adding definitions, dbxrefs etc verification plugin saving ontologies Jane Lomax EMBL-EBI
- Slides: 48