Curation of the Eco Cyc Database The Eco
Curation of the Eco. Cyc Database: The Eco. Cyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International http: //www. ecocyc. org http: //www. biocyc. org
SRI International Bioinformatics
Eco. Cyc Organization l Eco. Cyc SRI International Bioinformatics collects information about multiple types of database objects Genes l Pathway * Proteins l Reaction * l Compound * Pathway Reactions l Protein l Gene * Compounds l Transcription Unit * hierarchies
Eco. Cyc Statistics 176 pathways 992 enzymes 1006 enzymatic reactions 169 transporters 828 transcription units 1929 proteins have a comment (598 > 300 characters) SRI International Bioinformatics
Eco. Cyc Pathway Information SRI International Bioinformatics http: //biocyc. org: 1555/ECOLI/new-image? type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2
Eco. Cyc Pathway Information SRI International Bioinformatics http: //biocyc. org: 1555/ECOLI/new-image? type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2
…viewed with “More Detail” SRI International Bioinformatics
Eco. Cyc Protein Information reaction comment citations SRI International Bioinformatics
Eco. Cyc Gene Information SRI International Bioinformatics
Eco. Cyc Metabolic Overview Static or animated views of expression data http: //biocyc. org/ov-expr. shtml SRI International Bioinformatics
Eco. Cyc Curation l l l SRI International Bioinformatics names and synonyms gene classes subunit composition of protein complexes location of gene product protein or complex molecular weight enzyme activity name enzyme properties (activators, inhibitors, cofactors) comment fields evidence citations reactions catalyzed pathway information
SRI International Bioinformatics Build a new MOD or add a “Pathway Module”! Pathway Tools Software - Takes annotated genome - Generates database, including pathway predictions Freely available (academics/non-profits) Saccharomyces cerevisiae SGD, Stanford University Arabidopsis thaliana Carnegie Institution of Washington Plasmodium falciparum, Stanford University Mycobacterium tuberculosis Stanford University Synechocystis Carnegie Institution of Washington Methanococcus janaschii EBI Current Pathway Tools Users http: //bioinformatics. ai. sri. com/ptools/ Pathway Tools software environment for creation, curation, analysis, and Web publishing of MODs ptools-info@ai. sri. com
Eco. Cyc Strengths l Metabolism l Transport l Transcription regulation SRI International Bioinformatics
Eco. Cyc into the Future: SRI International Bioinformatics “Eco. Cyc is not just metabolism anymore!” …an integrated, review-level information resource on E. coli genomics and biochemistry…
The Eco. Cyc Update Project: SRI International Bioinformatics l What do we need to do? Goals l Can we possibly get it done? Quantification l Where do we start? Priorities l How is it going? Progress
Eco. Cyc Update: Curation Goals SRI International Bioinformatics Curate every gene product: § literature-based descriptions § comprehensive reference lists l. Expand database scope beyond metabolism, transporters, and transcription l. Curate l. Stay associated reactions and pathways current with the latest papers
Eco. Cyc Update: Quantification 4405 genes -175 transcription factors -168 transporters 4062 genes to curate Full-time curator: 4 days/week on curation + Part-time curator (70%), years 2 -4 Year 1: 1600 hours Year 2: 3000 hours Year 3: 3000 hours Year 4: 3000 hours Total: 10, 600 hours/4062 genes: 2. 6 hours per gene Curation of abstracts SRI International Bioinformatics
Eco. Cyc Update: Priorities l SRI International Bioinformatics 1. Problems raised by users and advisors 2. Gene products that have new characterizations published in the literature l 3. Gene products that have not yet been thoroughly curated l 4. Gene products that have been curated, but have not been updated lately l
Where are we now? 807 gene products curated. 807/4062 = 19. 9% of the total (excluding transport and transcription factors) 4 -year plan: Curate 615 genes in Year 1 We are meeting our goal! SRI International Bioinformatics
The Eco. Cyc Collaboration SRI UNAM l l l Julio Collado-Vides, Project Leader Socorro Gama-Castro, Curator Martin Peralta, Curator l l l TIGR l l Ian Paulsen, Project Leader Mark Hance, Curator l l Milton Saier, Project Leader Can Tran, Curator Funding: Peter Karp, PI Suzanne Paley, Software Engineer John Pick, Software Engineer Martha Arnaud, Curator UCD l UCSD l SRI International Bioinformatics John Ingraham, Project Leader MBL Monica Riley, Editor NIH National Center for Research Resources Emerita l
SRI International Bioinformatics
Pathway/Genome DBs Created by External Users SRI International Bioinformatics l. Saccharomyces cerevisiae, Stanford University l pathway. yeastgenome. org/biocyc/ l. Plasmodium falciparum, Stanford University l plasmocyc. stanford. edu l. Mycobacterium tuberculosis, Stanford University l Bio. Cyc. org thaliana and Synechocystis, Carnegie Institution of Washington l Arabidopsis. org: 1555 l. Arabidopsis l. Methanococcus l janaschii, EBI Maine. ebi. ac. uk: 1555 l. Other PGDBs in progress by 40 other users l. Software freely available l. Each PGDB owned by its creator
- Slides: 22