Creating a Community Database OrganismSpecific Database ModelOrganism Database
Creating a … Community Database Organism-Specific Database Model-Organism Database
Why Create a PGDB? l Perform SRI International Bioinformatics pathway analyses as part of a genome project l Analyze omics data l Create a central information resource for the organism l Create an FBA model l Perform comparative analyses
Model Organism Databases SRI International Bioinformatics l DBs that describe the genome and other information about an organism l Curated by experts for that organism l No one group can curate all the world’s genomes l Distribute workload across a community of experts to create a community resource l Every sequenced organism with an active experimental community requires a MOD l Integrate genome data with information about the biochemical and genetic network of the organism l Integrate literature-based information with computational predictions
Rationale for MODs SRI International Bioinformatics l Each “complete” genome is incomplete in several respects: l 40%-60% of genes have no assigned function l Roughly 7% of those assigned functions are incorrect l Many assigned functions are non-specific l MODs are platforms for global analyses of an organism l Interpret omics data in a pathway context l In silico prediction of essential genes l Characterize systems properties of metabolic and genetic networks
What is Curation? l Ongoing SRI International Bioinformatics updating and refinement of a PGDB l Correct false-positive and false-negative predictions l Incorporate information from experimental literature l Update genome sequence l Update gene functions, gene positions, gene names l Author comments and citations l Add new pathways, modify existing pathways l Enter information about regulatory networks
SRI International Bioinformatics Issues in Creating Public MODs l Obtaining funding l Scoping the project l Identify user community l Obtain buy-in and help from scientific community l IT: Set up database server, Web server l Hire and train curators
Questions l Do SRI International Bioinformatics you intend to make your PGDB public and to update it on an ongoing basis? l To create a Model Organism Database?
Administering Pathway Tools
Obtaining Pathway Tools SRI International Bioinformatics l Free to non-commercial organizations l To obtain license agreement go to Bio. Cyc. org and click on Software/Database Download l Follow Installation Guide l ptools-local directory l Locate in common directory l PGDBs created by all users who use this ptools installation l PGDBs downloaded via the registry l ptools-init. dat for this ptools installation
New Pathway Tools Releases l l l SRI International Bioinformatics Major releases = External software releases l Twice per year l Announced on ptools-users mailing list Minor releases twice per year affect only our Bio. Cyc. org Web site and flatfile distributions We support one prior release only Releases announced on ptools-users@ai. sri. com Read release notes at l http: //brg. ai. sri. com/ptools/release-notes. html Install process: l Upgrade schema of your DB (software assisted)
PGDB Storage: File or Relational Database l File storage: l Advantages: u l Disadvantages: u u u l No RDBMS installation and configuration Must be loaded and saved in its entirety No transaction history No concurrent access for multiple users Oracle/My. SQL storage: l Advantages: u u u l Faster read access, faster saves Concurrent update access for multiple users Stores history of all PGDB updates Disadvantages: u RDBMS must be installed and configured SRI International Bioinformatics
Multiuser Access to PGDBs l PGDB l Each SRI International Bioinformatics stored within one Oracle or My. SQL server curator installs PTools on their workstation l Different curators can use different software platforms l Workstations query RDBMS server via internet l Local disk cache speeds access l For each frame access, PTools queries l In-memory cache, disk cache, RDBMS server l After curator saves changes, all changes made by other users are loaded into curator’s session
How to Release a PGDB? l l l Decide on release frequency and schedule l Don’t wait until it’s perfect to release it! Freeze curation for 1 week Quality assurrance l Run consistency checker u u l l SRI International Bioinformatics Tools -> Consistency Checker Also updates organism-summary statistics Update publications, authors in organism frame l Update via Organism editor Create new version of PGDB l ptools-local/pgdbs/yeastcyc/1. 0/kb/yeastbase. ocelot l Edit against the new version, release the old version Author release notes Register PGDB in SRI PGDB registry l Will allow SRI to include it in Bio. Cyc
SRI International Bioinformatics Pathway Tools Data Import/Export l File->Export File->Import l Export/import to/from tab-delimited files l Export to Genbank, SBML, Bio. PAX l Export to attribute-value files l Attribute-value files can be imported into Bio. Warehouse l Relational database system for bioinformatics database integration l
SRI International Bioinformatics Napster Comes to Bioinformatics l Public l sharing of Pathway/Genome Databases PGDB registry maintained by SRI at URL http: //biocyc. org/registry. html l Registry operations l List contents of registry l Download PGDBs listed in the registry l Register PGDBs you have created
Registry Details SRI International Bioinformatics l Why register your PGDB? l Declare existence of your PGDB in a central location l Facilitate its download by other scientists l Facilitate its inclusion in Bio. Cyc. org l Why download a PGDB? l Desktop Navigator provides more functionality than Web l Comparative operations l Programmatic querying and processing of PGDB l Registration process l Registered PGDBs have open availability by default l Authors can provide their own license agreements l Registered PGDBs reside in authors’ FTP site or HTTP
Desktop versus Web Mode SRI International Bioinformatics l Pathway Tools runs in two different modes: l Desktop mode l Web mode (e. g. , Bio. Cyc. org) l Desktop vs Web functionality in Pathway Tools http: //biocyc. org/desktop-vs-web-mode. shtml l You can run both desktop and web modes at your site l Your PTools web server need not be open to the public
- Slides: 17