Pub Chem An Open Repository for Chemical Structure

Pub. Chem: An Open Repository for Chemical Structure and Biological Activity Information Steve Bryant The NIH Biowulf Cluster: 10 Years of Scientific Supercomputing February 3, 2009

Pub. Chem Overview … … NIH “Molecular Libraries” … Basic design / approach … Current discovery tools / example … Planned discover tools … New discovery tools ?

NIH Molecular Libraries Program … Technology Development Instrumentation Assay Development Screening Molecular Libraries Screening Centers Network (MLSCN) Chemical Diversity Predictive ADMET Compound Repository (MLSMR) Informatics Cheminformatics Research Centers

Molecular Libraries Bio. Assays … Peer review Assay Customized Assay Compound Repository Investigator Screen Optimization Chemistry Hit List Hit picking, confirmation, secondary screens

Molecular Libraries Components …

MLSCN Created … 2005

MLPCN Created … 2008

Pub. Chem Overview … … NIH “Molecular Libraries” overview … Basic design / approach … Current discovery tools / example … Planned discover tools … New discovery tools ?

Pub. Chem Approach … … “Gen. Bank model” … direct depositions by investigators … highly automated (low database cost) … 25 year precedents in biology … less precedent in chemical biology

Growth In Pub. Chem Contributing Organizations

Pub. Chem Contents … … Contributed substance records … with chemical structure … chemical names and comments … links to contributor web sites … contributed links to other NCBI biomedical databases

Growth In Pub. Chem Substances / Compounds

Pub. Chem Standardization. . .

Pub. Chem Standardization. . .

Pub. Chem Contents … … Contributed bioassay records … with assay description / protocol … links to tested substances … summary and detailed test results … links to contributor web sites and other NCBI databases

Growth In Pub. Chem Bio. Assays

Growth In Pub. Chem Tested Substances

Pub. Chem Overview … … NIH “Molecular Libraries” overview … Basic design / approach … Current discovery tools / example … Planned discover tools … New discovery tools ?

Pub. Chem Retrieval System … … Optimize “discoverability” for molecular biologists by integrating Pub. Chem into NCBI’s Entrez / Pub. Med Search Engine … Chemical structure search … Bioassay result search … Structure-activity tools

NCBI’s Entrez Search Engine. . .

Entrez Links and Neighbors. . . 2, 000 users. . . 60, 000 hits. . . … per day Pub. Chem Small Molecules Chemical Structure Similarity VAST Structure Similarity Protein 3 D Structure Activity Profile Similarity Bioactivity Screens Protein Pub. Med Sequences Literature Target Sequence Similarity Term Frequency Statistics

Pub. Chem Users per Day

Search for “Shoichet inhibitors”. . .

Pub. Med Article Retrieved. . .

Link to Pub. Chem Records. . .

“Kaempferol” in Pub. Chem. . .

Similar Compounds in Pub. Chem. . .

“Quercetin” in Pub. Chem. . .

Compare Protein / Ligand Complexes. . .

Link to Another Structure. . .

Tyrosine Kinase Family Member. . .

Links from “Quercetin” to Pub. Med. . .

Pub. Med Records. . .

Links from Quercetin to Bio. Assays. . .

Bio. Assay records. . .

Bio. Assay where “Active”. . .

Bio. Assay where “Active”. . .

Bio. Assay where “Active”. . .

Bio. Assay where “Active”. . .

Entrez Links and Neighbors. . . 2, 000 users. . . 60, 000 hits. . . … per day Pub. Chem Small Molecules Chemical Structure Similarity VAST Structure Similarity Protein 3 D Structure Activity Profile Similarity Bioactivity Screens Protein Pub. Med Sequences Literature Target Sequence Similarity Term Frequency Statistics

Pub. Chem Retrieval System … … Optimize “discoverability” for molecular biologists by integrating Pub. Chem into NCBI’s Entrez / Pub. Med Search Engine … Chemical structure search … Bioassay result search … Exploratory structure-activity tools

Compounds Similar to Quercetin. . .

Pub. Chem Bioactivity Analysis. . .

Pub. Chem Bioactivity Analysis. . .

Pub. Chem Structure-Activity. . .

Active Compound Cluster. . .

Bio. Asay Cluster. . .

Another Bio. Assay Cluster. . .

Pub. Med Connection. . .

Pub. Chem Structure-Activity. . .

Pub. Chem Overview … … NIH “Molecular Libraries” overview … Basic design / approach … Current discovery tools / example … Planned discover tools … New discovery tools ?

Planned Discovery Tools … … Bottom-line “Summaries” of multi-step Molecular Libraries screens … “Chemical Reagent” links for gene and protein records when possible … Add 3 D-conformer similarity to structure -activity analysis … Support multi-target “panel” screens

“Quercetin” in Pub. Chem. . .

“Quercetin” Similar Conformers. . .

Pub. Chem Overview … … NIH “Molecular Libraries” overview … Basic design / approach … Current discovery tools / example … Planned discover tools … New discovery tools ?

New Discovery Tools ? Systems-biology “pathway” links among chemical biology screens / results … Links to bioactivity information derived from scientific literature, literature abstraction, and other sources …

“Quercetin” in Pub. Chem. . .

“Quercetin” NLM Toxicology. . .

“Quercetin” NLM Toxicity. . .

http: //pubchem. ncbi. nlm. nih. gov Evan Bolton Tugba Suzek Jie Chen Paul Thiessen Svetlana Dracheva Valery Tkachenko Lewis Geer Jiyao Wang Lianyi Han Yanli Wang Jane He Jewen Xiao Siqian He Jian Zhang Karen Karapetian Vahan Simonyan Ben Shoemaker Wenyao Shi
- Slides: 60