The Tox Cast Chemical Landscape Paving the Road

  • Slides: 36
Download presentation
The Tox. Cast Chemical Landscape: Paving the Road to 21 st Century Toxicology Ann

The Tox. Cast Chemical Landscape: Paving the Road to 21 st Century Toxicology Ann Richard National Center for Computational Toxicology US Environmental Protection Agency, RTP, NC Comp. Tox Community of Practice Webinar, December 15 th, 2016 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U. S. EPA

Resources Open Access Perspectives article and Supporting Info files available for free download at:

Resources Open Access Perspectives article and Supporting Info files available for free download at: http: //pubs. acs. org/doi/abs/10. 1021/acs. chem restox. 6 b 00135 DOI: 10. 1021/acs. chemrestox. 6 b 00135 Chem. Res. Toxicol. , 2016, 29, 1225− 1251 1

Purpose of Tox. Cast library • To probe chemical-biological activity space potentially relevant to

Purpose of Tox. Cast library • To probe chemical-biological activity space potentially relevant to broad spectrum of toxicological outcomes of regulatory concern • To generate HTS chemical-activity profiles to be used for developing predictive models of toxicity 2

I. History of library construction • What were the main drivers and inputs? •

I. History of library construction • What were the main drivers and inputs? • How did the library expand in phases over time? • To what extent is physical library limited by practical constraints (i. e. , procurable, testable)? • What are quality concerns & how are they being addressed? 3

II. What’s in the library? • Chemical names • CASRN • Bottles • Solutions

II. What’s in the library? • Chemical names • CASRN • Bottles • Solutions • Structures • Features Chemical identifiers Physical samples Chemical representations 4

III. Is library “fit for purpose”? • Does library provide sufficient coverage of chemicals

III. Is library “fit for purpose”? • Does library provide sufficient coverage of chemicals of interest to EPA & stakeholders? • Does library include sufficient chemical diversity to span full range of toxicity mechanisms and outcomes of concern? • Does library provide sufficient coverage of local regions of chemistry to enable local model development? 5

III. Is library “fit for purpose”? • Does library provide sufficient coverage of chemicals

III. Is library “fit for purpose”? • Does library provide sufficient coverage of chemicals of interest to EPA & stakeholders? • Does library include sufficient chemical diversity to span full range of toxicity mechanisms and outcomes of concern? • Does library provide sufficient coverage of local regions of chemistry to enable local model development? … relative to the “chemical universe” and target inventories of greatest interest and concern to EPA 6

I. History - Tox. Cast inventory thru end of Testing Phase II Moving from

I. History - Tox. Cast inventory thru end of Testing Phase II Moving from Phase I to Phase II: Ø eliminated 17 chemicals Ø reprocured ph 1 inventory (v 2), run in new assays Ø full assay coverage of ph 2, new & old assays Ø limited assay coverage of e 1 k (endocrine only) Ø broader chemical and less assay coverage in tox 21 Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Expanding Tox. Cast Library into Phase II & Tox 21 • EPA & stakeholder

Expanding Tox. Cast Library into Phase II & Tox 21 • EPA & stakeholder inputs EPA ACTo. R EPA DSSTox EPA Program Offices External Nominations Hi En • Exclusively CAS-name lists gh d pr ocri FDA at Refe Indu o du ne d in v str Fo An er co ren cti i c i i tim a v s e n lc o d od A Fra Pla ru on ta Pe c h i h p c m e sti e vo to ata ddi gra stic D r cid robia ina mic l u me list set tives nces izers rugs nt als es ls s als s s W • Limited by practical constraints EPA’s Tox 21/Tox. Cast Phase II Chemical Nominations (>100 lists) Candidates for procurement Able to procure EPA Tox 21 ph 1 v 2 ph 2 e 1 k 1860 Unable to procure or cost prohibitive Complex mixtures, polymers Ill-defined substances No structure available Insoluble (est. Log. P) Volatile (est. Vapor Pressure) Too reactive, explosive Inorganics, radioactive, etc. DMSO insoluble volatile ph 1_v 2, e 1 k reference chemicals Donated chemicals (incl. 135 failed drugs) ~7000 ~4400 3726 NUMBER OF CHEMICALS ~19000 Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Expanding into Phase III Testing Phase Tox. Cast Phase II Chemical Set Chemicals Assay

Expanding into Phase III Testing Phase Tox. Cast Phase II Chemical Set Chemicals Assay Endpoints Completion ph 1_v 1 310 ~700 2011 ph 1_v 2 293 ~200 ph 2 768 767 ~700 ~900 03/2013 E 1 k 799 800 ~120 ~50 03/2013 Tox 21 tox 21 ~8900 ~80 Ongoing Tox. Cast Phase III ph 3 ~2000 ~300 Ongoing Pesticides , antimicrobials, food additives, green alternatives, HPV, MPV, endocrine reference cmpds, tox reference cmpds, NTP in vivo, FDA GRAS, FDA PAFA, EDSP, water contaminants, exposure data, industrial, failed drugs, marketed drugs, fragrances, flame retardants, … Assays >600 0 Chemicals

Tox. Cast chemical x assay counts (Top 5 assay providers & Tox 21, as

Tox. Cast chemical x assay counts (Top 5 assay providers & Tox 21, as of Jan 2016) Chem. Res. Toxicol. , 2016, 29, 1225− 1251 • >4000 EPA chemicals x up to 800 assay endpoints • >9000 Tox 21 chemicals x 60 or more assay endpoints What’s in the bottle?

Modeling Representations How are we addressing quality concerns? Structure-derived chemical descriptors, fragments & fingerprints

Modeling Representations How are we addressing quality concerns? Structure-derived chemical descriptors, fragments & fingerprints for use in modeling • 5% missing CAS or name, 1% missing Computational processing both (structure only) Desalting, tautomer functional group normalization, • Count/type of & CAS-name conflicts from Optional: de-duplicate, de-stereo, remove metal-containing compounds suppliers similar frequency to public lists Generic Substances Structure normalization Unique Structures • 22% of supplier structures conflicted with DSSTox structure after COA/curation In. Ch. I Key checks: includes salt/hydrate form, stereo, stoichiometric complexes ü 11% salt/hydrate discrepancies 1: 1 CASRN-name-structure Unique CASRN-name ü QC 8%levels stereo/geometric isomermanual differences DSSTox 1&2 (highest quality curation) Analytical chemistry QC: ID, purity, stability, concentration Physical Samples Stock Solutions Supplier/Lot Bottles 2000 COA/MSDS supplier documentation review 4000 6000 8000 10, 000 12, 000 14, 000 Chem. Track DSSTox chemical curation & registration 0 DSSTox QSAR-ready Structures 16, 000 Numbers of Samples/Substances/Representations Chem. Res. Toxicol. , 2016, 29, 1225− 1251

What’s in the library & is it “fit for purpose”? Environmental-Exposure Landscape Knowledge-Prediction EDSP

What’s in the library & is it “fit for purpose”? Environmental-Exposure Landscape Knowledge-Prediction EDSP Universe Landscape Regulatory Drugs Toxicit y Metabolism Structure similarity Chemical feature profiling Structures DSSTox TOXCST SD File Tox. Cast Chemical Library Physicochemical properties DMSO Insolubles Volatiles Not amenable to HTS Structural alerts Substance IDs (CASRN, Name) CASRN overlaps Regulatory Risk assessment Pesticides/Dru gs Exposure data Industrial / HPV In vivo tox data Chem. Res. Toxicol. , 2016, 29, 1225− 1251

What’s in the library? • Unique list of DSSTox substances (e. g. , CAS,

What’s in the library? • Unique list of DSSTox substances (e. g. , CAS, names) • Structures (mol, In. Ch. I, SMILES) annotated to salt/hydrate/stereospecific form • Inventory (ph 1_v 1, v 2, p 2, etc) and Testing Phase (I, III) labels Available at: ftp: //ftp. epa. gov/dsstoxftp/DSSTox_TOXCST_20160129. zip

What’s in the library? • CAS lists used to nominate chemicals for Phase II

What’s in the library? • CAS lists used to nominate chemicals for Phase II and Tox 21 Evaluate TOXCST coverage of high priority CAS lists Structures DSSTox TOXCST SD File Tox. Cast Chemical Library Substance IDs (CASRN, Name) CASRN overlaps Ø Overlap requires exact CAS matches Ø Chemical structure not considered Ø NOCAS substances not considered Regulatory Risk assessment Pesticides/Dru gs Exposure data Industrial / HPV In vivo tox data

What’s in the library? Data & Usage List coverage • Increasing list coverage moving

What’s in the library? Data & Usage List coverage • Increasing list coverage moving from Phase I II, III • High coverage of in vivo, exposure, & risk assessment data lists Chem. Res. Toxicol. , 2016, 29, 1225− 1251

What’s in the library? Data & Usage List coverage Chemical List Frequency Phase III

What’s in the library? Data & Usage List coverage Chemical List Frequency Phase III Phase I + ph 2 ph 1 15 Chem. Res. Toxicol. , 2016, 29, 1225− 1251 +e 1 k +ph 3 tox 21 10 5 0 Use. DB_17: Consumer Use. DB_20: Colorant Use. DB_23: Fragrance Use. DB_25: Personal Care Use. DB_29: Inert ACTo. R: FDA GRAS ACTo. R: FDA EAFUS DSSTox_IRISTR Use. DB_18: Antimicrobial Use. DB_26: Pesticide se. DB_28: Pharmaceutical DSSTox_FDAMDD* To. R: EPA_IUR_2002, 2006 CTo. R: NHANES 2001 -2, IV Use. DB_16: Industrial DSSTox_HPVCSI DSSTox_NTPBSI DSSTox_CPDBAS DSSTox_TOXREF • Highest frequency of ph 2 chemicals across toxicity data & usage lists (x. CAS) • Greatest in vivo data (Tox. Ref. DB) and usage coverage in Phases I & II, tapering off in Phase III 0 500 1000 1500 2000 2500 3000 Tox. Cast Chemicals (sorted by Testing Phase and inventory) 3500 4000

What’s not in the library? • To what extent is HTS library bounded by

What’s not in the library? • To what extent is HTS library bounded by practical constraints? Ø DMSO solubility Ø Volatility DSSTox TOXCST SD File Tox. Cast Chemical Library Substance IDs (CASRN, Name) CASRN overlaps DMSO Insolubles Volatiles Not amenable to HTS Structures Physicochemical properties Regulatory Risk assessment Pesticides/Dru gs Exposure data Industrial / HPV In vivo tox data

What’s not in the library? Volatiles (3%) DMSO Insolubles (8%) # chemicals 60 120

What’s not in the library? Volatiles (3%) DMSO Insolubles (8%) # chemicals 60 120 50 40 80 In TOXCST 30 60 40 20 50 % 10 20 0 -6 -4 -2 0 2 4 6 8 10 12 0 800 TOXCST 700 1600 1400 40 Ø Physchem properties help to 72 % define regions enriched with 80 120 160 200 240 280 320 360 400 440 “problem” chemicals Ø HTS results in problem regions should be more closely examined 600 1200 # chemicals Not in TOXCST 100 500 1000 400 800 300 600 400 200 85 % 200 100 0 0 -6 -4 -2 0 2 log. P 4 6 8 10 12 18 % 40 80 120 160 200 240 280 320 360 400 440 480 520 560 600 640 680 Molecular Weight Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Evaluate coverage of potential “target” inventories Environmental-Exposure Landscape Ø Tox. Prints: 792 “chemotype” features

Evaluate coverage of potential “target” inventories Environmental-Exposure Landscape Ø Tox. Prints: 792 “chemotype” features designed to cover environmental-exposure landscape (Yang et al. , 2015) Structure similarity Chemical feature profiling Structures Physicochemical properties DSSTox TOXCST SD File Tox. Cast Chemical Library Ø FDA_Drugs: ~7 K marketed & Substance IDs (CASRN, Name) discontinued drugs Volatiles CASRN overlaps Regulatory Risk assessment Pesticides/Dru gs Exposure data Industrial / HPV In vivo tox data Ø BMDHHA: ~800 chemicals with benchmark dose human health assessments (Wignall et al. , 2014) DMSO Insolubles Not amenable to HTS Ø CERAPP: ~35 K structures spanning EDSP “universe” and putative exposure landscape (Mansouri et al. , 2016) Regulatory EDSP Universe Drugs Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Tox. Print vs TOXCST: Assessing coverage & diversity 99% of TOXCST structures (4032) contain

Tox. Print vs TOXCST: Assessing coverage & diversity 99% of TOXCST structures (4032) contain Tox. Prints 84% of non-metal Tox. Prints are in TOXCST Tox. Prints 4056 structures 729 Total 93% of TOXCST structures contain 5 or more Tox. Prints TOXCST 57% contain 10 or more Tox. Prints # Tox. Prints/chemical Tox. Prints provide excellent “coverage” and suggest large structural diversity of TOXCST inventory Yang et al. , 2015: https: //toxprint. org/ Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Coverage of Tox. Prints across testing phases Chem. Res. Toxicol. , 2016, 29, 1225−

Coverage of Tox. Prints across testing phases Chem. Res. Toxicol. , 2016, 29, 1225− 1251 bond: C(=O)N_carbamate bond: C(=O)O_carboxylic. Ester_aromatic bond: C=O_aldehyde_generic bond: CC(=O)C_ketone_generic bond: CN_amine_sec-NH_generic bond: COC_ether_aliphatic__aromatic bond: COC_ether_aromatic bond: COH_alcohol_aromatic_phenol bond: CX_halide_alkenyl-X_generic Phase III bond: CX_halide_alkyl-X_generic bond: CX_halide_generic-X_dihalo_(1_2 -) bond: metal_metalloid_Si_organo bond: N=N_azo_generic bond: NC=O_urea_generic bond: PC_phosphorus_organo_generic bond: quat. N_alkyl_acyclic bond: S(=O)O_sulfonate chain: alkane. Linear_decyl_C 10 ring: fused_steroid_generic_[5_6_6_6] ring: hetero_[6]_N_pyridine_generic ring: hetero_[6]_O_pyran_generic 0 50 100 150 200 250 300 350 Number of chemicals containing Tox. Print chemotype 400

 • • O_ O) (= : C nd bo ca (= O) bo

• • O_ O) (= : C nd bo ca (= O) bo rbo N nd xyl _c i bo : C= c. E arb am n O st bo d: C _al er_ a a C d nd ro te (= e h : C O m N_ )C_ yde at am ke _g ic bo to en nd i : C bon ne_ ne eri OC d se _g c _e : CO c-N en th C H_ eric er _e bo ge t _ h n nd bo ali e r p _ er : C n OH d: C hat alip ic bo h O ic _ nd alc C_e __a atic r : C oh X_ ol ther oma bo _a h _ b ar tic nd on ali r : C d: de oma om X_ CX _a a ha _h lke tic_ tic ph lid al n bo e_ ide yl-X en ol _ nd ge _ : m ne alky gen r le et i al c-X_ X_g ric _m en di h e e bo bo tallo alo_ ric nd n ( i b d: d_ 1_ : P C_ on N=N Si_ 2 -) ph d: N _ or os C= azo gan ph o _ O or _u gen r u bo s_ ea eri nd or _g c : q ga en ua no er rin ch bo t. N _ge ic _ n g: fu ain: d: S alk ner se al ( ic = yl rin d_s kan O)O _ac y e t g: he ero Lin _su clic lfo te id_ ea rin ro_ ge r_d na te e n [ g: he 6]_ eri cyl te N_ c_[ _C 1 ro 5 p _[ yri _6 0 _ 6] d _O ine 6_6 ] _p _g yr en an er _g ic en er ic : C nd Ø Similar global Tox. Print profiles Ø Some local feature distinctions: bo Number of chemicals (x scaling factor) Tox. Print inventory profile comparisons (scaled) 800 600 500 Chem. Res. Toxicol. , 2016, 29, 1225− 1251 700 TOXCST CERAPP(x 0. 13) FDA_Drugs(x 0. 54) 400 300 200 100 0 features enriched in drugs, e. g. pyridine, pyran rings CERAPP features not well represented in TOXCST, e. g. azo, sulfonate bonds, decyl chains

Tox. Prints in CERAPP not present in Tox. Cast Library Ø Are the missing

Tox. Prints in CERAPP not present in Tox. Cast Library Ø Are the missing features present in environmental chemicals? Ø Why were these chemicals not included in Tox. Cast? Ø Use to expand Tox. Cast chemical coverage moving forward Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Coverage of 3 chemical classes: Tox. Cast vs CERAPP TOXCST CERAPP (4056) (32468) CERAPP

Coverage of 3 chemical classes: Tox. Cast vs CERAPP TOXCST CERAPP (4056) (32468) CERAPP (112) CERAPP (707) CERAPP (199) TOXCST (31) “Phthalates” TOXCST (20) “Perfluoro alkanes” TOXCST (43) “Biphenyls” Ø TOXCST provides good coverage of environmentally important chemical classes in CERAPP Ø Opportunities for local SAR models Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Comparison to potential target inventories based on computed properties Chem. Res. Toxicol. , 2016,

Comparison to potential target inventories based on computed properties Chem. Res. Toxicol. , 2016, 29, 1225− 1251 Greater complexity Less polar • TOXCST more similar to CERAPP & BMDHHA inventories than to FDA_Drugs in physchem property space • Donated_pharma not representative of drug space as a whole

Nearest neighbor similarity comparisons (Tanimoto) • 75% of CERAPP chemicals have a >75% similar

Nearest neighbor similarity comparisons (Tanimoto) • 75% of CERAPP chemicals have a >75% similar TOXCST “analog” • 58% of BMDHHA chemicals have a >75% similar TOXCST “analog • 61% of FDA_Drugs chemicals have a >75% similar TOXCST “analog Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Evaluate coverage of historical SAR “alerts” knowledge Environmental-Exposure Landscape Knowledge-Prediction EDSP Universe Landscape Regulatory

Evaluate coverage of historical SAR “alerts” knowledge Environmental-Exposure Landscape Knowledge-Prediction EDSP Universe Landscape Regulatory Drugs Toxicit y Metabolism Chemical feature profiling Structural alerts Structures Ø 61 HESS Repeat-dose toxicity category “alerts” DSSTox TOXCST SD File Ø 136 DART Developmental. Reproductive Toxicity “alerts” Tox. Cast Chemical Library q Lhasa Knowledge-base: Ø 157 Meteor biotransformation alerts (33 enzyme modules) DMSO Insolubles Volatiles Substance IDs (CASRN, Name) CASRN overlaps Regulatory Risk assessment Pesticides/Dru gs Exposure data Industrial / HPV In vivo tox data Ø 280 Derek toxicity alerts (43 tox endpoint models) Physicochemical properties Not amenable to HTS q OECD Toolbox : Structure similarity Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Incidence of HESS repeat-dose toxicity alerts in Tox. Cast Chem. Res. Toxicol. , 2016,

Incidence of HESS repeat-dose toxicity alerts in Tox. Cast Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Incidence of DART toxicity alerts in Tox. Cast Alert “classes” define local regions of

Incidence of DART toxicity alerts in Tox. Cast Alert “classes” define local regions of chemical space for targeted enrichment studies Chem. Res. Toxicol. , 2016, 29, 1225− 1251

4056 total structures How well does Tox. Cast cover historical SAR toxicity “alerts”? 30%

4056 total structures How well does Tox. Cast cover historical SAR toxicity “alerts”? 30% of TOXCST structures (1213) contain HESS Alerts HESS 72% of HESS Alerts in TOXCST 136 Total 124 Alerts 91% of DART Alerts in TOXCST 61 Total 44 26% of TOXCST structures (1018) contain DART Alerts DART 9% of TOXCST structures (391) contain both DART & HESS Alerts • 72% of HESS & 91% of DART alerts detected in TOXCST chemicals • 47% of TOXCST chemicals contain either HESS or DART alert Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Incidence of Derek (Rat) alerts and endpoint predictions in Tox. Cast Derek (Rat) Alerts

Incidence of Derek (Rat) alerts and endpoint predictions in Tox. Cast Derek (Rat) Alerts Derek (Rat) Endpoints Number of TOXCST chemicals Alert & endpoint “classes” define local regions of chemical space for targeted enrichment studies Chem. Res. Toxicol. , 2016, 29, 1225− 1251

How well does TOXCST cover historical SAR toxicity & biotransformation“alerts”? 58% of TOXCST analyzed

How well does TOXCST cover historical SAR toxicity & biotransformation“alerts”? 58% of TOXCST analyzed structures (1127) contain Derek alerts triggering 1 or more Derek endpoints 280 total alerts 43 total endpts 41 95% of endpts 228 80% of Derek alerts in TOXCST Derek (Species: Rat) • • 1935 structures analyzed 2121 structures not analyzed 4056 84% of TOXCST analyzed structures (1634) contain Meteor alerts 157 total alerts 80% of Meteor alerts in TOXCST 125 33 total enzymes 94% of enzymes 31 Meteor (Species: Rat) 80% of Derek & Meteor alerts detected in TOXCST chemicals 95% of Derek endpts & 94% of Meteor enzymes triggered 58% of TOXCST chemicals contain Derek toxicity alert 84% of TOXCST chemicals contain Meteor biotransformation alert Chem. Res. Toxicol. , 2016, 29, 1225− 1251

Is library “fit for purpose”? • Does library provide sufficient broad coverage of chemicals

Is library “fit for purpose”? • Does library provide sufficient broad coverage of chemicals of interest to EPA & stakeholders? YES! • Does library include sufficient broad chemical diversity to span full range of toxicity mechanisms and outcomes of concern? YES! • Does library provide sufficient broad coverage of local regions of chemistry to enable local model YES! development? … relative to the “chemical universe” and target inventories of greatest interest and concern to EPA 33

Current & future work Tox. Cast: Ø Develop automated workflows to support chemotype (e.

Current & future work Tox. Cast: Ø Develop automated workflows to support chemotype (e. g. , Tox. Print) analyses in local chemistry domains and apply to Tox. Cast assay data sets (individually and globally) Ø Strategic expansion of chemical library into local chemical domains Tox 21: Ø Landscape paper – history, content of library Ø Analysis of Tox 21 analytical chemistry data Expo. Cast: Ø Chemical library support for Non-targeted Screening (NTS) International Mixture Challenge (10 mixtures, 100 -400 chems) Ø Chemical library support for generating publicly releasable high-resolution mass spectra by 7 companies & collaborators 34

Coauthors & Acknowlegements § § § § Richard Judson Keith Houck Chris Grulke Patra

Coauthors & Acknowlegements § § § § Richard Judson Keith Houck Chris Grulke Patra Volarath Indira Thillainadarajah Chihae Yang & Jim Rathman Matt Martin John Wambaugh Tom Knudsen Jayaram Kancherla Kamel Mansouri Grace Patlewicz Tony Williams Kevin Crofton Rusty Thomas Thanks also to past and present Tox. Cast team members, Evotec contractors, ACTo. R curators, and special thanks to Bob Kavlock, David Dix, and Marty Wolf. 35