Getting Past Diversity in Assessing Virtual Library Designs
Getting Past Diversity in Assessing Virtual Library Designs Bob Clark Tripos, Inc. St. Louis, Missouri USA bclark@tripos. com www. tripos. com 2001 Tripos, Inc.
Where be the dragons? Stylized data sets • pyridine, pyrimidine & cyclohexane libraries • semi-homologous “series” Nearest-neighbor profiles • problems & advantages of subsetting 4 -Ureidopiperidine Sulfonamides • combinatorial sub-libraries Opt. Sim™ design Fingerprint visualization • horizon NLM
Cyclohexane, Pyrimidine and Pyridine Library Compositions* R 3 R 2 R 3 N R 2 N R 1 Chex R 1 Pym Position All libraries N Pyr Chex & Pym Pyr only R 1 F, Br, NO 2, Et NMe 2, Ac, COCF 3 SPh, OPh, CH 2 Ph H, Cl, CF 3 Me, i. Pr, SMe Ph none R 2 F, Et, CF 3, COCF 3 OPh, CH 2 Ph Br, NO 2, NMe 2 Ac, SPh Cl, Me, SMe, Ph R 3 CF 3, Ac, COCF 3 F, Br, NO 2 CN, CO 2 Me, CONH 2 Et, NMe 2, Ac SPh, OPh, CH 2 Ph R 4 none F, i. Pr, CF 3, SMe Ac, COCF 3, Ph SPh, OPh, CH 2 Ph *RD Clark. J Chem Inf Comput Sci 1997, 37, 1181 -1188. R 4
Nearest Neighbor Database Comparisons Chex Pyr 0. 311± 0. 04 NN similarity frequency (%) (wrt UNITY 2 D substructural fingerprints)* Chex Pym 0. 271± 0. 05 NN similarity * RD Clark. Relative and Absolute Diversity Analysis of Combinatorial Libraries. In: Combinatorial Library Design and Evaluation, pp 337 -362; AK Ghose & VN Viswanadhan, Eds. ; Marcel Dekker, New York, in press.
Asymmetry of Nearest Neighbor Profiles frequency (%) Pyr 5500 Pyr 500 0. 932± 0. 05 Pyr 500 Pyr 5500 0. 834± 0. 08 NN similarity
Nearest Neighbor Profiles Using Maximally Diverse Subsets* Pyr* 0. 544± 0. 02 Pyr* Pyr 0. 722± 0. 08 NN similarity D frequency (%) C Pyr 2 K* 0. 560± 0. 02 Pyr 2 K* Pyr 2 K 0. 729± 0. 09 NN similarity * RD Cramer, DE Patterson, RD Clark, F Soltanshahi & MS Lawless. J Chem Inf Comput Sci 1998, 38, 1010 -1023.
4 -Ureidopiperidine Sulfonamide Library* Primary Amines Sulfonyl chlorides Property cut-off passed structure -- 436 -- 178 mol. weight 200 361 350 163 mol. volume 190 Å3 363 255 Å3 165 2. 6 370 5. 0 168 aromatic rings 1 394 2 171 combined -- 308 -- 154 c. Log. P *RD Clark, DE Patterson, F Soltanshahi, JF Blake & JB Matthew. J Mol Graph Modelling 2000, 18, 404 -411.
Ureidopiperidine Sulfonamide Sublibraries All were constructed using an extension of “standard” Opti. Sim™ selection technology • subsample size k = 5 • exclusion radius 0. 10 • incremental pivot method Sublibrary 1: Cherry picked • 200 diverse representative products Sublibrary 2: four blocks, 10 x 5 each • 32 amines + 20 sulfonyl chlorides Sublibrary 3: single 20 x 10 block • 20 amines + 10 sulfonyl chlorides
Opti. Sim Design Scheme B 1 A 1 B 1 B 1 b 22 b 23 B 1 B 2 A 1 A 1 A 1 a 21 A 2 A 2 A 2 a 22 a 31 A 3 a 23 a 32 a 33 B 1 B 2 B 3 B 4 B 1 B 2 B 3 b 41 b 42 b 43 B 1 B 2 B 3 B 1 B 2 b 31 b 32 b 33 A 1 A 1 A 2 A 2 A 3 A 3 B 1 B 2 B 3 B 4 b 51 b 52 b 53 A 1 B 2 B 3 B 4 B 5 B 1 B 2 B 3 B 4 B 5 A 1 A 1 A 2 A 2 A 3 A 3 a 41 A 4 A 4 a 42 a 51 A 5 a 43 a 52 A 2 a 53
Ureidopiperidine Sulfonamide Nearest Neighbor Profiles cherry picked single block frequency (%) single block cherry picked NN similarity 0. 74 ± 0. 09 (median 0. 72) 0. 81 ± 0. 09 (median 0. 80)
Self-similarity Profiles for Diverse Subsets from Sub-libraries frequency (%) (20 compound subsets) NN similarity cherry-picked: 0. 52 ± 0. 02 (median 0. 515) four-block: 0. 55 ± 0. 02 (median 0. 545) single block: 0. 60 ± 0. 05 (median 0. 615)
Nearest Neighbor Profiles for Diverse Subsets are Symmetric cherry picked single block cherry picked frequency (%) cherry picked four block cherry picked NN similarity 0. 61 ± 0. 09 (median 0. 61) 0. 62 ± 0. 09 (median 0. 61) NN similarity 0. 63 ± 0. 10 (median 0. 58) 0. 62 ± 0. 11 (median 0. 58)
PCA (Euclidean) NLM (Tanimoto)
Effect of Horizon Distance (cyclohexanes) 2 1 1 4 2 2 3 4 3 3 4 2 2 4 2 1 3 4 3 1 1
Homolosine Projection source: Cartography Laboratory Indiana State University www. indstate. edu/gga_cart
PCA NLM with Horizon 37 36 25 35 26 34 33
PCA NLM with Horizon 22 39 23 38 32 24 30 27 31 28 29
Comparison of Sub-Libraries cherry picking four blocks single block 45 53 42 46 51 48
Comparison of Sub-Libraries cherry picking four blocks single block 55 54 41 53 42 51 43 50 44 45 49 47 46 48
Comparison of Sub-Libraries 40 cherry picking four blocks single block 55 42 52 51 44 45 47 49 46 48
Acknowledgements NIH SBIR grant 1 R 43 GM 58919 David Patterson • Sr. Fellow Fred Soltanshahi • Technologist Trevor Heritage, VP Software R&D 1999 Tripos, Inc.
Take-home: fingerprint similarity is biologically relevant (good neighborhood behavior)
- Slides: 22