Quantitative StructureActivity Relationships Quantitative StructurePropertyRelationships QSAR QSPR Alexandre

  • Slides: 57
Download presentation
 Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships QSAR & QSPR Alexandre Varnek Faculté de Chimie,

Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships QSAR & QSPR Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE

History of QSAR

History of QSAR

Discoverer of the Periodic Table — an early “Chemoinformatician” Dmitry Mendeleév (1834 – 1907)

Discoverer of the Periodic Table — an early “Chemoinformatician” Dmitry Mendeleév (1834 – 1907) • Russian chemist who arranged the 63 known elements into a periodic table based on atomic mass, which he published in Principles of Chemistry in 1869. Mendeléev left space for new elements, and predicted three yet-to-be-discovered elements: Ga (1875), Sc (1879) and Ge (1886).

Periodic Table Chemical properties of elements gradually vary along the two axis

Periodic Table Chemical properties of elements gradually vary along the two axis

History of QSAR • 1868, D. Mendeleev – The Periodic Table of Elements •

History of QSAR • 1868, D. Mendeleev – The Periodic Table of Elements • 1868, A. Crum-Brown and T. R. Fraser – formulated a suggestion that physiological activity of molecules depends on their constitution: Activity = F(structure) They studied a series of quaternized strychnine derivatives, some of which possess activity similar to curare in paralyzing muscle. • 1869, B. J. Richardson – narcotic effect of primary alcohols varies in proportion to their molecular weights.

History of QSAR • 1893, C. Richet has shown that toxicities of some simple

History of QSAR • 1893, C. Richet has shown that toxicities of some simple organic compounds (ethers, alcohols, ketones) were inversely related to their solubility in water. • 1899, H. Meyer and 1901, E. Overton have found variation of the potencies of narcotic compounds with Log. P. • 1904, J. Traube found a linear relation between narcosis and surface tension.

History of QSAR • 1937, L. P. Hammett studied chemical reactivity of substituted benzenes:

History of QSAR • 1937, L. P. Hammett studied chemical reactivity of substituted benzenes: Hammett equation, Linear Free Energy Relationship (LFER) • 1939, J. Fergusson formulated a concept linking narcotic activity, log. P and thermodynamics. • 1952 - 1956, R. W. Taft devised a procedure for separating polar, steric and resonance effects.

History of QSAR • 1964, C. Hansch and T. Fujita: the biologist’s Hammett equation.

History of QSAR • 1964, C. Hansch and T. Fujita: the biologist’s Hammett equation. • 1964, Free and Wilson, QSAR on fragments. • 1970 s – 1980 s – development of 2 D QSAR (descriptors, mathematical formalism). • 1980 s – 1990 s, development of 3 D QSAR (pharmacophores, Co. MFA, docking). • 1990 s – present, virtual screening.

1934 - Hammett R H CH 3 OCH 3 F Cl NO 2 ortho

1934 - Hammett R H CH 3 OCH 3 F Cl NO 2 ortho 6. 27 12. 3 8. 06 54. 1 11. 4 671 meta 6. 27 5. 35 8. 17 13. 6 14. 8 32. 1 para 6. 27 4. 24 3. 38 7. 22 10. 5 37. 0

1934 - Hammett Substituent O OH OCH 3 NH 2 CH 3 (CH 3)3

1934 - Hammett Substituent O OH OCH 3 NH 2 CH 3 (CH 3)3 Si C 6 H 5 H SH SCH 3 Meta Substituent Para -0. 708 -1. 00 +0. 121 -0. 37 +0. 115 -0. 268 -0. 161 -0. 660 -0. 069 -0. 170 -0. 121 -0. 072 +0. 06 -0. 01 0. 000 +0. 25 +0. 15 0. 00 F Cl CO 2 H COCH 3 CF 3 SO 2 Ph NO 2 +N(CH ) 33 N 2 + +S(CH ) 3 2 Meta Para +0. 337 +0. 062 +0. 373 +0. 227 +0. 355 +0. 406 +0. 376 +0. 502 +0. 43 +0. 54 +0. 61 +0. 70 +0. 710 +0. 778 +0. 82 +1. 76 +1. 91 +1. 00 +0. 90

Steric effects Taft quantified the steric (spatial) effects using the hydrolysis of esters: Here,

Steric effects Taft quantified the steric (spatial) effects using the hydrolysis of esters: Here, the size of R affects the rate of reaction by blocking nucleophilic attack by water. In this case, the steric effects were quantified by the Taft parameter Es: k is the rate constant for ester hydrolysis. This expression is analogous to the Hammett equation.

Es Values for Various Substituents H Me Pr t-Bu F Cl Br OH SH

Es Values for Various Substituents H Me Pr t-Bu F Cl Br OH SH NO 2 C 6 H 5 CN NH 2 0. 0 -1. 24 -1. 60 -2. 78 -0. 46 -0. 97 -1. 16 -0. 55 -1. 07 -2. 52 -3. 82 -0. 51 -0. 61 Compare some extreme values: H 0. 00 the reference substituent in the Taft equation Me -1. 24: little steric resistance to hydrolysis t-Bu -2. 78 : large resistance to hydrolysis Note: H is usually used as the reference substituent (Es(0)), but sometimes when another group, such as methyl (Me) is used as the reference, as in the chemical equation above, the value becomes 1. 24.

Steric effects Es may be used in other chemical reactions and to explain biological

Steric effects Es may be used in other chemical reactions and to explain biological activities, for example the hydrolysis of inhibitors of acetylcholine esterase. Organophosphates must be hydrolysed to be active and it is observed that their biological activity is directly related to the Taft steric parameter ES for the substituent R by the equation:

Octanol/water partition coefficient Usually, log. P instead of P is used log. P >

Octanol/water partition coefficient Usually, log. P instead of P is used log. P > 0, the compound prefers hydrophobic (unpolar) media log. P > 0, the compound prefers polar media

Biological activity as a function of log. P

Biological activity as a function of log. P

Hansch Analysis Biological Activity = f (EL, ST, HPh) + constant Biological Activity =

Hansch Analysis Biological Activity = f (EL, ST, HPh) + constant Biological Activity = log 1/C C, drug concentration causes EC 50, GI 50, etc. EL (electronic descriptor): Hammett constant ( m, p 0, p+, p-, R, F ) HPh (hydrophobicity descriptor): hydrophobic subst. constant, log P octanol/water partition coeff. ST (steric descriptor): Taft steric constant log 1/C = a ( log P )2 + b log P + + Es + C Hansch, C. ; Fujita, T. J. Am. Chem. Soc. , 1964, 86, 1616.

Hansch Analysis Biological Activity = f (Physicochemical properties ) + constant • Physicochemical properties

Hansch Analysis Biological Activity = f (Physicochemical properties ) + constant • Physicochemical properties can be broadly classiied into three general types: • Electronic • Steric • Hydrophobic

Descriptors

Descriptors

Quantitative Structure Activity Relationship (QSAR) Quantitative structure-activity relationships correlate, within congeneric series of compounds,

Quantitative Structure Activity Relationship (QSAR) Quantitative structure-activity relationships correlate, within congeneric series of compounds, their chemical or biological activities, either with certain structural features or with atomic, group or molecular descriptors. Molecular Structure ACTIVITIES Feature Selection & Mapping Representation Descriptors Katiritzky, A. R. ; Lovanov, V. S. ; Karelson, M. Chem. Soc. Rev. 1995, 24, 279 -287 1995

Definition of molecular descriptor The molecular descriptor is the final result of a logic

Definition of molecular descriptor The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment. Roberto Todeschini and Viviana Consonni

A complete description of all the molecular descriptors is given in: Handbook of Molecular

A complete description of all the molecular descriptors is given in: Handbook of Molecular Descriptors Roberto Todeschini and Viviana Consonni Methods and Principles in Medicinal Chemistry Volume 11 Edited by: H. Kubinyi R. Mannhold xx. Timmermann WILEY - VCH, Mannheim, Germany - 2000

Descriptors from Codessa Pro Descriptor Families Topological Fragments Receptor surface Structural Information-content Descriptors -

Descriptors from Codessa Pro Descriptor Families Topological Fragments Receptor surface Structural Information-content Descriptors - calculable molecular attributes that govern particular macroscopic properties Products Spatial Electronic Thermodynamic Conformational Quantum mechanical Plus Molecular and Quantum Methods

Molecular Descriptors Classification based on the dimensionality of structure presentation • 1 D (atom

Molecular Descriptors Classification based on the dimensionality of structure presentation • 1 D (atom counts, MW, number of functional groups, …) • 2 D (topological indices, BCUT, TPSA, Shannon enthropy, …) • 3 D (geometrical parameters, molecular surfaces, parameters calculated in quantum chemistry programs, …)

Molecular Descriptors 1 D

Molecular Descriptors 1 D

Constitutional descriptors • number of atoms • absolute and relative numbers of C, H,

Constitutional descriptors • number of atoms • absolute and relative numbers of C, H, O, S, N, F, Cl, Br, I, P atoms • number of bonds (single, double, triple and aromatic bonds) • number of benzene rings, number of benzene rings divided by the number of atoms • molecular weight and average atomic weight • Number of rotatable bonds (All terminal H atoms are ignored) • Hbond acceptor - Number of hydrogen bond acceptors • Hbond donor - Number of hydrogen bond donors These simple descriptors reflect only the molecular composition of the compound without using the geometry or electronic structure of the molecule.

Molecular Descriptors 2 D

Molecular Descriptors 2 D

Topological Descriptors based on the molecular graph representation are widely used in QSPR, QSAR

Topological Descriptors based on the molecular graph representation are widely used in QSPR, QSAR studies because they help to differentiate the molecules according mostly to their size, degree of branching, flexibility and overall shape.

TI based on the adjacency matrix • Total adjacency index: A = (1/2) •

TI based on the adjacency matrix • Total adjacency index: A = (1/2) • For G 1 and G 2, A = 5. • This TI can only distinguish between structures having different number of cycles (for cyclohexane A = 6).

TI based on the adjacency matrix : • M 1 = Zagreb group indices

TI based on the adjacency matrix : • M 1 = Zagreb group indices M 2 = where the vertex degree di is a number of bonds involving atom i excluding bonds to H atoms. Zagreb group indices were introduced to characterize branching

Zagreb group indices M 1 = M 2 = M 1(G 1) = 4*12

Zagreb group indices M 1 = M 2 = M 1(G 1) = 4*12 +2*32 = 22 M 1(G 2) = 2*12 +4*22 = 18 M 2(G 1) = 4*(1*3) +1*(3*3) = 21 M 1(G 2) = 2*(1*2) +3*(2*2) = 16 Randić’s molecular connectivity index Randic introduced a connectivity index similar to M 2 R = M. Randić, J. Am. Chem. Soc. , 97, 6609 (1975).

TI based on the Distance Matrix: the Wiener Index The entry dij of the

TI based on the Distance Matrix: the Wiener Index The entry dij of the distance matrix indicates the number of edges in the shortest path between vertices i and j. The Wiener index (the first TI !) accounts for the branching: W(G 1) = 29 W(G 2) = 35 Reference: H. Wiener, J. Am. Chem. Soc. , 69, 17 (1947)

TPSA - Topological Polar Surface Area Peter Ertl, Bernhard Rohde, and Paul Selzer, J.

TPSA - Topological Polar Surface Area Peter Ertl, Bernhard Rohde, and Paul Selzer, J. Med. Chem. 2000, 43, 3714 -3717

TPSA - Topological Polar Surface Area

TPSA - Topological Polar Surface Area

TPSA - Topological Polar Surface Area 3 D PSA vs TPSA for 34 810

TPSA - Topological Polar Surface Area 3 D PSA vs TPSA for 34 810 molecules from the World Drug Index

Geometrical descriptors • Moments of inertia - rigid rotator approximation - The moments of

Geometrical descriptors • Moments of inertia - rigid rotator approximation - The moments of inertia characterize the mass distribution in the molecule. • Shadow indices 1 - Surface area projections Radius of gyration Area – - Molecular surface area descriptor – - Describes the van der Waals area of molecule – - related to binding, transport, and solubility 1. Rohrbaugh, R. H. , Jurs, P. C. , Anal. Chim. Acta, 1987. 199, 99 -109.

Molecular Descriptors 3 D

Molecular Descriptors 3 D

Steric parameters • Length-to-breadth ratio : L/B 1 Molecular thickness • Molecular thickness B

Steric parameters • Length-to-breadth ratio : L/B 1 Molecular thickness • Molecular thickness B • Ovality 2 (ratio of the actual surface area and minimum surface ) • Molecular volume • Sterimol parameters 3 • Taft steric parameter Es 1. 2. 3. Janini, G. M. ; Johnston, K. ; Zielinski, W. L. Anal. Chem. 1975, 47, 670. Verloop, A. ; Tipker, J. In Biological Activity and Chemical Structure, Buisman, J. A. K. (editors), Elsevier, Amsterdam, Netherlands, 1977, p 63. Kourounakis, A. ; Bodor, N. Pharm. Res. 1995, 12(8), 1199. L L B

Quantum Chemical Descriptors • Quantitative values calculated in QUANTUM MECHANICS (semi-empirical, HF Ab Initio

Quantum Chemical Descriptors • Quantitative values calculated in QUANTUM MECHANICS (semi-empirical, HF Ab Initio or DFT ) calculations - Atomic charges (quant) - Atomic charges - LUMO - Lowest occupied molecular orbital energy – HOMO - Highest occupied molecular orbital energy – DIPOLE - Dipole moment – – – • - Components of dipole moment along inertia axes (Dx, Dy, Dz) Hf - Heat of formation Mean Polarizability - = 1/3( xx+ yy+ zz) EA – Electron Affinity IP – Ionization Potential E – Energy of Protonation Electrostatic Potential -

Lipophilic Descriptors (2 D and 3 D)

Lipophilic Descriptors (2 D and 3 D)

Lipophilic Descriptors log. P(octanol-water), log. P(alkane-water), log. P(chloroform-water), log. P(dichloroethane/water) Octanol-water partition coefficient •

Lipophilic Descriptors log. P(octanol-water), log. P(alkane-water), log. P(chloroform-water), log. P(dichloroethane/water) Octanol-water partition coefficient • Hansch-Leo method (Clog. P) • Rekker's method • Ghose-Grippen method (calculated log. P based on summing contributions of atom types) • Molecular lipophilicity potential (MLP) The MLP describe how lipophilicity is distributed all over the different parts of a molecule(lipophilicity maps and determination of hydro and lipophilic regions of a molecule)

Lipophilic Descriptors

Lipophilic Descriptors

Some Log. Po/w Extremes in Therapy

Some Log. Po/w Extremes in Therapy

What do these Drugs have in Common? Chloroform Log. Po/w = 1. 97 Irsogladine

What do these Drugs have in Common? Chloroform Log. Po/w = 1. 97 Irsogladine Log. Po/w = 1. 97 Secobarbital Log. Po/w = 1. 97 Trandolapril Log. Po/w = 1. 97 Acetyldigitoxine Log. Po/w = 1. 97

3 D Hydrophobicity hydrophobic hydrophilic All molecules have the same log. P ~1. 5,

3 D Hydrophobicity hydrophobic hydrophilic All molecules have the same log. P ~1. 5, but different 3 D MLP pattern.

Example of oral administration: – Drug is exposed to a large variety of p.

Example of oral administration: – Drug is exposed to a large variety of p. H values: • • • Saliva p. H 6. 4 Stomach p. H 1. 0 – 3. 5 Duodenum p. H 5 – 7. 5 Jejunum p. H 6. 5 – 8 Colon p. H 5. 5 – 6. 8 Blood p. H 7. 4 – „Liver-first-pass-effect“ www. 3 dscience. com

Lipophilic Descriptors • Log D • Log PN : log. P of the neutral

Lipophilic Descriptors • Log D • Log PN : log. P of the neutral form • Log PI : log. P of the ionized form

log. D – The Calculation • Log. D may simply be calculated from predicted

log. D – The Calculation • Log. D may simply be calculated from predicted log. P and p. Ka of the singly ionized species at certain p. H: • For acids: log. D(p. H) = log. P – log[1 + 10(p. H - p. Ka)] • For bases: log. D(p. H) = log. P – log[1 + 10(p. Ka - p. H)]

Fragment Descriptors: Cl, amide, COOH, Br, Phenyl Cl = 1 amide = 1 COOH

Fragment Descriptors: Cl, amide, COOH, Br, Phenyl Cl = 1 amide = 1 COOH = 1 Br = 0 Phenyl = 0

ISIDA Fragment descriptors Type of Fragments I. Sequences II. Augmented Atoms C-N=C-H C-N=C I(AB,

ISIDA Fragment descriptors Type of Fragments I. Sequences II. Augmented Atoms C-N=C-H C-N=C I(AB, 2 -4) N=C-N N=C C-H 2 to 4 atoms sequence Atoms+Bonds

ISIDA Fragment descriptors Type of Fragments I. Sequences II. Augmented Atoms II(A) (no hybridization)

ISIDA Fragment descriptors Type of Fragments I. Sequences II. Augmented Atoms II(A) (no hybridization) II(Hy) (hybridization of neighbours is taken into account)

C- C- Data. Set CCCN CNCC* C C- C-C CNC= CC O C- Calculation

C- C- Data. Set CCCN CNCC* C C- C-C CNC= CC O C- Calculation of Descriptors 0 10 1 5 0 0 8 1 4 0 0 4 1 2 4 Etc. ISIDA FRAGMENTOR the Pattern matrix

-0. 222 + 0. 973 -0. 066 PATTERN MATRIX PROPERTY VALUES LEARNING STAGE Building

-0. 222 + 0. 973 -0. 066 PATTERN MATRIX PROPERTY VALUES LEARNING STAGE Building of models VALIDATION STAGE QSAR models filtering -> selection of the most predictive ones QSAR models

Example : linear QSPR model Property PROPERTYcalc = -0. 36 * NC-C-C-N-C-C + 0.

Example : linear QSPR model Property PROPERTYcalc = -0. 36 * NC-C-C-N-C-C + 0. 27 * NC=O + 0. 12 * NC-N-C*C + …

Software

Software

DRAGON The software DRAGON calculates 1664 molecular descriptors divided in 20 blocks

DRAGON The software DRAGON calculates 1664 molecular descriptors divided in 20 blocks

CODESSA Pro ü calculate a large variety of molecular descriptors on the basis of

CODESSA Pro ü calculate a large variety of molecular descriptors on the basis of the 3 D geometrical structure and/or quantum-chemical parameters; ü develop (multi)linear and non-linear QSPR

ISIDA program ü calculates fragment descriptors; ü develop (multi)linear and non-linear QSPR models

ISIDA program ü calculates fragment descriptors; ü develop (multi)linear and non-linear QSPR models