Modelling Proteomes Ram Samudrala University of Washington Rationale
- Slides: 17
Modelling Proteomes Ram Samudrala University of Washington
Rationale for understanding protein structure and function Protein sequence -large numbers of sequences, including whole genomes ? Protein function - rational drug design and treatment of disease - protein and genetic engineering - build networks to model cellular pathways - study organismal function and evolution structure determination structure prediction Protein structure - three dimensional - complicated - mediates function homology rational mutagenesis biochemical analysis model studies
Protein folding DNA …-CUA-AAA-GGU-GUU-AGC-AAG-GUU-… protein sequence …-L-K-E-G-V-S-K-D-… one amino acid unfolded protein spontaneous self-organisation (~1 second) native state not unique mobile inactive expanded irregular
Protein folding DNA …-CUA-AAA-GGU-GUU-AGC-AAG-GUU-… protein sequence …-L-K-E-G-V-S-K-D-… one amino acid unfolded protein spontaneous self-organisation (~1 second) native state not unique mobile inactive expanded irregular unique shape precisely ordered stable/functional globular/compact helices and sheets
Ab initio prediction of protein structure sample conformational space such that native-like conformations are found select hard to design functions that are not fooled by non-native conformations (“decoys”) astronomically large number of conformations 5 states/100 residues = 5100 = 1070
Semi-exhaustive segment-based folding EFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK generate … fragments from database 14 -state f, y model … minimise … monte carlo with simulated annealing conformational space annealing, GA … filter all-atom pairwise interactions, bad contacts compactness, secondary structure
Ab initio prediction at CASP Before CASP (BC): “solved” (biased results) CASP 1: worse than random CASP 2: worse than random with one exception CASP 3: consistently predicted correct topology - ~ 6. 0 Å for 60+ residues CASP 4: consistently predicted correct topology - ~4 -6. 0 A for 60 -80+ residues **T 97/er 29 – 6. 0 Å (80 residues; 18 -97) *T 98/sp 0 a – 6. 0 Å (60 residues; 37 -105) **T 102/as 48 – 5. 3 Å (70 residues; 1 -70) **T 106/sfrp 3 – 6. 2 Å (70 residues; 6 -75) **T 110/rbfa – 4. 0 Å (80 residues; 1 -80) *T 114/afp 1 – 6. 5 Å (45 residues; 36 -80)
Comparative modelling of protein structure scan align de novo simulation … KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * ** build initial model minimum perturbation refine physical functions … construct non-conserved side chains and main chains graph theory, semfold
A graph theoretic representation of protein structure -0. 6 (V 1) represent residues as nodes -0. 5 (I) -0. 9 (V 2) weigh nodes -0. 7 (K) -1. 0 (F) construct graph -0 . 1 -0. 6 (V 1) -0. 5 (I) -0. 1 -0 -1. 0 (F) . 4 -0. 7 (K) -0 . 4 -0. 2 -0. 9 (V 2) -0. 1 -0. 3 -0 -1. 0 (F) find cliques -0. 5 (I) . 2 -0. 1. 1 -0. 3 -0 W = -4. 5 -0. 9 (V 2) -0. 1 -0. 2 -0. 7 (K) -0. 2
Comparative modelling at CASP alignment side chain short loops longer loops BC CASP 1 CASP 2 CASP 3 CASP 4 excellent ~ 80% 1. 0 Å 2. 0 Å poor ~ 50% ~ 3. 0 Å > 5. 0 Å fair ~ 75% ~ 1. 0 Å ~ 3. 0 Å fair ~75% ~ 1. 0 Å ~ 2. 5 Å fair ~75% ~ 1. 0 Å ~ 2. 0 Å CASP 4: overall model accuracy ranging from 1 Å to 6 Å for 50 -10% sequence identity **T 128/sodm – 1. 0 Å (198 residues; 50%) **T 111/eno – 1. 7 Å (430 residues; 51%) **T 122/trpa – 2. 9 Å (241 residues; 33%) **T 125/sp 18 – 4. 4 Å (137 residues; 24%) **T 112/dhso – 4. 9 Å (348 residues; 24%) **T 92/yeco – 5. 6 Å (104 residues; 12%)
Prediction for Invb using de novo fold recognition
Computational aspects of structural genomics A. sequence space B. comparative modelling * * C. fold recognition * * * * E. target selection D. ab initio prediction * * F. analysis * * * * targets (Figure idea by Steve Brenner. )
Computational aspects of functional genomics structure based methods microenvironment analysis G. assign function * structure comparison * * * zinc binding site? homology + sequence based methods sequence comparison motif searches phylogenetic profiles domain fusion analyses + experimental data * * function? assign function to entire protein space
Bioverse – explore relationships among molecules and systems http: //bioverse. compbio. washington. edu Jason Mcdermott
Bioverse – human protein-protein interaction network Jason Mcdermott/Zach Frazier
Bioverse – mapping pathways on networks Inisitol phosphate metabolism Benzoate degradation Sphningoglycolipid metabolism Starch/sucrose metabolism Nicotinate and nicotinamide metabolism Jason Mcdermott
Take home message Prediction of protein structure and function can be used to model whole genomes to understand organismal function and evolution Acknowledgements Group members
- Ram samudrala
- Ram samudrala
- Sruthi samudrala
- Ram nam me lin hai dekhat sabme ram
- Washington university rotc
- University of washington credit card
- Western washington university human services
- George washington university student accounts
- Gwu electrical engineering
- Tom anderson university of washington
- Infectious mononucleosis
- Bionic lens
- Hank webber
- University of washington emba
- "post university" -liu -washington
- University of washington emba
- George washington university electrical engineering
- Tom anderson university of washington