A brief on Domain Families Classification Classification to
A brief on: Domain Families & Classification
Classification to Families We can classify proteins into families by: – A. Sequence – B. Structure – C. Function (annotation) – D. Evolution
Used Terms: Motif = Domain = Signature = Profile = Seed Family = Cluster These terms are used interchangeably, They are very (too) flexible
Motif = Domain = Function ? ? ? • A motif is a sequence signature. • Structural definition of a domain: an independently folding structural unit. • A protein family is not well-defined. • Protein function is not well-defined (some proteins can have several functions). • Conclusion: these terms are used interchangeably, but they are very flexible.
Protein folds Toxin binding protein (Tol. B) Di-isopropylfluorophosphatase Glucose dehydrogenase
Dominant domain fold types. Holm and Sander. PROTEINS: Structure, Function, and Genetics 33: 88– 96 (1998)
Why Research Protein Families? • Function prediction and annotation. • Evolutionary research - finding orthologs and paralogs. • Search for new protein folds. • Functional research by similarity in characteristics.
Domains are the building blocks of evolution: some facts. . 3 domains Each occurs in diverse sets of protein families Number of domains in proteins ranges from 1 up to tens Structural based domain are ~ 150 aa Length varies: some are very short 30 -40 aa, other are long > 500 aa Domain definition is somewhat blurred Domain boundary is an unsolved problem Pyruvate kinase, PDB: 1 pkn
How is a novel gene born? • Domains are the evolutionary units of sequence that comprise the gene coding regions. • Most genes are built from more than one domain. • Novel genes can be created by recombination of domains into new domain arrangements.
Correspondence between functional associations and genes linked by the fusion method From Glycolysis: M. genitalium PGK M. genitalium TIM M. genitalium GAPDH Glycerone-P PGK 1 Glyceraldehyde-3 P GAPDH Glycerate-1, 3 P 2 Thermotoga Maritima PGK+TIM Glycerate-3 P Phytophthora infestans TIM+GAPDH
What is a Protein Family? • Protein family: A group of proteins that have a common protein ancestor. • Is it that simple? • Domains: non-linear evolution Who is in this family?
A protein can have several same or different domains Fibronectin protein– 1 fnf
The Power of Integration Pfam, Prosite, SMART, PRINTS, tigr. Fam, Pro. Dom Inter. Pro SCOP CATH FSSP GO KEGG
- Slides: 14