CISC 667 Intro to Bioinformatics Fall 2005 Lecture
![CISC 667 Intro to Bioinformatics (Fall 2005) Lecture 1 Course Overview Li Liao Computer CISC 667 Intro to Bioinformatics (Fall 2005) Lecture 1 Course Overview Li Liao Computer](https://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-1.jpg)
CISC 667 Intro to Bioinformatics (Fall 2005) Lecture 1 Course Overview Li Liao Computer and Information Sciences University of Delaware CISC 667, F 05, Lec 1, Liao
![Administrative stuff u u Syllabus and tentative schedule (check frequently for update) Office hours: Administrative stuff u u Syllabus and tentative schedule (check frequently for update) Office hours:](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-2.jpg)
Administrative stuff u u Syllabus and tentative schedule (check frequently for update) Office hours: 10: 00 AM-11: 30 AM Tuesdays and Thursdays F Appointments u Collect student info (name, email, dept, language) u Introduce textbook and other resources F URLs, PDF/PS files, or hardcopy handout F A reading list u Workload F 4 homework assignments (hands-on to learn the nuts and bolts) • Language issue: Perl is strongly recommended (A tutorial is provided) F Mid-term and final exams u Late policy: 15% off per class up to two class mtgs. CISC 667, F 05, Lec 1, Liao
![Bioinformatics Books · D. W. Mount, Bioinformaics: Sequence and Genome Analysis, CSHLP 2004. · Bioinformatics Books · D. W. Mount, Bioinformaics: Sequence and Genome Analysis, CSHLP 2004. ·](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-3.jpg)
Bioinformatics Books · D. W. Mount, Bioinformaics: Sequence and Genome Analysis, CSHLP 2004. · Dan E. Krane & Michael L. Raymer, Fundamental Concepts of Bioinformatics, Benjamin Cummings 2002 · João Meidanis & João Carlos Setubal. Introduction to Computational Molecular Biology. PWS Publishing Company, Boston, 1996. · Peter Clote and Rolf Backofen, Computational Molecular Biology: An Introduction, Willey 2000. · R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998. · Dan Gusfield. Algorithms on String, Trees, and Sequences. Cambridge University Press, 1997. · P. Baldi and S. Brunak, Bioinformatics, The Machine Learning Approach, The MIT press, 1998. CISC 667, F 05, Lec 1, Liao
![Molecular Biology Books Free materials: u Kimball's biology u Lawrence Hunter: Molecular biology for Molecular Biology Books Free materials: u Kimball's biology u Lawrence Hunter: Molecular biology for](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-4.jpg)
Molecular Biology Books Free materials: u Kimball's biology u Lawrence Hunter: Molecular biology for computer scientists u DOE’s Molecular Genetics Primer Books: u Instant Notes series: Biochemistry, Molecular Biology, and Genetics u Molecular Biology of The Cell, by Alberts et al CISC 667, F 05, Lec 1, Liao
![Bioinformatics - use and develop computing methods to solve biological problems The field is Bioinformatics - use and develop computing methods to solve biological problems The field is](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-5.jpg)
Bioinformatics - use and develop computing methods to solve biological problems The field is characterized by n an explosion of data n difficulty in interpreting the data n large number of open problems n until recently, relative lack of sophistication of computational techniques (compared with, say, signal processing, graphics, etc. ) CISC 667, F 05, Lec 1, Liao
![Why is this course good for you? n According to a report in recent Why is this course good for you? n According to a report in recent](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-6.jpg)
Why is this course good for you? n According to a report in recent ACM Technews, CS enrollment has dropped, for good or bad. u. A factor for this drop is "the growing prominence of biotechnology and other fields. " n Bioinformatics is a computational wing of biotechnology. CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-7.jpg)
CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-8.jpg)
CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-9.jpg)
CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-10.jpg)
CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-11.jpg)
CISC 667, F 05, Lec 1, Liao
![CISC 667, F 05, Lec 1, Liao CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-12.jpg)
CISC 667, F 05, Lec 1, Liao
![It is “much easier” to teach people with those skills about biology than to It is “much easier” to teach people with those skills about biology than to](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-13.jpg)
It is “much easier” to teach people with those skills about biology than to teach biologists how to code well. CISC 667, F 05, Lec 1, Liao
![Industry is moving in n IBM: u Blue. Gene, the fastest computer with 1 Industry is moving in n IBM: u Blue. Gene, the fastest computer with 1](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-14.jpg)
Industry is moving in n IBM: u Blue. Gene, the fastest computer with 1 million CPU u Blueprint worldwide collects all the protein information u Bioinformatics segment will be $40 billion in 2004 up from $22 billion in 2000 n Glaxo. Smith. Kline n Celera n Merck n Astra. Zeneca n … CISC 667, F 05, Lec 1, Liao
![Computing and IT skills Algorithm design and model building u Working with unix system/Web Computing and IT skills Algorithm design and model building u Working with unix system/Web](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-15.jpg)
Computing and IT skills Algorithm design and model building u Working with unix system/Web server u Programming (in PERL, Java, etc. ) u RDBMS: SQL, Oracle PL/SQL u CISC 667, F 05, Lec 1, Liao
![People n n International Society for Computational Biology (www. iscb. org) ~ 1000 members People n n International Society for Computational Biology (www. iscb. org) ~ 1000 members](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-16.jpg)
People n n International Society for Computational Biology (www. iscb. org) ~ 1000 members Severe shortage for qualified bioinformatians CISC 667, F 05, Lec 1, Liao
![Conferences n n n ISMB (Intelligent Systems for Molecular Biology) started in 1992 RECOMB Conferences n n n ISMB (Intelligent Systems for Molecular Biology) started in 1992 RECOMB](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-17.jpg)
Conferences n n n ISMB (Intelligent Systems for Molecular Biology) started in 1992 RECOMB (International Conference on Computational Molecular Biology) started in 1997 PSB (Pacific Symposium on Biocomputing) started 1996 TIGR Computational genomic, started in 1997. . . CISC 667, F 05, Lec 1, Liao
![Journals n n n Bioinformatics Journal of Computational Biology Genomics Genome Research Nucleic Acids Journals n n n Bioinformatics Journal of Computational Biology Genomics Genome Research Nucleic Acids](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-18.jpg)
Journals n n n Bioinformatics Journal of Computational Biology Genomics Genome Research Nucleic Acids Research. . . CISC 667, F 05, Lec 1, Liao
![How should I learn this course? Come to the class, do homework assignments, reading How should I learn this course? Come to the class, do homework assignments, reading](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-19.jpg)
How should I learn this course? Come to the class, do homework assignments, reading assignments, and ask questions! Nuts and Bolts: A lot of facts, new terminologies, models and algorithms At the beginning of each chapter of the text: - What should be learned - Glossary terms A typical approach to study almost any subject > what is already known? (what is the state of the art, so you won't reinvent the wheel) > what is unknown? o Known unknowns o unknowns CISC 667, F 05, Lec 1, Liao
![How much should I know about biology? - Apparently, the more the better - How much should I know about biology? - Apparently, the more the better -](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-20.jpg)
How much should I know about biology? - Apparently, the more the better - The least, Pavzner's 3 -page "All you need to know about Molecular biology". > I will tell you. - We adopt an "object-oriented" scheme, namely, we will transform biological problems into abstract computing problems and hide unnecessary details. So another big goal of this course is learn how to do abstraction. CISC 667, F 05, Lec 1, Liao
![Organisms: three kindoms -- eukaryotes, eubacteria, and archea Cell: the basic unit of life Organisms: three kindoms -- eukaryotes, eubacteria, and archea Cell: the basic unit of life](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-21.jpg)
Organisms: three kindoms -- eukaryotes, eubacteria, and archea Cell: the basic unit of life Chromosome (DNA) > circular, also called plasmid when small (for bacteria) > linear (for eukaryotes) Genes: segments on DNA that contain the instructions for organism's structure and function Proteins: the workhorse for the cell. > establishment and maintenance of structure > transport. e. g. , hemoglobin, and integral transmembrane proteins > protection and defense. e. g. , immunoglobin G > Control and regulation. e. g. , receptors, and DNA binding proteins > Catalysis. e. g. , enzymes CISC 667, F 05, Lec 1, Liao
![Small molecules: > sugar: carbohydrate > fatty acids > nucleotides: A, C, G, T Small molecules: > sugar: carbohydrate > fatty acids > nucleotides: A, C, G, T](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-22.jpg)
Small molecules: > sugar: carbohydrate > fatty acids > nucleotides: A, C, G, T --> DNA (double helix, hydrogen bond, complementary bases A-T, G-C) four bases: adenine, cytosine, guanine, and thymidine (uracil) 5' end phosphate group 3' end is free 1' position is attached with the base double strand DNA sequences form a helix via hydrogen bonds between complementary bases hydrogen bond: - weak: about 3~5 k. J/mol (A covalent C-C bond has 380 k. J/mol), will break when heated - saturation: - specific: CISC 667, F 05, Lec 1, Liao
![Information Expression 1 -D information array 3 -D biochemical structure CISC 667, F 05, Information Expression 1 -D information array 3 -D biochemical structure CISC 667, F 05,](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-23.jpg)
Information Expression 1 -D information array 3 -D biochemical structure CISC 667, F 05, Lec 1, Liao
![Genetic Code: codons CISC 667, F 05, Lec 1, Liao Genetic Code: codons CISC 667, F 05, Lec 1, Liao](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-24.jpg)
Genetic Code: codons CISC 667, F 05, Lec 1, Liao
![Challenges in Life Sciences n n n Understanding correlation between genotype and phenotype Predicting Challenges in Life Sciences n n n Understanding correlation between genotype and phenotype Predicting](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-25.jpg)
Challenges in Life Sciences n n n Understanding correlation between genotype and phenotype Predicting genotype <=> phenotype Phenotypes: u drug/therapy response u drug-drug interactions for expression u drug mechanism u interacting pathways of metabolism CISC 667, F 05, Lec 1, Liao
![Topics Mapping and assembly u Sequence analysis (Similarity -> Homology): u Pairwise alignment (database Topics Mapping and assembly u Sequence analysis (Similarity -> Homology): u Pairwise alignment (database](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-26.jpg)
Topics Mapping and assembly u Sequence analysis (Similarity -> Homology): u Pairwise alignment (database searching) F Multiple sequence alignment F Gene prediction F Pattern (Motif) discovery and recognition F u Phylogenetics analysis Character based F Distance based F Probabilistic F u Structure prediction RNA Secondary F Protein Secondary & tertiary F u Network analysis: Metabolic pathways reconstruction F Regulatory networks (Gene expression) F CISC 667, F 05, Lec 1, Liao
![Goals? At the end of this course, you should be able to - Describe Goals? At the end of this course, you should be able to - Describe](http://slidetodoc.com/presentation_image_h2/1da4aa1ad90c0b5bcea430e28ba6e36a/image-27.jpg)
Goals? At the end of this course, you should be able to - Describe the main computational challenges in molecular biology. - Implement and use basic algorithms. - Describe several advanced algorithms. F F F Sequence alignment using dynamics programming Hidden Markov models Support vector machines Monte Carlo simulation Hierarchical clustering - Know the existing resources: Databases, Software, … CISC 667, F 05, Lec 1, Liao
- Slides: 27