A Whirlwind Tour of Biomedical Informatics KunMao Chao
A Whirlwind Tour of Biomedical Informatics Kun-Mao Chao (趙坤茂) National Taiwan University http: //www. csie. ntu. edu. tw/~kmchao/
About this course • • • Course: Introduction to Biomedical Informatics Spring semester, 2013 9: 10 - 12: 10 Monday, 101 CSIE Building. 3 credits Web site: http: //www. csie. ntu. edu. tw/~kmchao/bioinformatics 13 spr • Instructor: Kun-Mao Chao (趙坤茂) • Teaching assistant: – Chia-Jung Chang (張家榮) & Wu-Lung R. Yang (楊伍隆) 2
TA: Chia-Jung Chang (張家榮) & Wu-Lung R. Yang (楊伍隆) Chia-Jung Wu-Lung 3
Coursework • Homework assignments and Class participation (10%) • Two midterm exams (70%; 35% each): – Midterm #1: April 1, 2013 (tentative) – Midterm #2: May 13, 2013 (tentative) • Oral presentation of selected papers/projects (20%) 4
The Best? The Cheapest? The Best Entrance The Cheapest 5
Bio-X? X-Informatics? Bio-X Bioinformatics X-Informatics Source: NIH, Bioinformatics Journal, NPS 6
Interdisciplinary Pioneers Archimedes of Syracuse Leonardo da Vinci Isaac Newton Source: Wikipedia 7
Amphibia, Triphibia Source: Wikipedia, xplanes 8
Band Alignment (Joint work with W. Pearson and W. Miller, 1992) Seq. 2 Seq. 1 9
Alignment in an Arbitrary Region (Joint work with R. C. Hardison and W. Miller, 1993) 10
Aligning Very Similar Sequences (Joint work with J. Zhang, J. Ostell and Webb Miller, 1997) 11
Generalized Global Alignment (Joint work with X. Huang, 2003) 12
Tag SNPs & Haplotype Inference (Joint work with Y. -T. Huang et al. , 2006) Yao-Ting Huang Kui Zhang Ting Chen Chia-Jung Chang Kun-Mao Chao 13
Sequence Comparison: Theory and Methods (Joint work with L. Zhang, 2009) 14
Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir Cambridge University Press, 2011 15
Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir 16
Bioinformatics for Biologists Edited by Pavel Pevzner and Ron Shamir 17
Central Dogma of Molecular Biology Source: http: //www. ncbi. nlm. nih. gov 18
From Genes to Proteins Source: http: //www. ornl. gov 19
Double Helix Source: http: //www. nature. com 20
A Brief History of Genetics • 1859 Charles Darwin published “The Origin of Species. ” • 1865 Genes are particular factors. [Gregor Mendel] • 1869 Discovery of nucleic acid [Friedrich Miescher] • 1903 Chromosomes are hereditary units. [Walter Sutton] • 1910 Genes lie on chromosomes. [Thomas Hunt Morgan] • 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant] • 1931 Recombination occurs by crossing over. [Harriet Creighton and Barbara Mc. Clintock] 21
A Brief History of Genetics (cont’d) • 1944 DNA is the genetic material. [Oswald Avery, Colin Mc. Leod and Maclyn Mc. Carty] • 1953 DNA is a double helix. [James Watson and Francis Crick] • 1961 -1967 Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick] • 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam] • 21 th Century: Many genomes completely sequenced MIT Open Courseware: Biology 7. 012 Introduction to Biology 22
Multiple Nobel Laureates 23
Milestones of Bioinformatics • • 1962 Pauling's theory of molecular evolution 1965 Margaret Dayhoff's Atlas of Protein Sequences 1970 Needleman-Wunsch algorithm 1977 DNA sequencing and software to analyze it (Staden) 1981 Smith-Waterman algorithm developed 1981 The concept of a sequence motif (Doolittle) 1982 Gen. Bank Release 3 made public 1982 Phage lambda genome sequenced 24
Milestones of Bioinformatics (cont’d) • 1983 Sequence database searching algorithm (Wilbur. Lipman) • 1985 FASTP/FASTN: fast sequence similarity searching • 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM • 1990 BLAST: fast sequence similarity searching • 1991 EST: expressed sequence tag sequencing • 1993 Sanger Centre, Hinxton, UK • 1994 EMBL European Bioinformatics Institute, Hinxton, UK 25
Milestones of Bioinformatics (cont’d) • • 1995 First bacterial genomes completely sequenced 1996 Yeast genome completely sequenced 1997 PSI-BLAST 1998 Worm (multicellular) genome completely sequenced • 1999 Fly genome completely sequenced 26
Milestones of Bioinformatics (cont’d) • • • Human Genome Project (1990 -2003) Mouse 2002 Rat 2004 Chimpanzee 2005 Completed Genomes 27
Chimpanzee Genome 28
The Primate Family Tree Source: Nature 29
orz’s Sequence Evolution ü orz (kid) u ü OTZ (adult) u ü Orz (big head) ü Crz (motorcycle driver) u ü on_ (soldier) ü or 2 (bottom up) ü oΩ (back high) ü STO (the other way around) ü Oroz (me) the origin? their evolutionary relationships? their putative functional relationships? 30
Topics • • • • Sequencing and genotyping technologies Molecular sequence analysis Recognition of genes and regulatory elements Comparative genomics Gene expression Molecular structural biology Biological networks Systems biology Computational proteomics Molecular evolution Phylogenetic trees Population genetics Medical informatics 31
Bioinformatics Centers • National Center for Biotechnology Information (NCBI, NIH): – http: //www. ncbi. nlm. nih. gov/ • European Bioinformatics Institute (EBI): – http: //www. ebi. ac. uk/ • DNA Data Bank of Japan (DDBJ): – http: //www. ddbj. nig. ac. jp/index-e. html • UCSC Genome Browser Home • RCSB Protein Data Bank 32
Bioinformatics Departments ü Computational Biology and Bioinformatics, USC ü Bioinformatics and Systems Biology, UCSD ü The Broad Institute of MIT and Harvard ü Computational and Genomic Biology, UC Berkeley ü Biomedical Informatics Research, Stanford University ü Comparative Genomics and Bioinformatics, Penn State ü Penn Center for Bioinformatics ü Max Planck Institute for Molecular Genetics ü Bioinformatics and Computational Biology, Iowa State 33
Bioinformatics Journals ü Bioinformatics ü Journal of Computational Biology ü Genome Research ü Nature ü Nucleic Acid Research ü PLo. S Computational Biology ü Science 34
Nature & Science 35
Bioinformatics Conferences ü The Annual International Conference on Research in Computational Molecular Biology (RECOMB) ü The Symposium on Intelligent Systems for Molecular Biology (ISMB) ü The European Conferences on Computational Biology (ECCB) 36
Bioinformatics Books 37
Bioinformatics Community • The International Society for Computational Biology (ISCB) – Senior Scientist Accomplishment Award 38
10 Steps to Success in Bioinformatics by Webb Miller 1. Become a biologist. 2. Value your number of citations above your number of publications. 3. Collaborate, and do it with great collaborators. 4. Do not expect a warm welcome from everyone. 5. Be a good collaborator. 6. Distribute and maintain software and/or run web servers that you personally continue to use. 39
10 Steps to Success in Bioinformatics by Webb Miller 7. Alternate between working on specific datasets and writing general-purpose software. 8. Write some of your own software. 9. Don't give up. 10. Be excited about your work. 40
- Slides: 40