Bioinformatics BIO 520INF 520 Jim Lund Assigned reading
• Bioinformatics • BIO 520/INF 520 • Jim Lund • Assigned reading: • Ch 1 & 2
Bioinformatics applies principles of information science (derived from applied math, computer science, and statistics) to make the vast, diverse, and complex life sciences data more understandable and useful. It automates simple but repetitive types of analysis. Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology.
BIO 520 Topics • Navigating biological databases. • Sequence alignment. • Proteins - 3 D structure visualization, prediction, motif analysis. • DNA sequence annotation. – Gene finding in prokaryotes and eukaryotes. • RNA structure. • Phylogenetic inference • Genome/transcriptome/proteome – Function & Analyses.
Molecular information-DNA • Raw bacterial DNA sequence – Coding or not? – Parse into genes? – Find regulatory sequences? – PCR primers, vector engineering? – 4 bases: ACGT • 1 kb for a gene • Mb for a genome
http: //www. ncbi. nlm. nih. gov/Genbank/genbankstats. html
Protein Structure Prediction
Proteomics 1978 -1998 MALDI-TOF? ESI-MS?
Metabolic Networks KEGG, 1998
Regulatory Networks KEGG
Bioinformatics-what is it? Acquisition, curation, and analysis of biological data Hypothesis
Bioinformatic Data-1978 to 2008 • DNA sequence • Gene expression • Protein Structure • Genome mapping • Metabolic networks • Regulatory networks • Trait mapping • Gene function analysis • Scientific literature
Goals of the HGP, 1998 -2003 • Reference Human Genome Sequence • • Improved Sequence Technology • • $0. 25 per finished base Human Genome Sequence Variation Technology for Functional Genomics Comparative Genomics • • Draft 2001, Finished in 2003 Finish Mouse by 2005 (well ahead here) ELSI Genome sequences highlight the finiteness of the set of sequences!
What remains to be done? • Comparative Genomics • Description of m. RNAs, proteins (identity and structure) • Functional analysis • Detailed understanding of development, regulation, variation
The Gene for…
Other Reasons to Care Affymetrix Genentech
Biologist User Training • Internet sites –Range from high quality to unreliable. • Unread documentation • Popular program sites with NO documentation –Perhaps one day I will get around to writing some documentation”–Help from a WWW service, hit several hundred times per day!
Dramatic Changes in Information Science • Information Storage – Digital: text, numbers, images • Computerized Data Analysis • Automated Data Analysis • Information Distribution – Internet, cloud, etc.
Moore’s Law Intel Corporation
Computer Science and bioinformatics • Operating Systems • Programming • Algorithms – New problems keep turning up! • Data structure/databases • Interfaces • Search and visualization
BIO 520 Nuts and Bolts • Syllabus & Schedule • Labs on Fridays In Young B-35 • Textbook – Internet • Exams (2 + final) – Program • Grading: documentation – 12 labs: 10 pts – Exams: 50 pts – Final: 50 pts http: //elegans. uky. edu/520
Textbooks Required textbook: • Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Supplemental reading (don’t buy): • Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3 rd Ed. – Baxevanis and Ouellette Biology background material: – Genes IX (Lewin) – Cell Biology (Watson et al, Darnell et al) – NCBI Bookshelf (http: //www. ncbi. nlm. nih. gov/entrez/query. fcgi? db=Books&itool=toolbar)
Computer Resources • http: //elegans. uky. edu/520 • Locally installed Programs: – Cn 3 D, Clustal, Tree. View, Chime • Web based tools: – Databases – Software programs
Biological Principles Evolution by natural selection DNA->RNA->Protein Structure Function
- Slides: 24